Psychology Statistics Exam 1

Ace your homework & exams now with Quizwiz!

Tom's score is 7 and the class mean is 5. What is his deviation score and what does it mean?

+2. Tom's score is 2 points ABOVE the mean

The number of scores in data

- N: the number of scores in a population -n: the number of scores in a sample.

A set of 50 scores has a mean of 30. Every student receives a bonus of 2 points for organizing his or her notes, what will be the new value for the mean?

New mean= 30 +2 = 32

A sample of n= 15 scores has a mean of M=30. A second sample of n=5 scores has a mean of m=24. If the two samples are combined, what value will be obtained for the mean of the combined sample?

(15) (30) + (5) (24)/ 15+5 = 28.5

What if the second sample has n=15 score ( the same as the first sample), what will be the mean of the combined sample?

( Number 1), first half for an example. (n=15) (M= 30) + (n=5) (M= 24)/ 15+5 =28.5 Problem: 30+24/2=27 As the two groups have the same sample size, we dont need to use a weighted mean; simply add two means and divide it by 2.

A study was conducted to examine whether marijuana is effective in reducing pain among cancer patients. The researchers selected 60 cancer patients at Mid Michigan Health Center and assessed the amount of pain among the patients before and after the use of marijuana. 1. Identify the sample _____________. 2. What would be the largest population, to which the researchers could draw conclusions, based on the study results? ________________.

1. 60 cancer patients at mid michigan health center 2. All Cancer patients

A researcher asks 100 CMU students how much they spend money on food (per week) in order to draw some inferences about amount of money all students at CMU spend. 1. The mean cost of food among 100 students is _____________ 2. 100 CMU students are __________________ 3. All CMU school students are __________________ 4. What type of statistics does the researcher intend to use? Descriptive or inferential statistics? 5. A researcher would like to generalize the result (from 100 CMU students) to all college students in the U.S. What kind of problem do you expect?

1. statistic (M). 2. sample 3. population 4. inferential statistics 5. a sampling error (our sample may not be representative of population).

Skewed distributions

1. they have only one mode (peakness), but 2. they are not symmetrical. Skewness means the degree of asymmetry. Positively and negatively skewed distribution.

A sample has a mean of 10, and the sum of all the scores is 150. What is n?

10= 150/n 150/10=15 n=15

A set of 8 scores has a mean of 20. If one of the scores is changed from x=42 to x=2, what will be the new value for the mean?

20 = ΣX/ 8 ΣX=160 NewΣX = 160-40 = 120 New_mean= 120/8=15

A sample of n=10 scores has a mean of 9. What is the sum of all 10 scores (=ΣX) for this sample?

9 = ΣX/10 9x10= 90 ΣX= 90

Variables

A characteristic or condition that change of has different values for different individuals. To be a variable, something must be capable of existing in more than one state, or must be capable of being changed. For example, being a female or male can be a variable, skin color, etc.

Constant

A characteristic or condition that does not vary or is the same for every individual. Suppose there are 20 female students and 3 males students in a class. Sex is a variable in this situation. Suppose all male students drop out of the class, the variable "sex" becomes a constant.

Symmetrical distribution

A distribution where the left hand side is a mirror image of the right hand side when a vertical line is drawn through the middle

Positively skewed distribution

A distribution where the scores pile up on the left side and taper off to the right. It has a long tail in the positive (right side) direction. If a distribution of exam scores is positively skewed, we can say that most of people received low scores. When a difficult exam is given to students, you expect that scores form a positively skewed distribution.

Negatively skewed distribution

A distribution where the scores pile up on the right side and taper off to the left. It has a long tail in the negative (left side ) direction. If a distribution of exam scores is negatively skewed, we can say that most of the people received high scores. When an exam is given to students, you expect that scores form a negative skewed distribution.

Polygon (line graph)

A graph consisting of a line that connects a series of dots. A dot is placed above each score or interval so that the height of the dot corresponds to the frequency Interval and ratio scales

Histogram

A graph showing a bar above each score or interval so that the height of the bar corresponds to the frequency and width extends to the real limits. Interval and Ratio scales

Bar Graph

A graph that is used to represent the frequencies of nominal or ordinal scaled variables The height of the bar corresponds to the frequency A space is left between adjacent bars because the variable consists of separate, distinct categories

Parameter

A parameter is a numerical quantity measuring some aspect of a population of scores. Greek letters are used to designate parameters

Percentile

A point/score on the measurement scale below which a specified percentage of the cases in the distribution falls. What is the score below which 95% of cases fall? 4, but always report in terms of real limits. so the answer is 4.5 thus, we say that 95% of students received a score 4.5 and below 4.5 is referred to as the 95th percentile

Population

A population is the set of all the individuals of interest in a particular study. It can vary in size to large to very small. Its usually large. (all fb users, all teen girls). It could be people rates, products, schools, etc, but it is typically people.

Sample

A sample is a subset of individuals selected from a population, intended to represent the population in a research study

Rule 4 Characterists of the mean: If every score in a distribution is multiplied or divided by a constant value, the mean will change in the same way.

A set of 100 scores has a mean of 45. If each score is divided by 5, what will be the new value for the mean? 45/5 = 9

Statistic

A statistic is a numerical quantity calculated in a sample. Such statistics are used to estimate parameters. Roman letters are used to designate a statistic.

Placebo

A treatment which does not contain the features that the researcher believes will make a difference in the outcome of the experiment. Essentially, this is a fake treatment.

1. A researcher would like to examine the effect of different length of instruction in reading acquisition of 2nd graders. The researcher took 50 students, assigned 25 randomly to one group and 25 randomly to the other. The first group received one fifteen-minute period of instruction; the second group received five fifteen-minute periods of instruction. The students took ABC reading test. a. Identify type of research design: Experimental, Non-expermental, or Correlational b. Identify IV and DV IV _____________________ DV ________________

A. Experimental B. Iv= length of instructions DV= scores of reading test

Measurement

A. Measurement involves assigning individuals or events to categories. - Simple category: Sex (male vs female) - Height or weight - You need to assign numbers to observations according to rules - Four Scales of measurement are typically discussed in statistics - This measurement process is important because it determines which statistics you can use with those numbers

A personality trait is defined as a long-lasting, relatively stable characteristic of an individual's personality. A trait, "relaxed/tense" has been shown to be a consistent factor differentiating creative artists from airline pilots. The following data represent relaxed/tense scores for two groups of subjects. Lower scores indicate a more relaxed personality. Artists: 7, 7, 6, 9, 8, 8, 7, 9, 5, 3, 6, 8, 7, 9 Pilots: 4, 2, 2, 3, 1, 5, 4, 3, 2, 2, 6, 2, 5, 3 a. Identify type of research design: Experimental, Non-expermental, or Correlational b. Identify IV and DV IV _____________________ DV ________________

A. Non experimental B. Quasi IV= Type of occupation (artists vs pilots) DV= Personality scores

Mean

An average. The sum of the scores divided by the number of scores For example: There are n=50 students in a class. The instructor divided students into 10 groups, 5 students in each group. She wants to assign a total of 300 problems to groups and students. If the groups and students divide the problems equally among themselves, how many problems will each group solve? How many problems will each student solve? Answer: How many problems will each group solve? 300 problems and 10 groups. 300/10=30 problems. How many problems will each student solve? 300 problems and 50 students = 300/50=6 or 30 problems and 5 students in each group = 30/5 =6

When to choose Median when measuring central tendency

Appropriate: -Interval, ratio scale with extreme outliers/skewed distribution -Undetermined values -open ended distributions -ordinal scale Should not be used when: -Nominal scale

When to choose Mode when measuring central tendency

Appropriate: -Nominal scale (sex) - Ordinal scale (when score range is limited; 1-3 scale, for example) - Discrete variables (cloth size) Should not be used when: Interval or ratio data, except to accompany mean or median

When to choose Mean when measuring central tendency

Appropriate: -Interval, ratio scale when no extreme outliers. -Ordinal scale (when score range is large; 1-9 scale, for example), use with caution Not be used: -Nominal scale, ordinal scale ( when score range is limited; 1-3 scale, for example) -Interval, ratio scale with extreme outliers/ skewed distribution -undetermined values -open-ended distributions

Discrete variable

Consists of separate, indivisible categories. They are commonly restricted to whole, countable numbers No values can exist between two neighboring categories Examples: number of children in a family ( it is not possible to have 2.5 children in a family), Gender(men -1, Women -2), Occupation (1= nurse, 2=teacher, 3=lawyer, etc)

Variables ( Are they Continuous or Discrete?) 1. Number of words recalled =______ 2. Body type (slim, average, heavy) = _____________ 3. Temperature= _________ 4. A letter Grade (ABCDE)=__________ 5. Weight in pounds of an infant= ___________ 6. When measuring height to the nearest 0.2 inch, what are the real limits for a height of 68 inches?

Continuous vs discrete 1. discrete 2. discrete 3. continuous 4. discrete 5. continuous 6. 67.9-68.1

Data

Data are measurements or observations. A data set is a collection of measurements or observations. A datum is a single measurement or observation is commonly called a score or raw score.

Rule 1 Characteristics of the mean: Changing the value of any score will change the mean

Example: The mean of the first quiz among 10 students was 7. The teacher found out her mistake after the grading was done; one student's score was supposed to be 10, not 1. What will be the new value for the mean after one score is changed from X=1 to X=10? 7= ΣX/ 10 Σx= 70 New ΣX= 70+9 = 79 New_ Mean =79/10 = 7.9

True or false? Is it possible for more than 50% of the scores in a distribution to have values greater than the median?

False

True or false? A distribution can have more than one median?

False, because the median is the value that divides a distribution in half

True or false? If you have a score of 52 on an 80 point exam, then you definitely scored above the median?

False, we need more info. For example, if most students scored 70 and above, 52 will be below the median, but if most students scored 40 and below, 52 will be above the median.

Melissa's deviation score on statistics exam is 0. What does 0, mean?

Her score is the same as the mean

Ratio scale

Highest level Any type of statistics is allowed to use if data are in ratio scale Properties that scales has: Identity= Same as nominal Logical order= Same as ordinal Equal distance= same as interval True zero point: the point 0 reflects an absene of he characteristic that the variable reflects. As 0 reflects an absence of property, a negative value does not make sense. For example, height can not be 10lbs. Examples: Age, time, height, weight, time to complete the task, number of children in a family, number of soda cans you can drink, tempt. measured in Kelvin.

Range

Highest value- lowest value Ranges from 0 to infinity. Data: 1,4,5,6,8,9,13 = Range= 13-1=12 Data: -9,-7,0,1,6 = Range= 6-(-9)=15 Data: -100, -93, -80, -23, -11 = Range= -11-(-100)=89 The simplest measure of variability The range is most informative for data sets without outliers The range is most informative for data sets without outliets Range is a very unreliable measure because it depends on only 2 points in the entire distribution.

A distribution of scores shows mean= 31 and median= 43. This distribution is probably_____

Negatively skewed. The mean is less than the median, indicating the tail is in the left side.

Independent Variable (IV)

IV is the factor that the researcher believes will cause something to happen. It is presumed cause in the cause and effect relationship that the experimental method seeks to demonstrate. It contains categorical group membership (experimental vs control group)

Ordinal scale

Identity: (same as nominal)The number gives us the identity of the category assigned. Logical Order: The numbers have a logical order (= scaled according to the amount of the particular characteristic they possess). Measurements in ordinal scale rank observations in terms of size or magnitude Properties that the scale does not have: equal distance and true zero point Examples: Class: 1= freshmen, 2=sophomore, 3=junior, 4= Senior Grade: 1= A, 2= b, 3= C, 4= D, 5=F Attitude survey question (likert scale) 0=strongly disagree 1=disagree 2=neutral 3=agree 4=strongly agree

Interval Scale

Identity: (same as ordinal and nominal)the number gives us the identity of the category assigned Logical order:( same as ordinal) the numbers have a logical order. Measures in ordinal scale rank observations in terms of size or magnitude Equal distance: Differences in the characteristics are represented by equal differences in the numbers assigned to the categories. Properties that the scale does not have: true zero point Examples: Temperature measured by Celsius and Fahrenheit= Standardized normalized test scores (IQ, Sat, etc. Approximately interval: ordinal scale with larger range of scores (final exam score, depression score).

I assign the value of 1 to male and 2 to female. The mean of gender variable is 1.75. Is the mean an appropriate central tendency to use for gender variable?

It is not appropriate as gender is a nominal scale.

Proportion (p)

It measures the fraction (proportion) of the total group that is associated with each score p= f/n Σp =1 (the sum of all proportions in the sample must equal 1).

Nominal Scale

Lowest level If data are in nominal scale, limited statistical techniques can be used. Identity: The number gives us the identity of the category assigned. Nominal means having to do with names Classifying individuals into categories that have different names buy are not related to each other in any systematic way. Properties that the scale does not have: Logical order, equal distance, and true point 0. example: Gender: 1= men 2= women Ethnicity: 1=white 2=black 3= hispanic 4=others Eye Color: 1= blue 2= Brown 3= Green 4=Hazel

Find the mean for the following sample of 6 scores: 4, 20, 5, 2, 1, 5

M (mean)= 4+10+5+2+1+5/ 6= 27/6 = 4.5

Suppose I teach 2 sections of the same class, each section consisting of different number of students. n1= 20 and n2= 40. I administered the same midterm test to the students in both sections and calculated the mean: m1= 80 and m2= 70. What is the mean for the total group if I combine the two sections?

M= Σx1 +Σx2/ n1 + n2 = 80 * 20 + 70 * 40/ 20+40= 4400/60 = 73.33

In a symmetrical distribution, but with multiple modes or no modes, do we use mean, median or mode?

Mean =median Mean and median have the same value

In a Normal distribution, do we use mean median or mode?

Mean, median and mode

In a skewed distribution do we use mean, median or mode?

Mean: influenced by extreme scores, is found far toward the long tail (positive or negative). Median: in order to divide scores in half, is found toward the long tail, but not as far as the mean. Mode is found near the short tail. Positively skewed distribution: Mean is closer to the skewed side (right) because mean is affected by extreme cases. Mean is greater than median. Negative skewed: mean is closer to the skewed side (left) because mean is affected by extreme cases. Mean is less than median.

A set of 100 scores has a mean of 45. If each score is divided by 5, what will be the new value for the mean?

New Mean= 45/ 5 = 9

If the stats exam was very easy do you expect the mean is greater than median?

No. If the exam is easy, most students would receive very high scores, which would form a negative skewed distribution. As the mean is closer to the skewed side, the mean should locate in the left side of the distribution. Therefore, we expect that median is higher than the mean.

Non experimental methods

Non equivalent groups and pre-post studies These studies do not use a manipulated variable to differentiate the groups. Instead the variable that differentiates the groups is usually a pre-existing participant variable (such as boys/girls) or a time variable (such as before/ after treatment)

Normal Distributions

Normal distributions are a family of distributions that have the same general shape. 1. They are symmetric, with scores more concentrated in the middle than in the tails. 2. They have only one mode (peakness). Normal distributions are sometimes described as bell shaped. Symmetrical distribution and unimodal distribution

Operational definition

Procedure that makes unobservable constructs observable. Through operational definition, constructs are measured and inferred. There are many operational definitions of a construct. Construct= Anxiety Operational definition= anxiety test

What does the X and Y mean in Statistical notation?

Raw scores are the original, unchanged scores obtained in the study. -x the letter x usually indicates raw scores for a particular variable. If Susan got a score of 3 on the first homework assignment, we could state that x=3 for susan. -y The letter y usually indicates raw scores for an additional variable. If susan got a score of -5 on an attitude survey on math, ranging from -5 (dislike) to +5 (like a lot), we could state that Y= -5 for susan.

Nonequivalent groups

Researcher compares groups Researcher cannot control who goes into which group

Rules for summation notation(order of operation)

Rule 1: any calcuation contained within parentheses including ( ), [ ], { }, I I,v (pie symbol) is done first rule 2: exponents (squaring or raising to other exponents) is done second. Rule 3: Multiplying and/or dividing is done third (working from left to right) Rule 4: Summation ( ∑ ) is done next Rule 5: Addition and or substraction are done (work from left to right) Rule 6: C∑Y = ∑CY (C=constant) Example: 5∑Y = ∑5Y ∑Y/C = (∑Y)/C (C = constant) Example: ∑Y/10 = (∑Y)/10

Susans deviation score on statistics exam is +10. What does +10 mean?

Susans score (x) is 10 points above the mean.

Frequency (f)

The number of cases associated with each score Σf = N ( the sum of the frequencies must equal the number of individuals in the sample).

Cumulative percentage (c%) this is the same thing as percentile rank

The percentage of cases (frequencies) at and below the upper real limit of each class interval. The percentage of case falling at or below a given point on the measurement c%= cf/N (100) C% of 10 - 10% of the students scored 1.5 or below C% of 70- 70% of the students scored 3.5 or below

Percentage

The percentage of the total group that is associated with each score It is p multiplied by (100) Σ%= 100 (the sum of all percentages must equal 100) Proportion and percentage are often called relative frequency. Relative frequencies are particularly helpful when comparing frequency distributions in which the number of the cases differs.

Sampling Error

The discrepancy or amount of error that exists between a sample statistic and the corresponding population parameter. a. If the instructor is interested in finding out the average amount of hours all CMU students study per week, she should sample participants from various departments. THE BETTER THE SAMPLE REPRESENTS THE POPULATION OF INTEREST, THE LESS SAMPLING ERROR WE CAN EXPECT! B. By using only CMU students who take PSY 211, the sampling error becomes large because the sample she used is probably a biased sample. Why might this be the case? - Only one geographical region is covered - PSY 211 is a technical course for which students may study a lot

Descriptive Statistics

The goal of descriptive statistics is to summarize or describe a collection of data in a clear and understandable way. For example assume PSY 211 instructor asks each individual in her class the amount of house they spend studying per week. Thereafter, she calculates the averages and finds that students study approximately 5 hours per week. This finding can be used to summarize and describe the students' study habits in a clear and understandable way.

Inferential statistics

The goal of inferential statistics is to draw inferences about a population from a sample. For example, the Psy 211 instructor would like to find out if the average she obtained (5 hours a week) in her class is significantly different from the average that was obtained from all CMU students.

The experimental method

The goal of this method is to establish a cause and effect relationship between two variables; that is to show that changing the value of one variable causes changes to occur in a second variable. To achieve this goal, the experimental method has two characteristics that differentiate experiments from other types of research studies.

Experimental Group

The group of participants that gets exposed to the treatment

Control Group

The group of participants that receives either no treatment or a Placebo treatment

Balance point

The mean is the balance point for the distribution. The total distance of scores above the mean is ALWAYS the same as the total distance of scores below the mean. Ignoring the sign (+,-) the distance is 7.

Rule 2: characteristics of the mean Introducing a new score or removing a score will usually, not always, change the mean. The exception is when the new or removed score is equal to the mean.

The mean of the first quiz among 15 students was 10. Jane missed her test and took her make-up test later and receives a score of 5. What will be the new value for the mean after Jane's score is added to the other scores? 10 = ΣX/15 ΣX= 150 NewΣX = 150 +5= 155 New_ Mean= 155/16 = 9.69

Rule 3 Characteristics of the mean: If a constant value is added to (or subtracted from) every score in a distribution, that same value will be added to (or subtracted from) the mean.

The mean score Stats exam was 89. The instructor decided to add 2 bonus points each students score as the exam was too difficult. What is the new overall class mean? 89+2 = 91

Mode

The most frequent occurring value or category in a distribution. In a graph it is the value that shows the highest peak (a category with the tallest bar). Can be used for all levels of measurement: nominal, ordinal, and interval ratio/ratio scale

Cumulative frequency (cf)

The number of cases (frequencies) at and below the upper real limit of each class interval. For example: cf of 6 - 6 students scored 2.5 or below cf of 19- 19 students scored below 4.5 or below

Manipulation

The researcher manipulates one variable ( independent variable) by changing its value from one level to another. The second variable (dependent variable) is observed to determine whether the manipulation causes changes to occur.

Median

The value that divides a distribution exactly in half; the middle value: 50% of the scores fall above median and 50% of the scores fall at or below the median. The median is equivalent to the 50th percentile). Can be used for ordinal and interval/ratio scale. NOT nominal scale data. To compute the median, list all the scores from lowest to highest and consider consider the following criteria: If n is an odd number, the median is the middle score/category in the list. If n is an even number, the median point halfway between averaging the two middle score. Select middle pair of scores, add them and divide the sum by 2.

The weighted mean

The weighted mean is the mean that is computed with extra weight given to a group that has a larger sample size when the two or more group means are combined.

Continuous variable

There are an infinite number of possible values between any two observed values is divisible into an infinite number of parts example" weight, height, age, time to complete a test

Correlational Method

Two variables are observed in a natural setting to determine whether there is a relationship between them. A. No manipulation B. No systematic Control C. Just examine the relationship in a natural setting D. As no variables are manipulated and controlled, a cause- effect relationship cannot be established.

There are five students in a class with the class mean of 8. If james' score is 5 points above the mean, Jill's score is 1 point above the mean, Jane's score is 10 points below the mean and Jims is right at the mean, what would Jerry's score (x) be?

Using the Σ(X-u)= 0 rule : M= 8 (class mean is 8) 5+ 1 - 10 + 0 + (x Jerry -M) = 0 -4+x (jerry-m) = 0 (x Jerry - M ) should be 4. (x Jerry- 8 ) = 4 x Jerry = 12 Jerry's score (x) is 4 points above the mean, so jerry's score (x) would be 12.

Deviation score

What it means: How far is an individual's score deviated from the mean? What is the distance between an individual's core and the mean? Population: x-u Sample: x-m Positive deviation score (+)= An individuals score is above the mean Negative deviation score (-) = An individuals score is below the mean. The sum of deviation scores is always zero

Real Limits Chapter 2

When a continuous variable is measured, the resulting measurements correspond to intervals on the number line, not just a single point represents the boundaries of intervals for scores on a continuous number line. The real limit separating two adjacent scores is located exactly halfway between the scores. Upper real limit= at the top of the interval (whole number +.5) Lower real limit= at the bottom of the interval (whole number -.5)

Real Limits

When a continuous variable is measured, the resulting measurements correspond to intervals on the number line, not just a single point. Real limits represent the boundaries of intervals for scores on a continuous number line. The real limit separating two adjacent scores is located exactly halfway between the scores

∑ ("Sigma")

c. Addition (summation): Most statistical calculation requires addition of a set of scores. i. ∑ ("Sigma") add all the scores. ii. ∑ is always followed by a symbol or mathematical expression. 1) ∑X add all the scores of X variable. 2) ∑(X-2) Subtract 2 from each X score and then sum.

The mean uses all the scores in the data, so it is the best measure of central tendency for skewed data

false. Median is the best index when the distribution is skewed

Pretest/ post test

individuals measured at two points in time Researcher cannot control influence of the passage of time

Variability

is a measure of the dispersion of spread of scores in a distribution It provides a numeric index of the differences between scores in a distribution and describe the degree to which the scores are spread out or clustered together It ranges from 0 - infinity

Dependent Variable (DV)

is the outcome that is measured. It is the presumed "effect" in the cause and effect relationship that the experimental method seeks to demonstrate. In a study attempting to demonstrate that a particular drug is effective in relieving anxiety, the anxiety is the dependent variable.

standard deviation

it measures the typical/ standard/average distance of scores from the mean the most commonly used and the most important measure of variability. it ranges from 0- infinity IT DETERMINES WHETHER THE SCORES ARE GENERALLY NEAR OR FAR FROM THE MEAN

Unimodal distribution

one mode (peakness)- only one score that occurs the most frequently

Methods of control

random assignments of subjects Matching of subjects Holding level of some potentially influential variables constant

Control

rules out influence of other variables

The mean and median have the same values, so the distribution is probably symmetrical

true

Constructs

unobserved hypothetical concepts that are used to help describe and explain behavior (IQ, anxiety depression).

Find the median for x and y distributions x: 3,6,1,4,10 y: 2,10,4,7,3,20

x: 1,3,4,6,10 = Median =4 y: 2,3,4,7,10,20 = median = 4+7/2= 5.5

What is an example of a deviation score?

Σ(X-u)= 0 X U X-U Tom 1 5 -4 T's score 1 is 4 points below the mean Matt 2 5 -3 M's score 2 is 3 points below the mean Ama 6 5 1 Ama. score 6 is 1 point above the mean Kim 6 5 1 Kims score 6 is 1 point above the mean Brad 10 5 5 Brads score 10 is 5 points above the mean


Related study sets

Hiragana in 48 minutes YA-YO (original pictures)

View Set

Algebra II 5.08: Simplify Complex Fractions

View Set

Credit Scores Credit Reports and Identity Theft

View Set

Section 1.8 Intro to Linear Transformations

View Set

Chapter 43: Assessment and Management of Patients With Hepatic Disorders

View Set

Chpt 5. Integumentary System (midterm)

View Set