Making Sense of Statistics
What is the symbol for the coefficient of determination?
(lowercase italicized) r squared.
Which one of the following indicates the strongest relationship? .68,.77,-.98,.50
-.98 (STRONGEST) being the key word as it is closest to the extreme.
Which one of the following indicates the weakest relationship? .93,-.88,-.95,.21
.21 because it is the closest to 0 which is the complete absence of a relationship which is the weakest relationship.
When r=.30, what percentage of the variance on one variable is accounted for by the variance on the other?
.30 X .30 = .09 X 100 = 9%
When r=.30, what percentage of the variance on one variable is NOT accounted for by the variance on the other?
.30 X .30 = .09 X 100 = 9% meaning 91% is NOT accounted for.
Suppose a researcher found a value of R of .40 for predicting the scores on variable Y from variables X and Z. Expressed as a percentage, what is the amount of variance in variable Y accounted for by the variance in the combination of variable X and Z?
.40 X .40 = .16 X 100 = 16% (R is calculated the same way as little r)
When the Pearson r = .40, is the percentage accounted for equal to 40%? Explain
.40 X .40 = .16 X 100 = 16% so the answer is "NO" because .40 has to be squared before converting to a percentage.
When r=.50, what is the value of the coefficient of determination?
.50 X .50 = .25
When r=.50, what percentage of the variance on one variable is accounted for by the variance on the other?
.50 X .50 = .25 X 100 = 25%
When r=.50, what percentage of the variance on one variable is NOT accounted for by the variance on the other?
.50 X .50 = .25 X 100 = 25% meaning 75% is NOT accounted for.
Which of the following values of R represents the strongest relationship? A) R=.45 B) R=.12 C) R=.66
.66 - strongly inverse
If there is no relationship between two sets of scores, what is the value of the correlation coefficient?
0
When a relationship is perfect, what is the percentage of explained variance?
100%
What is the mode of the following scores? 11,13,16,16,18,21,25
16
What is the outlier in the following set of scores? 2,31,33,35,36,38,39
2
Suppose you read that 20% of a population of 1000 was opposed to a city council resolution. How many were opposed?
200
If 21% of kindergarten children are afraid of monsters, how many out of each 100 are afraid?
21
Suppose M=30 and S=3 for a normal distribution of scores. What percentage of the cases lies between scores 27 and 30?
34% because the Mean is 30 and the Standard deviation is 3 and if we know that it's a "normal" distribution, that means the 68% rule applies thus telling us 30-3=27 meaning it one deviation meaning it's 34% which is half of 68%.
If you read that the median equals 42 on a test, what percentage of the participants have scores higher than 42?
50%
What percentage of the cases in a normal curve lies within one standard deviation unit of the mean?
68% (34% on the left and 34% on the right)
Suppose M=80 and S=10 for a normal distribution of scores. About 68% of the cases lie between what two scores?
70 and 90 (10 on the left and 10 on the right)
What is the outlier in the following set of scores? 50,50,52,53,56,57,75
75
What is the name of the type of study in which all members of a population are included?
A census
What does a frequency polygon show?
A dotted plot where vertical axis is the X (score) and horizontal axis is the f (frequency). Shows how many participants have each score. Good for smaller populations as it uses a dotted plot as curves are reserved for larger populations and don't work as well with smaller numbers.
While scattergrams are seldom presented in research reports, they are useful for obtaining what?
A general overview of the correlational relationship
WHat does a Pearson r of -1.00 indicate
A perfect inverse relationship
What term is used to refer to all members of a group in which a researcher has an interest?
A population
Suppose a researcher is planning to conduct a study on attitudes on a conversational topic and expects a wide degree of variation. Given that a wide degree of variation is expected, should the researcher use a "relatively large sample" or "a relatively small sample?"
A relatively large sample.
What is the name for a subset of a population?
A sample
Suppose a researcher is examining the validity of a set of scores on an oral language test to predict a set of scores that first-graders will earn on a beginning reading test. Which correlation statistic should the researcher compute for his research problem? A) r or B) R
A) r as there are only two elements
WHat type of study is needed in order to identify cause-and-effect relationships?
An experiment
Which of the following is the best single predictor of Algebra grades? A) Basic math Scores B) Attitude towards math scores
Attitude towards math
The term "measure of central tendency" is synonymous with what other term?
Average
Suppose a researcher is examining the validity of a combination of the length of engagement and the number of hours in premarital counseling as predictors of subsequent marital satisfaction. What correlation statistic should the researcher computed for this research problem? A) r or B) R
B) R because there is a third element.
The most important characteristic of a good sample is that it is free from what?
Bias
What is the name of the type of distribution that has two high points.
Bimodal - usually an event interrupted the frequency.
Which type of distribution is found much less frequently in research than the others?
Bimodal because it usually signifies an unusual event during gathering of information which affected the data.
How is a "mean" computated?
By summing the scores and dividing by the number of scores.
What in the symbol for NUMBER of participants in a study?
Capitol italicised N
What does a Pearson r of 0.00 indicate?
Complete absence of a relationship.
What is the general term that refers to the extent to which two variables are related across one group of participants?
Correlation
The observations researchers make result in what?
Data
In an experiment, are the responses the "independent variable" or the "dependant variable"?
Dependant
In an experiment, a researcher administered various dosage levels of aspirin to different groups of participants in order to determine the effects of the various dosage levels on heart attack rates. In this study, "heart attack rates" is the....
Dependant Variable
What is the purpose of correlation statistics?
Describes the relationship between 2 or more variables for one group of participants.
Is an average a "descriptive statistic" or an "inferential statistic"?
Descriptive
Is the range of a set of scores a "descriptive statistic" or an "inferential statistic"?
Descriptive
If two people have the highest math scores AND the two highest algebra scores, what kind of relationships does this suggest?
Direct
Inspection (without any computations) of the attitude-towards-math scores and the algebra grades (both high) suggests that the correlation is A) Direct or B) Inverse
Direct
Is the relationship between the scores on Test C and Test D "direct" or "inverse". Vertical 2 Column i.e. Test C: 1,2,4,6,8,9 Test D: 33,38,40,45,52,57
Direct because the higher the score on Test C, the higher the score on test D
What is the formula to determine a percentage?
Divide the smaller number by the total and then multiply by 100. i.e. Got 60 out of 80 on a test. 60 divided by 80 times 100 = 75. I got a C.
When researchers systematically use the empirical approach to acquire knowledge, we say they are engaging in what?
Empirical Research
In which type of study are treatments given in order to see how participants respond?
Experiment
"Perfect relationships are frequently found in the social and behavioral sciences". True or false?
False
"For populations with very limited variability, only very large samples can yield precise results." Is this tenement "true" or "false"?
False.
"Only very large samples can identify very large group differences" Is this statement "true" or "false"?
False.
What is the most important characteristic of a good sample?
Freedom from unbaised
lowercase italicized f means
Frequency - the number of cases producing a certain result from a study
What is the name of a table that shows how many participants have each score?
Frequency Distribution. i.e. (Italicized lowercase f - frequency) which is the number of participants associated with each score (italicized upper X). Italicized upper N is the total number of participants.
Will the scores for "Group D" or "Group E" below have a larger standard deviation if the two standard deviations are computed? Group D: 23,23,24,25,27,27,27 or Group E: 10,19,20,21,25,30,40
Group E - larger spread.
If you read the following statistics in a research report, which group should you conclude has the greatest variability? Group F: M=30.23, S=2.14 or Group G: M=25.99, S=3.01 or Group H: M=22.43, S=4.79
Group H - 4.79 is the greatest
A direct relationship was found between scores on a reading test and a vocabulary test. This indicates that those who scored high on the reading test tended to have what kind of score on the vocabulary test?
HIgh score
Which type of planning involves constructing or selecting measuring instruments?
How
What is the basic way to increase precision?
Increase sample size.
In an experiment, a researcher used group counseling with some participants and used individual counseling with other participants in order to study the effectiveness of the two types of counseling on raising the participants' self esteem. In this study, the two types of counseling constitute the
Independant Variable
Is a margin of error a descriptive statistic or an inferential statistic?
Inferential because it describes the confidence that the data is accuarte
Which two scales of measurement indicate the amount by which participants differ from eachother?
Interval and Ratio
For which scales of measurement is the mean appropriate?
Interval and ration
When the dots on a scattergram form a pattern going from the upper left to the lower right, what type of relationship is indicated?
Inverse
Is the relationship between the scores on Test A and Test B "direct" or "inverse". Vertical 2 Column i.e. Test A: 20,30,40,50,60,70 Test B: 600,500,400,300,200,100
Inverse because the higher the score on test A, the lower the score on Test B
What is the weakness of the range?
It is based on only the two most extreme scores. which may not accurately reflect the true variability in the entire group.
Which scale of measurement is between the ordinal and the ratio scales?
Iterval
Why is it a good idea to report the underlying frequencies when reporting percentages?
Just reporting on frequency (number of cases) is not enough information. We need to know the number of participants (N) in order to put it in context. Saying 8% of students are high doesn't tell me anything. How many were tested? If we say that 100 were tested, then we know that 8 out of the hundred were high.
There is a positive relationship between the scores on Tests E and F. Which participant in an exception to the rule? Explain why he or she is an exception. Vertical 2 Column i.e. Test E: 1050,2508,2702,3040,5508,5567 Test F: 160,169,184,205,90,210
Leona is the exception because the correlations are direct (the higher Test E, the higher Test F) except for her - she scores higher on Test E then 4 other people, yet she has the lowest Test F score of everyone. It does NOT correlate.
If the differences among a set of scores are small, this indicates which of the following? Much or little variability?
Little
What are the most commonly used symbols for the mean in academic journals?
M and m
If the differences among a set of scores is great, do we say that there is "much variability" or "little variability"?
MUCH variability
When a large number of cases are examined and a positive relationship is found, what else should one expect to find?
Many exceptions to the positive relationship. LARGE is the key word. More margin of error.
In a distribution with a positive skew, does the "mean" or the "median" have a higher value?
Mean, because it is further along on the number line.
What is the name of the group of statistics designed to concisely describe the amount of variability in a set of scores?
Measures of variability.
Which always has has 50% of the cases below it?
Median
Which average is defined as the middle point in a distribution?
Median
In a distribution with a negative skew, does the "mean" or the "median" have a higher value?
Median - keyword is HIGHER - trick question. Viewing the tail of the negative skew, indeed the mean is pulled further DOWN the number line meaning technically the median is HIGHER on the number line.
Which average is defined as the most frequently occurring score?
Mode
Suppose you draw a random sample of 20 hospitals from the population of hospitals in the United States, then draw a random sample of maternity wards from the 20 hospitals, and then draw a random sample of patients in the maternity wards previously selected. What type of sampling are you doing?
Multistage Random
What is the most important type of curve?
NORMAL Curve sometimes called the Bell Curve. Usually used with many participants illustrating the high point of the curve in the middle showing most participants fall in the middle (normal) i.e. Heights of women in large groups are normal meaning few are short and few are tall. Most are in the middle range. Normal curve is the one found mostly in nature.
Italicized capital N means
NUMBER of participants total in a study
What is another name for an inverse relationship?
Negative relationship
Suppose that on a 100-item multiple-choice test, almost all students scored between 95 and 100 but a small scattering scored as low as 20. When plotted on a curve, the distribution will show what type of skew?
Negative. Tail is to the left. Most people scored high on the number line which is where the top of the curve is.
Are the data that researchers collect always "scores"?
No
Does stratification eliminate all sampling errors?
No
Consider a value of .50. Would it be appropriate to multiply this value by 100 and to interpret is as representing 50%?
No - because we then see it as half which does not apply here.
Consider a value of r of .65. According to this section, would it always be appropriate to characterize the relationship as being "very strong"?
No - it all depends on the test to determine what is strong and what isn't.
Is the mean usually appropriate for describing the average of a highly skewed distribution?
No - it doesn't really reflect most of the examples. It is pulled in the direction of the higher/lower scores.
Are inverse relationships always weak?
No - that could implicate the strength of something like an antidote for example. It's strength shows in it's ability to be as inverse as possible.
As a general rule, is the range appropriate for describing a distribution of scores with outliers?
No - the outlier pulls the range towards itself and does not reflect the average.
What phrase should you memorize in order to remember the scales of measurement (In order).
No Oil In Rivers Nominal, Ordinal, Interval, Ratio
Is correlation a good way to determine cause-and-effect?
No because it doesn't explain WHY and the exception is a great example. Why is everyone in a group resulting in a direct correlation (higher score equals higher result) except for one person. Yo would need to run an experiment to determine WHY that one person was the exception.
Is the mean appropriate for describing highly skewed distributions?
No, because it is pulled towards the higher number?
Does using a large sample correct for bias?
No.
Is the interquartile range unduly affected by outliers?
No. The middle 50% ignores it representing the average scores.
Is all research in which biased samples are used worthless?
No. There are many situations in which researchers have no choice BUT to use biased samples due to ethical and legal reasons. i.e. can't make people smoke to see results. There will always be a margin of error, but the larger the sample, the lower the biased.
Is selecting a large sample an effective way to reduce the effects of bias in sampling?
No. There will always be bias of some sort no matter what.
What is the name of the lowest scale of measurement?
Nominal
Which level of measurement should be thought of as the "naming" level?
Nominal
If you as participants to name the country they were born in, which scale of measurement are you using?
Nominal as it just gives the name. Nominal is the lowest level of measurement.
This is a guideline from this section: "Choose the "median" when the "mean" is inappropriate". What is the exception to this guideline?
Nominal data because it is named data and it has no order. We would choose mode in this instant and see what the most repeated answer is.
In which type of study do researchers try NOT to change the participants?
Non Experimental
Which type of distribution is often found in nature?
Normal curve. Most living things function in the middle range.
Inspection (without an computations of the attitude towards math scores and the algebra grades suggest that the correlation is A) Perfect or B) Not Perfect
Not Perfect
What type of scattergram has the greatest amount of scatter?
One that has many deviation from the overall trend. Very little correlation or very low Pearson r
Which scale of measurement puts participants in rank order?
Ordinal
If you you rank participants from most cooperative to least cooperative, which scale of measurement are you using?
Ordinal. because it ranks participants from highest to lowest.
If samples yield statistics, what do populations yield?
Parameters
What is a proportion?
Part of (1) but less preferred than percentages. i.e. .2619 is twenty six hundredths. Messy.
In recent decades, researchers have increasingly used what term to refer to the individuals being studied?
Participants (implies they willingly participated)
What is the full name of the Pearson (r)
Pearson product-moment correlation coefficiant
Which is easier to interpret - "percentages" or "proportions"?
Percentages as proportions deals in hundredths and it gets messy.
For describing nominal data, what is an alternative to reporting the mode?
Percentages. i.e. there are more registered democrats than republicans. To state only that the modal political affiliation is Democrat is much less informative than reporting percentages.
Even the best plans for research often cannot be fully executed for physical reasons. According to this section of the book, what are some of the other reasons for this?
Physical, ethical and financial
In a distribution with a negative skew, is the long tail pointing to the "left" or to the "right"?
Points to the left. High point of the curve is to the right.
When plotted, income in large populations usually has what type of skew?
Positive skew meaning the tail is pointing to the right. As you go UP the number scale to the right, we realize LESS people have money meaning the tail goes down the further up we go.
Suppose that a broad cross section of high school students took a very difficult scholarship examination and almost all scored very low, but a very small number scored very high. When plotted on a curve, the distribution will show what type of skew?
Positive. Tail is to the right. Most scored low on the number line.
What statistic is part of (1)
Proportion
Briefly describe how one could select a simple random sample?
Put the names of every member of the population in a giant hat and have a blindfolded person pick out a reasonable amount of names.
What is the definition of "median"
Put the numbers in order from lowest to highest. The median is the middle point in a distribution. In an odd amount of scores, it is the middle score with an even number higher, and an even number lower. In an even number of scores. it is halfway between the two middle scores. To find the halfway point, sum the two middle scores and divide by 2. i.e. 7+10=17 (17/2=8.5).
Suppose you draw a sample of 12 of the homerooms in a school district at random and administer a questionnaire to all students in the selected homerooms. What type of sampling are you doing?
Random Cluster
Which scale of measurement has an absolute zero?
Ratio
If you measure the weight of participants in pounds, which scale of measurement are you using?
Ratio, because we see the difference in weight between each participant
When studying the incidence of rare phenomena, should researchers use "relatively large samples" or "relatively small samples"?
Relatively large samples.
Suppose that Researcher Doe increased her sample size from 100 to 120, while Researcher Smith increased his sample size from 500 to 520. Which researcher will get a greater increase in precision by increasing the sample size by 20?
Researcher Doe.
If you put the names of all members of a population on slips of paper. mix them, and draw some, what type of sampling are you doing?
Simple Random
What type of sampling eliminates bias in the selection of participants?
Simple random sampling
Do "large values" or "small values" of r shrink more dramatically when squared?
Small - and if you can't figure out why - ask Joe via skype and paypal
What are two synonyms for variability?
Spread & Dispersion
In what type of sampling is the population first divided into strata that are believed to be relevant to the variable(s) being studied?
Stratified Random
Suppose you draw at random the names of 5% of the registered voters separately from each county in a state. What type of sampling are you =using?
Stratified Random
What is the formula to discover the deviation from the mean?
Subtract the mean from each score. This produces the deviation from the mean.
How does the deviation from the mean sum to 0?
Subtract the mean from each score. This produces the deviation from the mean. Add up the deviations and you always get 0 (negatives cancel positive) -5+-1+-1+2+5 = 0
When the median is reported as the average, it is also customary to report which measure of variability?
The Interquartile range.
What is the definition of the "range"
The difference between the highest number and the lowest number. Subtract lowest from highest.
The term variability refers to what?
The differences among participants
How do statisticians define the term precision?
The extent to which the same results would be obtained if another random sample were drawn from the same population.
The standard deviation provides an overall measurement of how much participants' scores differ from what other statistic?
The mean of their group
A margin of error is reported as a warning to readers that WHAT might have happened?
The random sampling may have produced errors
What is the definition of the interquartile range?
The range of the middle 50% of the participants. This nullifies the influence of the outlier.
What is a "skewed distribution"?
The the mean is attained but is pulled in the direction of extreme scores and then does not genuinely reflect the average. i.e. 7 kids gave about 21 cents but one kid have $1.50. This mean example does not really tell you that most kids gave about 21 cents.
Why are inferential statistics NOT needed when analyzing the results of a census?
There is no sampling error - the whole population was accounted for.
If all participants have the same score on a test, what should be said about the variability in the set of scores?
There isn't any. NO variability.
What is the purpose of an experiment?
To determine cause and effect relationships.
Inferential Statistics are tools that tell us what?
Tools that tell us how much confidence we can have when generalizing from a sample to a population.
"The smaller the anticipated difference in the population, the larger the sample size should be." Is this statement "true" or "false"?
True.
What is the symbol for the standard deviation when a population has been studied?
Uppercase Italicized S
Empiricism refers to what?
Using direct observation to obtain knowledge
What is the formula to find out the FREQUENCY number the % is referring to using the example that out of 2200 participants, 44% are democrats.
We know that 44% represents 44 people out of each hundred are democrats. To find the ACTUAL number, multiply .44 X 2200 and you realize the actual number of democrats out of 2200 is 968.
Is the standard deviation a frequently used measure of variability?
Yes
IS it possible for a relationship to be both inverse and strong.
Yes - any negative number that is close to -1.0 is both inverse and strong. i.e. -.93
Is it possible for a relationship to be both direct and weak?
Yes - any positive number above 0 towards 1 but not far along is direct AND weak. i.e..11
Can multiple correlation coefficients be calculated for a combination of more that two predictors?
Yes - so we can increase the percentage of predictability.
If a researcher uses a sample of volunteers from a population, should we presume that the sample is biased?
Yes. It always is to some degree (margin of error) i.e. we never know what the non participants of the sample would have said. This creates a bias.
Does every day observation employ the empirical approach? Example.
Yes. Noticing a speedtrap cop hiding behind a bush let's us know that if someone runs the stoplight, there is a good chance they will get ticketed.
If there are 60 members of a population and you give them all number names using a 10 line by 10 line random box of numbers, and you've decided each member will have a two number name, and you are moving left to right for each row, what happens when you come across 61 or higher?
You would skip it because there are only sixty members of the sample.
For a given distribution, if you subtract the mean from each score to get the deviations and then sum the deviations, what will the sum of the deviations equal?
Zero
Of all the participants in a group have the same score, what is the value of the standard deviation of the scores
Zero
What is the definition of a nonexperimental study
a study in which observations are made to determine the status of what exists at a given point in time without the administration of treatments.
What is correlation coefficient?
a value from 0.00 (no correlation between variables) and 1 (a perfect correlation)
For a given value of (lower case italicized) r, how is the value of the coefficient of determination computed?
by squaring the r
"The more scattered the dots are in a scattergram, the stronger the relationship". This statement is true or false?
false
IN an inverse relationship, those who tend to score low on one variable tend to have what kind of score on the other variable?
high
In an inverse relationship, those who tend to score high on one variable tend to have what kind of score on the other variable?
low
% means
number per hundred who have a certain characteristic
What is the definition of "mean"?
the balance point in a distribution of scores. Sum the scores and divide by the number of scores.
What does "sum" mean?
total number added up
The amount of random sampling error obtained from unbiased samples tends to be small when what is done?
when large samples are used
Does random sampling produce sampling errors?
yes - we simply never know what those who chose not to participate would have said