STATS Robertson Exam 1
The quartile range divides the percentile scale into _____ equal parts that have a range of _____ percentage points.
4 25
Mr. E's physical education class ran as many laps as they could without stopping around a 400-meter track. The following is the distribution of laps completed by the students. 1, 2, 4, 4, 4, 5, 6, 6, 7, 10, 12.What is the mean of the distribution?
5.5
statistics are
characteristics of a sample
When dealing with large data sets, it makes most sense to
create a frequency distribution of the data
A percentile scale that is divided into 10 parts, each having a range of 10 percentage points, is called the
decile range
Distribution 1 has a sum of squared deviations of 1,000 and a sample size of 10. Distribution 2 has a sum of squared deviations of 1,500 and a sample size of 50. Which distribution has more variance?
distribution 1 has more variance
If the raw deviation scores from the mean are added together for a given distribution of numbers and the result is not equal to zero which of the following is true?
distribution is negatively skewed
The coefficient of variation can be found by ________ and multiplying by 100.
dividing the standard deviation by the mean
In a frequency polygon graph the Y-axis represents
frequency of scores
interquartile range
the difference between the raw scores at the 75th and the 25th percentile points
Which of the following defines the range of a data set?
the distance from the highest to the lowest value in a data set
Which of the following would you estimate to have a bimodal distribution?
the distribution for height that includes both males and females
range
the largest score minus the smallest score in a distribution
mean
the most used term in stats arithmetic average
mode
the score that occurs most frequently
standard deviation
the square root of the variance • Should be used to summarize variability in a set of data • Easiest measure of variability to interpret
variance
the sum of squared deviation scores about the mean divided by the number of scores minus 1
cumulative frequency
the sum of the frequencies for that class and all previous classes
You are comparing the effects of low-fat, low-carb, and low-protein diets on percent body fat and body mass. The independent variable contains how many levels?
three
Dr. X wishes to test the strength of individuals with Parkinson's disease. Dr. X decides to use the 6-minute walk test as a measure of muscle strength. Dr. Contrarian thinks the 6-minute walk test is a poor choice since it primarily assesses the cardiorespiratory fitness of the individuals, not their strength. Dr. Contrarian's criticism concerns the _____ of the test.
validity -> the soundness or the appropriateness of the test in measuring what it is designed to measure
body mass
variable
The raw deviation scores from the mean are added together for a given distribution of numbers. What is the result?
zero
Which of the following would you estimate to have a bimodal distribution?
—a distribution that has two modes not height in offensive lineman**
variability
—indices of dispersion • Describes the scatter of scores in a distribution • Describes the spread of the data
ties in scores
• A raw score of 5 is at positions 6, 7, and 8. • Since we define percentile as fraction at or below a given score, use position 8. • 8/15 = 0.53 → all raw scores of 5 are given 53rd percentile.
Parameter and Statistics
• Parameter—a characteristic of the population • Statistics—a characteristic of a sample that is used to estimate the value of the population parameter • Sampling error example: • Average body fat percentage of all undergraduate males at KU is 21% • If a random sample of undergraduate males at KU is collected and average is 18% fat • Sampling error = 3% fat
mean
• The most sensitive of the central tendency indices • Mean is affected by every score in the distribution. • Outliers have a greater effect on mean than other measures of central tendency. • Used for subsequent calculations of statistical inference • Useful for interval and ratio data but is arguably not appropriate for ordinal data
Ten sprinters ran a 100-meter dash. The average time the sprinters took to complete the 100-meter dash was 11.7 seconds and the standard deviation was 1.2 seconds. What is the coefficient of variation?
10
It was determined that the variance for a distribution of eight bench press test scores was 100. What is the standard deviation of the data set?
12.5
In a grouped frequency distribution it is common to group scores into ______ groups.
15
Mr. E's physical education class ran as many laps as they could without stopping around a 400-meter track. The following is the distribution of laps completed by the students. 1, 2, 4, 4, 4, 5, 6, 6, 7, 10, 12.What is the mode of the distribution?
4
Distribution 1 has a sum of squared deviations of 1,000 and a sample size of 10. Distribution 2 has a sum of squared deviations of 1,500 and a sample size of 50. Which distribution has a greater standard deviation?
Distribution 1 has a greater standard deviation.
In order to find variance, add up the raw deviation scores from the mean and divide by the number of scores.
False
Standardizing scores does not allow comparisons between two sets of scores with different units.
False
There can be only a single mode in any given distribution
False
normal curve
Gaussian curve (after Karl Gauss), bell-shaped curve • Symmetrical • Mean, median, mode are all same value (the middle) • Frequency of scores declines in a predictable manner as scores deviate farther and farther from the center
Hypothesis
Must be testable • Must be falsifiable
rank order distribution
Ordered listing of the data in a single column
Person A scored in the 50th percentile. Person B scored in the 80th percentile. If persons A and B both increase their score by the same absolute amount (e.g., 10 units), who will have the greatest change in percentile rank?
Person A will have the greatest change in percentile rank.
In a histogram, the Y-axis represents
frequency of scores
Dr. P had her entire class complete a push-up test and a sit-up test. Student J performed 21 push-ups and 27 sit-ups. When Dr. P standardized the scores from her students, Student J's raw scores placed him at the 60th percentile for the push-up test and the 40th percentile for the sit-up test. How did Student J compare to the other students in the class?
Student J was better at the push-up test than the sit-up test.
Dr. T had his entire class complete a push-up test and a sit-up test. Student R performed 21 push-ups and 27 sit-ups. When Dr. T standardized the scores from his students, Student R's raw scores placed her at the 50th percentile for both the push-up test and the sit-up test. How did Student R compare to the other students in the class?
Student R was equally good at both tests.
Student X performed 6 pull-ups, which scored in the 45th percentile of the population, and Student Y performed 20 pull-ups, which scored in the 90th percentile of the population. If Student X and Student Y were tested again and both increase their raw scores by 3, who will likely have a greater increase in standard score?
Student X will have a greater increase in standard score
Student X scored in the 50th percentile and Student Y scored in the 75th percentile of the population for scores. Based on this information alone, which of the following is determined?
Student Y has a greater than Student X.
A study is investigating the effects of low-carbohydrate and low-fat diets on change in blood triglyceride levels. Which of the following statements is FALSE?
The independent variable is change in blood triglycerides.
What is the relationship of the mean, median, and mode in a positively skewed distribution of numbers?
The mean will be greater than the median, and the median will be greater than the mode.
What is the relationship of the mean, median, and mode in a normal distribution of numbers?
The three measures will fall at or very near the same number.
If you want to compare the effect of low-fat versus low-carbohydrate diets on change in blood triglyceride levels, which would be an acceptable null hypothesis?
There is no difference in change in blood triglyceride levels between low-fat and low-carb diets. null hypothesis-> • Predicts no relationship or no difference between the groups
determining the percentile from the score
To calculate a percentile from a rank order distribution, calculate the fraction of scores that fall at and below the score of interest.
Standardizing scores is a useful technique because it allows for the interpretation of raw scores in relation to the population.
True
relationship continued:
Use the mode if only a rough estimate of the central tendency is needed, and the data are normal or nearly so. • Use the median if • the data are on an ordinal scale; • the middle score of the group is needed; • the most typical score is needed; or • the curve is badly skewed by extreme scores.
Hypothesis Testing
We reject H0 and accept H1 when differences or relationships between variables are established beyond a reasonable doubt. • Level of confidence (LOC) is the value we set that defines our reasonable doubt. • May set level of confidence at 5%; then reject H0 when p ≤ .05 • If set level of confidence at 1%, then reject H0 when p ≤ .01
standard dev and normal curve
When N is large and the distribution is close to normal, there are usually five or six standard deviations within the range of a data set. • Does not apply if N is small • Does not apply if data are skewed
relationship between mean median and mode
When the data are distributed normally, the three measures of central tendency all fall at or near the same value. • When the data are skewed, these values are no longer identical.
Theory
a belief regarding a concept or a series of related concepts • Generate hypotheses that can be tested • If hypotheses survive testing, more confidence in the theory • Examples: gravity, evolution, sliding filament theory of muscle contraction
Sample
a subset of a population • Need samples that are representative of the population of interest • All else being equal, larger samples will more likely reflect the population characteristics • Random sample—each member of the population has an equal opportunity of being selected into the sample • Stratified sample—break population into subcategories, then sample from each stratum • Bias—extraneous factors operate on the sample to make it unrepresentative of the population
central limit theorem
a sum of random numbers becomes normally distributed as more and more of the random numbers are added together • Imagine a random number generator spits out 20 random numbers → calculate the mean and store it → repeat many times • Create frequency distribution of those means • Resulting frequency distribution approximately normal • The ubiquity of the normal distribution—randomness leads to Gaussian distributions
Ratio
based on order, has equal distance between scale points, and uses zero to represent the absence of value • Kelvin temperature scale, force (N) • Subject A's mass = 50 kg, subject B's mass = 100 kg; B is twice as massive as A
ordinal
gives quantitative order to the variables, but does not indicate how much better one score is than another • Often rank order scales • Examples—pain scales (0-10), Borg Rating of Perceived Exertion Scale (6-20)
interval
has equal units, or intervals, of measurement— that is, the same distance between each division of the scale—but has no absolute zero point (zero is arbitrary) • F and C temperature scales • Joint angles
Which of the following is an example of experimental research?
having two groups eat different diets and measuring their change in blood pressure
Which of the following is an example of experimental research?
having two groups eat different diets and measuring their change in blood pressure experimental research-> research process that involves manipulating and controlling events or variables to solve a problem
A disadvantage of the mode is that it
heavily influenced by extreme outliers
Your score ranked you in the 80th percentile of a normally distributed data set. If your score does not change, but additional scores are added that cause the distribution to more leptokurtic, how will your percentile rank change?
it will increase
Which measure of central tendency is always affected by every score in a data set?
mean
What type of measure should be used to describe how scores tend to cluster in a distribution of scores?
measure of central tendency
Which measure of central tendency is useful with highly skewed data sets because it is unaffected by extreme outliers?
median
If a distribution of data has a longer tail extending toward the left, or the lower values, the distribution would be
negatively skewed
Choice of college major is an example of a _____ variable
nominal
Choice of college major is an example of a _____ variable.
nominal -> mutually exclusive categories • Assigned number does not indicate amount of something • Examples—sex, political party, college major
If a distribution of data has tails that are equal on both sides, the distribution would
not be skewed
All else being equal, decreasing a sample size should ___________ sampling error.
not decrease** other options =
Dr. Stats has developed a statistical model to predict whether athletes will, or will not, experience a stress fracture based upon their blood concentrations of vitamin D. The study is _____ study.
observational
Dr. Who wants to understand the relationship between calcium intake and bone mineral density. Subjects keep dietary records for a month, from which average daily calcium intake is quantified. Bone mineral density is then quantified with a DEXA scanner. This is an example of _____ study.
observational
When is an ordered listing of data presented in a single column?
only when the data set is relatively small
simple frequency distribution
ordered listing of the variable being studied (X), with a frequency column (f) that indicates the number of cases at each given value of X
grouped frequency distribution
ordered listing of, in one column, a variable (X) in groups and, in a second column [the frequency column (f)], the number of persons who performed in each group of scores • Use if N > 20 and R > 20 • Typically form 15 groups • Interval size (i) • i = range/15
A distribution that is shorter and has thinner tails than a normal distribution is a __________ distribution
platykurtic
Statistical Inference--> Population
population: any group of persons, places, or objects that have at least one common characteristic Can be any group as long as criteria for inclusion in the group are defined so that it is clear who qualifies as a member
If a distribution of data has a longer tail extending toward the right, or the higher values, the distribution would be
positively skewed
In a distribution of scores, the highest score was 35 and the lowest was 20; the difference between these scores is 15. Which measure of variability does this describe?
range
In order to determine the interquartile range for a distribution of scores, take the difference of the _____ scores at the _____ and _____ percentile points.
raw; 25th; 75th
Dr. Stats has developed a statistical model to predict whether athletes will, or will not, experience a stress fracture based upon their blood concentrations of vitamin D. The subjects in the study are Division 1 collegiate volleyball players. Dr. Y wonders whether the model will predict stress fractures in high school volleyball players. Dr. Y's question centers on the question of
reliability
Dr. Stats has developed a statistical model to predict whether athletes will, or will not, experience a stress fracture based upon their blood concentrations of vitamin D. The dependent variable is
stress fractures
relationship part 2
• Use the mean if • the curve is near normal and the data are of the interval or ratio type; • all available information from the data is to be considered (i.e., the order of the scores as well as their relative values); or • further calculations, such as standard deviations or standard scores, are to be made.