Module 3
A set of measurements is divided into four parts, each containing approximately 25% of the measurements. These three dividers are called?
quartiles
A z score has units interpreted as _____________
standard deviations from the mean.
The sample standard deviation can be found by ________________ the sample variance.
taking the square root of
When calculating an approximate variance for grouped data, ______
the midpoint of each class is used to approximate the measurements in that class.
What does a correlation coefficient close to 0 indicate?
A weak correlation between two variables
What type of table can be used to analyze relationships between two or more categorical variables?
Contingency table
The interquartile range is the difference between which two quartiles?
first and third
For populations that are positively or negatively skewed, the empirical rule tends to ______.
give unreliable descriptions of the distribution.
Data summarized in frequency distribution or histogram form, without giving the individual measurements, is called __________ data.
grouped
When do we tend to calculate a sample covariance?
When we wish to measure the strength and direction of the linear association between two numerical variables.
The length of the interval that contains the middle 50% of measurements is known as ______.
the interquartile range
μ is the symbol used to denote: the population variance the sample variance the sample mean the population mean
the population mean
Standard deviation (the square root of variance) is expressed in _____ units as the original population.
the same
Population _____________ is the average of the squared deviations from the population mean.
variance
A ___________ mean is calculated by multiplying each measurement by its weight, summing the resulting products, and dividing the resulting sum by the sum of the weights.
weighted
A population value, x, has a z score of 3. What does this mean?
x is 3 standard deviations above the mean.
What does µ represent? What does x̅ represent?
µ : Population Mean x̅ : Sample Mean
What are the symbols for standard deviation?
σ: Population standard deviation s: Sample standard deviation
What are the three key statistics used to describe the centrality of numerical data?
Sample Mean, Sample Median, Sample Mode
What are the key statistics for understanding the variability of numerical data?
Sample Variance, Sample Standard Deviation, Sample Range
Statistics describes _______________________ Parameters describes __________________________
Statistics describes samples Parameters describes populations
What are the steps to create a box-and-whiskers display?
1. Calculate Q1, Md, and Q3 2. Draw a box that extends from Q1 to Q3. Draw a vertical line through the box at Md. 3.Mark the values of the lower and upper limits 4. Draw the whiskers 5. Plot the outliers.
1st Quartile 2nd Quartile 3rd Quartile
1st Quartile matches 25th Percentile 2nd Quartile matches Median 3rd Quartile matches 75th Percentile
Using Chebyshev's Theorem, at least 93.75% of the population measurements lie within _______ standard deviations of the mean.
4
What would be an example of a very strong positive correlation?
A correlation coefficient of 0.938 between advertising expenditure and sales volume
What summaries portray grouped data?
A histogram A frequency distribution
What is the Coefficient of Variation (CV)?
A measure of the relative variability of the data, calculated as the Standard Deviation divided by the Mean
What does a correlation coefficient of -1 indicate?
A perfect negative correlation between two numerical variables
What does a correlation coefficient of 1 signify?
A perfect positive correlation between two numerical variables
How can you convert counts to frequencies in a frequency table?
Divide individual counts by the total count
True or false: In a weighted mean, each measurement is given the same importance or weight.
False
What is the primary numerical technique for summarizing a single categorical variable?
Frequency table
What is the empirical rule?
If a population has mean µ and standard deviation σ and is described by a normal curve, then 68.26% of the population measurements lie within one standard deviation of the mean: [µ − σ, µ + σ] 95.44% lie within two standard deviations of the mean: [µ − 2σ, µ + 2σ] 99.73% lie within three standard deviations of the mean: [µ − 3σ, µ + 3σ]
What is Chebyshev's Theorem?
Let µ and σ be a population's mean and standard deviation, then for any value k > 1
What does Md stand for?
Median
Which of the following symbols is used to denote mode?
Mo
In a box-and-whiskers display, the lower whisker is drawn from ____ to _____.
Q1; the smallest measurement that is not outlying.
What is true about population variance and standard deviation?
The more spread out the measurements in a data set are, the larger the variance will be. The raw deviations from the mean sum to zero. The more spread out the measurements in a data set are, the larger the standard deviation will be.
What is a z-score and how do you find it?
The z score is the number of standard deviations that x is from the mean. A positive z score is for x above (greater than) the mean; A negative z score is for x below (less than) the mean 1. Find the Mean (μ\muμ): Calculate the mean of the dataset. 2. Find the Standard Deviation (σ\sigmaσ): Calculate the standard deviation of the dataset. 3. Subtract the Mean from the Raw Score: This gives you the deviation of the raw score from the mean. 4. Divide by the Standard Deviation: This standardizes the deviation in terms of standard deviations. Watch Youtube Video
True or false: The sample mean is the point estimate of the population mean.
True
How do you decide what measure of central tendency to use?
Use the mean for normally distributed data without outliers. Use the median for skewed data or when dealing with outliers. Use the mode for categorical data or when identifying the most common value is important.
A ______ is a graphical summary of a data set's quartiles and interquartile range.
box-and-whiskers display
The sample ______ is the average of the sample measurements.
mean
Which measure of central tendency is calculated by adding up all the measurements and then dividing by the number of measurements?
mean
A population ______ is a number calculated using the population measurements that describes some aspect of the population.
parameter
The ______________ of measurements in a data set is calculated as the largest measurement minus the smallest measurement.
range