Chapter 3 - Stat 1

¡Supera tus tareas y exámenes ahora con Quizwiz!

Unimodal

A data set that has only one value that occurs with the greatest frequency.

Symmetric Distribution

A distribution in which the data values are uniformly distributed about the mean.

Population Variance

Symbol: σ(square) (lower case sigma) The average of the squares of the distance each value is from the mean.

Modal Class

The class with the largest frequency.

Mean

The sum of the values, divided by the total number of values. Also known as the arithmetic average. µ Ex: The mean of 3, 2, 6, 5, and 4 is? ( 3 + 2 + 6 + 5 + 4 ) / 5 = 4

Mode

The value that occurs most often in a data set.

Outlier

An extreme value in a data set; it is omitted from a box-plot.

Sample Mean

Denoted by X (pronounced "X bar"), is calculated by using sample data.

Parameter

A characteristic or measure obtained by using all the data values for a specific population.

Statistics

A characteristic or measure obtained by using the data values from a sample.

Data Array

A data set that has been ordered.

Multimodal

A data set with three or more modes

Range Rule of Thumb

Dividing the range by 4, given an approximation of the standard deviation The range rule of thumb is only an approximation and should be used when the distribution of data values is unimodal and roughly symmetric. - The range rule of thumb can be used to estimate the largest and smallest data values of a data set. The smallest data value will be approximately 2 standard deviations below the mean, and the largest data value will be approximately 2 standard deviations above the mean of the data set.

Five-Number Summary

Five specific values for a data set that consist of the lowest and highest values, Q1 and Q3, and the median.

∑X

Means to find the sum of the X values in the data set.

Interquartile Range (IQR)

Q3 - Q1 (i.e. the distance between the first and third quartiles)

The spread or variability of data is shown commonly by what measures?

Range, variance, and standard deviation.

Bimodal

A data set with two modes

Positively Skewed or Right-Skewed Distribution

A distribution in which the majority of the data values fall to the left of the mean.

Negatively Skewed or Left-Skewed Distribution

A distribution in which the majority of the data values fall to the right of the mean.

Boxplot

A graph of a data set obtained by drawing a horizontal line from the minimum data value to Q1, drawing a horizontal line from Q3 to the maximum data value, and drawing a box whose vertical sides pass through Q1 and Q3 with a vertical line inside the box passing through the median or Q2.

Decile

A location measure of a data value; it divides the distribution into 10 groups.

Percentile

A location measure of a data value; it divides the distribution into 100 groups.

Coefficient of Variation

A measure of the variation of the dependent variable that is explained by the regression line and the independent variable; the ratio of the explained variation to the total variation. -denoted by CVar, is the standard deviation divided by the mean. The result is expressed as a percentage.

Empirical Rule

A rule that states that when a distribution is bell-shaped (normal), approximately 68% of the data values will fall within 1 standard deviation of the mean; approximately 95% of the data values will fall within 2 standard deviations of the mean; and approximately 99.7% of the data values will fall within 3 standard deviations of the mean.

Resistant Statistic

A statistic that is not affected by the extremely skewed distribution brought by outliers.

Nonresistant Statistic

A statistic that is relatively less affected by outliers.

Chebyshev's Theorem

A theorem that states that the proportion of values from a data set that fall within k standard deviations of the mean will be at least 1 - 1/k2, where k is a number greater than 1. It helps you find 75% or 88.98% of the range of data the given studied data values. -Chebyshev's theorem applies to any distribution regardless of its shape.

Population Mean

Denoted by µ (pronounced "mew"), is calculated by using all the values in the population. The population mean is a parameter.

Exploratory Data Analysis (EDA)

The act of analyzing data to determine what information can be obtained by using stem and leaf plots, medians, interquartile ranges, and box-plots. - The purpose of exploratory data analysis is to examine data to find out what information can be discovered about the data, such as the center and the spread.

z Score or Standard Score

The difference between a data value and the mean, divided by the standard deviation.

Deviation

The difference or distance each data value is from the mean.

Range

The highest data value minus the lowest data value.

Weighted Mean

The mean found by multiplying each value by its corresponding weight and dividing by the sum of the weights.

Median

The midpoint of a data array.

Population Standard Deviation

The square root of the variance.

Midrange

The sum of the lowest and highest data values, divided by 2.

Quartile

Values that separate the data set into approximately equal groups. - Quartiles divide the distribution into four equal groups, denoted by Q1, Q2, Q3. Note that Q1 is the same as the 25th percentile; Q2 is the same as the 50th percentile, or the median; Q3 corresponds to the 75th percentile.


Conjuntos de estudio relacionados

Ch 6 AA Variable Interest Entities, Intra-Entity Debt, Consolidated Cash Flows, and Other Issues: Questions

View Set

CISSP PRACTICE TESTS Chapter 5 ▪Identity and Access Management (Domain 5)

View Set

Unit IV Exam Review The 1980s – 1990s Chapters 12 & 13 History of Rock and Roll

View Set

Unit 1 IV Therapy & Blood Administration NCLEX Questions

View Set

Hardware and Network Troubleshooting

View Set

Chapter 29 Text Assignment Book Answers

View Set

Asepsis and Infection Control PT. 2

View Set