Statistics Chap 3 : Numerical Descriptions of Data

¡Supera tus tareas y exámenes ahora con Quizwiz!

Properties of the Mode

1. A data set may not have a mode 2. A data set may have one or more than one mode 3. if a mode exists for a data set, the mode is a value in the data set 4. Not affected by outliers in the data set 5. Only measure of center appropriate for qualitative data

Creating a Box Plot

1. Begin with a horizontal (or vertical) number line that contains the five-number summary 2. Draw a small line segment above (or next to) the number line to represent oeach of the numbers in the five-number summary 3. Connect the line segment that represent the 1st Quartile to the line segment representing the 3rd quartile, forming a box with the median's line segment in the middle 4. Connect the "box" to the line segments representing the minimum and maximum values to form the "whiskers"

Properties of Range

1. Easiest measure of dispersion to calculate 2. Only affected by the largest and smallest values in the data set, so it can be misleading

Properties of Standard Deviation

1. Easily computed using a calculator or computer 2. Affected by every value in the data set 3. Population standard deviation and sample standard deviation formulas yield different results 4. Interpreted ass the average distance a data value is from the mean thus it cannot take on negative values 5. same units as the units of data Larger standard deviation indicates the data values are more spread out, smaller deviation indicates the data values lie closer together 7. If it equals 0, then all of the data values are equal to the mean 8. Equal to the square root of the variance

Properties of the Median

1. Easy to compute by hand 2. Only determined by middle values of a data set, not affected by outliers 3. May not be a value in the data set if there are an even number of values in the set 4. Useful measure of center for skewed distributions

Determining the Most appropriate measure of Center

1. For qualitative data, the mode should be used 2. For quantitative data, the mean should be used unless the data set contains outliers or is skewed 3. For quantitative data sets that are skewed or contain outliers, the median should be used

Finding the Median of a Data Set

1. List the data in ascending order, making an ordered array 2. If the data set contains an ODD number of values, the median is the middle value in the ordered array 3. If the data set contains an EVEN number of values, the median is the arithmetic mean of the 2 middle values in the ordered array

Properties of the Mean

1. Most familiar and widely used measure of center 2. Its value is affected by EVERY value in the data set 3. May not be a value in the data set 4. Appropriate measure of center for quantitative data with no outliers

Graphs & Measures of Center

1. The mode is the data value at which a distribution has its highest peak 2. The median is the number that divides the area of the distribution in half 3. The mean of a distribution will be pulled toward any outliers

Properties of Variance

1. easily computed using a calculator or computer 2. Affected by every value in the data set 3. Population variance and sample variance formulas yield different resuts 4. Difficult to interpret because of its unusual squared units 5. Equal to the square of the standard deviation 6. Preferred over thee standard deviation many statistical tests because of its simpler formula

Five-Number Summary

A numerical description of a data set that lists in order from smallest to largest: • Minimum value • 1st Quartile, Q1 • 2nd Quartile, Q2, Median • 3rd Quartile, Q3 • Maximum Value

Chebyshev's Theorem

The proportion of data that lie within K standard deviations of the mean is at least 1-1/K2 for K > 1. When K = 2 and K = 3 : • K = 2: At least 1 - 1/22 = ¾ = 75% of the data values lie within 2 standard deviations of the mean • K = 3: At least 1 - 1/32 = 8/9 = 88.9% of the data values lie within 3 standard deviations of the mean

Percentiles

Values that divide the data into 100 equal parts, each percentile indicates approx. what percentage of the date lie at or below a given value

Quartiles

Values that divide the data into four equal parts, equivalent to the 25th, 50th and 75th percentile • Q1 = First Quartile: 25% of the data are less than or equal to this value • Q2 = Second Quartile: 50% of the data are less than or equal to this value • Q3 = Third Quartile: 75% of the data are less than or equal to this value

Box Plot

a graphical representation of a five-number summary, sometimes referred to as a "box-and-whisker plot"

Standard Deviation

a measure of how much we might expect a typical member of the data set to differ from the mean

Hinge

an approx. of the first or third quartile, found by using the median to divide the data set into an upper half and a lower half (without including the median in either half), and then finding the median of either half of the data set

No mode

describes a data set in which all of the data values occur only once or each value occurs an equal number of times

Bimodal

describes a data set in which exactly 2 data values occur equally often

Multimodal

describes a data set in which more than two data values occur equally often

Unimodal

describes a data set in which only one data value occurs most often

Chebyshev's Theorem

gives a minimum estimate of the percentage of data within a few standard deviations of the mean for any distribution

Standard score (or z-score)

indicates how many standard deviations from the mean a particular data value lies

Pth Percentile of a Data Value

the Pth percentile of a particular value in a data set is given by Where P is rounded to the nearest whole number

Sample Mean

the arithmetic mean of a set of sample data

Population Mean

the arithmetic mean of all the values in a population

Range

the difference between the largest and smallest values in the data set, Range= Maximum data value - Minimum Data value

Location of Data Value for the Pth percentile

the location of the Pth percentile in an ordered arra data values 1. If the formula results in decimal value for l, the location is the next larger whole number 2. If the formula results in a whole number, the percentile's value is the arithmetic mean of the data value in the location and the data value in the next larger location

Weighted Mean

the mean of a data set in which each data value in the set does not hold the same relative importance

Median

the middle value in an ordered array of data

Interfquartile Range (IQR)

the range of middle 50% of the data, given by

Coefficient of variation, CV

the ratio of the standard deviation to the mean as a percentage, allows comparison of the spreads of data from different sources, regardless of differences in units of measurement

Variance

the square of the standard deviation

Population Standard Deviation

the standard deviation of a population data set

Sample Standard Deviation

the standard deviation of a set of sample data

Arithmetic mean

the sum of all of the data values divided by the number of data values, often simply called the mean

Mode

the value in a data set that occurs most frequently

Population Variance

the variance of a population data set

Sample Variance

the variance of a set of sample data

Empirical Rule

used with bell-shaped distributions of data to estimate the percentage of values within a few standard deviations of the mean

Empirical Rule for Bell-Shaped Distributions

• Approx. 68% of the data values lie within 1 standard deviation of the mean • Approx 95% of the data values lie within 2 standard deviations of the mean • Approx 99.7% of the data values lie within 3 standard deviations of the mean


Conjuntos de estudio relacionados

8th Grade Social Studies Quiz 25-3

View Set

Chapter 17 Understanding Research Findings

View Set

Stat 252 Regression & Correlation

View Set

Chapter 31: The Child with Endocrine Dysfunction

View Set

Knewton Alta - Chapter 2 - Descriptive Statistics Part 2

View Set

Biology Unit 5: Protein Synthesis

View Set

CHP 1: The Nature of Strategic Management

View Set