Math 2342 Chap 3

Ace your homework & exams now with Quizwiz!

outlier

An extremely high or extremely low data value in a data set can have a striking effect on the mean of the data set. This is one reason why, when analyzing a frequency distribution, you should be aware of any of these values.

weighted mean

Find the weighted mean of a variable X by multiplying each value by its corresponding weight and dividing the sum of the products by the sum of the weights.

Sample Variance

(denoted by s2) is where X = individual value = sample mean n = sample size

standard deviation

- is a measure of how spread out numbers are. SD = √Variance To find the standard deviation of a sample, you must take the square root of the sample variance The formula for the sample standard deviation, denoted by s, is where X = individual value = sample mean n = sample size same as the procedure for finding the population variance

Finding the Mean for Grouped Data

Step 1 Make a table as shown. A Class B Frequency f C Midpoint Xm D f · Xm Step 2 Find the midpoints of each class and place them in column C. Step 3 Multiply the frequency by the midpoint for each class, and place the product in column D. Step 4 Find the sum of column D. Step 5 Divide the sum obtained in column D by the sum of the frequencies obtained in column B.

Rounding Rule for the Mean

The mean should be rounded to one more decimal place than occurs in the raw data. For example, if the raw data are given in whole numbers, the mean should be rounded to the nearest tenth. If the data are given in tenths, the mean should be rounded to the nearest hundredth, and so on.

median is used to find the center or middle value of a data set. is used when it is necessary to find out whether the data values fall into the upper half or lower half of the distribution. is used for an open-ended distribution. is affected less than the mean by extremely high or extremely low values.

The median is the midpoint of the data array. symbol - MD Finding the Median Step 1 Arrange the data values in ascending order. Step 2 Determine the number of values in the data set. Step 3 If n is odd, select the middle data value as the median. If n is even, find the mean of the two middle values. That is, add them and divide the sum by 2.

modal class

The mode for grouped data is the modal class. The modal class is the class with the largest frequency.

negatively skewed or left-skewed distribution

When the majority of the data values fall to the right of the mean and cluster at the upper end of the distribution, with the tail to the left, Also, the mean is to the left of the median, and the mode is to the right of the median. As an example, a negatively skewed distribution results if the majority of students score very high on an instructor's examination. These scores will tend to cluster to the right of the distribution.

population variance

is the average of the squares of the distance each value is from the mean. The symbol for the population variance is σ2 (σ is the Greek lowercase letter sigma). The formula for the population variance is where X = individual value μ = population mean N = population size The population standard deviation is the square root of the variance. The symbol for the population standard deviation is σ.

population standard deviation

is the square root of the variance. The symbol for the population standard deviation is σ.

positively skewed or right-skewed distribution

the majority of the data values fall to the left of the mean and cluster at the lower end of the distribution; the "tail" is to the right. Also, the mean is to the right of the median, and the mode is to the left of the median.

Uses of the Variance and Standard Deviation

can be used to determine the spread of the data. If the variance or standard deviation is large, the data are more dispersed. This information is useful in comparing two (or more) data sets to determine which is more (most) variable. used to determine the consistency of a variable. For example, in the manufacture of fittings, such as nuts and bolts, the variation in the diameters must be small, or else the parts will not fit together. are used to determine the number of data values that fall within a specified interval in a distribution. For example, Chebyshev's theorem (explained later) shows that, for any distribution, at least 75% of the data values will fall within 2 standard deviations of the mean. are used quite often in inferential statistics.

sample mean

denoted by (pronounced "X bar"), is calculated by using sample data. The sample mean is a statistic.

population mean

denoted by µ (pronounced "mew"), is calculated by using all the values in the population. The population mean is a parameter.

*Percentiles*

divide the data set into 100 equal groups. gives relative ranking - % does not

A parameter

is a characteristic or measure obtained by using all the data values from a specific population.

A statistic

is a characteristic or measure obtained by using the data values from a sample.

boxplot

is a graph of a data set obtained by drawing a horizontal line from the minimum data value to Quartile 1, drawing a horizontal line from Quartile 3 to the maximum data value, and drawing a box whose vertical sides pass through Q1 and Q3 with a vertical line inside the box passing through the median or Q2.

midrange is easy to compute. gives the midpoint. is affected by extremely high or low values in a data set.

is defined as the sum of the lowest and highest values in the data set, divided by 2. symbol-- MR is used for the midrange.

mode is used when the most typical case is desired. is the easiest average to compute. can be used when the data are nominal or categorical, such as religious preference, gender, or political affiliation. is not always unique. A data set can have more than one mode, or the mode may not exist for a data set.

is the value that occurs most often in the data set. It is sometimes said to be the most typical case. - unimodal - A data set that has only one value that occurs with the greatest frequency - bimodal - If a data set has two values that occur with the same greatest frequency, both values are considered to be the mode - multimodal - If a data set has more than two values that occur with the same greatest frequency, each value is used as the mode no mode -- When no data value occurs more than once, the data set is said to have no mode. Note: Do not say that the mode is zero. That would be incorrect, because in some data, such as temperature, zero can be an actual value

mean varies less than the median or mode when samples are taken from the same population and all three measures are computed for these samples. is used in computing other statistics, such as the variance. for the data set is unique and not necessarily one of the data values. cannot be computed for the data in a frequency distribution that has an open-ended class. is affected by extremely high or low values, called outliers, and may not be the appropriate average to use in these situations.

mean is the sum of the values, divided by the total number of values. The sample mean, denoted by (pronounced "X bar"), is calculated by using sample data. The sample mean is a statistic.

range

spread or variability of a data set, The range is the simplest of the three measures and is defined now. The range is the highest value minus the lowest value. The symbol R is used for the range. R = highest value − lowest value

*"z" score* or standard score

tells us how many SD a data value is above or below a mean for a specific distribution value z score = value - mean/standard deviation - z score is positive, the score is above the mean - z score is 0, the score is the same as the mean. - z score is negative, the score is below the mean.

symmetric distribution

the data values are evenly distributed on both sides of the mean. In addition, when the distribution is unimodal, the mean, median, and mode are the same and are at the center of the distribution. Examples of symmetric distributions are IQ scores and heights of adult males.

See all study sets

Related study sets

MIS 340 Final Exam

Chapter 5: Electrostatics and Magnetism

Ultrasound Registry Review Anatomy GYN

Earth's Interior

Chemistry - Scientific Notation and Significant Figures

CHAP 1 SELF TEST

Protein as Drug targets

UE Muscles

Accounting Chapter 4 Quiz

Ruben's ICT Test Review

Labor Econ Zoric Test 1

Cardiology/respiratory/endocrine

Sociology Chapter 1-Symbolic Interactionism Quiz

Road Signs & Vulnerability Act, Aggressive Driving, Aggressive Driving Review, Traffic Laws Summary, Conditions, Safety, Speeding, Basic Laws, Major Traffic Laws of Florida, Effects of Alcohol Summary, Economic Cost Summary, Psychological Summary, FE...