Chapter 3: Numerical Descriptive Measures

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

1Q = ?

25th percentile

As a measure of central location, the mode's value diminishes with data sets that have more than ___ modes

3

IQR:

50% of the data, the middle chunk

2Q = ?

50th percentile = median

3Q = ?

75th percentile

what does the z-score measure?

It measures the distance of a given sample value from the mean in standard deviations -it's unitless

to calculate the position of the percentile (Lp):

Lp = (n + 1) (p/100)

Median:

Midpoint in a string of sorted data, where 50% of the observations, or values, are below and 50% are above.

downfall of Chebyshev's theorem?

it results in conservative bounds for the percentage of observations falling in a particular interval. The actual percentage of observations lying in the interval may in fact be much larger.

dispersion (variability) is good to look at with ____?

location such that.. in choosing between supplier A & supplier B, we should consider: not only the average delivery time for each but also the variability in delivery time for each

Use Chebyshev's theorem and the empirical rule to make ___?

precise statements regarding the percentage of data values that fall within a specified number of standard deviations from the mean

Empirical rule:

provides the approximate percentage of observations that fall within 1, 2, or 3 standard deviations from the mean.

the measures of dispersion/variability:

range variance/standard deviation coefficient of variation (COV)

Coefficient of variation (CV) textbook definition:

serves as a relative measure of dispersion and adjusts for differences in the magnitudes of the means.

the two main causes of variation:

special and common

if CV is greater than 1 or 100%, than ?

standard deviation is greater than the mean

if CV is less than 1 or 100%, than ?

standard deviation is less than the mean

what does 1.5 or greater indicate? (for CV)

that an it is an out of control condition and the data shouldn't be used to make an indication

the common measures of central tendency: mean, median, and mode; prove which principle of variation?

that it can be measures

what does a z-score of −1.5 imply?

that the given sample value is 1.5 standard deviations below the mean.

a high standard deviation indicates that ___?

the data are spread out

if the number calculated with the empirical rule and chebyshev's theorem is different, then....

the data is non-normal

A low value of standard deviation indicates that ___?

the data points are close to the mean

Variance is based on ?

the difference between the value of each observation (xi) and the mean

in a box plot, if the median is right of center and the left whisker is longer than the right whisker....?

the distribution is negatively skewed.

in a box plot, If the median is left of center and the right whisker is longer than the left whisker....?

the distribution is positively skewed

For example, a z-score of 2 implies that...?

the given sample value is 2 standard deviations above the mean.

what is the Best measure of location for normal data?

the mean

which measure of location is Influenced by outliers or skewed data?

the mean

what is the Best measure of location for skewed or non-normal data?

the median

If a process is affected only by common or inherent causes of variation...

the process measurements will form distributions that are stable or predictable over time

if a process is affected by special causes of variation...

the process measurements will form distributions that are unstable or unpredictable over time

Use the z−score to find...

the relative position of a sample value within the data set by dividing the deviation of the sample value from the mean by the standard deviation

Range:

the simplest measure of dispersion; it is the difference between the maximum (Max) and the minimum (Min) values in a data set.

for a sample, standard deviation = s = ?

the square root of s^2 (the variance)

for a sample, variance = s^2 = ?

the sum of ((the observations) - (the mean))sqaured / ((the number of observations) - 1)

to calculate weighted mean (x-bar):

the sum of (the weight of an observation multiplied by the value of the observation)

mode:

the value of a data set that occurs most frequently

what happens when the CV increases?

the variation, relative to the mean, increases

Central location:

the way quantitative data tend to cluster around some middle or central value

what should you use when trying to forecast?

the weighted mean

why doesn't The interquartile range, IQR = Q3 − Q1, not depend on the extreme values?

this measure still does not incorporate all the data so, the IQR is a good measure of dispersion

The purpose of measuring central location is...

to find a typical or central value that describes the data

true or false: An equal number of observations lie above and below the median

true

z-scores can predict outliers, true or false

true

Chebyshev's theorem you can...

use the standard deviation to make statements about the proportion of observations that fall within certain intervals

• Population Mean:

use when calculating data from the entire population opposed to the data from a sample set

finding the Median of a data set:

value in the middle when data items are arranged in ascending order

Mode of a data set:

value that occurs with greatest frequency

dispersion (in this sense) = ?

variability

multimodal:

when more than two modes exist

bimodal:

when two modes exist

with the z-score of the min and max, what can you determine?

whether the data is normal or non-normal

Mode:

The most frequently occurring value

Standard Deviation of a data set:

positive square root of the variance

CV is Calculated by...

(s / x-bar) (100) -dividing a data set's standard deviation by its mean, -CV is a unitless measure that allows for direct comparisons of mean-adjusted dispersion across different data sets.

z-score = ?

(x - x-bar) / s

the principles of variation:

1. variation always exists 2. variation can be measured 3. variation forms a pattern called distribution 4. a distribution can vary central tendency/location,, spread, and shape

the pth percentile divides a data set into two parts:

Approximately p percent of the observations have values less than the pth percentile; AND Approximately (100 - p ) percent of the observations have values greater than the pth percentile.

Mean/Average:

Average of a set of values

what is The main difference between Chebyshev's theorem and the empirical rule?

Chebyshev's theorem applies to all data sets whereas the empirical rule is appropriate when the distribution is symmetric and bell-shaped

standardizing the data

Converting sample data into z-scores

Chebyshev's theorem:

For any data set, the proportion of observations that lie within k standard deviations from the mean is at least 1 − 1/k2, where k is any number greater than 1.

IQR = ?

Q3 - Q1

Box plot (box-and-whisker plot):

a convenient way to graphically display the minimum value (Min), the quartiles (Q1, Q2, and Q3), and the maximum value (Max) of a data set. -also are used as an effective tool for identifying outliers and skewness -used to informally gauge the shape of the distribution

Median is measure most often reported for

annual income and property value data [A few extremely large incomes or property values can inflate the mean.]

why is the range Not considered a good measure of dispersion?

because it focuses solely on the extreme values and ignores every other observation in the data set

why use the variance and s.d. instead of finding the average distance from the mean?

because we would get 0

the following are examples that describe what? • A typical value that describes the return on an investment • The number of defects in a production process • The salary of a business graduate • The rental price in a neighborhood • The number of customers at a local convenience store

central location

variation exists in ______ and can be measured by ____ _____ because it has a _____ that we want to find

everything central tendency pattern

what do measures of dispersion indicate?

how the data vary around the center

when is it preferable to use the empirical rule?

if the histogram or other visual and numerical measures suggest a symmetric and bell-shaped distribution

what does "1.5 × IQR " tell us when used correctly?

if there is an outlier

Coefficient of Variation:

indicates how large the standard deviation is in relation to the mean

what does it mean If the mean and median are substantially different?

it is most likely that the data set contains outliers

to calculate "mu":

pop mean = Mu = (Sum of values) / (Number of observations in the population)

measures of location:

mean median mode weighted mean percentile/quartiles

Variance:

measure of variability that utilizes all the data

Population Parameters:

measures are computed for data from a population

Sample Statistics:

measures computed for data from a sample

what answers this question? Where does the average or location of the population "tend to center?"

measures of central tendency

what divides the data in half?

median

what is Also referred to as the 50th percentile?

median

When data set has extreme values...

median is the preferred measure of central location

which measure of location is not often used?

mode

unimodal:

no mode

the five summary values:

o Min = smallest value o Q1 = first quartile = 25th percentile o Q2 = median = second quartile = 50th percentile o Q3 = third quartile = 75th percentile o Max = largest value

the weighted mean is used when...

observations may be more important than others

after finding the position of Lp, you find the percentile or _Q =

p1 + [(decimal in Lp)(p1 - p2)] so, the pth percentile is located __% (decimal in Lp converted into a percentage) of the distance between the ___th and ___th observation

sample mean (x-bar):

point estimator of the population mean (mu)

Sample Statistic =

point estimator for corresponding population parameter


Kaugnay na mga set ng pag-aaral

Body Fluids, Ch10, Semen Analysis

View Set

Modules 8 - 10 Communicating Between Networks Exam

View Set

Financial Management of the Firm Learnsmart

View Set

Exam 3 Genetics - Gene Regulation

View Set