Quantitative Survey Methods Ch 3

Ace your homework & exams now with Quizwiz!

what are the 5 values in 5 number summaries

Smallest Value First Quartile Median Third Quartile Largest Value

an outlier might be:

an incorrectly recorded data value a data value that was incorrectly included in the data set a correctly recorded unusual data value that belongs in the data set

for chebyshev's theorem, At least 75% of the data values must be within z = ____ standard deviations of the mean.

2

First Quartile = ____th Percentile

25th

according to the empirical rule, Almost all of the data values will be within +/- ____ standard deviations of its mean.

3

for chebyshev's theorem, At least 94% of the data values must be within z =____ standard deviations of the mean

4

Second Quartile = ___th Percentile = Median

50th

according to the empirical rule, For data having a bell-shaped distribution: Approximately ____% of the data values will be within +/- 1 standard deviation of its mean.

68%

Third Quartile = ___th Percentile

75th

for chebyshev's theorem, At least _____% of the data values must be within z = 3 standard deviations of the mean.

89

according to the empirical rule, Approximately ___% of the data values will be within +/- 2 standard deviations of its mean.

95

___________ is a measure of linear association and not necessarily causation.

Correlation

_________ _______ refers to functionality in interactive dashboards that allows the user to access information and analyses at increasingly detailed level

Drilling down

Range =

Largest value - Smallest value

formula to Compute Lp , the location of the pth percentile.

Lp= (p/100)(n + 1)

___________ are specific percentiles.

Quartiles

If the data distribution is symmetric, the skewness is _____. a. 0 b. .5 c. 1 d. None of these answers are correct.

a. 0

Which of the following is NOT a measure of variability of a single variable? a. covariance b. standard deviation c. range d. coefficient of variation

a. covariance

A(n) _____ is an unusually small or unusually large data value. a. outlier b. median c. sample statistic d. z-score

a. outlier

A numerical measure computed from a sample, such as sample mean, is known as a _____. a. sample statistic b. population parameter c. sample parameter d. population statistic

a. sample statistic

which of the following is not a measure of location: a) median b) variance c) mode d) mean

b) variance

Which of the following is not a measure of dispersion? a. range b. 50th percentile c. standard deviation d. interquartile range

b. 50th percentile

The measure of location often used in analyzing growth rates in financial data is the _____. a. hyperbolic mean b. geometric mean c. arithmetic mean d. weighted mean

b. geometric mean

The coefficient of variation indicates how large the standard deviation is relative to the _____. a. median b. mean c. range d. variance

b. mean

For data skewed to the left, the skewness is _____. a. positive b. negative c. between 0 and .5 d. less than 1

b. negative

Which of the following symbols represents the standard deviation of a population? a. μ b. σ c. x̄ d. σ 2

b. σ

If the data have exactly two modes, the data are ____________

bimodal

A ____ ________ is a graphical summary of data that is based on a five-number summary.

box plot

A set of visual displays organizing and presenting information used to monitor the performance of a company or organization in a manner that is easy to read, understand, and interpret is called a _____. a. crosstabulation b. stem-and-leaf display c. data dashboard d. stacked bar chart

c. data dashboard

In a five-number summary, which of the following is NOT used for data summarization? a. largest value b. median c. mean d. smallest value

c. mean

The measure of variability easiest to compute, but seldom used as the only measure, is the _____. a. interquartile range b. variance c. range d. standard deviation

c. range

Which of the following values of r indicates the strongest correlation? a. .361 b. 0 c. −.9 d. .82

c. −.9

The mean provides a measure of ___________ _____________.

central location

The _____________ ___ __________ indicates how large the standard deviation is in relation to the mean

coefficient of variation

the _____________ is a measure of the linear association between two variables. Positive values indicate a positive relationship. Negative values indicate a negative relationship

covariance

Two descriptive measures of the relationship between two variables are ______________ and _____________ ________________

covariance and correlation coefficient

Since the median is the middle value of a data set, it must always be _____. a. smaller than the mode b. smaller than the mean c. larger than the mode d. None of these answers are correct

d. None of these answers are correct

When the data are positively skewed, the mean will usually be _____. a. less than the median b. greater than the mode c. less than the mode d. greater than the median

d. greater than the median

A graph with skewness −1.8 would be which of the following? a. moderately skewed right b. highly skewed left c. moderately skewed left d. highly skewed right

d. highly skewed right

The correlation coefficient ranges from which two values? a. 0 and 1 b. 1 and 100 c. minus infinity and plus infinity d. −1 and +1

d. −1 and +1

When the data are believed to approximate a bell- shaped distribution: The _____________ ________ can be used to determine the percentage of data values that must be within a specified number of standard deviations of the mean.

empirical rule

Summary statistics and easy-to-draw graphs can be used to quickly summarize large quantities of data. Two tools that accomplish this are _______-________ summaries and _____ ______

five-number summaries and box plots

The _______________ _________ is calculated by finding the nth root of the product of n values. It is often used in analyzing growth rates in financial data (where using the arithmetic mean will provide misleading results). It should be applied anytime you want to determine the mean rate of change over several successive periods (be it years, quarters, weeks, . . .). Other common applications include: changes in populations of species, crop yields, pollution levels, and birth and death rate

geometric mean

A data value greater than the sample mean will have a z-score ___________ than zero.

greater

At least (1- 1/z^2) of the items in any data set will be within z standard deviations of the mean, where z is any value _________ than 1.

greater

Chebyshev's theorem requires z > 1, but z need not be an __________.

integer

Th ________________ _______ of a data set is the difference between the third quartile and the first quartile. It is the range for the middle 50% of the data. It overcomes the sensitivity to extreme data value

interquartile range

Limits for a box plot are located (not drawn) using the _______________ ________.

interquartile range (IQR)

A data value less than the sample mean will have a z-score _____ than zero.

less

The median is the measure of location most often reported for annual income and property value data. As a few extremely large incomes or property values can inflate the __________

mean

Perhaps the most important measure of location is the __________.

mean

The ______ of a data set is the average of all the data values.

mean

Data dashboards are not limited to graphical displays. The addition of numerical measures, such as the ______ and _________ __________ of KPIs, to a data dashboard is often critical

mean and standard deviation

Whenever a data set has extreme values, _________ is the preferred measure of central location.

median

The _________ of a data set is the value in the middle when the data items are arranged in ascending order.

median

The _____________ provides the preferred measure of location when the data are highly skewed.

median

for an even number of observations, the _______ is the average of the two middle values

median

the 50th percentile is the __________

median

A key to the development of a box plot is the computation of the __________ and the quartiles ____ and ____.

median; Q1 and Q3

for an odd number of observations, the median is the _________ value

middle

The _________ of a data set is the value that occurs with greatest frequency. The greatest frequency can occur at two or more different values.

mode

If the data have more than two modes, the data are ________________

multimodal

if the distribution shape is moderately Skewed Left the Skewness is _______________. Mean will usually be less than the media

negative

The empirical rule is based on _________ _________________

normal distribution

n=

number of observations in the sample

An __________ is an unusually small or unusually large value in a data set. A data value with a z-score less than -3 or greater than +3 might be considered an outlier. This can be -2 and +2 depending how you want to see outliers.

outlier

Box plots provide another way to identify __________

outliers

in a box plot, Data outside the interquartile range limits are considered __________ The locations of each is shown with the symbol (dot)

outliers

The pth percentile of a data set is a value such that at least ___ percent of the items take on this value or less and at least (100 - p ) percent of the items take on this value or more

p

a ____________ provides information about how the data are spread over the interval from the smallest value to the largest value.

percentile

A sample statistic is referred to as the _________ ______________ of the corresponding population parameter

point estimator

The sample mean x is the ________ _______________ of the population mean μ.

point estimator

If the measures are computed for data from a population, they are called ________________ ______________.

population parameters

if the distribution shape is moderately Skewed Right the Skewness is ____________. Mean will usually be more than the median

positive

the __________ of a data set is the difference between the largest and smallest data values. It is the simplest measure of variability. It is very sensitive to the smallest and largest data values

range

If the measures are computed for data from a sample, they are called ____________ _________

sample statistics

An important measure of the shape of a distribution is called _____________

skewness

The ____________ __________ of a data set is the positive square root of the variance. It is measured in the same units as the data, making it more easily interpreted than the variance

standard deviation

The z-score is often called the ____________ _____________.

standardized value

ΣXi=

sum of the values of the n observations

Another measure sometimes used when extreme values are present, is the ___________ _________. It is obtained by deleting a percentage of the smallest and largest values from a data set and then computing the mean of the remaining values

trimmed mean

t or f: Just because two variables are highly correlated, it does not mean that one variable is the cause of the other

true

Often a manager or decision-maker is interested in the relationship between _____ variables.

two

It is often desirable to consider measures of ____________ (dispersion), as well as measures of locatio

variability

The ____________ is a measure of variability that utilizes all the data. It is based on the difference between the value of each observation (xi) and the mean (X̅ for a sample, μ for a population). The variance is useful in comparing the variability of two or more variables

variance

The ______________ is the average of the squared differences between each data value and the mean

variance

The correlation coefficient can take on values between -1 and +1. Values near -1 indicate a strong negative linear relationship. Values near +1 indicate a strong positive linear relationship. The closer the correlation is to zero, the __________ the relationship

weaker

In some instances the mean is computed by giving each observation a _________ that reflects its relative importance. The choice of weights depends on the application

weight

The process of converting a value for a variable to a z-score is often referred to as a ___ ___________________.

z transformation

An observation's ____-_______ is a measure of the relative location of the observation in a data set.

z-score

denotes the number of standard deviations a data value Xi is from the mean.

z-score

Suppose annual salaries for sales associates from Hayley's Heirlooms have a bell-shaped distribution with a mean of $32,500 and a standard deviation of $2,500.The z-score for a sales associate from this store who earns $37,500 is _____.

z-score = 2

A data value equal to the sample mean will have a z-score of _________

zero

c symmetric (not skewed) the Skewness is ______. Mean and median are equal

zero

sample mean of x forumla

ΣXi/ n

Population Mean of μ formula

μ =∑Xi/ N


Related study sets

Three Parts of Conclusion )Krayer(

View Set

fundamental of management section 2

View Set

Native American Studies Final Review

View Set

Quiz 4.1-4.2: Simplifying, Adding, Subtracting, and Multiplying Radicals

View Set

NUR3010-CHAPTER 1- PrepU Quizzes

View Set

EDAPT week 1 - Introduction to Nutrition,Health & Wellness

View Set