Statistics 2001 Chapter 1,2,3

¡Supera tus tareas y exámenes ahora con Quizwiz!

A pie chart is a segmented circle whose segments add up to ______ degrees.

360

The p-value approach to hypothesis testing has --- steps.

4

A one-way ANOVA test is based on which distribution?

Fdf1,df2

These days, it has become easy to access data by simply using a search engine like -------.

Google

There are several guidelines to follow when constructing graphs that summarize statistical data. Which of the following statements is LEAST accurate?

Graphs should have a lot of adornments.

The two-way ANOVA test can be extended to capture the -----between the factors.

INTERACTION

SSB/r−1=

MSB

SSBr−1SSB/ r-1= .

MSB

When constructing a histogram, what values/labels go on the horizontal (x) axis and the vertical (y) axes?

Quantitative class limits on the horizontal axis; frequency or relative frequency on the vertical axis.

In a one-way ANOVA table, = SSTR + SSE.

SST

True or false: The alternative hypothesis always states the opposite of the null hypothesis.

True

True or false: The mean is the most widely used measure of central location for quantitative data.

True

True or false: The two-way ANOVA test can be conducted with or without examining the interaction of the two factors.

True

The significance level is the probability of making

a Type I error.

In order to calculate the arithmetic mean, one

adds all of the data points, then divides by the number of data points.

One method of graphical presentation for qualitative data is a _____.

bar chart

An important final conclusion to a statistical test is to...

clearly interpret the results in terms of the initial claim.

Relative frequency distributions are generally more useful than frequency distributions when

comparing data sets of different sizes.

Consider the following variable: a runner's time in a 100-meter race. This variable is best categorized as a ______ variable.

continuous

You use the R _____ function generates the correlation coefficient as well as the value of the test statistic and the p-value.

cor.test

The------- ------ approach to hypothesis testing is attractive when a computer is unavailable and all calculations must be done by hand.

critical value

The branch of statistics that summarizes important aspects of a data set is often referred to as ______ statistics.

descriptive

The ANOVA test assume the samples are selected ------.

independently

If the value of the test statistic falls in the rejection region, then the p-value must be

less than α.

The test statistic when the population standard deviation is know is z = x−μ0σ/√nx-μ0σ/n. This formula is valid only if XX follows ---a distribution.

normal

In most applications, we require some form of the equality sign in the ---hypothesis.

null

The two competing hypotheses used in hypothesis testing are called---the hypothesis and the ---hypothesis.

null, alternative

A cumulative frequency distribution identifies the number of -------that falls below the upper limit of a particular interval.

observations

When performing a hypothesis test on μ, the p-value is defined as the

observed probability of making a Type I error.

The average of the sum of squared differences from the mean is the

population variance

The ----is not considered a good measure of dispersion because it focuses solely on the extreme values and ignores every other observation in the sample or the population.

range

We always use ----evidence and the chosen significance level α to conduct hypothesis tests.

sample

Sampling, rather than surveying an entire population, can offer some substantial benefits. Some of those benefits include

saving money and time.

A one-way ANOVA test is better than using a series of two-sample t tests because conducting a

series of two-sample t tests inflates the risk of committing a Type I error.

A polygon gives a general idea of the--- of a distribution.

shape

Histograms can be used to determine the -----of the data.

shape

In a ANOVA test, we compute the grand mean by calculating

the sum of all the observations and then dividing by the total number of observations.

The critical value of a hypothesis test is

the value that separates the rejection region from the non-rejection region.

Data that are collected by recording a characteristic of a subject over several time periods are referred to as ______ data.

time series

Since ANOVA techniques were originally developed in connection with agricultural experiments, the term---- is often used to identify the populations being examined for an ANOVA analysis.

treatment

The formula for the sample mean is

xx = ∑i=1nxin

Match these terms with their meanings: α β

α: The probability of a Type I error. β: The probability of a Type II error.

In which of the following data sets would the arithmetic mean NOT be a good measure of central location?

7, 8, 8, 9, 25

What is the most widely-used measure of central location?

Mean

Place the sums of squares from a one-way ANOVA table in the correct order.

SSTR SSE SST

The branch of statistics that draws conclusions about a large set of data based on a smaller set of data is often referred to as ______ statistics.

inferential

Hypothesis testing is analogous to a criminal court of law where someone is ---until proven---.

innocent, guilty

It is not sufficient to end the analysis with a conclusion that you reject the null hypothesis or you do not reject the null hypothesis. You must ----the results.

interpret

Generally, the---- is the best measure of central location when outliers are present.

median

A(n) ______ is a segmented circle whose segments portray the relative frequencies of the categories of some qualitative variable.

pie chart

One method of graphical presentation for qualitative data is a(n) ______.

pie chart or bar chart

The first step to determine the median is to

place the data in numerical order

When performing a hypothesis test on μ when the value of σ is unknown, the test statistic is computed as x−μ0s/√nx-μ0s/n and it follows the

tdf distribution with (n - 1) degrees of freedom.

This is the symbol for the population mean.

μ

This is the symbol for the sample mean.

-x

Suppose you are performing a hypothesis test on μ and the value of σ is known. At the 5% significance level, the critical value(s) for a two-tailed test is (are):

-z0.025 and z0.025

True or false: The alternative hypothesis HA in one-way ANOVA requires that all means differ from one another.

False

The alternative hypothesis for a two-sided test for a population mean would be denoted as

HA: μ ≠ μ0

Which of the following BEST describes a frequency distribution for qualitative data?

It groups data into categories, and records the number of observations in each category.

In a neighborhood there are five houses listed for sale for the following amounts: $250,000; $275,000; $280,000; $295,000; and $515,000. What is the BEST measure of central location for the price of a house in the neighborhood?

Median

Which of the following graphical depictions displays cumulative data?

Ogive

Which of the following is an example of cross-sectional data?

Results of market research testing current consumer preferences for soda drinks

In one-way ANOVA, the mean square for treatments (MSTR) is calculated how?

SSTR/(c-1)

Match these shape and association measures with their Excel function names.

Skewness=> =SKEW(array) Kurtosis=> =KURT(array) Sample Covariance=> =COVARIANCE.S(array1,array2) Correlation=> =CORREL(array1,array2)

-------is the science that deals with the collection, preparation, analysis, interpretation, and presentation of data.

Statistics

A sales invoice is what type of data?

Structured

Which of the following is an example of inferential statistics?

Test the longevity of all light bulbs based on a sample of 100 light bulbs.

When there are an odd number of observations, and the observations are in order from smallest to largest, the median is...

The middle observation

All of the following are examples of continuous variables EXCEPT:

The number of children in a family

A(n) ______ depicts the frequency or the relative frequency for each category of a qualitative variable as a series of horizontal or vertical bars, the lengths of which are proportional to the values that are depicted.

bar chart

In one-way ANOVA, two independent estimates of the common population variance σ2 are estimated. These estimates are commonly referred to as ______.

between-treatments variability and within-treatments variability

The ______ is a weighted sum of the sample variances of each treatments.

error sum of squares

The p-value is the likelihood of obtaining a sample mean that is at least as -------as the one derived from the given sample, under the assumption that the null hypothesis is true as an equality.

extreme

In general, data are compilations of------ , -----, -----or other .

facts, figures, or other contents

An owner of a grocery store wants to determine the brands of soda that customers purchase at the store. When summarizing the data about soda brand purchases, the meaningful measure of central location is the ______.

mode

The ______ is a measure of central location that is the most frequently occurring value in the data set.

mode

When summarizing a qualitative data set, the ______ is the best measure of central location.

mode

A quantitative variable is also known as a ----variable.

numerical

The mean is usually greater than the median when the data are ----skewed.

positively

Performing a one-way ANOVA test, instead of performing a series of two-sample t tests, --the risk of incorrectly rejecting the null hypothesis.

reduces

A ______ is a (measured) subset of a population.

sample

Histograms can be used to observe the -----of the data.

spread or variability

In one-way ANOVA, the error sum of squares (SSE) is the

sum of the weighted sample variances of each treatment.

A -----distribution is one that is a mirror image of itself on both sides of its center.

symmetric

The term ---is often used to identify the c populations being examined.

treatments

In two-way ANOVA with interaction, we partition the total sum of squares SST into the following components:

SSA, SSB, SSAB, and SSE

If one variable decreases as the other variable decreases, the two variables have what type of relationship?

Positive

If the interaction between two factors is not significant, what are the next ANOVA tests to be done?

Tests about the population means of factor A and/or factor B

When constructing classes for a frequency distribution for quantitative data, which of the following statements is LEAST accurate?

The number of classes should equal the number of observations.

Which of the following is NOT an assumption for performing a one-way ANOVA?

The population correlation coefficients indicate a strong linear relationship.

Which of these is a NULL hypothesis applicable for a two-way ANOVA test with interaction?

There is no interaction between factors A and B.

In descriptive statistics, a polygon is best described as a

graph that connects the midpoints of each class and its associated frequency or relative frequency.

In a two-way ANOVA test, the sum of squares for factor B is based on the sum of the squared differences between the mean for each level of factor B and the ------ ------.

grand mean

In two-way ANOVA without interaction, the error sum of squares (SSE) is calculated as ______.

SST - (SSA + SSB)

_____ data often consist of numerical information that is objective and is not open to interpretation.

Structured

The ANOVA test is a ----tailed test.

right

If the value of the sample covariance between the two random variables X and Y equals -150, then we can conclude that X and Y have a (an) ______.

negative linear relationship

A one-way analysis of variance (ANOVA) test compares population---- based on one categorical variable or factor.

means

The mode is defined as

most frequently occurring variable in the data set

If the population standard deviation is unknown, it can be estimated by using ______.

s

An auditor for a small business wants to determine whether the mean value of all accounts receivable is less than $550. She takes a sample of 40 and computes the sample mean and the sample standard deviation. The null and alternative hypotheses for this test are

H0: μ ≥ 550 and HA: μ < 550

The only way we can reduce both Type I and Type II errors is by increasing-----.

n or sample size

In ANOVA testing, if the ratio of the between-treatment variability to within-treatment variability is significantly greater than one, then we

rejecting the null hypothesis of equal population means

The critical value approach specifies a region of values, called the ______. If the test statistic falls into this region, we reject the ______.

rejection, null hypothesis.

This value of rxy represents a perfect negative linear relationship.

-1

Which of the following are correctly configured two-tailed tests?

-H0: p = p0 HA: p ≠ p0 -H0: μ = μ0 HA: μ ≠ μ0

When decomposing total variation in a two-way ANOVA test with interaction, SST = +. + + .

SST= SSA+SSB+ SSAB+SSE

Not rejecting the null hypothesis when the null hypothesis is false.

Type II error

When creating a bar chart or a histogram, each bar/rectangle should be of the ------width.

same

Correlation coefficient measures the strength of the linear relationship between---- variables.

two

For a hypothesis test of μ when σ is known, the value of the test statistic is calculated as

z = x−μ0///σ/√n

In general, we follow three steps when formulating the competing hypotheses. Place these steps in the correct sequence.

- Identify the relevant population parameter of interest. - Determine whether it is a one- or two-tailed test. - Include some form of the equality sign in the null hypothesis and use the alternative hypothesis to establish a claim.

Place the steps to perform an ANOVA difference of means test in their proper sequence.

- Specify the null and the alternative hypothesis. - Specify the significance level - Calculate the value of the test statistic and the p-value. -State the conclusion and interpret the results.

< rxy < .

-1 , 1

In hypothesis testing, two correct decisions are possible:

-Reject the null hypothesis when it is false. -Do not reject the null hypothesis when it is true.

Put the following steps in the p-value approach to hypothesis testing in the correct order.

1.) Specify the null and alternative hypotheses. 2.) Specify the significance level. 3.) Calculate the value of the test statistic and its p-value 4.) State the conclusion and interpret results.

In a two-way ANOVA test, what is the maximum number of different hypotheses that can be tested?

3

Many experts believe that _____ of the data in the world today were created in the last two years alone.

90%

With respect to a bar chart, which of the following statements is MOST accurate?

A bar chart is a useful graphical tool for qualitative data.

Sampling is necessary when it is either impractical or impossible to survey the entire population. In which situation does surveying the entire population INSTEAD OF sampling just a part of the population make the most sense?

A teacher who has 30 students in her class wants to determine the average of the most recent test scores.

Which of the following statements is NOT correct concerning the p-value and critical value approaches to hypothesis testing?

Both approaches use the same decision rule concerning when to reject H0.

Which of the following is an example of descriptive statistics?

Calculate the percent of 2500 U.S. voters in an opinion poll who approve of the President's performance.

Suppose the competing hypotheses for a test are H0: μ ≤ 10 versus HA: μ > 10. If the value of the test statistic is 1.90 and the critical value at the 1% level of significance is z0.01 = 2.33, then the correct conclusion is:

Do not reject H0 and conclude that the population mean does not appear to be greater than 10 at the 1% significance level.

For an ANOVA test, the p-value is found using the ----table.

F

True or false: A Type I error occurs if we do NOT reject the null hypothesis when it is actually false.

False

Match these symbols with their meanings. H0 HA

H0- Null Hypothesis HA- Alternative hypothesis

A researcher for a store chain wants to determine whether the proportion of customers who try out the samples being offered is more than 0.15. The null and alternative hypotheses for this test are

H0: p ≤ 0.15 and HA: p > 0.15

The null hypothesis for a two-sided test for a population mean would be denoted as

H0: μ = μ0

Which of the following is right-tailed test for the correlation coefficient.

H0: ρxy < 0 HA: ρxy > 0

In order to approximate the class width for a frequency distribution of quantitative data, we calculate:

Largest Value- Smallest Value/ Number of classes

SSAc−1SSA/c-1=

MSA

In a two-way ANOVA, the Fdf1,df2 statistic that determines whether significant differences exist between the factor A means is calculated as /. .

MSA/ MSE

------is the within-treatments variance.

MSE

SSE/ rc(w-1)=

MSE

SSE/nT−c−r+1= .

MSE

-----is the between-treatments variance.

MSTR

Match these location measures with their R function names.

Mean: mean(df$var) Median: median(df$var) Multiple measures: summary(df) Minimum: min(df$var) Maximum: max(df$var) Percentile: quantile(df$var,p)

Match these location measures with their Excel function names.

Mean=> =AVERAGE(array) Median=> =MEDIAN(array) Mode=> =MODE(array) Minimum=> =MIN(array) Maximum=> =MAX(array) Percentile=> =PEPERCENTILE.INC(array, p)

When there are an even number of observations, and the observations are in order from smallest to largest, the median is...

The average of the two middle observations

What is the relationship between the variance and the standard deviation?

The standard deviation is the positive square root of the variance.

Which of the following statements is true about the test of H0: ρx y = 0?

The test statistic is assumed to follow the tdf distribution with n - 2 degrees of freedom.

A company wants to estimate the mean price of oil over the past 10 years. What type of data does the company need?

Time series data

Which of the following is not a measure of central location?

Variance

Which characteristic of big data does the following describe? Data come in all types, forms, and granularity, both structured and unstructured.

Variety

Which characteristic of big data does the following describe? The credibility and quality of data.

Veracity

The conclusions of a hypothesis test that are drawn from the p-value approach versus the critical value approach are

always the same.

A qualitative variable is also known as a -----variable.

categorical

The term ------- ------relates to the way data tend to cluster around some middle or central value.

central location

If the two independent estimates of σ2 are relatively close together, then it is likely that the variability of the sample means can be explained by

chance

Data that are collected about many subjects at the same point in time or without regard to differences in time are known as ______ data.

cross-sectional

In two-way ANOVA with interaction, we partition the total sum of squares into ----- distinct components.

four

A ______ is a way to organize qualitative data into categories and record the number of observations in each category.

frequency distribution

n a two-way ANOVA test, the sum of squares for factor A is based on the sum of the squared differences between the mean for each level of factor A and the ----- -----.

grand mean

The range is the difference between

largest and smallest values

We reject H0 if the p-value is ---- ----alpha.

less than

In one-way ANOVA, between-treatments variability is based on the variability between sample ---.

means

The one-way analysis of variance (ANOVA) test is used to determine if differences exist between the ----of three or more populations.

means

The measure of central location where half the values of the data set lie above this measure and half the values of the data set lie below this measure is known as the ______.

median

A negative value of the covariance implies that x and y have a ----linear relationship.

negative

What type of relationship exists between two variables if as one increases, the other decreases.

negative

The mean is usually less than the median when the data are --skewed.

negatively

If we reject the null hypothesis when it is actually false we have committed...

no error.

In order to implement an hypothesis test, it is essential that /X is---- distributed.

normally

Extremely small or large values, also referred to as----- .

outliers

Most researchers and practitioners favor the ----- -value approach

p-value

The notation σ^2 represents the

population variance.

A relative frequency distribution for quantitative data identifies the

proportion of observations that occur in each class.

If the chosen significance level is α = 0.05, then there is a 5% chance of

rejecting a true null hypothesis.

The standard error of the estimate is the standard deviation of the ---.

residuals or errors

When testing μ and σ is known, H0 can never be rejected if z ≤ 0 for a

right-tailed test.

Generally, for a frequency distribution, the width of each interval is the -----for each interval.

same

The ----level of a hypothesis test is defined as 100α%.

significance

If a distribution is not symmetric, then it is either positively skewed or negatively-------.

skewed

An ogive is a graph that plots the cumulative frequency, or cumulative relative frequency, against the

upper limit of the corresponding class.

When performing a hypothesis test on μ when σ is known, H0 can never be rejected if

z ≥ 0 for a left-tailed test.

The------------ frequency for a particular interval indicates the proportion of the observations that falls below the upper limit of that particular interval

Cumulative relative

We would conduct a hypothesis test to determine whether or not

sample evidence contradicts H0.

How many means can you test for differences using ANOVA?

3 or more

The notation -x represents the ______.

sample mean

In order to determine if significant differences exist between some of the population means, we develop two independent estimates of the common population .

variance

Two widely used measures of dispersion are...

variance and standard deviation

Northern University wants to determine the average starting salary for last year's graduates of its College of Business. What is the population from which the survey is taken?

All of last year's graduates from Northern's College of Business who started working

True or false: ANOVA is a statistical technique used to determine if there is a difference in three or more population standard deviations.

False

When testing whether the correlation coefficient differs from zero, the value of the test statistic is t20 = 1.95 with a corresponding p-value of 0.0653. At the 5% significance level, can you conclude that the correlation coefficient differs from zero? Multiple choice question.

No, since the p-value exceeds 0.05.

For quantitative data, a ______ groups data into classes and records the number of observations that falls into each class.

frequency distribution

The notation s2 represents the

sample variance

Which one of the following is NOT a step we use when formulating the null and alternative hypotheses?

Calculate the value of the sample statistic.

The competing hypotheses for a one-way ANOVA test that compares the means of three populations are defined as

H0: μ1 = μ2 = μ3 HA: Not all population means are equal

In order to summarize qualitative data, a useful tool is a(n) ______.

frequency distribution

In inferential statistics, we use---- information to make inferences about an unknown ----parameter.

sample, population

For a hypothesis test on μ when the value of σ is unknown, the value of the test statistic is calculated as ______, provided that we sample from a normal population.

tdf = x−μ0/s/√n

A ---------- ----------distribution shows the number of observations that fall below the upper limit of a particular interval. Listen to the complete question

cumulative frequency

In a frequency distribution for a categorical variable, intervals are------------ .

mutually exclusive

A ------is a series of rectangles where the width and height of each rectangle represent the interval width and frequency of the respective interval.

histogram


Conjuntos de estudio relacionados

AP Psychology Module 77 - Prejudice and Discrimination

View Set

Dosage Calculation Practice Exam 3

View Set

In this exercise, you will control Research and Development and reposition a product to target High Tech customers.

View Set

PEDS Chapter 44: The Child with a Genitourinary Alteration

View Set

B-22 Interview Questions / Automation/ SDET

View Set

Week 1 - Chapter 3: Health, Wellness, and Health Disparities

View Set

Civics Midterm Exam study Guide Grade 7

View Set