Statistics Exam 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

The z-value that is used to construct a 95% confidence interval is:

1.96

What is the critical value for a two-tailed hypothesis test on a population mean when alpha is 5%, and the population standard deviation is known?

1.96

The appropriate null hypothesis for the Kruskal-Wallis Rank Test is:

The c independent groups have equivalent medians.

he degrees of freedom for the F test in a one-way ANOVA are:

(c - 1) and (n - c).

The probability distribution used to test for differences between two population variances is the

F distribution

All F tests are one-tailed tests.

False

The value of alpha for a 98% confidence interval would be:

alpha = 0.02

The t distribution is used to construct confidence intervals for the population mean when the population standard deviation (i.e., sigma) is known.

FALSE -The t-distribution is used when the population standard deviation (sigma) is not known.

For a given level of significance, if the sample size is increased, the power of the test will decrease.

False

For a given level of significance, if the sample size is increased, the probability of committing a Type II error will increase.

False

The distribution of the F test statistic is symmetrical.

False

Among-group variation is considered to be random error.

False-Among-group variation is due to differences from group to group

In a one-way ANOVA F test, the "among group" variation is attributable to:

In a one-way ANOVA F test, the "among group" variation is attributable to:

When determining the proper sample size for estimation of the mean, one needs to know the acceptable sampling error, "e."

True

When the sample size is increased, the statistical power of the hypothesis test (1.00 minus Beta) also increases

True

When the One-Way ANOVA F test is found to be significant, which statistical method is used as a follow-up procedure to determine between which means there is a statistically significant difference?

Tukey-Kramer Procedure

If a researcher wishes to determine whether there is evidence that the mean family income in the U.S. is greater than $30,000, then

a one-tailed test should be utilized, in which the region of rejection is in the right (upper tail).

It is possible to directly compare the results of a confidence interval estimate to the results obtained by testing a null hypothesis if:

a two-tailed test is used.

The most common approach to finding the y-intercept and slope is the method of least squares. This method minimizes the sum of squared differences between

actual Y values and the predicted Y values

If the plot of residuals is fan-shaped, which assumption is violated?

equal variance along the regression line (i.e., homoscedasticity).

n general, the expected frequencies per cell in the conduct of a Chi-Square test are those one would

expect to find in a given cell if the null hypothesis were actually true

The region of rejection for a one-tailed test is:

found in the tail that supports the alternative hypothesis.

In order to reduce the likelihood of committing a Type II error, the researcher could

increase the sample size

The type of reasoning involved in obtaining confidence interval estimates is called

inductive

If an individual rejects a true null hypothesis, then she/he has

made a Type I error.

In testing for differences between the means of two related populations (i.e., Matched-Pairs) where the variance of the differences is unknown, the degrees of freedom are:

n - 1.

When estimating the population mean with a small sample, the t distribution may be used with ______ degrees of freedom.

n-1

When testing for the difference between two population variances with sample sizes of n1 = 8 and n2 = 10, the number of degrees of freedom are

numerator d.f. = 7, denominator d.f. = 9

In a two-way ANOVA the degrees of freedom for the error term are:

rc(n' - 1).

The quantity (1 - alpha) is called:

the confidence coefficient.

When testing the null hypothesis using the confidence interval estimate of the difference between two means, one would reject the null hypothesis when

the confidence interval does not include zero

In testing for differences between the means of two independent populations the null hypothesis states that:

the difference between the two population means is not significantly different from zero

An interaction term in a multiple regression model may be used when:

the effect of an X variable on Y is affected by the value of a second X variable.

If the p-value is less than alpha in a one-tailed test:

the null hypothesis should be rejected.

If a one-tailed test for a proportion is being performed and the upper critical value is +2.33 and the test statistic is equal to +1.37, then:

the null hypothesis should not be rejected.

The Tukey-Kramer procedure would be used:

to test for pair-wise mean differences

If the data is nominal

use Ps with Z proportion

Is sigma, the population standard deviation given

use Z

the maximum probability of a type I error that you are willing to live with.

α

One should reach the same conclusions in the conduct of a hypothesis test, regardless of whether one is using the critical-value approach, or the p-Value approach to hypothesis testing.

TRUE

The confidence interval estimate of the population mean is constructed around the sample mean.

TRUE

The t-critical values used when computing confidence intervals for the mean vary by degrees of freedom.

TRUE

In a sample of size 25, in order to compute any sample statistic, how many sample values are free to vary?

24

If we are testing for the difference between the means of two independent samples with samples of n1 = 20 and n2 = 20, the number of degrees of freedom is equal to:

38

The degrees of freedom for the Chi-Square test statistic when testing for independence in a contingency table with 4 rows and 4 columns would be

9

In a two-way ANOVA the degrees of freedom for the interaction term are:

(r - 1)(c - 1).

Suppose there is interest in comparing the median response time for three independent groups learning a specific task. The appropriate nonparametric procedure is

Kruskal-Wallis Rank Test for Differences in Medians.

A point estimate consists of a single sample statistic that is used to estimate the true population parameter.

True

The coefficient of determination is r . It can have values from 0 to 1. It is the square of the correlation coefficient, r, and is interpreted as a percentage of variation in the dependent variable explained by changes in the independent variables. For example, an r value of .56 means 56% of the variation is explained. Adjusted r2 values are helpful in comparing several models.

...

When using a simple regression model, extrapolating the linear relationship between X and Y is acceptable under what conditions?

Extrapolation is never acceptable.

In order to calculate the F test statistic for a one-way ANOVA experiment, one would perform which of the following operations?

MSA/MSW

Which of the following statistical methods is appropriate to test whether or not there is sufficient evidence of a difference between the proportions of two related samples?

McNemar Test

If we are testing for the difference between the means of two related samples with samples of n1 = 20 and n2 = 20, the number of degrees of freedom is equal to:

19

The z-value which is used to construct a 99% confidence interval is:

2.58

Simple linear regression involves the use of

A single numerical independent variable to predict the numerical dependent variable

It is possible to construct a 100% confidence interval estimate for the population mean.

FALSE

The Z-critical values used when computing confidence intervals for the proportion vary by degrees of freedom.

FALSE

The confidence interval obtained will always correctly estimate the true population parameter.

FALSE

The tails of the t distribution contain less area than the tails of the normal distribution.

FALSE

When determining the proper sample size, the t-distribution is used based on degrees of freedom and the chosen degree of confidence.

FALSE - Because sample size, and therefore degrees of freedom, are unknown, only the Z-distribution can be used in the determination of the proper sample size.

The power of a test is the probability of rejecting the Null Hypothesis when the Null Hypothesis is really true.

False

As is the case with the Normal probability distribution, the Chi-Square distribution is symmetrical.

False Chi-Square distribution is right-skewed.

To determine whether or not the data meets the assumption of equal variances, which of the following tests should be conducted?

The Levine Test

A test for the difference between two proportions can be performed using the chi-square distribution.

True

Hypothesis testing provides a "confirmatory" approach to data analysis.

True

If the level of significance is chosen to be 5% and the p-value for the hypothesis test is 0.044, then the null hypothesis should be rejected.

True

If the null hypothesis is rejected, then one can conclude that the alternative hypothesis is supported by the observed findings.

True

The F test for the equality of two population variances is based on the difference between the two variances.

True

The Marascuilo Procedure is appropriate for conducting multiple comparisons of proportions between all pairs of groups.

True

The McNemar Test is appropriate to test whether or not there is sufficient evidence of a difference between the proportions of two related samples

True

The larger the sampling error acceptable to the researcher, the smaller the minimum required sample size needs to be.

True

An alternative approach to utilizing the chi-square test for equality of c proportions would be to use the:

chi-square test for independence.

To determine whether a set of observed frequencies differ from their corresponding expected frequencies, we could apply the

chi-square test.

The statistic that measures the proportion of the variation in Y that is explained by an X variable while controlling for the other X variables is known as

coefficients of partial determination

The value of z selected for constructing a given confidence interval is called the ________ value.

critical

The one-way ANOVA is used to test statistical hypotheses concerning:

group means.

How many degrees of freedom are associated with the multiple regression model when running a t-test for the individual coefficients?

n-k-1

In what way do the One-Way and Two-Way ANOVA designs differ? In a one-way ANOVA,

one can only test for the treatment effect of a single factor

The t test for the difference between the means of two independent samples assumes that the respective:

populations are approximately normal. samples are randomly and independently drawn. sample variances are equal.

If p is < α you would

reject the null

If a p-value for a hypothesis test on a mean was given as 0.0330, and the level of significance used was 5%, then the conclusion would be to

reject the null hypothesis

In a one-way ANOVA, if the computed F statistic exceeds the critical F value we may:

reject the null hypothesis since there is evidence of a treatment effect.

If the p-value is greater than alpha in a two-tailed test:

the null hypothesis should not be rejected.

In a hypothesis test, the probability of obtaining a value of the test statistic equal to or even more extreme than the value observed - given the null hypothesis is true - is referred to as:

the p-value.

If s, the sample standard deviation is given

use t

A Type II error is committed when:

you fail to reject a null hypothesis that is really false.

When using the chi-square test for differences in two proportions with a contingency table that has r rows and c columns, the degrees of freedom for the test statistic will be:

(r - 1)(c - 1).

Which of the following values of the chi-square distribution cannot occur?

-2.45 The chi-square distribution takes on only positive values.

What are some properties of the student's t distribution?

-It is symmetric. -It's exact shape (i.e., spread) is characterized by the degrees of freedom. - As the sample size grows, it gradually approaches the normal distribution.

Each of the following factors must be known in order to determine the sample size for quantitative (numerical) data

-the acceptable sampling error -the desired confidence level -the population standard deviation

1-p will give you level of confidence you can have in a decision to reject the null. For example, assume the p-value is p = 0.04. 1 minus p would be 1.00-.04 = .96. Therefore, you could have 96% confidence in a decision to reject the null. If α = .05 that would allow you reject the null because 96% is > the required 95% confidence.

...

Analysis of variance is sort of a misleading name for this technique. You are testing the means of several populations (3 or more) while assuming that the variances are equal, that the populations are independent, and that they come from random samples. The null hypothesis for this is μ = μ = μ , while the alternative hypothesis is that they are not all equal. If the null is not rejected, there is not enough evidence to indicate any difference in the several populations. If the null is rejected, the Tukey-Kramer technique will have to be used to determine which of the means are different. Prior to conducting the ANOVA test on the means, you must test the population variances to determine whether they are equal or not by using the Levene's Test for Homogeneity of Variances.

...

Chapter 12 - Chi-Square Tests of Independence and Equality of Proportions These problems assume nominal or ordinal data. The populations do/does NOT have to be normally distributed. There are two types of chi square problems with which you should be familiar. There is the test for independence. This involves two variables and one population. For example, if the variables are age and income, the hypothesis would be that age and income are independent or not related. The alternative hypothesis is that they are related. Use the Chi-square table to find the critical values based on # of rows minus 1 times number of columns minus 1. Observed or sample values or frequencies will be given in the problem. You will need to find expected values by multiplying the row totals times the column total and dividing by N (the total # of observations in the table. The chi square distribution is positively skewed meaning that high values of chi-square will cause you the reject the null. A high value means that the observed and expected values are quite different, thus the variables are likely related. The calculation of chi square is the same for either type of chi square problem. For each cell of the table, find the difference between observed and expected frequencies and square the difference. Then, divide that squared difference by the expected value and sum the results for all cells of the table to obtain ONE overall chi square value for the problem.

...

Nonparametric tests are used when a normal distribution cannot be assumed. These tests use ordinal data and are based on ranks. The McNemar Test is used to test for differences in two proportions for related samples. The Wilcoxon Rank Sum Test is used with two independent samples. It is the nonparametric alternative to the t - test for two populations. The Kruskal Wallis Test is the nonparametric alternative to ANOVA. Chi-Square is used as the comparison table. The WRST can be used to determine which medians are different if the KW Test leads to rejecting the null.

...

The ANOVA table F test results can be used to determine there is anything useful in the equation. This is particularly helpful if you have two or more independent variables. A high F value means that at least one independent variable is related to the dependent variable in the equation.

...

The F test formula is used to test whether the population variances (sigma 1 = sigma 2) are likely to be equal before conducting a Z or t test that assumes that they are equal. The ratio of the sample variances (s /s ) is compared to two critical values from the F table. One value is obtained directly from the table based on the denominator and numerator degrees of freedom. With equal sample sizes, the second critical F value is found by taking the reciprocal of the first critical F value. If the sample sizes are unequal, you first look up one value in the table based on the numerator and denominator degrees of freedom. To find the second critical value, you must reverse the degrees of freedom and then look that value up in the table. The null hypothesis is that the variances are equal. If the calculated F is greater or less than the critical F value you would reject and assume unequal. Otherwise, the variances are assumed equal.

...

The SST, SSA, and SSW formulas are used to calculate the sum of squares for use in the ANOVA summary table. MST, MSA, and MSW are found by dividing the sum of squares by the degrees of freedom. F is found by the ratio of MSA divided by MSW. The greater the value of F the more likely you can reject the null hypothesis that the means are equal. A value of 4-5 is usually large enough to reject. Look in the F table using the numerator degrees of freedom (c-1) and the denominator degrees of freedom (n-c) to find the critical value. If the calculated F is greater than the critical value you can reject the null. Otherwise, fail to reject. If you reject, you need to use Tukey-Kramer to determine whether the sample differences are large enough to indicate a significant difference in the population values for each pair.

...

The correlation coefficient is r. It is used to test for the presence of a linear RELATIONSHIP between two variables. The null is no relationship while the alternative is that there is a relationship. The possible range of values is from -1 to +1. A zero indicates no linear relationship while + or -1 indicates a perfect positive or negative relationship. Because it is difficult to interpret these values, the absolute value of r must exceed 2/the square root of n to be significant (Quick rule).

...

The first formula in Chapter 9 is for a Z Hypothesis Test for the Mean. It is used to determine whether a hypothesized mean value is likely to be correct where the population standard deviation ό sigma is known. It is assumed that the population is normally distributed.

...

The first formula is for a Z test for two population means. The hypothesis always deals with the population means (μ) not x bar. It may be a one or two tailed test depending on the wording of the question in the problem. Assumptions for this test include normally distributed populations, random samples, two independent samples, equal variances, and large samples (n1 plus n2 = 60 or more) with known population standard deviations (ό).

...

The second formula in Chapter 9 is for a T Hypothesis Test for the Mean. It is used to determine whether a hypothesized mean value is likely to be correct where the population standard deviation ό sigma is unknown and the sample standard deviation s must be used. It is assumed that the population is normally distributed. Here you need to use t instead of z. The t value comes from the t table and will vary depending on the confidence level (99%, 95% and 90%). Degrees of freedom are n-1 for these problems and as the sample size increases, the t and z distributions and values become identical.

...

The second type chi-square problem is a test for equality of proportions. The data is still nominal and the populations need not be normally distributed. There are now several populations and one variable. The problem might be asking whether the proportion of Democrats, Republicans and Independents favoring Obama is the same. The null hypothesis is that the proportions are the same for the three groups. The alternative is that they are not the same. Degrees of freedom are found by taking # of rows (2) minus 1 and multiplying by the number of columns minus 1. Remember either type of chi-square problem must have EXPECTED values of at least 5 to be valid.

...

The t. formula is used to test the means for two independent populations. The assumptions are the same as for the Z test of two means with = variances, normal population distributions, and random samples, but the combined sample sizes are less than 60 and the population standard deviations are unknown.

...

The third formula in Chapter 9 is for a Hypothesis Test for a Proportion. It is used to determine whether a hypothesized proportion value is likely to be correct. The sample proportion Ps must be used. It is assumed that the population is normally distributed. You will need to calculate the sample proportion by dividing the number with the characteristic you are interested in by the sample size or n. For example, if in a sample of 50, 30 people prefer candidate A you would divide 30 by 50 to get 60% or .60 as the sample proportion who prefer candidate A. The Z value comes from the Z table and will vary depending on the confidence level (99%, 95% and 90%).

...

To find p-values in the tables, always start with the calculated value of the test statistic in step 5. If it is a Z problem, look down the left column to find that value and go to the center of the table to find the area or probability that corresponds to that value. If the problem is two tailed, =, /= you must multiply the value by two for the two sides of the distribution.

...

How would one interpret a standard error of estimate that is equal to $65 in a simple linear regression analysis?

About 95% of the observed Y values fall within 130 of the least squares line.

The probability that the test statistic will fall inside the region of rejection due to chance alone is equal to which probabilities?

Alpha

In the design of a study using a One-Way ANOVA, the factor may have several groups defined by only numerical levels.

False Groups may be defined by either numerical or categorical levels.

The Null Hypothesis for conducting a One-Way ANOVA is that "Not all the means are equal."

False The null hypothesis indicates that there are no statistically significant differences between any of the means

The expected cell frequencies in the conduct of a Chi-Square test are those one would expect to find in a given cell if the null hypothesis were actually false

False They are what one would expect if the Null hypothesis were really true

The t distribution allows the calculation of confidence intervals for means for small samples when the population variance is not known, regardless of the shape of the distribution in the population

False- If the sample size was small, and the population was skewed, the assumptions for using the t-distribution would be violated

When the population standard deviation is unknown, the population standard deviation can be estimated by dividing the approximate range of the variable by three

False-The population standard deviation can be estimated by dividing the approximate range of the variable by six.

This data gives more information by answering the question, how much difference? For example, if it was 60 degrees yesterday and 50 degrees today it was 10 degrees colder today. There is no zero point.

Interval

In a multiple regression model, which of the following is correct regarding the value of the adjusted multiple coefficient of determination?

It is usually smaller than the coefficient of multiple determination.

This data is a count, frequency or percentage based on specific categories - Example there 20 people in the class. Ten or 50% have brown hair, four or 20% have black hair, four or 20% have blond hair, and two or 10% have red hair. - Proportion problems are of this type.

Nominal

This data provides more information than nominal data as it can be rank ordered. An example would be small, medium, and large.

Ordinal

the actual probability of a type I error

P-value

To investigate the efficacy of a diet, a random sample of 16 male patients is drawn from a population of adult male volunteers is collected. The weight of each individual in the sample is taken at the start of the diet, and at a medical follow-up four weeks later. Assume that the population of differences in weight before versus after the diet follow a normal distribution. What would be the appropriate statistical test to conduct the hypothesis test?

Paired t test

This type of data also tells how much different. If you made an 80 on exam and your friend made 70 the difference is 10 points. There is a minimum score of zero with ratio data. The problem talks about a mean and a standard deviation.

Ratio

In order to calculate the coefficient of multiple determination r2Y.12, you would use which of the following formulas?

Regression Sum of Squares divided by Total Sum of Squares

Researchers would like the probability of which research decision outcome to be the greatest?

Statistical Power (1.00 minus Beta)

In practice, the population mean is an unknown quantity that is to be estimated.

TRUE

One assumption underlying the use of the confidence interval estimate for the proportion using the normal distribution is that both "X," and "n-X" are greater than five.

TRUE

The sample mean is a point estimate of the population mean.

TRUE

Alpha is the probability of committing a Type I error.

True

If there is no past information about the proportion, the value of pie that will never underestimate the sample size needed is equal to 0.50.

True

In a two-way ANOVA setting, when there are more than two levels of a factor and there is no significant interaction effect, Tukey's multiple comparison procedure can be used to perform pair-wise mean comparisons.

True

In a two-way ANOVA, the interpretations of the main effects make sense only when the interaction component is not significant.

True

In order for the chi-square test for differences between more than two proportions to be valid, the expected frequency in each cell should be at least 5.

True

In the business world, sample sizes are determined prior to data collection

True

In two factor factorial design, factors A and B are said to have interaction if the effect on factor A is dependent on the level of factor B.

True

One of the assumptions made in the application of the one-way ANOVA F test is homogeneity of variance

True

Other things being equal, as the confidence level for a confidence interval for either the mean or the proportion increases, the width of the interval increases.

True

Repeated measurements from the same individuals is an example of data collected from two related populations.

True

The Z test statistic is a numerical quantity computed from the data of a sample and the hypothesized population parameter, and is used in reaching a decision on whether or not to reject the null hypothesis.

True

The null hypothesis for the Chi-Square test of independence should specify that the two categorical variables are independent

True

The test for the difference of two independent population means assumes that each of the two populations is normally distributed.

True

You may usually use the normal distribution to set up a confidence interval estimate of the population proportion if the sample size is "sufficiently" large.

True

If p is > α you would

fail to reject the null

A 95% confidence interval for the mean can be interpreted to mean that:

if all possible SAMPLES are taken and confidence intervals are calculated, 95% of those intervals would include the true POPULATION mean somewhere in their interval.

To meet the assumptions for simple linear regression, what type of relationship should be observed between the residual values and values of X?

if the linear model is appropriate for the data, there should be no apparent relationship between the residual values and values of X

In testing for the differences between the means of two independent populations where the variances in each population are unknown but assumed equal, the degrees of freedom are:

n1 + n2 - 2

When the population standard deviations are unknown, both samples are less than 30, and the equal variances assumption cannot be met, which test statistic should be used to test the differences between two independent means

separate-variance t-test for the difference between two means

The F test for Differences among more than two means is an extension of the

t test for the difference between two independent means

Which of the following measures how close the observed sample statistic has come to the hypothesized population parameter?

test statistic

The null hypothesis for the Chi-Square test of independence should specify

that the two categorical variables are independent

Which of the following is appropriate for conducting multiple comparisons of proportions between all pairs of groups?

the Marascuilo procedure

What is a complement of alpha?

the confidence coefficient

A confidence interval estimate is constructed around

the point estimate

When testing for differences between the means of two related populations, the null hypothesis states that:

the population mean difference is not significantly different from zero.

The coefficient of determination tells us:

the proportion of total variation in Y that is explained by X.

The hypothesis test for the equality of two population variances is based on:

the ratio of the two sample variances.

When conducting the One-Way ANOVA F test, the assumption of "homogeneity of variance" requires that

the variances of the groups are equivalent.

The standard error of the estimate is a measure of:

the variation around the regression line.

The chi-square test can be used:

to test for homogeneity of proportions.

When determining the sample size for a mean for a given level of confidence and standard deviation, if the sampling error (e) is allowed to increase, the sample size required:

will decrease.


Ensembles d'études connexes

RN Comprehensive Online Practice 2019 A with NGN

View Set

(Practice) Ch. 1 - Introduction to Real Estate

View Set