Analytics Chapter 15: Non Parametric Methods

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Describe the range of possible values for the chi-square distribution.

0 < Χ2 < ∞

When doing a goodness-of-fit test to compare the distribution of sample data to a normal distribution, we must know the population mean and standard deviation to calculate the normal frequencies. Choose the statements that correctly describe df (degrees of freedom). Select all that apply. 1. If we use the sample mean and standard deviation, then df = k - 3. 2. If the sample mean and standard deviation are used to estimate the population parameters, then df = n - k. 3. If the mean is known and sample standard deviation is estimated, then df = k - 2. 4. If we use the sample mean and standard deviation, then df = k - 2. 5. If the mean and standard deviation are known, then df = k - 1.

1. If we use the sample mean and standard deviation, then df = k - 3. 3. If the mean is known and sample standard deviation is estimated, then df = k - 2. 5. If the mean and standard deviation are known, then df = k - 1.

Which of the following are characteristics of the goodness-of-fit test? Select all that apply. 1. It can be used with nominal, ordinal, interval or ratio data. 2. It assumes the population is normally distributed. 3. Goodness-of Fit tests use the chi-square distribution. 4. It compares an observed frequency distribution to an expected frequency distribution. other options - 5. It can be used with nominal or ordinal data. 6. It calculates a p-value based on the t-distribution.

3. Goodness-of Fit tests use the chi-square distribution. 4. It compares an observed frequency distribution to an expected frequency distribution. 5. It can be used with nominal or ordinal data.

Which of the following is a non-parametric test? 1. Regression Analysis 2. ANOVA 3. The Goodness-of-Fit Test 4. The Test of Two Means

3. The Goodness-of-Fit Test

What happens to the chi-square distribution as the degrees of freedom of the sample becomes larger?

It approaches the normal distribution in shape.

How do you determine the degrees of freedom for a chi-square distribution?

It is k - 1, where k is the number of categories of data.

Explain why the chi-square statistic is always positive.

It uses the square of the difference in the observed and expected frequencies, which must be at least 0.

For all chi-square tests, the null is rejected when the observed frequencies differ substantially from the expected frequencies. What values of chi-square statistics support rejecting the null?

Large

The goodness-of-fit test can compare a distribution of data to a normal distribution. Why is this important?

Many of the tests we have studied assume that the data is from a normal population.

In statistics, what do we mean when we say that a test is "non-parametric"?

No assumption is made regarding the shape of the population distribution.

For a chi-square test with more than two categories, what is the accepted limitation on the use of a chi-square test?

No more than 20% of the cells can have expected frequencies less than 5.

How many chi-square distributions are there?

One for each possible degree of freedom (1, 2, 3, etc.).

How do the goodness-of-fit tests for equal, unequal, and normal frequency distributions differ from one another?

Only in the table of expected frequencies.

What is the shape of the chi-square distribution when the number the degrees of freedom is small?

Positively skewed (a long "tail" on the right).

Suppose we have a contingency table that shows gender (female, male) by favorite movie type (comedy, drama, action, horror). If the table summarizes a total of 120 people, there are 60 females, and 50 people preferred comedy, what would the expected frequency be for the female/comedy cell? 1. 119 2. 25 3. 110 4. 20

Solution: 2. 25 Why? Calculate Expected Frequency!

Suppose we have a contingency table that shows gender (female, male) by favorite movie type (comedy, drama, action, horror). What would the degrees of freedom be for this contingency table analysis?

Solution: 3 Why? df = (r-1)(c-1) = (2-1)(4-1) = 3

A chi-square test should not be used if too many cells have expected frequencies below 5. What is the reason for this?

Too much weight is given to categories with relatively low expected frequencies.

If the chi-square statistic has a p-value of 0.82 in a contingency table analysis, What would you conclude concerning the two variables (rows and columns)?

We cannot reject the null that the variables are not related.

A contingency table test of independence gives a chi-square value to the right of the cutoff value. What is the conclusion?

We reject H0. There is no relationship between the two variables.

If a contingency table has r rows and c columns, how do you calculate the degrees of freedom for the chi-square statistic?

df = (r - 1)(c - 1)

If the sample mean and standard deviation are used as estimates of the population parameters in order to perform a chi-square goodness-of-fit test to check the normality assumption, what are the degrees of freedom for the Χ2 distribution?

df = k - 1 - 2

A sample consists of n observations which are sorted into k frequency bins. What is the degrees of freedom for a chi-square goodness-of-fit test?

k - 1

What will happen in a goodness-of-fit test if too many of the cells have expected frequencies lower that 5?

χ2 will be inflated (larger), making it more likely that we will reject H0.

Suppose n students are asked to indicate which of the following sports they like best: basketball, baseball, football, or soccer. It is believed that 20% prefer basketball, 10% prefer baseball, 40% prefer football, and 30% prefer soccer. How would expected frequencies be calculated?

For each sport, their expected percentage would be multiplied by n.

When calculating the chi-square statistic, the difference in the observed frequency for a particular category and its expected frequency is squared and then divided by what?

The expected frequency for the category.

In a data set with only two cells (two frequencies), what is the accepted limitation on the use of a chi-square test?

The expected frequency in each cell must be at least 5.

How is the normality assumption that underlies tests such as the test of two means, ANOVA, and regression related to goodness-of-fit tests?

The goodness-of-fit test can check to see if the normality assumption is met.

If the chi-square statistic has a p-value of 0.008 in a contingency table analysis, what would you conclude concerning the two variables (rows and columns)?

The two variables are related.

For all chi-square tests, the rejection region is located where?

The upper tail

What do we mean when we say that there is a family of chi-square distributions?

There is a chi-square distribution for 1 degree of freedom, for 2 dof, for 3 dof, etc. but they all have similar properties.

Suppose we have a contingency table that shows gender (female, male) by favorite movie type (comedy, drama, action, horror). What would the null hypothesis be for this contingency table analysis?

There is no relationship between gender and movie type.

When the chi-square statistic is used to test the variables in a contingency table (i.e. rows and columns) for independence, what is the null hypothesis?

There is no relationship between the two variables.


Kaugnay na mga set ng pag-aaral

Ch.7 P2 Linear Regression with Categorical Variables

View Set

Nervous System, Sensory Organs & Action Potentials

View Set

Introduction to Linux - Chapter 14

View Set

Give combining forms for the following meanings:

View Set

Chapter 18 - The Circulatory System: Blood

View Set