STA 2023: Ch 9/10

Ace your homework & exams now with Quizwiz!

10.3 List five properties of the​ F-distribution.

1. Numerator denoted: d.f.N 2. Denominator denoted : d.f.D 3. The F-distribution is: positively skewed and therefore: is not symmetric. 4. The total area under each F-distribution curve is equal to: 1 5. All values of F are greater than or equal to 0. 6. For all​ F-distributions, the mean value of F is approximately equal to 1.

10.3 List the three conditions that must be met in order to use a​ two-sample F-test.

1. The samples must be randomly selected. 2. The samples must be independent. 3. Each population must have a normal distribution.

10.4 What conditions are necessary in order to use a​ one-way ANOVA​ test?

1. There must be at least 3 samples. 2. Each population must have the same variance. 3. The samples must be randomly selected from a​ normal, or approximately​ normal, population. 4. The samples must be independent of each other.

9.2 What is a​ residual? Explain when a residual is​ positive, negative, and zero.

A residual is the difference between the observed​ y-value of a data point and the predicted​ y-value on a regression line for the​ x-coordinate of the data point. A residual is positive when the point is above the​ line, negative when it is below the​ line, and zero when the observed​ y-value equals the predicted​ y-value.

9.1 Explain how to determine whether a sample correlation coefficient indicates that the population correlation coefficient is significant

A table can be used to compare the absolute value of r with a critical value, or a hypothesis test can be performed using a t-test.

9.1 Value of home and life span are two variables that have been shown to have a positive correlation but no​ cause-and-effect relationship. Describe at least one possible reason for the correlation.

A. Greater wealth allows people to afford more valuable homes and to spend more money on health​ care, and greater health care spending generally enables people to live longer. B. Exercise tends to increase life​ spans, people who live within walking distance of amenities tend to walk more than those who do​ not, and homes that are within walking distance of amenities tend to be more valuable than homes that are not.

10.3 Test the claim about the differences between two population variances

A. If the alternative hypothesis (Ha) contains: -less than (<) or -greater than (>) inequality, a​ right-tailed test is needed. B. If the alternative hypothesis(Ha) contains - the​ not-equal-to symbol, (≠) then a​ two-tailed test is needed. Use: (n1−1) d.f in the numerator and (n2−1) d.fin the denominator.

9.3 A. The coefficient of determination r² is the ratio of which two types of​ variations? B. What does r² measure? C. What does (1-r²) measure?

A. The coefficient of determination is the ratio of the explained variation to the total variation. B. The coefficient of determination is the percent of variation of y that is explained by the relationship between x and y. C. The value (1-r²) is the percent of the variation that is unexplained

Explain how to determine the values of ad.f.N and d.f.D when performing a​ two-sample F-test.

A. The variable d.f.N represents the degrees of freedom of the​ numerator, and the variable d.f.D represents the degrees of freedom of the denominator. B. The value of d.f.N is equal to (n₁ - 1), and the value of d.f.D is equal to (n₂ - 1), where n₁ and n₂ represent the sample sizes of the numerator and denominator​ (respectively).

9.3 A. Calculate the coefficient of determination given r B. What does this tell you about the explained variation of the data about the regression​ line? C. About the unexplained​ variation?

A. coefficient of determination = r² EX: r= 0.033 (square r and get 0.001 after rounding) B. Convert 0.001 to a percentage (0.1%) C. Subtract (0.1%) from 100 to get (99.9%)

10.4 Which statement below describes the hypotheses for a​ two-way ANOVA​ test?

A​ two-way ANOVA test has three null​ hypotheses, one for each main effect and one for the interaction effect.

10.2 Expected frequency

Expected frequency: (row total * column total) / grand total. Step 1. Find horizontal row total Step 2. Find vertical column total Step 3. Multiply these and divide by the grand total.

10.2 Explain how to find the expected frequency for a cell in a contingency table.

Find the sum of the row and sum of the column in which the cell is located. Find the product of these two sums. Divide the product by the sample size.

10.1 Find the expected​ frequency, for the given values of n and pi.

Formula: Ei = (n*p)

10.4 State the null and alternative hypotheses for a​ one-way ANOVA test.

H₀​​: All population means are equal. Ha​: At least one population mean is different from the others.

10.2 Calculate marginal frequencies and sample size.

Marginal relative frequency: calculated by dividing a row total or a column total by the sample size. Sample size: calculated by adding all numbers in the table.

10.2 Decide whether to fail or reject H0.

Reject H0: because the test statistic (x²) is in the rejection region. Fail to reject H0: because the test statistic(x²) is NOT in the rejection region.

10.3 Explain how to find the critical value for an​ F-test.

Specify the level of​ significance, α. Determine the degrees of freedom for the​ numerator, d.f.N​, and​ denominator, d.f.D. Find the critical value of F using technology or the​ F-distribution table.

10.2 Relative Frequency (%)

Step 1. Count the total number of items or sums of all frequencies. Step 2. Divide each single frequency by the total sum of all frequencies. EX: Cats: 5 Dogs: 5 Total: 10 Divide 5/10 to get 2

10.2 Conditional Relative Frequency (%)

Step 1. Find row (horizontal) total Step 2. Find single row entry and divide by row total.

10.4 Describe the difference between the variance between samples MSB and the variance within samples MSW.

The MSB measures the differences related to the treatment given to each sample. The MSW measures the differences related to entries within the same sample.

9.3 How can the coefficient of determination be​ interpreted?

The coefficient of determination is the fraction of the variation in money spent that can be explained by the variation in money raised. The remaining fraction of the variation is unexplained and is due to other factors or to sampling error.

9.1 Two variables have a positive linear correlation. Does the dependent variable increase or decrease as the independent variable​ increases?

The dependent variable increases.

9.3 Describe the explained variation about a regression line in words and in symbols.

The explained variation is the sum of the squares of the differences between the predicted​ y-values and the mean of the​ y-values of the ordered pairs.

9.1 "Correlation does not imply causation"

The fact that two variables are strongly correlated does not in itself imply a cause-and-effect relationship between the variables.

10.1 What conditions are necessary to use the​ chi-square goodness-of-fit​ test?

The observed frequencies must be obtained randomly and each expected frequency must be greater than or equal to 5.

9.2 Determine if the point is influential. The change in slope or intercept is significant if it is larger than​ 10%.

The point is not an influential point because the slopes with the point included and without the point included are not significantly​ different, and the intercepts are not significantly different.

9.1 Describe the range of values for the correlation coefficient.

The range of values for the correlation coefficient is -1 to​ 1, inclusive.

9.2 Two variables have a positive linear correlation. Is the slope of the regression line for the variables positive or​ negative?

The slope is positive. As the independent variable increases the dependent variable also tends to increase.

9.3 Standard error of estimate

The standard error of the estimate: the square root of the coefficient of non-determination divided by it's degrees of freedom. DF: N(sample size) - 2 or (N-2)

9.3 Describe the total variation about a regression line in words and symbols.

The total variation is the sum of the squares of the differences between the​ y-values of each ordered pair and the mean of the​ y-values of the ordered​ pairs, or ∑(yi-y)²

9.3 Describe the unexplained variation about a regression line in words and in symbols.

The unexplained variation is the sum of the squares of the differences between the observed​ y-values and the predicted​ y-values.

10.2 T/F: If the test statistic for the​ chi-square independence test is​ large, you​ will, in most​ cases, reject the null hypothesis.

True

9.3 Explain what it means for two variables to have a bivariate normal distribution.

Two variables have a bivariate normal distribution when for any fixed values of x the corresponding values of y are normally​ distributed, and for any fixed values of y the corresponding values of x are normally distributed.

9.1 Give examples of two variables that have a perfect positive linear correlation and two variables that have a perfect negative linear correlation.

Two variables that have perfect positive linear correlation are the price per gallon of gasoline and the total cost of gasoline. Two variables that have perfect negative linear correlation are the distance from a door and the height of a wheelchair ramp.

9.3 What is the coefficient of determination for two variables that have perfect positive linear correlation or perfect negative linear​ correlation? Interpret your answer.

Two variables that have perfect positive or perfect negative linear correlation have a correlation coefficient of 1 or −1, respectively. In either case the coefficient of determination is​ 1, which means​ 100% of the variation in the response variable is explained by the variation in the explanatory variable.

9.1 A farmer wants to determine if the amount of sunlight received by similar crops can be used to predict the harvest of the crop. explanatory variable? response variable?

amount of sunlight harvest of the crop

10.2 Explain how the​ chi-square independence test and the​ chi-square goodness-of-fit tests are similar. How are they​ different?

chi-square independence test: A. Has d.f = (r-1)(c-1) B. Expected frequency: Er,c C. test if two variables are​ independent chi-square goodness-of-fit test: A. Has d.f = (k-1) B. Expected frequency: Ei = npi C. Test if a frequency distribution fits an expected​ distribution Both A. Obtained from a random sample B. Each expected frequency is at least​ 5 C. Testing a claim about data that are in​ categories

10.2 Degrees of freedom for chi-square contingency table

d.f = (r-1)(c-1) where r is the number of rows and c is the number of columns. (only counting rows and columns with data values)

Degrees of freedom for ANOVA

dfb =(k-1) the number of groups minus 1 dfw= (N-K)the total number of participants minus the number of groups

9.3 Find the coefficient of determination (r²) using x and y table values

https://exploringfinance.com/coefficient-of-determination-r-squared-calculator/

10.3 Find the critical​ F-value for a​ two-tailed test using the indicated level of significance α and degrees of freedom.

https://mathcracker.com/f-critical-values Always the second crit value

9.3 Constructing Prediction Interval

https://mathcracker.com/prediction-interval-calculator-regression-prediction

10.3 Find the critical​ F-value for a​ right-tailed test using the indicated level of significance α and degrees of freedom.

https://www.danielsoper.com/statcalc/calculator.aspx?id=4

10.1 Determine the critical​ value, and the rejection region. DF: (k-1) where k is the number of categories in the table (Chi-squared dist.)

https://www.omnicalculator.com/statistics/critical-value

10.2 Chi-squared independent critical value/rejection region

https://www.omnicalculator.com/statistics/critical-value

10.2 Chi-squared independent test statistic

https://www.socscistatistics.com/tests/chisquare2/default2.aspx

9.2 Finding Correlation Coefficient : r =

https://www.socscistatistics.com/tests/pearson/default2.aspx

9.2 Finding line of regression and estimates ŷ_____x + (________)

https://www.socscistatistics.com/tests/regression/default.aspx

9.1 What does (1 - r²) ​measure?

percent of the variation that is unexplained

9.1 Discuss the difference between r and p.

r: represents the sample correlation coefficient. p: represents the population correlation coefficient.

9.1 Describe the explained variation about a regression line in words and in symbols.

the explained variation is the sum of the squares of the differences between the predicted​ y-values and the mean of the​ y-values of the ordered pairs

9.1 What does r² ​measure?

the percent of variation of y that is explained by the relationship between x and y


Related study sets

Learn It: Chapter 06 Positive and Neutral Messages

View Set

Ch 4: Management Fraud and Audit Risk

View Set

Fourozan (4th Edition)--Chapter 3

View Set

Intrapartum complications Exam 1

View Set

How the media affects your body image.

View Set