Stats Exam

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Large MST

large Between group variability

not enough information given to make a decision on the null hypothesis

In ANOVA, the sample size is 500 and eight groups are compared. The total sum of squares is 10,000. The decision should be:

. data are quantitative

In contrast to a chi-square test, an ANOVA F test is most appropriate when the

89 degrees of freedom

In multiple regression analysis involving 10 independent variables and 100 observations, the critical value t for testing individual coefficients in the model will have __ degrees of freedom?

Individual Confidence Level

Success rate of a procedure for constructing a single confidence interval.

Any number greater than or equal to zero but smaller than 1

If none of the data points for a multiple regression model with two independent variables were on the regression plane, then the multiple coefficient of determination would be:

we can then conclude that all the means are different from one another

If the null hypothesis that the means of four groups are all the same is rejected using ANOVA at a 5% significance level, then

0.810

In a multiple regression analysis involving 20 observations and 5 independent variables, the total variation SSTotal=250 and SSError=35. The multiple coefficient of determination adjusted for degrees of freedom is:

n - k - 1

In a multiple regression analysis involving k independent variables and n data points, the degrees of freedom associated with the SSE is:

Familywise Confidence Level

Success rate of procedure for constructing a family of confidence intervals, where a "successful" usage is one in which all intervals in the family capture their parameters.

In ANOVA, an interaction between two factors means that:

the effect of one factor depends on the level of the other factor

(ANOVA) If the null hypothesis is true,

we would expect all the sample means to be close to one another (and as a result, close to the grand mean (i.e. overall sample mean)).

(ANOVA) A two-independent sample t-test (with equal variances assumed)

will produce the same p-value as a one-way ANOVA test on two groups.

df1 = k − 1 df2 = N − k

(Anova) The F test statistic has an F Distribution with

SSE = Error sum of squares

(a measure of variation of each observation around its group mean)

Assumptions of Two-Way ANOVA

(i) Independent random samples (ii) normal population distributions for each possible group (iii) equal standard deviations

multicollinearity

- is a condition that exists when the independent variables are highly correlated with the dependent variable - if it exists=> regression coefficients will be difficult to interpret and the standard errors of the regression coefficients for the correlated independent variables will increase

interaction term.

Can "change" the slope of the regression line

Error Anova

It reps the variability in the measurements within groups

SSTotal = Total sum of squares

a measure of variation of the data around the overall mean)

Grand Mean

overall sample mean of all the data (sum of all the data divided by N)

Small MSE

small Within group variability

Bonferroni and Tukey-Kramer method is used to

-Adjust for multiple comparisons between pairs of treatment groups.

ANOVA Rationale of the F test statistic

-Measure variability between sample means. -Large variability within the samples weakens the "ability" of the sample means to represent their corresponding population means. -Therefore, even though sample means may markedly differ from one another, variability between sample means must be judged relative to the "within samples variability".

Assumptions of ANOVA

-Normality is not critical. Extremely long-tailed or skewed distributions only cause problems if sample size in a group is < 30. -The assumptions of independence both within and between groups are critical. -The assumption of equal standard deviations in the population is crucial. Rule of thumb: Check if largest sample standard deviation divided by smallest sample standard deviation is < 2

Relationship between ANOVA and 2 sample t−test

-Suppose we are interested in comparing two treatment/group means. If we have two independent samples, then for the hypothesis test -Then the test statistic for ANOVA will be the square of the test statistic of the corresponding t−test.

Tukey-Kramer Approach

-The greatest chance of making an error (type I) arises in comparing the largest mean with the smallest. -Hence, we compute a test statistic that depends on: xmax − xmin -Using this test statistic, we can determine how each confidence interval (in a family of intervals) should be modified. - If we can protect against an error in this "worse case", all other comparisons are also protected; put in another way, if this comparison is not significant, nor is any other one!

multiple comparisons procedures (Tukey/Boneferroni)

-We can also examine all pair-wise parameter comparisons to determine which parameters differ from which, using -Are methods of carrying out tests so that the familywise type I error rate is controlled (at 0.05 for example).

interaction effect

-exists when differences on one factor depend on the level you are on another factor. We can include them in the regression as follows: -Two variables interact if a particular combination of variables leads to results that would not be anticipated on the basis of the main effects of those variables.

Main effects

-this is the impact on the response (dependent variable) of varying levels of that factor, regardless of the other factor (i.e., pooling together the levels of the other factor). There are two main effects, one for each factor. P-value for factor A, P-value for factor B

The coefficient b1 is interpreted as the:

...The Change in the average value of y per unit change in x1, holding x2 constant

Two-sample t statistic

A ___ test assuming equal variance and an ANOVA comparing only two groups will give you the exact same p-value (for a two-sided hypothesis).

indicator variables

Categorical variables are encoded as ____(i.e., 0-1 variables) when used as explanatory variables in multiple regression. The____ variable for a particular category is binary: it equals 1 if the observation falls into that category and it equals 0 otherwise. -Conceptually, they can "change" the y-intercept of the regression line

ANOVA Rule of Thumb

Check if largest sample standard deviation divided by smallest sample standard deviation is < 2

Multiple regression

Extension of simple linear regression. It is used when we want to predict the value of a variable based on the value of two or more other variables. The variable we want to predict is called the dependent variable

Partial regression coefficient.

For each x term in the multiple regression equation, the corresponding β is referred to as a

ANOVA Hypothesis

H0 : µ1 = µ2 = · · · = µk Ha : at least two of the population means are unequal.

ANOVA Assumptions

Independent random samples; (ii) k normal population distributions; (iii) equal standard deviations; all effect sizes must be greater than or equal to zero

MSG = mean treatment/group sum of squares

MST = SST/ k − 1

Interaction

P-value for interaction effect of A and B

MSE = mean error sum of squares

SSE/N − k

Regression models

Specify categories of a categorical explanatory variable using artificial variables called indicator variables.

The r-squared multiple coefficient of determination is defined as:

The % of the variability in the y values that is explained by the regression 1 - (SSe/SSt)

at least two of the population means are not equal

The alternative hypothesis for an ANOVA F test is that

population means are equal.

The null hypothesis for an ANOVA F test is that

the F−test

To test the validity of a multiple regression model, we test the null hypothesis that the regression coefficients are all zero. We apply:

family of confidence intervals

When several 95% confidence intervals are considered simultaneously

The population standard deviations must be equal

Which of the following is an assumption in the ideal model for comparing several populations used for one-way ANOVA?

A large MST and a small MSE will

Will lead to a large value of the F test statistic. This will lead to a small p−value, which in turn leads to a decision to reject Ho.

SST = Treatment sum of squares

a measure of variation of the group/treatment means from the overall mean

If the alternative hypothesis is true

at least some of the sample means would differ.

B0, the Y-intercept

can be interpreted as the value you would predict for Y if both X1 = 0 and X2 = 0. However, this is only a meaningful interpretation if it is reasonable that both X1 and X2 can be 0, and if the dataset actually included values for X1 and X2 that were near 0. If neither of these conditions are true, then _____ really has no meaningful interpretation. It just anchors the regression line in the right place. In our case, it is easy to see that X2 sometimes is 0, but if X1, our bacteria level, never comes close to 0, then our intercept has no real interpretation.

Which of the following changes the analysis of variance results

each value in one of the samples is multiplied by the same constant C. the same constant is added to each value in one of the samples

An experiment has a one-way (or completely randomized, design)

if several levels of one factor are being studied and the individuals are randomly assigned to its levels. (There is only one way to group the data.)

Two-Way Analysis of Variance (ANOVA)

is a technique for studying the relationship between a quantitative dependent variable and two qualitative independent variables

Analysis of variance (ANOVA

is the technique used to determine whether more than two population means are equal


Kaugnay na mga set ng pag-aaral

Evolve - Med Surg - Cardio, Chapter 69: Management of Patients with Autoimmune disorders RV, Pharmacology (Hesi)

View Set

Women's Health/Disorders and Childbearing Health Promotion (level 2)

View Set