7007

¡Supera tus tareas y exámenes ahora con Quizwiz!

ANOVA assumptions

- Continuous outcomes and independent groups - Independent observations - Normally distributed outcomes - Equal variance of the outcome

One-sample t-test assumptions

- Continuous variable - Independent observations - Normal distribution

Dependent-samples t-test assumptions

- Continuous variable and two dependent groups - Independent observations - Normal distribution of differences

Independent-samples t-test assumptions

- Continuous variable and two independent groups - Independent observations - Normal distribution in each group - Equal variances for each group

Correlation Assumptions

- Observations are independent - Both variables are continuous - Both variables are normally distributed - Relationship between the two variables is linear (linearity) - Variance is constant with the points distributed equally around the line (homoscedasticity)

Multiple linear regression

- Observations are independent - Outcome is continuous - Relationship between the outcome and each continuous predictor is linear - Variance is constant with the points distributed equally around the line - Residuals are independent - Residuals are normally distributed - No perfect multicollinearity

Linear regression assumptions

- Observations are independent - Outcome is continuous - Relationship between the two variables is linear (linearity) - Variance is constant with the points distributed equally around the line (homoscedasticity) - Residuals are independent - Residuals are normally distributed

Chi-squared test Assumption

- Variables must be nominal or ordinal (usually nominal) - Expected values should be 5 or higher in at least 80% of groups - Observations must be independent

Measures to help identify outliers and influential observations

- standardized residuals - df-betas - Cook's distance - leverage

Which of the following would be considered a very strong negative correlation? - .89 - -.09 - -.89 - .09

-.89

What percentage of the variance is shared if two variables are correlated at .4? - 40% - 4% - 8% - 16%

16%

How many pairwise comparisons would there be for an ANOVA with four groups? - 16 - 4 - 12 - 6

left skewed

A density curve where the left side of the distribution extends in a long tail. (Mean < median.)

Chi-square test

A statistical method of testing for an association between two categorical variables. Specifically, it tests for the equality of two frequencies or proportions.

standard error

An estimate of the standard deviation of the sampling distribution of a statistic.

Graphs for one continuous and one categorical variable

Bar Point Boxplot Violin

Which of the following is appropriate to graph a single categorical variable? - Histogram - Bar chart - Boxplot - Scatterplot

Bar chart

ANOVA post hoc tests

Bonferroni - conducts t-test for each pair of means Tukey's honestly significant - less conservative of a test than Bonferroni

Alternative for failing homogeneity assumption (ANOVA)

Brown-Forsythe Welch's

Correlation effect size

Coefficient of determination r-squared or R^2

T-test Effect Size Statistic

Cohen's d .2 to <.5 = small .5 to <.8 = medium .8+ = large

What is the primary purpose of ANOVA? - Comparing means across three or more groups - Comparing medians across three or more groups - Examining the relationship between two categorical variables - Identifying normally distributed data

Comparing means across three or more groups

What is the primary purpose of the three t-tests? - Comparing means among groups - Comparing medians among groups - Examining the relationship between two categorical variables - Identifying normally distributed data

Comparing means among groups

The results of running R code show in which pane? - Source - Environment - History - Console

Console

Chi-squared effect sizes

Cramer's V Phi coefficient Odds ratio

Which t-test would you use to compare mean BMI in sets of two brothers? - One-sample t-test - Independent-samples t-test - Chi-squared t-test - Dependent-samples t-test

Dependent-samples t-test

Custom functions are useful when doing which of the following? - Loading a library - Visualizing the distribution of one variable - Working with continuous variables - Doing the same thing multiple times

Doing the same thing multiple times

What is the primary purpose of Pearson's and Spearman's correlation coefficients? - Examining the relationship between two noncategorical variables - Identifying deviations from normality for continuous variables - Examining the relationship between two categorical variables - Comparing means across group

Examining the relationship between two noncategorical variables

ANOVA statistic

F(4, 1404) = 43.3; p < .05

Which R data type is most appropriate for a categorical variable? - Numeric - Factor - Integer - Character

Factor

Alternative test for two-way ANOVA

Friedman test

Which of the following is appropriate to graph a single continuous variable? - Waffle chart - Histogram - Bar chart - Pie chart

Histogram

Graphs for single continuous variables

Histogram Density Plot Box Plot

Which of the following assumptions does not apply to all three t-tests? - Independent observations - Normal distribution of continuous variable - Homogeneity of variances - Inclusion of one continuous variable

Homogeneity of variances

Which of the following measures would be most appropriate for describing the spread of a variable that is extremely right-skewed? - Standard deviation - Range - IQR - Mode

IQR

Which of the following assumptions does not apply to ANOVA? - Independent observations - Normal distribution of continuous variables - Homogeneity of variances - Inclusion of one bivariate variable

Inclusion of one bivariate variable

Which of the following is true about the adjusted R2? - It is usually larger than the R2 - It is only used when there is just one predictor - It is usually smaller than the R2 - It is used to determine whether residuals are normally distributed

It is usually smaller than the R2

Alternative for failing the normality assumption (ANOVA)

Kruskhal-Wallis

Graphs for two continuous variables

Line Scatterplot

Data transformations

Linear transformations - keep existing linear relationships between variables, often by multiplying or dividing one or both of the variables by some amount Nonlinear transformations - increase (or decrease) the linear relationship between two variables by applying an exponent (power transformation) or other function to one or both of the variables

Alternative to the independent-samples t-test

Mann-Whitney U Kolmogorov-Smirnov test

When an independent-samples t-test does not meet the assumption of normality, what is an appropriate alternative test? - Sign test - Levene's test - Mann-Whitney U test - Dependent-samples t-test

Mann-Whitney U test

Violating independent observations assumption (Chi-squared)

McNemar's test Cochran's Q-test

Which of the following measures would be most appropriate for describing the central tendency of a variable that is continuous and normally distributed? - Mean - Variance - Median - Mode

Mean

The normal distribution depends on which of the following? - Mean and standard deviation - Sample size and probability of success - Standard deviation and number of successes - Mean and probability of success

Mean and standard deviation

Which of the following is not an assumption for the Pearson's correlation analysis? - Normally distributed variables - Monotonic relationship - Linear relationship - Constant variance

Monotonic relationship

Graphs for two categorical variables

Mosaic Bar

Which of the following is not an assumption for simple linear regression? - Normally distributed variables - Multicollinearity - Linear relationship - Constant variance - Normally distributed residuals

Multicollinearity

Apply a Bonferroni adjustment to a p-value of .01 if the analyses included six pairwise comparisons. If the threshold for statistical significance were .05, would the adjusted p-value be significant? - Yes - No

Which of the following is not an assumption for binary logistic regression? - Normally distributed variables - No multicollinearity - Linearity - Independence of observations

Normally distributed variables

Which of the following tests would be used to test the mean of a continuous variable to a population mean? - One-sample t-test - Independent-samples t-test - Chi-squared t-test - Dependent-samples t-test

One-sample t-test

Which test is used to determine whether a correlation coefficient is statistically significant? - Paired samples t-test - Chi-squared test - One-sample t-test - P-value

One-sample t-test

Graphs for single categorical variables

Pie Waffle Bar Point

Which of the following is not a recommended type of graph? - Pie chart - Bar chart - Waffle chart - Density plot

Pie chart

For a categorical predictor in a logistic regression model, what is the group that other groups are compared to called? - Null group - Independent group - Standard group - Reference group

Reference group

The chi-squared distribution often has what type of skew? - Left - Right - It depends - It is not skewed

Right

The binomial distribution depends on which of the following? - Mean and standard deviation - Sample size and probability of success - Standard deviation and number of successes - Mean and probability of success

Sample size and probability of success

Alternative to one sample t-test

Sign test - examines the median instead of the mean

Alternative test for Correlations

Spearman's rho

A significant odds ratio of 2.5 for BMI as a continuous predictor of heart disease in a binary logistic model would indicate which of the following? - The odds of heart disease increase 2.5% for every 1-point increase in BMI. - Those with heart disease have 2.5 times higher odds of having an increasing BMI compared to those without heart disease. - The odds of heart disease are 2.5 times higher for every 1-point increase in BMI. - There are 2.5 times as many people with heart disease as without among those with higher BMI.

The odds of heart disease are 2.5 times higher for every 1-point increase in BMI.

True or False? In R, categorical variables are best represented by the factor data type and continuous variables are best represented by the numeric data type. - True - False

True

Violating expected values assumption (Chi-squared)

Use Fisher's exact test

t-test used when the variances n two groups are unequal

Welch's t-test

In which situation would you use planned comparisons? - After a significant ANOVA to compare each pair of means - Instead of an ANOVA when the data did not meet the normality assumption - When you have to choose between two categorical variables - When you conduct an ANOVA and have hypotheses about which sets of means are different from one another

When you conduct an ANOVA and have hypotheses about which sets of means are different from one another

Alternative to dependent-sample t-test

Wilcoxon signed-ranks test

right skewed

a distribution with a tail that extends to the right (Mean > Median)

Two-way ANOVA

a hypothesis test that includes two nominal independent variables, regardless of their numbers of levels, and a scale dependent variable

monotonic

a relationship that goes in only one direction

Significance for the coefficients (b) is determined by - an F-test. - an R2 test. - a correlation coefficient. - a t-test.

a t-test.

Ordinary least squares

a type of linear least squares method for estimating the unknown parameters in a linear regression model

Durbin-Watson test

can be used to determine whether the model violates the assumption of independent residuals

ANOVA effect size tests

eta-squared omega-squared .01 to <.06 = small .06 to <.14 = medium .14+ = large

Density plots, histograms, and boxplots can all be used to... - examine frequencies in categories of a factor. - examine the relationship between two categorical variables. - determine whether two continuous variables are related. - examine the distribution of a continuous variable.

examine the distribution of a continuous variable

Pearson's partial correlation

examining how multiple variables share variance with each other

Deterministic

have one precise value for y for each value of x

A confidence interval indicates a significant odds ratio when - it includes 1. - it includes 0. - it does not include 1. - it does not include 0.

it does not include 1.

Which of the following opens the ggplot2 library? - install.packages("ggplot2") - library(package = "ggplot2") - summary(object = ggplot2) - open(x = ggplot2)

library(package = "ggplot2")

y = mx+b

m is the slop of the line b is the y-intercept x and y are the coordinates of each point along the line

Computing the percent correctly predicted by the model is one way to determine... - model fit. - model significance. - predictor significance. - if assumptions are met.

model fit.

Platykurtic

normal curves that are short and more dispersed (broader)

Leptokurtic

normal curves that are tall and thin, with only a few scores in the middle of the distribution having a high frequency

In a data frame containing information on the age and height of 100 people, the people are the _____________ and age and height are the _____________. - observations, variables - variables, observations - data, factors - factors, data

observations, variables

Chi-squared is computed by first squaring the differences between... - observed frequencies and expected frequencies. - observed frequencies and the total sample size. - observed frequencies and observed percentages. - expected values and observed percentages.

observed frequencies and expected frequencies.

Which of the following is not an effect size for chi-squared? - Cramér's V - Odds ratio - Phi - p-value

p value

The block of text at the top of a code file that introduces the project is called - library. - summary. - prolog. - pane.

prolog.

Covariance

quantifies whether two variables vary together

Correlation coefficients

r = 0 - no relationship r = .2 - weak relationship r = .5 - moderate relationship r = .8 - strong relationship r = 1 - perfect relationship

F-statistic

ration of explained information (in the numerator) to unexplained information (in the denominator)

Continuous predictors influence the ______ of the regression line, while categorical predictors influence the _____________. - slope, intercept - intercept, slope - R2, p-value - p-value, R2

slope, intercept

omnibus test

tests for an overall effect, but does not provide info on which means are unequal

Residuals

the difference between an observed value of the response variable and the value predicted by the regression line

A sampling distribution shows... - the distribution of means from multiple samples. - the distribution of sample sizes over time. - the distribution of scores in the population. - the distribution of observations from a single sample.

the distribution of means from multiple samples.

The z-score is... - the number of standard errors between the mean and some observation. - the difference between the sample mean and population mean. - the width of the 95% confidence interval. - the number of standard deviations an observation is from the mean.

the number of standard deviations an observation is from the mean.

A mosaic plot is used when graphing... - the relationship between two continuous variables. - the relationship between one continuous and one categorical variable. - the relationship between two categorical variables. - data that are not normally distributed by group.

the relationship between two categorical variables.

standard deviation versus standard error

the standard deviation is a measure of the variability in the sample, while the standard error is an estimate of how closely the sample represents the population

To learn which cells are contributing the most to the size of a chi-squared statistic, compute... - the standardized residuals. - the p-value. - the odds ratio. - Cramér's V.

the standardized residuals.

Wald test

the statistical significance of the slope in linear regression

predicted values

the values of y predicted by the model for a given value of x

Chi-squared can be used to understand the relationship between... - any two variables. - two categorical variables. - two continuous variables. - one categorical and one continuous variable.

two categorical variables.

Stochastic

when you are unable to predict or explain something

In a normal distribution, 95% of observations are... - within one standard deviation of the mean. - included in computing the mean. - within two standard deviations of the mean. - divided by the sample size to get the standard deviation.

within two standard deviations of the mean.

The R2 is the squared correlation of which two values? - y and the predicted values of y - y and each continuous x - b and t - b and se

7007

Conjuntos de estudio relacionados

Abeka Algebra 2 Grade 10 - Test 7

Chapter 23 Scrotum

Greek 101- Alpha Tau Omega

Sociology Final

MGMT300 test 3

Chapter 8

CHAPTER 6: DNA Structure, Replication & Recombination

exam 3 questions (non math)

menopause ch 53 adaptive quiz

World History: Unit 7 - The Industrial Revolution

Educ 271 essays

911 Driving Knowledge Test

Smartbook 5

CH 13 HW

414

12.7 Windows App Management QUIZ

Physics Vibrations, Waves, and Sound

Anatomy 1 - chapter 9

Legal Aspects in HWD - Midterm

Circular-Flow Diagram