Midterm 3
Level 2
"Macro level" equations examine trends at the level of the cluster/group. Clusters in any dataset are assumed to be a random sample of all possible clusters in the population.
Level 1 of MLM
"Micro-level" equations that examine trends at the level of the individual. This is the lowest level of MLM. Due to clustering there is one equation for each group in the dataset.
different people have different amounts of change between groups! It varies - different people have different levels of effect of the intervention.
(πα)ij term in WS design
Dummy Codes in Mixed Interactions
-tests differences in slopes compared to a reference group - the simple slope analysis is straightforward (ie. run multiple models with each group as the reference group)
GLM & MR frameworks -Only advantage of MR is hierarchical tests; must be computed by hand w/ GLM
Categorical-continuous interactions are easily handled in ______________ frameworks
using Satterthwaite approximation method •This is only df adjustment method available in SPSS MIXED
For multilevel, how are df calculated?
Assumptions of the ANCOVA
-Independence of Covariate -Homogeneity of regression slopes
the "usual" within-cells error term
In 1bs/1ws designs, The main effect of A is tested against...?
The value of P(Y) can range between
0-1. Where 0 is no chance that the outcome occurred, and 1 is 100% chance that the outcome is absolutely occurring. The goal of maximum likelihood is to adjust the parameters to get P(Y) as close to 1.0 as possible.
Two Repeated Measures Factors -Each factor of interest is fully crossed with other factor AND subjects
A X B X S describes...
Omega squared (w^2)
A good estimate of how much variance in the observed data is accounted for by a particular effect
a unique estimator can be specified in terms of combinations of covariances and variances, given other parameters in model
A parameter is identified in path analyses when...?
goodness of fit
A pseudo-R^2 estimate of a particular logistic regression model against a null (intercept) model. Pseudo R^2 are typically lower than OLS R^2 values from similar models to determine goodness of fit and not the closeness of the value to 1.
Yes -- •Use orthogonal contrasts on categorical variables •Compute product variables with centered covariates •Evaluate t-tests of product variables
Can test contrast on slopes in MR?
No ,expect for special circumstances. We can ask the question, if the causal hypothesis is correct, what is the magnitude of estimated relationships?
Can we use path analyses to 'prove' or 'disprove' hypotheses?
•Try Unrestricted first •If it can't be estimated, then •AR1H or CSH plausible even if not optimal, because they allow for heterogeneous errors across conditions •However one pays a cost of heterogeneity in form of adjusted df
AR1H or CSH?
μj - βy.x (xj - μx)
Adjusted mean in ANCOVA =
AIC
Adjusts goodness-of-fit value to account for the number of parameters that have been estimated. It "penalizes" models that try to fit too many parameters.
AIC (Akaike information criterion)
Adjusts the goodness of fit value to account for the number of parameters that have been estimated. Essentially, it "penalizes" models that try to fit too many parameters.
effect coding
All slopes that are tested are tested against mean in...?
continuous predictor variable
Are added into designs for reasons such as, continuous variables related to time (age, time to complete task, or time of day), continuous variables that capitalize on individual differences in participants (IQ, working memory capacity)
No - error terms are not homogeneous, with more error variance for linear, quadratic, and cubic terms...
Are error terms homogeneous?
LR test
As N grows asymptotically large, Wald test equals ...?
•Assumes that Js are ordered, 1 to J, and that there is a first-order association so that adjacent levels are more highly correlated than separate levels, with the correlation being equal to a standardized regression of a variable at J onto the variable at J-1. •Also assumes invariant (stationary) first-order auto-correlation, ρ
Autoregressive, Equal Variances:
-Unequal variances across conditions but autoregressive correlation structure
Autoregressive, heterogeneous
Difference in slope from psych department from salary
B weight:
In 1bs/1ws designs
Better to use ηG2 in what??
Linear Relationship
Binary data does NOT have a _____ relationship.
N gets large
Binomial only approximates the normal distribution as _________?
-J variance parameters, different for each of the J variables- constrained equal correlation (implies different covariances given that J variances vary:•σjj' = ρjj' * (σjj* σj'j')
CS, Heterogeneous
Main effects: Continuous Variables
Center by subtracting the mean of all the scores (mean-centering).
Simple Variance components
Common covariance structure assumes all random effects are independent and that their variances are the same. This is the default for SPSS.
Unstructured
Common covariance structure that assumes that covariances are completely unpredictable and do not conform to a systematic pattern. This is a default option for random effects in SPSS
Simple AR
Common covariance structure that is for a first-order autoregressive structure. This means that the relationship between variances changes in a systematic way. Ie: the rho- values close to the diagonal indicate that the correlations between variance are most similar in successive measurements. The observations that are less correlated are not beside each other.
-Only two parameters •One variance parameter, equal for all J variables; •One covariance parameter, equal for all off-diagonal elements
Compound symmetry (CS)
Nagelkerke
Corrects the Cox-Snell R^2 such that the statistic can reach a theoretical maximum of 1.0
Main effects: Categorical Variables
Create contrasts to capture group effects (dummy, effects, and orthogonal contrasts)
•1 within-subjects factor B and one between-subjects factor A •A X (B X S) People can only be in one level of A, so the persons are NESTED within levels of A.
Describe 1 between, 1 within factor in repeated measures design
The diagonal elements of Ξ are the variances of the contrast variables in K. The off-diagonal elements are the covariance between contrasts.
Describe the diagonals in a sphericity matrix for repeated measures
WS Design
Design is A x S with S = subjects
multiple alternative error structures (not sphericity) and to compute df adjustments appropriate for those error structures •Model will estimate covariance structure for residuals and generate proper error terms for F tests of fixed effects
Developments in multilevel regression for RM allow the analyst to specify...?
-Unlikely with repeated measures (recall power advantage for within-subjects over between-subjects designs given correlated errors - consistent individual differences)
Diagonal (uncorrelated)
ηG2 includes typically more error terms than ηp2
Difference between ηG2 and ηp2
log-odds
Due to logit transforms, the probability estimate (pY) is defined because b-values are interpreted in terms of increases/decreased in __________ per unit increase in IV
Logistic Regression
For data with binary outcomes (0 or 1) we use _________.
repeated measures design
Each individual experiences 2+ different levels of a factor. (Ex. in longitudinal studies, taste tests, etc.)
the population contrast has a mean of 0
Each test of a contrast in repeated measures involves a test of the idea that ....?
Two Repeated Measures Factors (Like in the 10 point extra credit)
Effects are scaled as deviations from the grand mean, μ in...?
The idea that there are typically positive correlations across persons levels across WS variables
Efficiency in WS designs depends on...
-Variables 'inside' the path model, that serve as both predictors and outcomes
Endogenous variables in path analysis
SS subjects
Error SS for intercept term in bs paragraph is actually..
the grand mean of the covariate ("centroid" of the multiple covariates)
Estimated means are by default estimated and tested at the what..?
-Variables that never serve as dependent variables -We are not seeking to 'explain' their variance; they only serve as predictors
Exogenous variables in path analysis
a manipulated independent variable (for efficiency!!!)
Experimental psychologists often use within-subjects designs for __?
Multinomial logistic regression
For data with more than 2 categorical outcomes.
J-1 different dependent variables -- contrast variables -- by multiplying orthogonal contrast weights and the scores on Y for that person •In the case of J = 5, each person can have a set of computed linear, quadratic, cubic, and quartic contrast variables
For person-contrasts in repeated measures, each person's data yields what?
error SS for intercept term in "Between-subjects" paragraph is actually the SSSubjects
For repeated measures in SPSS, where is the "S" term?
J-1 transformed linear combinations of the J dependent variables, weighted by C.
For sphericity, We can think of the matrix contrast variables as...?
estimate a variance around those parameters
For these random effects, we don't separately estimate each group's intercept and slope, we simply do what?
Yij = μ + πi + αj + (πα)ij + εij
Full ANOVA Model equation
More straightforward than continuous-continuous case with respect to computing simple slopes, testing linear combinations of simple slopes
GLM is more powerful for ANCOVA because...
rows of transposed M matrix using integer weights
Helmert Contrasts:
d = (M1 - M2) / SQRT (pooled MSError) -Where pooled MSError = (SSSubjects+ SSAXS ) / {(n-1) +(n-1)(J-1)} Cohen's d
How can we obtain SD-unit mean differences against between-person variance for repeated measures?
•Law of linear combinations: •D = M1 - M2•sD2 = {(s12 + s22 -2s1s2r12)} •Thus we can compute the t-test without actually computing the difference scores for each person •t = (M1 - M2) - k / SQRT {(s12 + s22 -2s1s2r12)/N}
How could we compute SD and t-test from summary statistics?
•Multilevel regression programs divide parameters into fixed effects and random effects •Fixed effects: regression parameters assumed to be equal across a set of replicated levels
How do multilevel models handle within-subjects analyses?
compute a likelihood-ratio CHI-SQUARE (Χ2) test, sometimes denoted G: •G = DNull - DModel
How do we do a test of model fit with logistic regression?
Examine hierarchical test by computing deviance for model with all k predictors versus model dropping a given predictor
How do we use HR in LR?
•Nesting is explicitly represented and declared in problem setup •Level 1 vs. Level 2 (larger nesting hierarchies can be coded, not treated in this class) -e.g., Level 1 = students, Level 2 = classrooms (and can generalize to Level 3 = schools, Level 4 = districts, Level 5 = cities.....)
How does nesting work in multilevel models?
Wald test is often used to test regression coefficients (although logistic regression test preferred)
How does the Wald Test come into play with simple regression?
as predictors are added to evaluate improvement in fit It scales the total amount of 'information' available in the dependent variable
How is deviance used in logistic regression?
•Experimental psychologists often aggregate data over clusters (average RT, proportion correct within a condition, etc.) and use the aggregated statistics as dependent variables •Psychologists estimating learning may treat all students as individuals, ignoring nesting within classrooms
How to we treat clustered/grouped data?
slope for Psychology vs average of Sociology & History
How would you use GLM & LMATRIX to test the complex contrast illustrated in CCWA?
A is not crossed with Subjects, but B is •Within-Subjects Sources: •SSB = Jn Σ bk2 •SSAB = n ΣΣ (ab)jk2 •SSSw/AxB = SStotal - SSS/A - SSB - SSAB
If A is the between-subjects variable,
by simple main effects or partial interaction contrasts. Main effects evaluated by contrasts on marginal means (or pair-wise comparisons on marginal means)
If exploring for mean differences after significant ANOVA F-test In two-repeated measures factors, Interactions can be evaluated HOW? What about main effects?
Interaction: Categorical by Categorical
If factorial ANOVA, then create cross-factor interaction contrasts
No
If greenhouse geiser tests is .504, is sphericity assumption satisfied?
independent (additive)
If interaction is not significant, then any main effects are ___?
•The main effect of A is tested against "usual" within-cells error term •The main effect of B and the A X B interaction are tested against the B X S error term, like in one-factor repeated measures
In 1bs/1ws designs, how are the main effects tested?
similar covariate-pretest correlations between the groups •Hence any group X covariate interaction should (indirectly) reflect differential relevance of predictors to training gains •Example: training of spatial imagery for memory encoding might benefit those higher in imagery ability
In ANCOVA, Given random assignment, one is justified in arguing for __?
use pooled error term (MSAXS) as the denominator for the comparison F-test •If not, one needs to use a specific error term
If sphericity assumption is tenable, one can what?
inflated, sometimes severely so (so it needs to be attended to!)
If the sphericity assumption is violated, then Type I error rate is ..?
where the LS means are evaluated
If there is no covariate X group interaction, it doesn't matter ___?
sum of the diagonal elements (trace) in the Var-Cov matrix
If we create an estimate of V ar-Cov matrix for variables, the overall mean square is the
the B X S error term, like in one-factor repeated measures
In 1bs/1ws designs the main effect of B and the A X B interaction are tested against...?
significant; smaller
In ANCOVA, If regression of Y on covariate is ________________, the reduction in MSresidual will lead to a ____________ denominator (1-R2 [residual variance] shrinks as R2 grows)
Treatment Group
In ANCOVA, _____________________- means are adjusted for the covariate
covariate
In GLM, estimated means adjust for__?
•only captures average individual differences, and we generically expect that that σY2 > 0 •Likewise, we aren't interested in testing for the A X S interaction (& couldn't!)
In WS designs, the Main effect of Subject...
ANCOVA
In ________________, factorial experiment is evaluated with one or more covariate(s) that influences the dependent variable
Fixed factor (condition) and random factor (subjects)
In a simple repeated measures design we have one __________ and one __________.
Wald Statistic
In logistic regression we use the ______. Is similar to running a t test on regression coeffects in linear regression, but instead of the t we use the Z statistic.
diagonal elements (equal error variances for each contrast). •Sphericity: σ12 = σ22 ... = σJ-12
In repeated measures, The sphericity assumption is that the matrix CΦC' has equal ..?
there are three different interaction error terms involving Subjects, one for each test (e.g., A X S source used for A main effect, etc.)
In the A X B X S design, how many interaction terms would there be?
compute separate variables for each main effect source: •1) aggregating over B produces an A X S design on the 'marginal means' of A •2) aggregating over A produces a B X S design on the marginal means of B
In two-repeated measures factors, because each person receives all within-subjects cells, it is possible to...?
type of expected MS
In two-repeated measures factors, because participants are randomly sampled, each within-subjects factor has the same____________________ as with the one-way repeated measures ANOVA
Yes, the sphericity assumption needed for standard F-test applies separately to each error term -It's still only a concern if J > 2, or K > 2, (or [J-1]*[K-1] > 2) •Can independently satisfy assumption for 1 source, not other 2 •Violation of assumption matters for Type I error rate control, so an adjustment might be needed
In two-repeated measures factors, is sphericity a concern?
evaluation of ANOVA F-tests
In two-repeated measures factors, planned comparisons can proceed in lieu of...?
error term should include all interactions of manipulated factors with covariates and relevant "subject" error terms
In two-repeated measures factors, the error term should include what?
each source is tested against its interaction w/ subjects (as in the one-way repeated measures design)
In two-repeated measures factors, what's the implication implied because each source interacts with the subjects?
contrast codes
Interpretation of interaction regression coefficients will depend on the _________ that were used for the categorical predictors
No
Is the interaction test typically included in ANCOVA?
Homer and Lemeshow
Is the simplest to compute and interpret. It is the proportional reduction in the absolute value of the log-likelihood. It measures the exact changes in -2LL when adding predictors.
Level 1
Level _______ intercepts from various individual groups vary randomly in the value around the fixed population intercept
Yij = μ + αj + βy.x (xij - μx) + eij
Linear Model for ANCOVA with 1 Between-Subjects Factor
•Typical homogeneity of variance assumptions, but also •Assume no true covariate X treatment interaction (homogeneity of regression) -The idea is that all the covariate is doing is reducing error variance •'independence' of covariate & independent variable (treatment)
List the Assumptions for ANCOVA:
-1) independence of errors across persons -2) homoscedasticity of residual variance -3) normal distribution of errors -4) linearity of functional form -1) self-containment of regression equations •All 'relevant' causes included in model -Justifies assumption that residual terms contain other causes that are orthogonal [in effect on d.v.s] to variables in the model -2) Orthogonal residuals across equations
List the assumptions for Path analysis:
deviance
Log-likelihood is often converted to ________ also known as -2 log likelihood.
•Define units of measurement •Declare fixed effects of interest •Declare the random effects (for example, were persons randomly sampled? Schools? Etc.) -Are effects likely to vary with random source? •At minimum we need to define the proper sources of 'error variance' for evaluating fixed effects
MLM procedural steps
•In regular ANOVA, all individual differences are part of 'experimental error' and are assigned to the MSwithin •In ANCOVA, we create a new MSresidual that has variance associated with the covariate removed from MSwithin
MS within differences between ANCOVA and anova?
Analyst can specify a flow of relationships, specifying which variables are proximal causes versus distal causes
Specification of path analyses
Cox and Snell
More complicated to compute , but is considered a "close" analog to the linear regression R^2. This is probably the most widely-used.
OLS regression •This requires writing likelihoods for parameters based on data and iteratively solving for the parameters using statistical optimization methods
Multilevel regression use more advanced and tricky forms of parameter estimation (typically, variations on Maximum Likelihood estimation) rather than...?
-1 and +1 but doesn't usually dip below 0 (only when F-value is less than 1).
Omega ^2 ranged between
error variance in simple repeated measures
Repeated responses across participants are going to be highly correlated, this adds a new source of.....
Simple Repeated Measures
One within-subjects IV
Simple Mixed Models
One within-subjects IV and one Between Subjects IV
SE for simple slopes
Run separate dummy coded MRs to get __?
Path analysis
Specification, Estimation, and Testing are three stages in what?
the error term (Efficiency in WS design!)
Positive correlations across WS design reduces...
add covariate to factorial design is to get a more powerful test of mean differences
Purpose of ANCOVA:
restricted maximum likelihood generically appropriate when random effect variance components not the focus of interest but merely sources of error variance, as in repeated measures problems
REML
variation of regression parameters across units of analysis within a level Can be divided into random intercepts
Random effects:
groups differ in level of dependent variable; common slope
Random intercepts:
groups differing in relationship of X to Y
Random slopes:
path analysis
Regression model depicts estimated regression weights for a model specified to capture the 'causal flow' in a set of variables in what?
more than 2 levels of within-subjects factor
Repeated Measures- General ANOVA model generalize to...?
repeated measures, by old convention we use S (for Subjects) instead of B (A X S), and denote subjects as πi•Yij = μ + πi + αj + (πα)ij + εij
Repeated measures formula for more than TWO levels:
1bs/1ws designs
Sphericity assumption applies (only) to the B X S error term in...?
•Adjusted degrees of freedom to account for heterogeneity in cluster size w/o assuming equal variance in all clusters •Analog to unequal-variance t-test
Sattherthwaite approximation
No, never!
Should we ever standardize categorical variables?
BIC
Similar to the AIC, but has harsher penalties for the number of parameters being estimated. This should be used when N is high and the number of free parameters is low.
Comparisons. •If G-G ε < .90, use separate error term (and comparison as computed in SPSS directly) •IF G-G ε >= .90, use pooled error term, computed by hand •This applies SEPARATELY to each of the 3 error terms, A X S, B X S, & A X B X S
Sphericity assumptions matter for WHAT in AXSXB designs?
individuals have been randomly selected from a random population.
Subjects is random factor because?
methods for computing likelihood-ratio chi-square tests or approximations for different random covariance structure models
Test for random effects in MLM involve what?
Effect Codes in mixed interaction
Tests differences in slopes compared to the grand mean
Paired-Observation t-test
The _______________ computes a pretest-posttest difference score for every person
intraclass correlation
The ___________________ is the proportion of overall variance captured by clustering into groups
fixed They are assumed by the model to be the same for all units of observation in the model
The average regression coefficients at Level 1 or from Level 2 to Level 1 are what?
K = LβM
The contrast equation in GLM is:
Deviance, or -2 Log-Likelihood
The log-likelihood estimation is often converted to ___________ or ____________.
the SQRT(Σcj2)
To get orthonormalized contrast weights in repeated measures, one divides by what??
•1) LR test •2) Wald test
To get tests of each regression coefficient, two options:
-We predict the probability of the outcome occurring •b0 and b1 -Can be thought of in much the same way as multiple regression -Note the standard regression equation is embedded in the logistic regression equation's exponent
The outcome of logistic regression...?
Reciprocal Odds
The probability of an event NOT occurring. One gets this through dividing the original odds ratio (not log odds) by 1.
the 2-way analysis
The same effect size statistics, η2 and ω2, can be calculated in ...?
BIC (Schwarz's Bayesian Criterion)
This is very similar to the AIC, but imposes harsher penalties for the number of parameters being estimated. Field suggests that this statistic should be used when N is high and the number of free parameters is low.
Wald test
This test tells you how much a SEM model could be improved if parameters were deleted from the model by removing paths or constraining variables.
Chi ^2
To assess logistic regression model fit we use the X^2 distribution.
it estimates the group differences in intercepts as a variance component
To estimate ICC, The random intercept for groups estimates what...?
Bernoulli Series -Binary outcome (0, 1) -probability of 1 = p, probability of 0 = q -r = # of successes
Two-outcome variable in series of N independent trials is a...?
-Full covariance matrix of repeated measures error terms, freely estimated-J variances, J*(J-1)/2 covariances -With J = 5 levels (5 variables), this involves estimating 5 variances and 10 covariances (15 parameters) while also estimating the fixed effects regression parameters
Unrestricted error structures (multilevel):
the null hypothesis that the odds behave exactly as modeled except for sampling error in the dependent variable
What null hypothesis is tested with logistic regression?
the grouping variables
We can test whether a continuous predictor variable (covariate) interacts with __?
Covariance and IV's
We must make sure that the ________ and _________ are independent of each other
If the interaction accounts for a significant amount of variance in the observed data How the sum of squares is partitioned to each of the different effects (main effects + interaction)
We use hierarchical regressions to determine?
Get simple slope of reference group (with coded 0) •Interaction effects are differences in slopes against the slope of the reference group •Can run a different dummy coding model to get SE of simple slope for all groups
What advantages of doing dummy coding have with categorical-continuous interactions?
-To reduce within-group variance -statistical control of confounds
What are the 2 reasons to include covariates?
the population intercept and population regression slope
What are the fixed parameters of the population regression equation
categorical variables
What are the only variables of interest in an ANCOVA?
•H0: all αj = 0 •H0: all βk = 0 •H0: all (αβ)jk = 0
What are the three major hypotheses we care about with a 1bs/1ws design?
•Wald test- form a z-test of est. parameter/SE -Interpret as a standard normal deviate •Use model comparison (for 1 or more parameters)
What are the two methods to test hypothesis that a random variance component = 0 in MLM?
MS within
What cannot be the error term in a one-factor repeated measures design
R2 and ω2 are proportion of variance effect size metrics
What do R^2 and w^2 have in common?
•If Sphericity applies to a specific test, use of the pooled error term (MSAXS) provides a more powerful test of the contrast •F = SSK/ MSAXS
What do we do if sphericity is satisfied in repeated measures tests?
intercept estimate at mean slope
What does effect coding result in?
how much variance we account for with BOTH fixed and random effects
What does the pseudo R^2 tell us?
You run the result of making type 1 error (false alarm); if the structure is too complex, you run the risk of a type 2 error (a miss)
What happens if you covariance structure is too simple?
The repeated measures analog of ω2 Est. ω2 = J-1 (MSA - MSAXS) / (SSTotal + MSSubj)
What in repeated measures provides an unbiased estimate of the population variance accounted for by the repeated measures factor?
classic ANCOVA
What is a special case of moderated regression analysis in GLM?
The bonferroni correction
What is needed for nonorthogonal contrasts in two repeated measures factors?
Partial derivatives of likelihood fcn w/ respect to unknown parameters required
What is required for estimation models for multilevel regression?
First perform "main effects" model that included covariate(s) and categorical coded variables • IF factorial ANOVA, include standard A X B interaction •Then, add product variables w/ covariate(s) to test for moderation •If nonsignificant increase in R2, trim product variables & evaluate "main effects"
What is the difference between MR and categorical-continuous interactions?
R^2. •R2 can still be computed as •R2 = SSA / SSTotal •For specific comparisons: - R2K = SSK / SSTotal
What is the effect size calculation for repeated measures designs?
Sphericity - complex analog to homogeneous error variances assumption in between-subjects ANOVA
What is the new within-subjects (repeated measures) design assumption?
to merely serve to provide statistical control and/or to increase power on the estimate.
What is the purpose of continuous variables in ANCOVA?
within subjects experiments
What method of analyses has a new error term?
ηG2 and ηp2; These proportion of variance stats are greatly affected by relative N of levels (J & K)
What tends to obscure differences in effect size stats in two-repeated measures factors?
We take the log of it to make the logit transform. We do this to get a simple regression equation Logit = Bo + B1X
What to we do with the odds ratio in logistic regression?
no interactions possible that involve subjects with A (because subjects are nested w/i A)
What's a consequence of "nesting" in repeated measures?
In WS, we care about the variance of transformed variable
What's different about sphericity with repeated measures vs other homogeneity of variance assumptions?
As with A X S design, for A X B X S design check G-G estimate of epsilon (ε), but this time independently for each error source -( A X S; B X S; A X B X S) •If estimated G-G ε < .90, use G-G df-adjusted F-test rather than standard test
What's different with checking G-G estimate of epsilon (ε) in a two-repeated measures factors model (A X B X S design)?
1. First model is null; just intercept 2. Then we add one or more IVs
What's the order of models for logistic regression?
ALWAYS! Unless otherwise specified!
When do we assume that you need to mean center the continuous IV?
When there is no meaningful "0" value.
When do we choose NOT to center variables?
Violations of independence assumption Case 1: fluke (bad) randomization - covariate happens to differ between treatments Helps to adjust for bad random assignment Case 2: biased assignment - covariate used as basis for decision to treat (ANCOVA OK but homogeneity of covariance assumption must be met) Case 3 - reactive effects of treatment on covariate-If covariate administered after treatment -ANCOVA logic flawed because it won't just passively 'reduce error' but will partial out some treatment variance Case 4 - nonrandom assignment / intact groups-ANCOVA helpful but careful assessment of relationships of covariate & treatment needed
When is ANCOVA more than just reducing MSResidual?
Use Greenhouse-Geiser Test
When n < 20 in repeated measures, what do we do?
(J-1) (n-1) •When sphericity cannot be assumed and specific error-term tests needed, df = n-1 for each comparison
When sphericity is assumed in repeated measures, df = what?
Clustering
When you have data that exhibits dependencies on correlations among subsets of cases.
multilevel model
When you have heterogeneity of slopes
Log-Likelihood is summing the probabilities associated with he predicted and actual outcomes. It is the estimate of how much explained information is leftover after fitting the model.
Why do we want smaller rather than larger log-likelihood values?
Everybody is given every level= the design has participants CROSSED with A, that has J levels
Why in WS designs does every person experience all "experimental treatments"?
OLS regression coefficients become negatively biased (too small). Thus, the CI is too small. There is also larger significance values, and this leads to an overestimation of significance (alpha inflation).
Why is clustered data bad?
due to dominance of systematic individual differences as component of total variance
Why is η2 often unacceptably small in two-repeated measures factors?
-True nominal scales -Experimental designs (like a 2 X 2)
With categorical variables, we can have:
Compound Symmetry
_______________________ is a relatively implausible error structure except in randomized experiments where manipulations might affect only means but not individual differences
specify contrast weights in the M matrix (using SPSS syntax)
With interactions in two repeated measures factors , to get specifically tailored post hoc partial interactions, one may need to do what?
logistic regression
Y: 0 = not promoted, 1 = promoted could be in...?
path analysis
a form of multivariate analysis in which the causal relationships among variables are presented in a graphic format
it actually correlates w/ the dependent variable & reduces error variance Otherwise you are wasting the df you invest in estimating the effect of the covariate
You only want to use a covariate in ANCOVA if
Log-likelihood
_______ is summing the probabilities associated with the predicted and actual outcomes. It is an estimate of how much unexplained information is leftover after fitting the model.
Replicates (items, trials) are administered to people, sometimes in within-subjects factors Example: semantic RT task with different types of items (concrete vs. abstract words)
__________ (items, trials) are administered to people, sometimes in within-subjects factors
Logistic function
___________ are used to reformulate a regression equation to handle probabilities.
logistic regression
_______________ transforms the dependent variable using a logit transform, and then estimates a linear regression equation using the logistic regression equation
logistic regression
________________ uses regression methods to predict binary outcome variables
•Planned comparisons can proceed in lieu of evaluation of ANOVA F-tests •Logic of comparisons used in 2-way ANOVA for generating interaction terms still applies (but can be computed as separate variables, as w/ one-way RM
__________________ can proceed in lieu of evaluation of ANOVA F-tests in In 1bs/1ws designs
Analysis of Covariance
_____________________ is classical application of categorical X continuous design w/ a "main effects" model
Binomial Distribution
a frequency distribution of the possible number of successful outcomes in a given number of trials in each of which there is the same probability of success is a...?
Logistic Regression
a nonlinear regression model that relates a set of explanatory variables to a dichotomous dependent variable
Mauchly's test
a test of the assumption of sphericity. If this test is significant then the assumption of sphericity has not been met and an appropriate correction must be applied to the degrees of freedom of the F-ratio in repeated-measures ANOVA. The test works by comparing the variance-covariance matrix of the data to an identity matrix; if the variance-covariance matrix is a scalar multiple of an identity matrix then sphericity is met.
Greenhouse-Geisser estimate
an estimate of the departure from sphericity. The maximum value is 1 (the data completely meet the assumption of sphericity) and the minimum is the lower bound. The correction varies between 1/(k-1), where k is the number of repeated measurements, and 1.
Huynh-Feldt estimate
an estimate of the departure from sphericity. The maximum value is 1 (the data completely meet the assumption of sphericity). Values below this indicate departures from sphericity and are used to correct the degrees of freedom associated with the corresponding F-ratios by multiplying them by the value of the estimate. It is less conservative than the Greenhouse-Geisser estimate, but some say it is too liberal. It is typically used when sphericity is greater than .075.
multiple logistic regression
an extension of logistic regression in which two or more continuous or categorical independent variables are included in the model
ANCOVA
analysis of covariance with an extra continuous variable
Large log likelihood values
are bad ! This is because it means that there is a large amount of unexplained variance. Smaller LL means that there is a better fit.
Population intercept and population regression slope in level 2 equations
are fixed parameters of the population regression equation
Categorical variables
are mixed in for any number of reasons. Some of which include: Naturally occurring groups (ethnicity/gender). Naturally occurring, but artificially divided into groups (age in terms of young/old). Lastly, for control and manipulation groups in an experimental design.
maximum likelihood methods
are used in MLM, which means that MLM do not use "true" OLS fit diagnostics like R^2.
OLS Regression
assumes that observations are independent of each other. (1 participant = 1 observation).
- this shows the reduction in error due to removal of component of Y by regression of Y on X (think compact vs augmented model) - if βy.x <> 0 then eij < e*ij
e*ij = βy.x (xij - μx) + eij
•The biggest single advantage of this approach is that one does not need COMPLETE DATA on all units of observation •One can model incomplete data based on MISSING AT RANDOM assumptions
biggest advantage of estimation models with multilevel regression?
Orthogonal contrasts in mixed interactions:
can test more complex hypothesis related to mean differences between groups (ie. mean of one group vs. pooled mean of two groups)
Diagonal
common covariance structure that assumes that random effects are independent, but their variances are heterogenous. This typically the default covariance structure for repeated measures.
Interaction: Categorical by Continuous
compute product variables and contrasts x covariate (continuous IV). If it is a factorial ANOVA, then you will need to create more sets of product variables, by cross factor the interaction contrast by the covariate.
the full mixed model equation in MLM
contains level 1 (person level) predictor xij and level 2 (group level) predictor Wj of intercept and slope and the cross-level interaction.
Odds Ratio
describes the odds of an event occurring. These odds are calculated by dividing the probability of an event occurring by the probability of an event not occurring.
Slope Homogeneity
estimated slopes for the separate groups are roughly similar
ICC = 0
groups do not significantly differ.
MSWithin
in ANCOVA, Covariate reduces ___________, which can make for more powerful tests of A, B, & A X B interaction
evaluate the nonparallel lines (regression of Y on X for A1; regression of Y on X for A2, etc.) and try to interpret the interaction
in ANCOVA, If there is an interaction then one would need to
to test for it: First evaluate the covariate X factor interaction. If assumption is met, interpret categorical effects
in ANCOVA, it's advisable not just to assume homogeneity of covariance but to..?
"residual"
in MLM, with no Level 2 predictors, all [random] variance in regression coefficients is _____?
main effect of Subject
in WS analyses, we are not interested in testing for the...?
the variance due to means; this should be numerator for the F-test
in WS designs, MSA = SSA/J-1, which is...?
a component due to random sampling of the values of the random factor The Expected value for A has error variance + person's improvement between levels (σ2SxA) interaction +
in WS designs, When a random factor is crossed with a fixed factor, the expected value or E(Fixed factor) contains what??
bigger mean effect (and more N) to detect it
in WS designs, if there is lots of variability in persons' change over time, then we need a
nonparallel lines (differences in change from pretest to posttest, as measured by sD2)
in WS designs, the MSA x Sinteraction reflects what?
•Subjects is a random factor - we randomly sample persons into our design (or treat it as such) •A is a fixed factor: we have only pretest and post test and no sampling is involved on levels of A
in WS designs, why is MSAxS the right error source to match with MSA?
use adjusted test
in repeated measures, if est. ε > .90 one can use standard repeated measures F-test; if not then ...?
Serial dependency
is a specific case of clustering that is due to repeated observations within individuals. (i.e. repeated measures).
x0 in logistic function
is the midpoint of the sigmoid
Log-likelihood
is used to predict the probability of an outcome Y, given a predictor X, or P(Y).
A path analysis
it's a MR system where there are multiple DVs and variables are both potentially predictors and outcomes-- what is it?
Smaller standard errors
lead to increased power
moderation analysis
looks at interactions between continuous variables
L in logistic function
max possible value
e in logistic function
natural logarithm base
below 1 = negative relationship above 1= positive relationship Above: perceived benefits increase likelihood of women getting mamogram
odds ratio below / above 1?
MR or GLM analog
one needs to test explicitly for interaction of covariate with factor using___________________ in ANCOVA
tau ^2
the amount of variance in a variable attributable to differences among groups in the population.
estimated variance components and one can estimate random effects for different sources
random effects factors in multilevel analysis are treated as __?
the intercepts and slopes of the level 1 equation become
random variables because we assume that all groups are randomly sampled from all possible groups in the population
more power to reject null hyp
smaller error term =
k in logistic function
steepness of the function's curve
Sphericity in repeated measures ANOVA
the assumption is that the variances in pairs of treatment levels are not different from each other.
adjusted df does (like the independent samples t-test under the unequal variance assumption)
the critical value of F in repeated measures is based on ...?
binomial distribution Instead there is an S-shaped function in which small changes in values of probability mattering more near the mean of the continuous predictor
the probability regression surface is not linear in the continuous predictor variable in...?
Intraclass Correlation Coefficient (ICC)
used to estimate the degree of clustering in a dataset. The degree of clustering is positively related to alpha inflation, such that higher clustering leads to higher alpha inflation.
Random effects
variables that are selected at random from a probability distribution (i.g. the intercept and slope in a simple regression.
fixed effects
variables that take on a predetermined set of values. The intercept and slope in simple regression is thought to reflect true slope/intercept of a population.
1 bs and 1 ws model of repeated measures
we are not typically interested in inferences about the subject-related sources of variances in...?
maximum likelihood
we want to maximize the probability of a set of observed datapoints given in a model
multivariate tests
what is a competitor to the adjusted df tests for repeated measures?
The issue with interpreting the ηp2 in a repeated measures design is that SSSubjects is ignored as a source of 'error'. •ηp2 from output is problematic for repeated measures and I do not recommend it
why is n^2 for repeated measures problematic?
:=•1) X correlates with Y •2) X precedes Y in time •3) Non-spuriousness XY holds eliminating (statistically controlling) on other influences on Y
•Inferring causality from correlation in path analysis:
-Mj is the sample mean in the jth group -Mxj is the mean covariate in the jth group -Mx is the grand mean for the covariate
•Mj - by.x (Mxj - Mx), where