Research Methods 2 Final
time series design
researchers take multiple measures over the course of a longitudinal study (same as single-group pretest-posttest w/o manipulation)--typical of longitudinal design
questionnaire
respondents read instructions and write/mark answers to questions -mostly context questions
confidence interval (CI)
roughly speaking, the range of scores (that is, the scores between an upper and lower value) that is likely to include the true population mean; more precisely, the range of possible population means from which it is not highly unlikely that you could have obtained your sample mean.
Assumptions of ANOVA
same as t test for independent means 1. pops follow normal curve 2. pops have equal variances 2. scores in grps independent of each other (not paired/matched)
outliers
scores with an extreme (very high or very low) value in relation to the other scores in the distribution
raw score effect size
shown as the difference between the Population 1 mean and the Population 2 mean
Single-subject (N-of-one) experimental designs
Pre vs post-treatment behavior. Evaluating change in a single P. -time-series -Skinner/clinical psych
Frows
S^2rows/S^2within or MSrows/MSwithin
Assumptions of factorial ANOVA
Same as for one-way ANOVA pop normality/equal variance for pops that go w each cell
correlated groups (within-subjects/matched subjects) design
Same or closely matched participants in each group.
One-way repeated measures ANOVA
Same procedure as two-way but the rows, instead of another factor, are Ps. One P per row, each P has a score in each column. 2 way ANOVA in which 1 factor is experimental condition & other is P. Interested in differences over & above individual overall differences among Ps.
demographic questions (surveys)
Seek descriptive info about respondents -factual items: can be verified independently
100-power
beta
opposite of power is
beta Beta, the probability of not getting a significant result if the research hypothesis is true, is the opposite of power, the probability of getting a significant result if the research hypothesis is true. (b) Beta is 100% minus power
how can we unbias the estimate of population variance?
by slightly changing the formula (dividing by N-1) Dividing by a slightly smaller number makes the result slightly larger.
statistical power
probability that the study will give a significant result if the research hypothesis is true.
two-way repeated measures ANOVA
Set up like a 3-way between-subjects ANOVA w 2 repeated measures variables & one dimension for Ps.
variance of a distribution of means
Sm=(sum (M-GM)^2)/dfbetween
inverse transformation
researcher changes each score to 1 divided by score. Corrects distributions w strongest skew to right (positive skew)
ordinary path analysis has largely given way to
structural equation modeling
matrix of cells
structure of cells in a factorial design
SSwithin
sum of squared deviations of each score from its group's mean S^2within=(sum(X-M))^2/dfbetween or MSwithin=SSwithin/dfwithin
ex post facto study
"after the fact" Research observes current behavior and attempts to relate it to earlier experiences cannot infer causal relationships (confounding) /eliminates rival hypotheses
R^2columns
((S^2columns)(dfcolumns))/((S^2columns)(dfcolumns)+(S^2within)(dfwithin))
R^2interaction
((S^2interaction)(dfinteraction))/((S^2interaction)(dfinteraction)+(S^2within)(dfwithin))
R^2rows
((S^2rows)(dfrows))/((S^2rows)(dfrows)+(S^2within)(dfwithin))
What determines the power of the study ?
(1) how big an effect (the effect size) the research hypothesis predicts (2) how many participants are in the study (the sample size). Power is also affected by the significance level chosen, whether a one-tailed or two-tailed test is used, and the kind of hypothesis-testing procedure used
cohens conventions for effect size
(a) small effect size 1d = .2 (b) medium effect size 1d = .5 (c) large effect size 1d = .8
SScolumns
(sum(Mcolumn-GM))^2
SSrows
(sum(Mrow-GM))^2
SSinteraction
(sum(X-GM)-(X-M)-(Mrow-GM)-(Mcolumn-GM))^2
SSwithin (factorial ANOVA)
(sum(X-M))^2 where X=score M=mean of score's cell
Major categories of sampling procedures
-Nonprobability (convenience) sampling: Uses respondents who are readily available w little attempt to represent any pop--convenient but weak. -Probability sampling: greater confidence sample adequately represents pop. Random sampling: every member of pop has equal chance of being selected -> lower selection bias.
Typical Program Evaluation Designs
-Randomized control-grp design (max control) -Nonequivalent control-grp design (natural control grp likely similar to grp being evaluated) -single-grp, time-series design (repeated measures on DV before, during, & after program to control threats to internal validity if control grp not possible) -pretest-postest design (weak, least control)
Assumptions for the Significance Test of a Correlation Coefficient
-The population of each variable (X and Y) follows a normal distribution. -There is an equal distribution of each variable at each point of the other variable. - linear
Uses of factor analysis (video)
-Variance/covariance btwn set of observed characteristics in terms of a simpler structure. (less unobserved than observed factors) -testing a theory in psych -dimensionality reduction: less unobserved than observed variables (machine learning)
Basic designs in survey research
-cross-sectional survey design (administering survey once to a sample, obtaining data on measured characteristics as they exist at time of survey) -longitudinal survey design (within-subjs survey research design in which same grp/panel of Ps surveyed at several times to assess changes within individuals over time) -sequential survey design (cross-sectional survey repeated over time, giving sense of how things are changing longitudinally)
assumptions for chi-square tests
-each score must not have any special relation to any other scores -scores cant be based on same ppl being tested more than once.
systematic between-groups variance
-experimental variance (due to IVs) -extraneous variance (due to confounding variables) -nonsystematic within-groups error variance
in experimentation, each study is designed to
-maximize experimental variance -control extraneous variance -minimize error variance
how does a t dist differ from a normal curve
-slightly more spread out than a normal curve, but very similar -Also higher tails —> more extreme. This means it takes a slightly more extreme sample mean to get a significant result when using a t dist vs a normal curve
z scores to know
.05 --> 1.64 (one tailed) --> 1.96 (two tailed .01 --> 2.33 (one tailed) --> 2.58 (two tailed)
z scores will always have a mean of _____ and a standard deviation of _______
0, 1
Major reasons for conducting experiments in a field setting
1. Test external validity of lab findings 2. Determine effects of events that occur in the field (Piaget) 3. Improve generalization across settings
three possible directions of casality
1. X could be causing Y. 2. Y could be causing X. 3. Some third factor could be causing both
Types of Generalization
1. of results from Ps in a study to a larger pop 2. of results of study over time (ie testing stability) 3. of results from study settings to other field settings (naturalistic settings)
most common use of chi square
2 nominal variables, each w several categories (chi-square test for independence)
Multivariate (factorial) design
2 or more IVs in a single study
A distribution of differences between means is, in a sense, ______ steps removed from the populations of individuals
2 steps removed First, there is a distribution of means from each population of individuals; second, there is a distribution of differences between pairs of means, one of each pair of means taken from its particular distributions of means
estimation approach always uses
2 tailed
How many F-ratios in a two-way ANOVA
3 (column/row main effects, interaction effect)
effect size
: standardized measure of difference (lack of overlap) between populations. Effect size increases with greater differences between means.
Levene's test
An F-test used to determine if the variances from data sets are significantly different (homogenity of variances)
Repeated Measures ANOVA
Appropriate ANOVA for within-subjs design. Modified to factor in correlation of conditions in within-subjs design.
randomized within blocks
Arranging trials into blocks of trials (includes 1 trial from each condition) and then randomizing order of presentation of trials within each of these blocks. No possibility of one condition having most of trials at beginning/end of testing.
Surveys
Ask participants in their natural environments about their experiences, attitudes, or knowledge. Not a single research design -status survey and survey research
Correlated group designs
Assure group equivalence by using same Ps in all grps or closely matched Ps. -within-subjects (single-subject an extension) -matched-subjects
within-cells variance estimate
Average of pop variance estimates made from scores in each of the cells (denominator in factorial f-value)
sample size and power
Basically, the more people there are in the study, the more power there is. Sample size affects power because, the larger the sample size is, the smaller the standard deviation of the distribution of means becomes. If these distributions have a smaller standard deviation, they are narrower
Mean Squares (MS)
Between-groups & within-groups variances estimates Divide each sum of squares by associated df
Mixed Design Factorial ANOVA
Can refer to a factorial that includes both btwn-subjs & within-subjs factors. Can also be used to refer to a factorial that includes both manipulated & nonmanipulated factors. Mixed in both ways: Classified on btwn vs within to select appropriate ANOVA, then manipulated vs nonmanipulated to interpret results
Solomon's 4 group design
Combine randomized, pretest-posttest, control grp design & posttest-only, control grp design. Way to control interaction effects.
Frustration effect
Common clinical phenomenon that occurs whenever new procedures initiated (ex peak shortly after training begins) -problem for interrupted time series design
within-subjects design/ANOVA
Comparison within, not between the diff Ps/subjects ex: repeated measures
approximate randomization tests
Computer randomly selects large # of possible divisions of sample. Considered representation of using every possible division
experimental designs (generally)
Control groups and randomization distinguish them from most nonexperimental designs. They control most sources of confounding Randomize whenever possible & ethically feasible
crossover effect
Control grp does not change, but experimental grp does, even going beyond level of control grp. -considerable confidence in causal inferences.
Advantage in correlation/regression over t-test/ANOVA
Correlational gives you direct info on relationship btwn variable that divides the grps & measured variable as well as permitting a significance test. T-test/ANOVA gives only statistical significance.
Latin squares design
Counterbalanced arrangements that involves arranging letters in rows and columns so each letter appears only once in each row and once in each column. Result is all letters appears in each position in sequence equal # of times, and all letters follow each other equal # of times.
Status survey
Describe current characteristics of a pop
independent groups (between-subjects) design
Different participants appear in each group
Assumptions for a t test of dependent means
Each of the population distributions is assumed to follow a normal curve Assume two populations have the same variance —> homogeneity of \variance The scores are independent from each other meaning they are not paired up or matched in any way (this would make it a dependent means test)
Within-subjects (repeated measures) design
Each p serves as his or her own control. Test Ps under all conditions Drawback: sequence effects (confounding) have larger F values
Kramer
Emotional contagion: leading ppl to experience the same emotions without their awareness. Positive posts omitted: higher % of negative words, less % percent positive in person's posts. Vice versa. Withdrawal effect: less emotional posts, less expressive overall in following days. small effect size
Reporting F values
F(dfbtwn,dfwithin)=?,p<?
Difference between F-tests and the structural model
F-tests focus directly on what contributes to overall pop variance; structural model focuses directly on what contributes to divisions of deviations of scores from Grand mean.
Survey research
Goes beyond status surveys seeking relationships among variables -most familiar form of research in social sciences -Basic rule in survey research:Instrument should have clear focus & be guided by hypotheses held be researcher. -open-ended, mc, and likert scale (continuum of agreement) items -heart of survey research: selection of representative samples
mixed design ANOVA
Have in same analysis both repeated measures & ordinary btwn-subjs factors.
design notation factorial
IVs & how many levels of each variable are included (# of levels of IV1 x # of levels of IV2...)
Factors
IVs in a factorial design
variance of a distribution of differences between means
If the null is true, we assume they have the same variance so when estimating the variance of each pop you are getting two separate estimates of what should be the same number —> in reality they are gonna be slightly differenent so we use a pooled estimate of the population variance
single-group, pretest-posttest study
Improvement over posttest-only approach bc Includes pretreatment evaluation cannot infer causal relationship (confounding) time-series design
Statistical significance
Indicates that a result is unlikely if there really is no effect of the independent variable.
error bars
Indication of statistical uncertainty showing true mean of pop falls somewhere within range represented standard error= most commonly used measure of variability for error bars
Why is the normal curve (or at least a curve that is symmetrical and unimodal) so common in nature?
It is common because any particular score is the result of the random combination of many effects, some of which make the score larger and some of which make the score smaller. Thus, on average these effects balance out near the middle, with relatively few at each extreme, because it is unlikely for most of the increasing and decreasing effects to come out in the same direction
level of signifcant tells you
It tells you how confident you can be that these results could not have been found if the null hypothesis were true. The more extreme (lower) the p level is, the stronger the evidence is for a nonzero effect size.
Huynh-Feldt correction
Least conservative correction for violation of sphericity assumption (lowest p-value). Widely agreed to be quite accurate; most used correction.
Cramer's phi
Measure of effect size for a chi square test for independence with a contingency table that is larger than 2x2. Also known as Cramer's V Cramer's o=root(X^2/(N*dfsmaller)) dfsmaller=df for smaller side of contingency table N=total number of Ps
data transformations usually mentioned in
Methods/results section of research article.
Heterogeneous Population
More diversity that must be represented in sample need larger sample to represent diversity of pop
randomized, posttest-only, control group design
Most basic experimental design, includes randomization (selection/assignment) and a control group. The 2 groups must be equivalent at the beginning of the study (statistically equal) bc after random assignment to groups, any small differences that might exist are results of sampling error
Macro-level variables have relation w micro-level variables (multi-stage)
Most common situation in psych research -Z ->y (macro-to-micro) -Z ->y/x ->y (relationship btwn z & y, given effect of x on y taken into account) -x ->y with z connecting to middle of arrow (relation btwn x & y dependent on z (z/y dependent on x)--cross-level macro-micro interaction -x -> Z (micro-macro proposition)
context questions
Most items on a questionnaire. Ask about respondents' opinions, attitudes, knowledge, & behavior
dftotal (Factorial ANOVA)
N-1 dfrows+dfcolumns+dfinteraction+dfwithin
df for dependent means
N-2
pretest-posttest, natural control group study
Naturally occurring groups are used, only one of which receives the treatment. Participants not randomly assigned to groups as in experimental design No procedure to ensure the 2 groups are equivalent at start of study (confounding) close to experimental design
dfinteraction (factorial ANOVA(
Ncells-dfrows-dfcolumns-1
dfcolumns
Ncolumns-1
Hilgard
Neither game difficulty nor violence predicted aggressive behavior. Low 2D:4D ratio did not either (individual biological differences)
Assumptions (one-way repeated measures ANOVA)
Normal distribution & equal variances of pops for each condition/condition combo Sphericity: Pop variances for each condition are same & pop correlations among the diff conditions are same
dfrows
Nrows-1
dfrows(one-way repeated measures ANOVA)
Number of Ps-1
Repeated Measures ANOVA
Often used to analyze repeated measures designs
Single-variable (univariable) designs
One IV
Don't get mixed up. The distributions of means can be narrow (and thus have less overlap and more power) for two very different reasons. THese are :
One reason is that the population of individuals may have a small standard deviation; this has to do with effect size. The other reason is that the sample size is large. This reason is completely separate. Sample size has nothing to do with effect size. Both effect size and sample size affect power.
Counterbalancing
Order of presentation of conditions is systematically varied. Complete counterbalancing=all possible orders of conditions occur equal # of times. (X!; where X=# of conditions). Partial better for large # of conditions, complete for small #. 1. each P exposed to all conditions of experiment 2. each condition presented equal # of times in each position 3. each condition precedes & follows each other condition an equal # of times.
parametric test
Ordinary hypothesis testing procedure that requires assumptions about the shape or other parameters of the populations
randomized pretest-posttest, control group design
Participants randomly assigned to grps. All ps tested on DV (pretest), experimental group is administered the treatment, then both groups retested on DV (posttest). Critical comparison btwn experimental & control grps on posttest measures. Adds level of confidence w pretest (way to check if grps actually equivalent on DV at start of experiment)
Problem with addition of a pretest
Possibility that pretest will affect Ps response to treatment Confounding: effect of pretest may not be constant for grps, but vary depending on level of IV
random order of presentation
Randomly assign each P to a different order of the conditions. Effective w large number of subjects
Single-subject direct replication
Repeating a single-subject experiment with the same participant or other participants and with the same target behavior
Unbiased assignment
Required of experimental designs. Can be achieved thru free random assignment or matched random assignment.
1st step of hypothesis testing for two-way ANOVA
Research/null hypothesis for each main effect and the interaction effect. Same process for testing each effect.
single-group, posttest-only study
Researcher manipulates IV with a single group and then group is measured. cannot infer causal relationships(confounding)
multilevel, completely randomized, between-subjects design
Researcher randomly assigns Ps to 3 or more conditions Controls for same confounding as simple 2-grp designs may/may not include pretests
three things ab the distribution of means (central limit theorem)
Rule 1: The mean of a distribution of means is the same as the mean of the population of individuals Rule 2a: The variance of a distribution of means is the variance of the population of individuals divided by the number of individuals in each sample Rule 2b: The standard deviation of a distribution of means is the square root of the variance of the distribution of means. Rule 3: The shape of a distribution of means is approximately normal if either (a) each sample is of 30 or more individuals or (b) the distribution of the population of individuals is normal.
PRE (propionate reduction in error)
SS total - SS error / SS total tells you how good of a job your X has done in predicting Y
S^2(MS)columns
SScolumns/dfcolumns
S^2(MS)interaction
SSinteraction/dfinteraction
SStotal (factorial ANOVA)
SSrows+SScolums+SSinteraction+SSwithin 0r (sum(X-GM))^2
S^2(MS) rows
SSrows/dfrows
S^2(MS) within (Factorial ANOVA)
SSwithin/dfwithin
F value (ANOVA)
S^2between/S^2within or MSbetween/MSwithin
Fcolumns
S^2columns/S^2within or MScolumns/MSwithin
Finteraction
S^2interaction/S^2within or MSinteraction/MSwithin
There is some debate in the psychological research literature about whether research studies should present their results in terms of unstandardized regression coefficients, standardized regression coefficients, or both types of coefficients. one solution? problems?
Some researchers recommend listing only regular unstandardized regression coefficients when the study is purely applied, only standardized regression coefficients when the study is purely theoretical, and both the regular and standardized regression coefficients in all other cases. Although this appears to be a straightforward solution to the issue, researchers do not always agree as to when a study is purely applied or purely theoretical.
Type of hypothesis-testing procedure and power
Sometimes the researcher has a choice of more than one hypothesis-testing procedure to use for a particular study.
Hurlbert
Statistically significant be disallowed in scientific literature (except for abstracts) Studies conduction statistical analyses are too diverse in type, size, objective, diversity of statistical procedures used, & other summary statistics accompanying p-values for generally applicable specific guidelines to be possible
t statistic formula
T = sample mean - population mean / standard deviation fo the distribution of means
power and effect size
The bigger the effect size is, the greater the power. Means: if there is a big mean difference in the populations, you have more chance of getting a significant result in the study. The difference in the means between populations we saw earlier is part of what goes into effect size standard deviation: The smaller the standard deviation is, the bigger the effect size is. SO smaller the SD, the higher the power. This is because a smaller SD means less variation, so narrower distributions, so less overlap, so higher chance of getting a stat sig result.
forumla for correlation coefficent (r)
The correlation coefficient is the sum, over all the people in the study, of the product of each person's two Z scores, then divided by the number of people.
How is the general linear model different from multiple regression? Why?
The general linear model is for the actual (not the predicted) score on the criterion variable, and it includes a term for error. To predict the actual score, you have to take into account that there will be error
main effects
The impact of each independent variable on the dependent variable result for a grouping variable, averaging across the levels of the other grouping variable(s) factorial design
intercept on regression line
The intercept is the predicted score on the criterion variable Y when the score on the predictor variable (X) is 0. It turns out that the intercept is the same as the regression constant. This works because the regression constant is the number you always add in—a kind of baseline number, the number you start with.
z scores
The number of standard deviations (SD) a score (X) is from the mean (M).
What is the relationship between the regression line and the linear prediction rule?
The regression line is a visual display of the linear prediction rule
What number is used to indicate the accuracy of an estimate of the population mean? Why?
The standard error (or standard deviation of the distribution of means) is used to indicate the accuracy of an estimate of the population mean. The standard error (or standard deviation of the distribution of means) is roughly the average amount that means vary from the mean of the distribution of means.
What is the difference between correlation as a statistical procedure and a correlational research design?
The statistical procedure of correlation is about using the formula for the correlation coefficient, regardless of how the study was done. A correlational research design is any research design other than a true experiment.
Single-subjects, randomized, time-series design
Time-series design for single P w 1 additional element: point at which treatment begins determined randomly. Measure DV several times over long pd, w experimental intervention occurring at selected pt during observations.
Purpose of factor analysis (video)
To estimate a model which explains variance/covariance btwn set of observed variables (in a pop) by a set of (fewer) unobserved factors +weightings.
why do we use squared error?
Unlike unsquared errors, squared errors do not cancel each other out. Using squared errors penalizes large errors more than small errors; they also make it easier to do more advanced computations.
matched-subjects design
Uses diff Ps in each condition, but closely matches Ps before assigning them to conditions (matched random assignment). Matched on relevant variables likely to have an effect on the DV w significant variability in pop.
One- versus two-tailed tests and power
Using a two-tailed test makes it harder to get significance on any one tail.
Strictly speaking, consider what we are figuring to be, say, a 95% confidence interval>
We are figuring it as the range of means that are 95% likely to come from a population with a true mean that happens to be the mean of our sample. However, what we really want to know is the range that is 95% likely to include the true population mean. This we cannot know.
stratified random sampling
When important to ensure subgrps within pop adequately represented in sample. Divide pop into strata (subgrps) and take random sample from each stratum.
Central principle of ANOVA
When the null hypothesis is true, the ratio of the between-groups to within-groups should be about 1. When the research hypothesis, the ratio should be greater than 1.
nominal (dummy) coding
When there are more than 2 groups, instead of trying to make the nominal variable that divides the groups into a single variable, you can make it into several numerical predictor variables w 2 levels each avoids problem of creating arbitrary rankings used in multiple regression for predictors for same results as ANOVA
predictor variable
X in prediction, variable that is used to predict
Cho-square results
X^2(df)=?, p__
2. When figuring the correlation coefficient, why do you divide the sum of crossproducts of Z scores by the number of people in the study?
You divide the sum of cross-products of the Z scores by the number of people in the study, because otherwise the more people in the study, the bigger the sum of the cross-products, even if the strength of correlation is the same. Dividing by the number of people corrects for this. (Also, when using Z scores, after dividing the sum of the cross-products by the number of people, the result has to be in a standard range of -1 to 0 to +1.)
z scores have a greater magnitdue when
Your score is further from the mean OR the standard deviation (variability) of the distribution is smaller
Eta squared (n^2)
a common name for R^2 measure of effect size for ANOVA (aka correlation ratio
Shapiro-Wilk test.
a formal statistical test of normality
scatter daiagram
a graph that shows the relation between two variables. One axis is for one variable; the other axis, for the other variable. The graph has a dot for each individual's pair of scores. The dot for each pair is placed above that of the score for that pair on the horizontal axis variable and directly across from the score for that pair on the vertical axis variable
results of Shaprio Wilkes test
a probability of less than .05 tells us that the distributions differ pretty greatly from normality, which is NOT what we want! We want the probabilities to be high, indicating that the distributions are pretty similar to a normal distribution.
interrupted time series design
a single group of participants is measured several times both before and after some event or manipulation -variation of within-subjs design -can take advantage of data already gathered over long time, in studies in which presumed causal event occurs for all members of pops -change must be sharp to be interpreted as anything other than normal fluctuation, occur immediately after intervention -flexible design, can use existing data, can be improved by using one or more comparison grps
t test
a special case of ANOVA (use only when 2 grps)
bivariate correlation
a special case of multiple correlation
bivariate prediction
a special case of multiple regression
Single-subject clinical replication
a specialized form of replication for single-subject designs, which is used primarily in clinical settings
grouping variable
a variable that separates groups in analysis of variance
Kinds of propositions for study of hierarchical/multilevel systems having 2 distinct layers (multi-stage)
about micro units (X -> Y) about macro units (Z->Y) about macro-micro relations (Z->Y/X->Y)
Loosely speaking, statistical significance is
about the probability that we could have gotten our pattern of results by chance if the null hypothesis were true
population parameter
actual value of the mean, standard deviation, and so on, for the population; usually population parameters are not known, though often they are estimated based on information in samples.
multilevel modeling
advanced type of regression analysis that handles a research situation in which people are grouped in some way that could affect the pattern of scores -lower-level variable: variable about ppl within each grouping -upper-level variable: variable about grouping as a whole
When both a main effect and interaction effect occur
always interpret the interaction first
ratio scale
an equal-interval variable is measured on a ratio scale if it has an absolute zero point, meaning that the value of zero on the variable indicates a complete absence of the variable
factorial analysis of variance
analysis of variance for a factorial research design
one-way analysis of variance
analysis of variance in which there is only one grouping variable
history
any number of other events might account for changes in DV. -must rule out/identify confounding caused by history in interrupted time-series design
Significant at p<.05 is interpreted as
as "there's less than a 5% probability that a finding this extreme would occur when the null hypothesis is really true."
variance
average of the squared deviation of each score from the mean. Tells you about how spread out the scores are (that is, their variability),
split-half reliability
based on correlation of scores of items from 2 halves of test -cronbach's alpha: widely used measure of test's internal consistency reliability (extent to which items of a measure assess a common characteristic) that reflects avg of split-half correlations from all possible splits into halves of the items on the test. at least 0.6 is good, preferably closer to 0.9.
within-groups pop variance estimate (one-way repeated measures ANOVA)
based only on deviation of scores within conditions after adjusting for each P's overall mean.
sequence effects
bc each P is exposed to all conditions, exposures to earlier conditions may affect performance on later conditions. Practice effects: due to growing experience w procedures over successive conditions (positive=get better, negative=worse). ex: Carryover effects: due to influence of a particular condition/combo of conditions on responses to later conditions.
superious correlation
caused by a third varible
rank-order transformation
changing a set of scores to ranks
Analysis of variance table
chart showing the major elements in figuring an analysis of variance using the structural model approach
Assent
child's agreement to participate in an experiment.
planned contrasts
comparison in which the particular means to be compared and in what direction they will differ were decided in advance dfwithin is same, dfbetween=1
primary goal of experimental research designs
control extraneous variables
methods of carrying out hypothesis tests when samples appear to come from nonnormal pops
data transformation rank-order test
square root transformation
data transformation using the square root of each score. Positively (right) skewed ->less skewed.
not at beginning of results section if
data treated as ranked/nonparametric
Just how much the t distribution differs from the normal curve depends on the
degrees of freedom
mutatis mutandis
dependency of observations on micro-units within micro-units of focal interest
program evaluation research
determines how successfully a program meets its goals -needs several dependent measures (convergent validity) to evaluate how well program meets each of goals -minimize bias by using objective measures whenever possible, ppl not involved in administration of program to gather data
dfwithin (factorial ANOVA)
df1+df2+g+dflast (add up df for all the cells)
df (chi-square test for independence)
df=(Ncolumns-1)(Nrows-1)
between-groups (numerator) df (ANOVA)
dfbetween=Ngroups-1
Rules for btwn-subjects factorial design df
dftotal=N-1 df for main effects=number of levels for each factor minus 1 df for interaction= product of df for main effects df for within-grps variance=N minus # of grps # of grps= # of levels for each factor
within-groups (denominator) df (ANOVA)
dfwithin=df1+df2+...+dflast where df1=df of group 1
biased estimate
estimate of a population parameter that is likely systematically to overestimate or underestimate the true value of the population parameter. For example, SD2 would be a biased estimate of the population variance (it would systematically underestimate it).
between-groups estimate of population variance (ANOVA)
estimate of the variance of the population of individuals based on the variation among the means of the groups studied S^2between or MSbetween=Sm^2(n) where n is number of participants in each sample
Nonexperimental designs from most to least control for extraneous variance
pretest-posttest, natural control group study single-group, pretest-posttest study single-group, posttest-only study ex post facto study
symmetrical distribution
distribution in which the pattern of frequencies on the left and right side are mirror images of each other
skewed distribution
distribution in which the scores pile up on one side of the middle and are spread out on the other side; distribution that is not symmetrical
comparison distribution for t test for independent means
distribution of differences between means
distribution of means
distribution of means of samples of a given size from a population (also called a sampling distribution of the mean); comparison distribution when testing hypotheses involving a single sample of more than one individual.
Dichotomizing
dividing the scores for a variable into two groups. Also called median split -can then do factorial ANOVA -lose info when reduce range of scores to just high & low (treat results w skepticism, more likely Type I/II errors) -appropriate when 2 v diff grps of ppl -common, but dying out
experimental variance
due to effects of IV(s) on DV(s)
why do researchers like z scores
easier comparison - of findings from diff studies easier communication - about findings
phi coefficient (circle w line in it)
effect size measure for a chi square test for independence with a 2x2 contingency table; square root of division of chi square statistic by N o=root(X^2/N) min of 0 max of 1 .1=small, .3=medium, .5=large
The correlation coefficient itself is a measure of
effect size.
computer-intensive assumptions
either of the 2 for parametric tests (normality/equal variances
alpha
probability of making a Type I error; same as significance level
beta
probability of making a Type II error.
within-groups estimate of the population variance (ANOVA)
estimate of the variance of the population of individuals based on the variation among the scores in each of the actual groups studied. Based on chance (or unknown factors) S^2within or MSwithin=S1^2+S2^2 +...+Slast^2 where S1 is estimate pop variance of grp 1
manipulation check
evaluate whether manipulation actually had its intended effect on Participants. Assessment of effectiveness of experimental manipulation
ABA reversal design
evaluates the effects of an independent variable on a dependent variable by measuring the dependent variable several times, during which the treatment is applied and then removed. ABA: no treatment, treatment, no treatment
Kurtosis
extent to which a frequency distribution deviates from a normal curve in terms of whether its curve in the middle is more peaked or flat than the normal curve -peaked: more scores in tails, flat: fewer in tails
Estimation of population variance also leads to slightly more ______ means than in an exact normal curve
extreme
two-way factorial research design
factorial research design in analysis of variance with two variables that each divide the groups Use two-way ANOVA
Type 2 error
failing to reject the null hypothesis when in fact it is false; failing to get a statistically significant result when in fact the research hypothesis is true.
misuse of frequency tables/histograms
failure to use equal interval sizes exaggeration of proportions
new in chi-square test of independence
figure differences btwn expected and observed for each combo of categories (each cell) independent: proportions up/down cells of each column should be same.
figuring power involves
figuring out the area of the shaded portion of the upper distribution,
least squares criterion
finding the regression line that gives the lowest sum of squared errors between the actual scores on the criterion variable (Y) and the predicted scores on the criterion variable
To look at a main effect
focus on marginal means for each grouping variable
to look at the interaction effect
focus on pattern of individual cell means
link between research designs & statistical analysis
focusing on control of variance
Linear prediction rule
formula for making predictions; that is, formula for predicting a person's score on a criterion variable based on the person's score on one or more predictor variables
rectangular distribution
frequency distribution in which all values have approximately the same frequency
unimodal distribution
frequency distribution with one value clearly having a larger frequency than any other -most psych studies
bimodal distribution
frequency distribution with two approximately equal frequencies, each clearly larger than any of the others
multimodal distribution
frequency distribution with two or more high frequencies separated by a lower frequency; a bimodal distribution is the special case of two high frequencies
error term
function of both variability within groups and size of samples. As sample size increases, error term gets smaller
general linear model
general formula that is the basis of most of the statistical methods covered in this text; describes a score as the sum of a constant, the weighted influence of several variables, and error. central logic behind all these methods
test-retest reliability
give test to grp of ppl twice; correlation btwn scores from the 2 testings. -often not practical/appropriate
non-equivalent control group design
groups that already exist in natural environment used in research. 2 major problems: -groups may differ on dependent measure at beginning of study (use pretest) -may be other differences btwn grps (confounding caused by selection); use similar grps or identify/eliminate confounding differences btwn grps
quasi-experiment
have essential form of experiments, control for some confounding, but do not have as much control as experiments. -cannot always manipulate IV -usually cannot assign Ps to grps, must accept existing grps
Confounding factors interrupted time-series design
history instrumentation
Ways to control sequence effects
hold extraneous variables constant vary order of presentation of conditions
frequencies
how many people/observations fall into diff categories
T test for a single sample
hypothesis testing procedure in which a sample mean is being compared to a known population mean and the population variance is unknown
distribution-free tests
hypothesis testing procedure that making no assumptions about shape of pops (nonparametric tests) ex: rank-order
randomization test
hypothesis-testing procedure (usually a computer intensive method) that considers every possible reorganization of the data in the sample to determine if the organization of the actual sample data were unlikely to occur by chance.
Analysis of Variance (ANOVA)
hypothesis-testing procedure for studies with three or more groups
chi-square test for goodness of fit
hypothesis-testing procedure that examines how well an observed frequency distribution of a nominal variable fits some expected pattern of frequencies
chi-square test of independence
hypothesis-testing procedure that examines whether distribution of frequencies over categories of one nominal variable is unrelated to distribution of frequencies over categories of a 2nd nominal variable
rank-order test
hypothesis-testing procedure that uses rank-ordered scores
Chi-square test
hypothesis-testing procedure used when the variables of interest are nominal (categorical) variables
bootstrap tests
hypothesis-testing procedures (computer-intensive) that allow you to create multiple estimates of sample statistic by creating large # of randomly selected samples from data & seeing how consistent sample statistic is across all estimates
relationship between r and t w correlations
if r = 0, t = 0. This is because the numerator would be 0 and the result of dividing 0 by any number is 0. Also notice that the bigger the r, the bigger the t
If results are stat sig and small sample size
important reslt
observed frequency
in a chi-square test, number of individuals actually found in the study to be in a category or cell
expected frequency
in a chi-square test, number of people in a category or cell expected if the null hypothesis were true A cell's expected frequency is # in its row divided by total, multiplied by # in columns E=(R/N)*C
marginal means
in a factorial design in analysis of variance, mean score for all the participants at a particular level of one of the grouping variables
cell
in a factorial research design, particular combo of levels of the variables that divide the group
regression constant (a)
in a linear prediction rule, particular fixed number added into the prediction
pooled estimate of the population variance (S2 Pooled)
in a t test for independent means, weighted average of the estimates of the population variance from two samples (each estimate weighted by the proportion consisting of its sample's degrees of freedom divided by the total degrees of freedom for both samples).
error
in prediction, the difference between a person's predicted score on the criterion variable and the person's actual score on the criterion variable.
latent variable
in structural equation modeling, unmeasured variable assumed to be underlying cause of several variables actually measured in study (represents combo of several variables)
factorial design
include multiple independent variables. Allow us to study interactive effects of IVs on DVs. Appropriate statistical analysis=ANOVA
if results are not stat sig and small sample size
inconvlusive
decision errors
incorrect conclusions in hypothesis testing in relation to the real (but unknown) situation, such as deciding the null hypothesis is false when it is really true.
row means
means for all the people in a row in a factorial design
if the CI in a correlation test doesn't include 0
its statistically significant
studies using difference scores have
large effects sizes/power BUT weak research design: many alternative explanations for significant difference.
significance level and power
less extreme significance levels (such as p 6 .10 or p 6 .20) mean more power (higher probability of getting a statistically signifcant result)
regression line
line on a graph such as a scatter diagram showing the predicted value of the criterion variable for each value of the predictor variable; visual display of the linear prediction rule
line graphs interaction
lines not parallel
sampling frame
list of all ppl in pop for limited pop -must be workable sampling frame for random sampling.
survey instruments
lists questions & provides instructions/methods for answering them -questionnaire, interview schedule
max power for chi-square
lots of Ps, less categories in each direction (smaller df) Should be AT LEAST 5x as many individuals as cells, otherwise risk Type II error.
primary units= secondary units= (multi-stage)
macro-units (ex schools); micro units (ex teachers)
A factorial design tests 2 types of hypotheses
main effects interaction effects
ecological validity
making accurate generalizations of lab findings to real-world settings
data transformation
mathematical procedure (such as taking the square root) used on each score in a sample, usually done to make the sample distribution closer to normal. Hypothesis testing only after transformations. Order of scores must stay same, do transformations to all scores/variables -square root -log -inverse
F distribution
mathematically defined curve that is the comparison distribution used in an analysis of variance Not symmetrical, has long tail on right F ratio's distribution cannot be lower than 0 and can rise quite high
chi-square distribution
mathematically defined curve used as the comparison distribution in chi-square tests; distribution of the chi-square statistic df=Ncategories-1 skewed to right: cannot be less than 0 but can have very high values.
To get larger F-scores (statistical significance)
maximize experimental variance and minimize error variance
best prediction of y in regression
mean yields the smallest sum of squared errors in the absence of a predictor
cell mean
mean of a particular combination of levels of the variables that divide the groups in a factorial design in analysis of variance
column means
means for all of the people in a column in a factorial design
t test for dependent
means hypothesis-testing procedure in which there are two scores for each person and the population variance is not known; it determines the significance of a hypothesis that is being tested using difference or change scores from a single group of people.
comparison dist for t test for dependent means
means of difference scores
clinical significance
means that the result is big enough to make a difference that matters in treating people
proportionate reduction in error
measure of association between variables that is used when comparing associations. Also called proportion of variance accounted for.
correlation coefficient (r) (also called Pearson correlation coefficient)
measure of degree of linear correlation between two variables ranging from -1 (a perfect negative linear correlation) through 0 (no correlation) to +1 (a perfect positive correlation).
fit index (structural equation modeling)
measure of how well correlations in sample correspond to correlations expected based on hypothesized pattern of causes & effects among these variables, usually ranges from 0 to 1 w 1=perfect. 0.9=good fit -RMSEA: widely used. Low values=good fit. -significant=theory does not fit.
partial correlation coefficient
measure of the degree of correlation between two variables, over and above the influence of one or more other variables
Homogeneous Population
members similar to one another. Small sample size will represent pop effectively
path analysis
method of analyzing correlations among a grp of variables in terms of a predicted pattern of causal relations; usually predicted pattern diagrammed as pattern of arrows from causes to effect. -path: arrow that shows cause/effect connections btwn variables -path coefficient: degree of relation associated w an arrow in a path analysis (same as standardized regression coefficient)
Scheffe test
method of figuring the significance of post hoc comparisons that takes into account all possible comparisons that could be made Scheffe=F/dfbetween
null in rank-order test
middle rank is same for the 2 pops
the biggest complaint today against significance tests" has to do with...
minsinterpretation
post hoc comparisons
multiple comparisons, not specified in advance; procedure conducted as part of an exploratory analysis after an analysis of variance
repeated measures design
multiple groups, but all scores are from the same group
most general teqnique out of all four types
multiple regression
most general to most specific tests
multiple regression --> bivariate regression / correlation / ANOVA (all equal) --> T tests
Bonferroni procedure
multiple-comparison procedure in which the total alpha percentage is divided among the set of comparisons so that each is tested at a more stringent significance level
sampling error
natural variability among means
general linear model
nearly the same as multiple regression model
Quasiexperimental designs
non-equivalent control group design interrupted time-series design
regression
predicting values on one varibale from another
Null & research hypothesis in chi-square test for independence
null: the 2 variables are independent (unrelated) research: Not independent
regression coefficient (b)
number multiplied by a person's score on a predictor variable as part of a linear prediction rule
between-groups variance estimate
numerator in factorial f-value Based on groupings being compared for particular main/interaction effect
rank-order variable
numeric variable in which the values are ranks, such as class standing or place finished in a race. Also called ordinal variable.
effect of restriction in range
often to drastically reduce the correlation compared to what it would be if people more generally were included in the study (presuming there would be a correlation among people more generally).
grand mean (GM)
overall mean of all the scores, regardless of what group they are in; when group sizes are equal, mean of the group means
illusory correlation
overestimate of strength of relationship btwn 2 variables -paired distinctiveness (2 unusual events/grps) -prejudices
cell
particular combo of categories for 2 variables in a contingency table
Mediational Analysis
particular type of ordinary path analysis that tests whether a presumed causal relationship btwn 2 variables is due to some particular intervening (mediating) variable. 4 steps: -X -> Y -X -> M -M ->Y in which A included as predictor -M ->Y; X no longer predicts Y. Sobel's test. partial mediation=significant but weaker X.
frequency distribution
pattern of frequencies over the various values; what a frequency table, histogram, or frequency polygon describes
multi-stage sample
pop of interest consists of subpops, & collection takes place via those subpops. -two-stage sample: only 1 subpop level -multi-stage sampling leads to dependent observations
null hypothesis for t test for independent means
population 1 = population 2
assumptions are about
populations, not samples
interaction effect
pretest may sensitize Ps to nature of the research; sensitization may interact w new info & change way Ps respond.
proportion of variance accounted for (R^2)
proportion of the total variation of scores from the grand mean that is accounted for by the variation between the means of the groups R^2=((S^2between)(S^2within))/((S^2between)(dfbetween)+(S^2within)(dfwithin)) R^2=((F)(dfbetween))/((F)(dfbetween)+dfwithin)
variance commonality (factor analysis)
proportion of variance explained by set of factors common to other observed variables
unique variance (factor analysis)
proportion of variance unique to specific variables; not caused by common set of factors. -unique factors correlated: proportion of covariances due to unique factors
When predicting scores on a criterion variable from scores on one predictor variable, the standardized regression coefficient has the same value as what other statistic?
r
Cohen's conventions for corrleation
r = 0.1 --> small r = 0.3 --> medim r = 0.5 --> large
experimental designs (specific types)
randomized, posttest-only, control group design randomized pretest-posttest, control group design (extra level of confidence for confounding) multilevel, completely randomized, between-subjects design
transformation for left skew
reflect (subtract from some high # so all reversed) scores
Type I error
rejecting the null hypothesis when in fact it is true; getting a statistically significant result when in fact the research hypothesis is not true.
curvilinear correlation
relation between two variables that shows up on a scatter diagram as dots following a systematic pattern that is not a straight line
linear correlation
relation between two variables that shows up on a scatter diagram as the dots roughly following a straight line. .
partialing out
removing the influence of a variable from the association between other variables (holding constant)
if results are not stat sig and large sample size
research hypothesis probs false
repeated measures design
research strategy in which each person is tested more than once; same as within-subjects design
log transformation
research takes logarithm of each score. Good for strongly right (positive) skewed; stronger version of square-root.
Mean of the Distribution of Differences Between Means
zero bc if null is true, two pops have the same mean
Standardized regression coefficient (beta)
shows the predicted amount of change in standard deviation units of the criterion variable if the value of the predicted variable increase by one standard deviation useful and practical way of making predictions on a criterion variable from scores on a predictor variable
interrater reliability
similarity of ratings by 2 or more raters of each P's behavior or spoken/written material.
ceiling effect
situation in which many scores pile up at the high end of a distribution (creating skewness to the left--negative) because it is not possible to have a higher score
floor effect
situation in which many scores pile up at the low end of a distribution (creating skewness to the right--positive) because it is not possible to have any lower score
restriction in range
situation in which you figure a correlation but only a limited range of the possible values on one of the variables is included in the group studied.
independence
situation of no relationship btwn 2 variables; term usually used regarding 2 nominal variables in a chi-square test for independence.
Cohen's conventions for R^2
small: .01 medium: .06 large: .14
The variance of a sample will generally be ______ than the variance of the population from which it is taken
smaller
do you want a larger or smaller CI?
smaller, bc then you're more precise about what the true mean is
hierarchical linear modeling
sophisticated type of multilevel modeling that handles a research situation in which ppl grouped in some way that could affect pattern of scores
structural equation modeling
sophisticated version of path analysis that includes paths w latent, unmeasured, theoretical variables & permits a kind of significance test and provides measures of overall fit of data to hypothesized causal modeling
normal curve
specific, mathematically defined, bell-shaped frequency distribution that is symmetrical and unimodal; distributions observed in nature and in research commonly approximate it
t value
square root of F; t cutoff is square root of F cutoff
To compare correlations with each other, most researchers
square the correlation coefficient (r)
standard deviation of the distribution of means is also
standard error
larger effect size is greater
statical significant
Chi-square statistic (x^2)
statistic that reflects the overall lack of fit between the expected and observed frequencies; sum, over all the categories or cells, of the squared difference between observed and expected frequencies divided by the expected frequency X^2=sum(O-E)^2/E
computer-intensive methods
statistical methods, including hypothesis-testing procedures, involving large numbers of repeated computations -flexible (use when no test exists) -new
factor analysis
statistical procedure applied in situations where many variables measured & that identifies grps of variables that tend to be correlated w each other and not w other variables -factor: grp of variables that tend to correlate w each other and not w other variables -factor loading: correlation of a variable w a factor. Ranges from -1 to 1. +/- 0.3 -> contributes meaningfully to factor.
probe
statistically testing the differences between the group means when there are three or more groups to compare
slope
steepness of the angle of a regression line in a graph of the relation of scores on a predictor variable and predicted scores on a criterion variable; number of units the line goes up for every unit it goes across.
SStotal (structural model)
sum of squared deviations of each score from the overall mean of all scores, completely ignoring the group a score is in SStotal=SSwithin+SSerror (sum(X-GM))^2=(sum(X-M))^2+(sum(M-GM))^2
SSbetween
sum of squared deviations of each score's group mean from the grand mean S^2between=(sum(M-GM))^2/dfbetween or MSbetween=SSbetween/dfbetween
to evaluate how good a prediction rule is, we use
sum of squared errors
sum of the squared errors
sum of the squared differences between each predicted score and actual score on the criterion variable.
SSError (sum of the squared errors)
sum of the squared differences between each score on the criterion variable and its predicted score.
F table (ANOVA)
table of cutoff scores on the F distribution
chi square table
table of cutoff scores on the chi-square distribution for various degrees of freedom and significance levels
Repeated measures (within-subjects) factorial ANOVA
takes into account correlated nature of grps. Should be avoided when sequence effects strong.
interview schedule
telephone/in-person interviews in which researcher reads questions to respondent & records answers
causal modeling
test whether pattern of correlations among variables in a sample fits w some specific theory about which variables are causing which. -path analysis -limitations: Correlation does not demonstrate causation, take only linear relationships directly into account, results distorted (smaller path coefficients) if restriction in range
Single-subject systematic replication
testing for generalization of a procedure to other conditions, persons, and target behaviors
null hypothesis in t test for dependent means
that there is NO DIFFERENCE between the two groups of scores (this means that the mean of the population of difference scores is 0)
the greater the confidence is
the broader is the confidence interval.
Reliability
the degree of consistency/stability of a measure -test-retest -split-half -interrater -almost always discussed when talking about creation of a new measure
multiple baseline design
the effects of treatment demonstrated on different behaviors successively -across behaviors: diff behaviors of same individual -across individuals: same behavior in diff individuals -across settings & time: Treatment for one individual in diff settings/time
spearman's rho
the equivalent of a correlation coefficient for rank-ordered scores -less affected by outliers
Spearman's rho
the equivalent of a correlation coefficient for rank-ordered scores. can be used in certain situations when the scatter diagram suggests a curvilinear relationship between two variables. can be used in certain situations to figure a correlation when the original scores are not based on true equal-interval measurement. less affected by outliers than the regular correlation coefficient
The degrees of freedom for the t test for a correlation
the number of people in the sample minus 2 (N-2)
least squared errors principle
the principle that the best prediction rule is the rule that gives the lowest sum of squared errors between the actual scores on the criterion variable and the predicted scores on the criterion variable
Attenuation
the reduction in a correlation due to unreliability of measures
attenuation
the reduction in a correlation due to unreliability of measures
cross-product of Z scores
the result of multiplying a person's Z score on one variable by the person's Z score on another variable very useful for finding correlation bc we get a large positive number if there is a positive linear correlation, a large negative number if there is a negative linear correlation, and a number near 0 if there is no linear correlation.
hypothesis testing for a regression coefficent
the same as correlatoin is it stat sig from zero?
standard deviation
the square root of the average of the squared deviations from the mean. Tells you approximately the average amount that scores differ from the mean
standard error of the mean (standard error)
the standard deviation of the distribution we would get if we took every possible sample of a given size from a pop & graphed the means of those samples.
standard error is
the standard deviation of the sampling distribution of the sample mean
a constrast is considered to be proper if
the sum of codes is zero
sets of contrast codes are considered to be orthagonal if
the sum of the cross products of the contrast codes also add up to zero - multiply all the contrast codes from each group, add them up, see if theyre 0
extraneous variables
those between-group variables, other than the IV, that have effects on groups as a whole, possibly confounding results
standardized effect size
to divide the raw score effect size for each study by its respective population standard deviation.
T test for Independent Means
two seperate groups of people tested and the population variance is not known
contingency table
two-dimensional chart showing frequencies in each combination of categories of two nominal variables 2x2/3x2 esp common
levels of measurement
types of underlying numerical information provided by a measure, such as equal-interval, rank-order, and nominal (categorical)
in a t test for a single sample the population variance is
unkown
estimation approach
uses confidence interval (z score)(standard deviation of distribution of means) + SAMPLE mean If the confidence interval does not include the mean of the null hypothesis distribution, then the result is statistically significant
How do researchers carry out the factorial ANOVA
using structural model approach
the null hypothesis in hypothesis testing for a correlation is
usually that in the population the true relation between the two variables is no correlation
contrast codes
values associated with groups of data that we want to compare if we want to make more comparisons, we use more than one set of contrast codes
continuous variable
variable for which, in theory, there are an infinite number of values between any two values
equal-interval variable
variable in which the numbers stand for approximately equal amounts of what is being measured
discrete variable
variable that has specific values and that cannot have values between these specific values.
numeric variable
variable whose values are numbers (as opposed to a nominal variable).
nominal variable
variable with values that are categories (that is, they are names rather than numbers). Also called categorical variable
Structural Model
way of understanding the analysis of variance as a division of the deviation of each score from the overall mean into two parts: the variation in groups (its deviation from its group's mean) and the variation between groups (its group's mean's deviation from the overall mean); an alternative (but mathematically equivalent) way of understanding the analysis of variance
results of Levene's test
we'd like to see high p values, indicating that the estimated variances of the two populations are pretty similar. A low probability suggests that they're pretty different, that they're NOT homogeneous.
instrumentation
when ppl initiate new approaches/programs, might also be changes in recording procedures. -confounding must be ruled out in interrupted time-series design
Interaction effect
when the effect of one independent variable differs depending on the level of a second independent variable. Enhancement of sum of effects of variables Most important issue in factorial research
unreliability of measurement
when the procedures used to measure a particular variable are not perfectly accurate. The effect is to make the correlation smaller than it would be if perfectly accurate measures were used (presuming there would be a correlation if perfectly accurate measures were used).
randomize
whenever possible most basic and single most important control procedure.
error variance
within-groups variance due to chance factors and individual factors
criterion variable
y in prediction, a variable that is predicted
if results are stat sig and large sample sizw
you must consider the effect size directly, because it is quite possible that the effect size is too small to be useful.