PSY 110 exam 3
critical significance guidelines
-2 > t > 2 and F > 4
An independent-measures experiment with three treatment conditions has a sample of n = 10 scores in each treatment. If all three treatments have the same total, T1 = T2 = T3, what is SSbetween?
0
If the null hypothesis is true, what value is expected on average for the repeated-measures t statistic?
0
A chi-square test for goodness of fit has df = 2. How many categories were used to classify the individuals in the sample?
3
pooled variance
= SS+SS/(n-1)+(n-1)
If a repeated-measures study shows a significant difference between two treatments with a = .01, then what can you conclude about measures of effect size?
A significant effect does not necessarily mean that the effect size will be large.
For which of the following situations would a repeated-measures research design be appropriate?
Comparing pain tolerance with and without acupuncture needles
Under what circumstances is the phi-coefficient used?
When both X and Y are dichotomous variables
Assuming that other factors are held constant, which of the following would tend to increase the likelihood of rejecting the null hypothesis?
Increase the sample mean difference
What is indicated by a positive value for a correlation?
Increases in X tend to be accompanied by increases in Y
In general, if the variance of the difference scores increases, what will happen to the value of the t statistic?
It will decrease (move toward 0 at the center of the distribution).
In a hypothesis test using a t statistic, what is the influence of a large sample variance?
Larger variance tends to lower the likelihood of rejecting the null hypothesis.
on average, what value is expected for the F-ratio if the null hypothesis is false?
Much greater than 1.00
In a repeated-measures experiment, each individual participates in one treatment condition and then moves on to a second treatment condition. One of the major concerns in this type of study is that participation in the first treatment may influence the participant's score in the second treatment. What is this problem is called?
Order effects
within studies order effects
Progressive effects: participants are altered by the sequence of conditions they encounter carryover effects: participants responses in one condition are affected by the prior condition, can besolved by counterbalancing which uses all possible sequences at least once. in partial counterbalancing a subset of the total sequences is used. reverse counterbalancing presents conidtions in one order and then again in reverse. block randomization has every condition occur before any condition is repeated
sum of products (SP)
SP = sum of (x - meanx)(y - meany) similar to sum of squared deviations (SS), measures amount of covariability between two variables in a correlation
SStotal - SSwithin =
SSbetween
For a group of graduating college seniors, a researcher records each student's rank in his/her high school graduating class and the student's rank in the college graduating class. Which correlation should be used to measure the relationship between these two variables?
Spearman correlation
What happens to the critical value for a chi-square test if the size of the sample is increased?
The critical value depends on the number of categories, not the sample size.
Which of the following describes the effect of increasing sample size?
There is little or no effect on measures of effect size, but the likelihood of rejecting the null hypothesis increases.
As sample variance increases, what happens to measures of effect size such as r2 and Cohen's d?
They tend to decrease
When comparing more than two treatment means, why should you use an analysis of variance instead of using several t tests?
Using several t tests increases the risk of a Type I error.
If there is a positive correlation between X and Y, then the regression equation Y = bX + a will have
a b > 0
t distribution
a family of distributions, one for each value of degrees of freedom
Which set of sample characteristics is most likely to produce a large value for the estimated standard error?
a small sample size and a large sample variance
factorial design
a study that combines two or more factors (ex: how gender influences movie preferences)
confidence intervals
alternative technique fordescribing effect size. absed on the reasonable assumption that M should be near m, it estimates m from sample M. a big confidence interval range means you are less likely to make an error. m = M+/- t(SEM). if the range does not include 0 Fthen it is significant and null can be rejected. more confidence desired increases interval width. larger sampleequals smaller interval
ANOVA
an analysis of varience, good for when you have more than two groups (in your categorical indpendent variable) that you're comparing on one continuous outcome or DV or when you have more than one factor that you are considering (whether theres an interaction between one factor like age and another like GPA). used to evaluate mean difference between two or more treatments, and uses sample data as basis for drawing general conclusions about populations
chi square test
appropriate when you have normal independent variables but only group membership (frequencies) as your raw data. test assumptions: each observed frequency is generated by a different individual (observed frequency) and test is not performed when expected frequency of any cell is less than 5 (expected frequencies).
nonparametric tests
are needed when the research situation does not conform to the requirements of parametric tests (normal distribution in population, homogeneity of variance in population, and numerical score for each individual). chi square and other nonparametric tests do not state hypotheses in terms of a specific population parameter. participants are usually classified into categories, nominal or ordinal scales are used and the data from these tests are frequencies
For an ANOVA comparing three treatment conditions, what is stated by the alternative hypothesis (H1)?
at least one of the three population means is different from another mean
between treatments variance vs within treatments variance
between measures differences caused by systematic treatment effects and random factors, within only measures random
special applications of chi square
chi square and pearson correlation both evaluate relationships between two variables. the type of data obtained determines which one is appropriate. chi square is used instead of t test or anova, when counts rather than means of categories are being compared. chi square can evaluate the significance, parametric tests measurer strength and effect with greater precision
for an independent-measures research study, the value of Cohen's d or r2 helps to describe
how much difference there is between the two treatments
for an independent measures test, the value of cohens d or r squared helps describe
how much difference there is between two treatments
chi square test for independence
chi square can test for the existence of a relationship between 2 variables. each individual is classified on each variable, counts are presented in the cells of a matrix, and research may be experimental or non experimental. frequency data from a sample is used to evaluate the relationship of two variable in the population. h0 = two variables are independent. single population = no relationship between two variables in this population. two separate populations = no difference between distribution of variable in the two populations (defined by a nominal variable). variables are independent when there is no consistent predictable relationship between them. frequencies in the sample = fO, and null hypothesis of same proportions in each category (population) = fE
A researcher is using a chi-square test for independence to evaluate the relationship between birth-order position and self esteem. Each individual is classified as being 1st born, 2nd born, or 3rd born, and self-esteem is categorized as either high or low. For this study, what is the df value for the chi-square statistic?
chi square df = (r - 1)(c - 1) = (3 - 1)(2 - 1) = 2
measuring effect size of chi square
chi square hypothesis test indicates difference did not occur by chance, but does not indicate effect size. for a 2x2 matrix, the phi coefficient φ measures the strength of a relationship. φ = √ x2/n. for a larger matrix, use cramer's v. V = √x2/n(df). df is the smaller of (R-1) or (C-1)
Which of the following best describes the possible values for a chi-square statistic?
chi square is always positive but can contain fractions or decimal values
cohort sequential design
combines cross sectional and longitudinal, ex: study one group at age 8 and 10, and another at age 10 and 12
correlations and causation
correlations do not provide proof of causation, that requires an experiment in which one variable is manipulated and others carefully controlled
repeated measures
data from the same or related participant groups, also called within subjects. this is what longitudinal studies are
independent measures design
data from two completely different, independent participant groups, also called between subjects design. this is what a cross sectional study is
standard error
describes how much difference is reasonable to expect between M and m, = SD/square root of n
percentage of variance
determines the amount of variability in scores determiend by treatment effect. .01 is small, .09 is medium, .25 is large
computing expected frequencies
fE = (fC)(fR)/n. fC is frequency total for column and fR is frequency total for row
if p is greater than .05
fail to reject null hpyothesis
regression
finds the equation describing the best fitting line for a set of data (a line is the best fit for the actual data that minimizes prediction errors). makes relationship indicated in pearson correlation obvious by putting a line through the scatterplot data. this makes the relationship easier to see, shows the central tendency of the relationship, and can be used for prediction. y(pointy) is the value of y predicted by the regression line for each value of x. y - y(pointy) is the distance each data point is from the line (error of prediction). when calculating predicted y (ypointy) for a provided x when you have a and b, use equation for line. regression procedure produces a line that minimizes total squared error of prediction (method is called least squared error solution)
Ten years ago, only 20% of the U.S. population consisted of people more than 65 years old. A researcher plans to use a sample of n = 200 people to determine whether the population distribution has changed during the past ten years. If a chi-square test is used to evaluate the data, what is the expected frequency for the older-than-65 category?
for expected frequency, multiply n (200) with provided percentage (20 % or .2). 200 x .2 = 40
statistical hypotheses for anova
h0: m1 = m2 = m3, h0 can be wrong if all means are different from each other or if only some means differ from each other while others are the same
hypotheses for repeated measures
h0: mean difference = 0 h1: mean difference does not equal 0
parametric tests
hypothesis tests test hypotheses about population parameters. parametric tests share several assumptions (normal distribution in population, homogeneity of variance in population, and numerical score for each individual).
determining significance with variability
if between variablity is low and within varability is high, then difference is not significant
f ratio
if h0 is true, size of treatment effect is small and f is near 1 (when variance between is similar to variance within), if h1 is truec size of treatment effect is large an f is noticeably bigger than 1
equal variances
if p is less than .05 than null hypothesis is rejected and equal variances are not assumed
calculation of expected
if the expected frequencies are based purely on chance (random model), then you can calculate them based on how many categories you have in your analysis
chi squared distribution
includes values (all vare greater than 0) for all possible random damples when h0 is true. null hypothesis should be rejected if the discrepancy between observed (fO) and expected (fE) values is large (aka if x2 is large). distribution is positively skewed, it is a family of distributions (determined by df determined by C-1, where C is number of categories). slightly different shape for each df value.
independent vs dependent variables
independent is always categorical, dependent is always continuous
levels
individual conditions or values that make up a factor
data for goodness of fit
individuals in each actegory are counted, observed frequencies in each category are measured, and each individual is counted in only one category. compares the observed frequencies (fO) of data with the assumptions of the null hypothesis. fO is just frequency counts and cant be fractional.construct expected frequencies (fE = frequency value that is predicted from h0 and sample size) that are in perfect agreement with the null hypothesis. chi squared = x2
when the n is small, the t distribution
is flatter and more spread out than the normal zdistribution
Which of the following accurately describes the chi-square test for goodness of fit?
it uses one sample to test a hypothesis about one population
The Pearson and the Spearman correlations are both computed for the same set of data. If the Spearman correlation is rS = +1.00, what can you conclude about the Pearson correlation?
it will be positive
In the observed frequencies for a chi-square test for independence, how often is each participant counted?
just once
An analysis of variances produces dfbetween = 3 and dfwithin = 24. If each treatment has the same number of participants, then how many participants are in each treatment?
k = groups of participants. dfbetween = 3 = k - 1 therefore k = 4. dfwithin = 24 = N - k. therefore N = 28. total participants/groups of participants = participants in each treatment = 28/4 = 7
between treatments degrees of freedom
k-1
what combo of mean and variance leads to a decision that there is a significant treatment effect?
large mean difference and small sample variance
which combination of factors is most likely to produce a large value for f ratio?
large mean differences and small sample variances
in general, what factors are most likely to reject the null hypothesis for an ANOVA?
large mean differences and small variances
factors that influence hypothesis test outcome
larger sample mean difference increases t, as does larger sample size, larger variance and standard deviation deccrease t
null hypothesis for independent measures
m1 - m2 = 0
alternative hypothesis for independent measures
m1 - m2 does not equal 0
correlations
measures and describes the realtionships between two variables, good for when you have one group of people and more than one continuous measure (DV) that may be related, and when you want to know whether you can predict one continous (not categorical, examples of continuous: height, age, GPA) variable from another. can vary in direction (positive or negative), form (linear is most common), and strength (varies from 0 to 1). may imply causation, but relationship can be due to third variable
estimated cohens d
measures effect size, computed using the sample standard deviation. 0.2 is small, 0.5 is medium, and 0.8 is large
point based correlation
measures relationship between 2 variables, one varaible has only 2 values (called dichotomous or binomial). point biserial r2 has the same value as r2 computed from t statistic (measures effect size). useful when trying to figure out whether performance on a single question (right or wrong) correlates with the overall score. both x and y are recoded as 0 and 1. the regular pearson formulation is used to calculate r. r2 measures effect size (proportion of variablity in one score predicted by the other)
correlation coefficient
measures the degree of a relationship on a scale of 0 to 1. is not a proportion, but squared correlation may be interpreted asthe proportion of shared variability (called the coefficient of determination, represents percentage of y that can be predicted by x). value affected by range of scores in data, severely restricted range may provide a different correlation than a broader range of scores
partial correlation
measures the relationship between two variables while mathematically controlling the influence of a third variable by holding it constant. rxy • z = rxy - (rxy • ryz)/ √(1-r2xz)(1-r2yz)
z scores form a normal distribution if
n is greater than 30 or the original distribution is approximately normally distributed
degrees of freedom
n-1, influences shape of distribution for small samples but not large. lower df means higher value for critical region. one sample(paired sample) tests df is n-1, multiple groups (independent sample) df is n-1 for each group
independent samples t-tests involve two groups so the df is
n-2
within-treatments degrees of freedom
n-k
Assuming that there is a 5-point difference between the two sample means, which set of sample characteristics is most likely to produce a significant value for the independent-measures t statistic?
n1 = n2 = 100 and small sample variances
non directional vs one direcitonal test
non directional most commonly used, but sometiems one direction is. in one direciton, critical region is defined in just one tail of the distribution, one tailed makes it easier to reject null
sample mean difference for more than two samples
not possible to calculate, instead use f ratio (mean squares/variance between divided by mean squares/variance within). mean squares between is the numerator and determines how far the sample means are from the grand mean (SS between divided by df between). mean squares within measures difference between sample mean and individual scores (SS within divided by df within)
n1
number of scores in each treatment
k
number of treatment conditions
ANOVA assumptions
observations within each sample must be independent, population from which sample is selected must be normal and have equal variance
scheffe test
one of the safest possible posthoc tests uses f ratio to evaluate the significance of the difference between two treatment conditions. Favsb = MSbetween/MSwithin calculated with SS of two groups. another popular posttest is Tukey'shonestly signifcant difference, or HSD
what is needed for each test
one sample: must know the test value independent measures: must be sure groups are equivalent using random sampling or random assignment repeated measures: must test the same group twice and control for learning and order effects
directional hypotheses and one tailed test
one tailed test only used when predicting a specific direction of the variance is justified
correlations and outliers
outlier is an extremely deviant individual. produce disproportionately large impact on the correlation coefficient
pearson correlation
r = covariability of x and y/variability of x and y separately AND r = SP/√ (SSx)(SSy) AND sample: r = the sum of (zx)(zy)/n-1 OR population: r = the sum of (zx)(zy)/N. measures degree and direction of linear relationship between two variables. in a perfect linear relationship every change in x corresponds with a change in y, and will be +1.00 or -1.00. usually computed for sample data, but used to test hypotheses about the relationship in the population. developed for data having linear relationships, with data from interval or ratio measurement. alternative correlations are used for data having non linear relationships and with data from nominal or ordinal measurement scale
paired samples t test
repeated measures. good for comparing scores (on a continuous DV) in one group of participants at two different time points, usually before and after treatment
repeated measures vs independent measures
repeated pros: requires fewer subjects, able to study change over time, reduces influence of individual differences, and has substantially less variablity in scores. cons: factors besides treatment may cause change in score, order effects (participation in first treatment may influence score in second)
homogeneity assumption
requires equal population variances for an independent measures test, sample variances jsut need to be similar
chi square statistic for test independence
same equation as chi square for goodness of fit, x2 = sum of (fO - fE)2/Fe. degrees of freedom = (R - 1)(C - 1), R is number of rows, C columns
effect size
should be determined if null hypothesis is rejeccted. mean difference/square root of pooled variance. .2 is small, .5 is medium, and .8 is large
critical region for chi square test
significance level is determined, critical value for chi square is located in a table of critical values according to df and signficance level chosen. if the chi square is higher than the critical value reject the null
post hoc tests
significant f ratio means at least one difference in means is statistically significant but does not indicate which means are different. post hoc tests help determine exactly which means are different. they compare two means at a time (pairwise comparison). each comparison includes a risk for type 1 error that accumulates and is called the experimentwise alpha level. these posttests use special methods to try to control alpha level (only used when anova is significant)
t test for repeated measures
similar in structure, but comparing difference scores instead of raw scroes
testing regression significance
similar to analysis of variance, uses f ratio of two mean square values. each MS is SS divided by its df. H0 = the slope of the regression line (b or beta) is zero (a flat line). h1: the line slopes to the left or the right. MSregression = SS regression/df regression. MSresidual = SS residual/df residual. F = MSregression/MSresidual
matched subjects design
similar to repeated measures, uses two seperate sampels but each individual in one sample is matched one to one with an individual in another sample. very similar tests, but matched has twice as many participants
One sample has a variance of s2 = 20 and a second sample has a variance of s2 = 30. The pooled variance for these two samples will be
somewhere between 20 and 30
standard error of the difference
square root of variance1 divided by n1 plus variance2 divided by n2. OR square root of pooled variance divided by n1 plus pooled varaince divided by n2 (better if sample sizes are different)
for two independent samples, either
t or f can be used
t tests and anova
t tests are just a special case of anova where only two conditions are being prepared. that is why t squared = F when there are two conditions
factor
the independent/quasi independent variable that designates the groups being compared
advantage of anova
the more t tests you run, the higher chance that the significance is due to chance. by evaluating them all simultaneously anova eliminates this issue
a basic assumption for a chi square hypothesis test is
the observations must be independent
What is indicated by a large value for the chi-square statistic?
the sample data (observed values) do not match the hypothesis
In an independent-measures hypothesis test, if t = 0, then
the two means must be equal
A sample of n = 10 scores has M = 58, s2 = 160, and an estimated standard error of 4 points. Which of these values will probably decrease if the sample size is increased to n = 100?
the value of the standard error
critical values
to determine critical values, take t and df and look at table. to determine t, take critical value (a) and df and look at table
N
total number of scores
if r = 0.58, the linear regression equation predicts about 1/3 of the variance in y scores
true, when r = .58, r2 = .336 (roughly 1/3)
nonparametric equivalents of popular parametric tests
typically used for ordinal data whererank orders might be different between groups, similar to spearmans correlation. paired t test -> wilcoxon, independent t test -> mann-whitney U, one way anova -> kruskal-wallis
for independent measures = between subjects
use independent samples test (two groups, df=n-2)
for repeated measures = within subjects
use paired samples/dependent samples/pretest-posttest (one group, df=n-1)
spearman correlation
used for ordinal scales, also used for interval or ratio that does not have a linear relationship (ex: curved or with an asymptote). ifscores are tied, they need to be ranked. to assign rank, list scores from lowest to highest, assign a rank to each position on list, compute mean of ranked position for 2 (or more) tied scores, and assign mean as rank to each tied score.
one sample t test
useful when you have data about a continuous measure, and youwant to see twhether the sampleas a whole differs from one specific value, and when you are trtyin to see if a continuous set of data differs from chance. an approximate z. used to test hypothesses about an unkwon population mean when the standard deviation is also unknown. is significant if t is large enough and p is small enough.
An independent-measures study uses
uses a different sample for each of the different treatment conditions being compared
chi square for goodness of fit
uses sample data to test hypotheses about the shape of proportions of a population distribution. tests to fit the proportions in the observed sample with the hypothesized proportions of the population. specifies the proportion of the population in each category. h0 = no preference among categories and no difference in one population from the proportions in another known population. uses sample data to test hypotheses about the shape or proportions of a population distribution. test the fit of proportions obtained in the sample with the hypothesized proportions of the population
hypothesis testing with z scores
uses sample mean to estimate and approximate population mean. problem with z scores is they requires knowledge of the population standard deviation, that sometimes researchers dont have. no measure of effect size is included and something with a small effect can be statistically significant, therefore results should be accompanied by a measure of effect size
when a treatment has a consistent effect
variability is low
dependent variable
what you are measureing
Which of the following Pearson correlations shows the greatest strength or consistency of relationship?
whatever has biggest number, negative or positive
linear equations
y = bX + a (similar to y =mx + b). x and y are variables, while a and b are fixed constants. change in y/change in x in the slope. value of a is where line crosses the x axis (so when x equals 0).
A linear regression equation has b = 3 and a = - 6. What is the predicted value of Y for X = 4?
y = bx + a. y = 3(4) - 6 = 6
effect size for anova
η² = SS between/SStotal, small = .01, medium = .06, large = .14
A researcher computes the pooled variance for two samples and obtains a value of 120. If one of the samples has n = 5 scores and the second has n = 10 scores, then what is the value of the estimated standard error for the sample mean difference?
√120/5 + 120/10