Stats Final Exam

Ace your homework & exams now with Quizwiz!

within-treatment variance measures differences caused by:

1. random, unsystematic factors

SP =

= ΣXY − ΣXΣY / n

How to formulate hypotheses?

H0: "All population (treatment) means are the same" H0: µ1 = µ2 = µ3 H1: "At least one population mean is different" e.g.: H1: µ1 ≠ µ2 ≠ µ3 H1: µ1 = µ2 , but µ3 differs. etc.

F =

MSbetween / MSwithin

Correlation

describes the situation in which both X and Y are random variables. In this case, the values for X and Y vary from one replication to another and thus sampling error is involved in both variables.

In a repeated-measures design, the same people are tested in each treatment, so

differences between treatment groups cannot be due to individual differences

interaction effect

effects of IV diff at diff evels of another IV

factor

independent (or quasi-independent) variable

Most commonly used correlation:

pearson correlation

f e = formula

Direction:

positive (+) or negative (-).

Tukey's HSD:

q(√MSWithin / n)

independent measures f =

treatment effect + indiv diffs + chance / indiv diffs + chance

A repeated-measures design removes

variability due to individual differences, and gives us a more powerful test

Unplanned comparisons (post hoc tests)

• Exploring data after H0 has been rejected • Specific tests to control experiment

What is ANOVA?

A hypothesis-testing procedure used to evaluate mean differences between two or more treatments (or populations).

What does ANOVA stand for

Analysis of Variance

df = for chi

C -1

Outliers:

Correlation is particularly susceptible to a few extreme scores--always look at the plot

Another way of expressing ANOVA formula

F = treatment effect + chance / chance

Dependent measures design

Groups are samples of dependent measurements (usually same people at different times; also matched samples) "Repeated measures"

Independent measures desig

Groups are samples of independent measurements (different people)

Main effect

Mean differences along the levels of one factor (oneway F-ratio)

dftotal =

N - 1

Omibus ANOVA

Overall significance test

Testwise Error

Probability of a type I error on any one statistical test

Experiment-wise Error

Probability of a type I error over all statistical tests in an experiment

measure of covariability

SP ("sum of squared products")

our previous measure of variability is

SS ("sum of squared deviation")

MSbetween =

SSbetween / dfbetween

SStotal =

SSwithin + SSbetween -dfwithin + dfbetween

MSwithin =

SSwithin / dfwithin

Hypothesis Testing with ANOVA

Step 1: Hypotheses • H0: all equal; H1: at least one is different Step 1: Hypotheses • H0: all equal; H1: at least one is different • Need: dfB and dfW Step 3: Calculations • SSB and SSW • MSB and MSW • F Step 4: Decision and conclusions • And maybe a source table

Taraban & McClelland

Tested reading times varying attachment and expectation

Nonlinearity:

The data may be consistently related, but not in a linear fashion

Random Variable

The values of the variable are are beyond the experimenter's control. We don't know what the values will be until we collect the data

Fixed Variable

The values of the variable are determined by the experimenter. A replication of the experiment would produce same values

studentized range statistic

a table of critical values of the SRS is provided in appendix G

More than one factor

factorial design

the critical value for Chi-square actually ___ with larger df rather than decreasing

increases

Remember: variance (="noise") in the samples

increases the estimated standard error and makes it harder for a treatment-related effect to be detected

Non-parallel lines =

interaction

regression

involves predicting a random variable (Y) using a fixed variable (X). In this situation, no sampling error is involved in X, and repeated replications will involve the same values for X (This allows for prediction)

When rejecting H0 in an ANOVA test, we

just know there is a difference somewhere...we need to do some detective work to find it

You look up the value for q from your text using

k (# of treatment groups) and dfwithin

dfbetween =

k - 1

The differences among the levels of one factor are referred to as the ____ of that factor.

main effect

Factorial Design can cause what

main effects of either factor, and an interaction effect between the factors

In ANOVA, variance =

mean square (MS)

Pearson correlation

measures the degree and direction of a linear relationship between variables.

HSD is the

minimal mean difference for significance

Parametric tests

must make assumptions about the distribution of the population, and estimate parameters of a theoretical probability distribution from statistics

Parallel lines =

no interaction

C =

number of columns

k =

number of levels of the factor (i.e. number of treatments)

n =

number of scores in each treatment

Levels

number of values used for the independent variable

fo =

observed frequency

interaction between two factors

occurs whenever mean differences between individual treatment conditions (combinations of two factors) are different from the overall mean effects of the factors

Advantages of Repeated-Measures

reduces or limits the variance, by eliminating the individual differences between samples.

One factor

single-factor design

q is the

studentized range statistic

Smith & Ellsworth (1987)

studied the effect of asking a misleading question on accuracy of eyewitness testimony. Subjects viewed a video of a robbery and were then asked about what they saw Factor 1: Type of question (unbiased/misleading) Factor 2: Questioner's knowledge of crime (naïve/ knowledgeable)

Remember the t statistic:

t = actual difference between sample means / difference expected by chance

in a graph, lines that are nonparallel indicate

the presence of an interaction between two factors.

Two-factor ANOVA consists of

three hypothesis tests

N =

total number of scores in the entire study

repeated measures f =

treatment effect + chance / indiv diffs + chance

Pearson Correlation- Points to keep in mind:

• Correlation does NOT imply causation. Ø Direction and third variable problems. • Correlation values can be greatly affected by the range of scores in the data. • Outliers can have a dramatic effect on a correlation value.

in independent measures, differences within groups could be due to

• Individual differences • Error or chance

In independent measures, differences between groups could be due to

• Treatment effect • Individual differences • Error or chance (tired, hungry, etc)

SSbetween =

∑ -T2/n - G2/N

SSwithin =

∑SSinside each treatment *****SSwithin = SStotal - SSbetween*****

T =

∑X for each treatment condition

SStotal =

∑X2 - (G^2/N)

dfwithin =

∑df in each treatment = N - k

We have to make sure we do not exceed a

.05 chance of a Type I error while we "investigate"

Steps in Hypothesis Testing for chi

1) State hypotheses (No preference/preference) 2) Determine critical values (chance model) 3) Calculate Chi-sqaure statistic 4) Decision and conclusions

Advantages to ANOVA

1)Can work with more than two samples. 2)Can work with more than one independent variable

Two interpretations for ANOVA

1)Differences are due to chance. 2)Differences are real

Note about F values

1)F-ratios must be positive. 2) If H0 is true, F is around 1.00. 3)Exact shape of F distribution will depend on the values for df.

steps of ANOVA calculation

1. SS within 2. df within 3. variance within treatments (ss within/df/within) 4. F = variance between treatments / variance within treatments

ANOVA Hypotheses

1. There really are no differences between the populations (or treatments). The observed differences between the sample means are caused by random, unsystematic factors (sampling error). 2. The populations (or treatments) really do have different means, and these population mean differences are responsible for causing systematic differences between the sample means

between-treatment variance measures differences caused by:

1. systematic treatment effects 2. unsystematic, random factors

If there is no effect due to treatment: f =~

1.00

G =

"grand total" of all the scores

Nonparametric tests

(distribution free) make no parameter assumptions; parameters are built from data, not the model

Two-factor ANOVA will do three things:

- Examine differences in sample means for humidity (factor A) - Examine differences in sample means for temperature (factor B) - Examine differences in sample means for combinations of humidity and temperature (factor A and B).

Looking at the data, there are two kinds of variability (variance):

-Between treatments -Within treatments

Variance between treatments can have two interpretations

-Variance is due to differences between treatments. -Variance is due to chance alone. This may be due to individual differences or experimental error.

We might want a nonparametric test if:

-We don't meet the assumptions for a parametric test -We want to do tests on medians -They aren't just tools to break out if assumptions fail.

error term

-the denominator of the F-ratio -it measures only unsystematic variance

ANOVA test statistic (F-ratio) is similar to t-stat why?

actual variance between sample means / variance expected by chance

Bonferroni correction

alpha is divided across the number of comparisons sig = α / k

Planned comparisons (a priori tests)

are based on theory and are planned before the data are collected • More powerful tests • Can be more liberal with our error rate

Stats Final Exam

Related study sets

A&P Lecture Exam 2 Chapter 10

Ch 13

MicroEconomics Unit 3

Management of Patients with Neurologic Dysfunction (Chapter 66)

Chapter 4 - Basic Molecular Genetic Mechanisms

Quiz 9

CH 11 Project Management T/F

ACE FINAL/Edited: History Unit Test 1

PSY 315 exam 2

Drivers Ed. Insurance Terms

Business and Consumer Loan

Chapter 2-1

Developmental Stages

Finding Unknown Angle Measurements

Clinical Management FINAL EXAM

Section Review Question Ch 1-5

Exam 2

Intro to IoT Final Exam Review

Chem

Excel Mod 1 Quiz