inferential Statistics
Type 1 Error
(alpha): reject null when null is true
P value interpretation
- <1%= overwhelming evidence that supports the alt. hyp. -1-5%= strong evidence to support alt hyp. -5-10%= weak evidence to support alt hyp ->10%= no evidence that supports the alt. hyp
Two independent sample t test
- compare population means of two populations -hypotheses are left, right or two tailed -test statistic should be used
ANOVA Assumptions
1. DV is normally distributed for each of the population 2. the population variances of the DV are the same (homogeneity) 3. the cases represent random samples from the population and the schores on the DV are independent of each other (mutually exclusive)
Steps for hypothesis testing
1. Establish hypothesese (null and alt) 2. Significance level- determine what alpha to use to determine statistical significance 3. Stats test- determine what test to use 4. Compare the distribution of the stat to the distribution under the null and report p- value 5. Decision rule: reject or fail to reject null
Procedure for a t test
1. Null and alt hypo 2. Specify significance level 3. Calculate the t statistic 4. Determine the critical value 5. Compare the t value with the critical value 6. State the conclusion
Two Sample t test assumptions
1. both parent populations are normally distributed 2. random samples were taken from each distribution 3. the population variances of the two populations are equal
Procedure for a paired t test
1. calculate the difference between the observations of each pair (di= yi-xi) 2. calculate the mean difference (dbar) 3. calculate the standard deviation of the differences and use this to calculate the standard error of the mean 4. calculate the t statistic 5. use tables for t distribution to compare your value for T to the tn-1 distribution this will give the p value for the test
ANOVA degrees of freedom
between group variance: no. groups -1 within group variance: no. subjects - no. groups df total variance: no. subjects - 1
Sum of Squares for Total Variation
equal to the sum of the squared deviations of each data point in all the groups from the grand mean
Sum of Squares between
examines how each group mean varies from the grand mean
ANOVA violations
non normality; heterogenity of variance among groups; non-indpendence
df or a paired t test
number of groups - 1; determines critical region
a priori contrasts
planned post hoc comparison; based on hypothesis stated before data collected
Procedure for a two sample t test
1. hypotheses 2. significance level 3. compute test statistic 4. determine critical region 5. reject null if test statistic falls in the critical region; fail to reject if it falls outside CR 6. calculate exact p values 7. express results in p values 8. CI 95% or 99% 9. conclusions
Paired t test assumptions
1. paired differences are independent 2. paired differences are identically normally distributed (same mean and variance)
Steps in Test of Hypothesis
1. state hypothesis 2. establish significance alpha 3. determine appropriate test 4. calculate the test statistic 5. determine df 6. compare computed test statistic against a table CV or compare the p-value corresponding to the test statistic with alpha level
Advantages ANOVA vs t Tests
1. yields accurate and known type 1 error 2. can affect 2 or more independent variable simultaneously 3. more powerful when alpha is held constant 4. robust
T test
A group of statistics used to determine if a significance difference exists between the means of two sets of data; small n (<30)
Parameter
A measurable characteristic do a population (u, sigma...)
Statistic
A measurable characteristic of a sample (x,s,z,t...)
Hypothesis testing
A statement about the parameters; support or contradict; based on sample evidence (stat) and probability, used to test claims regarding a characteristic (parameter)
F ratio
ANOVA: variance between treatments divided by the variance within treatments; or F= treatment effect+ differences due to chance/ differences due to chance F=1 - no treatment effects F>1- some treatment effect (not significant though)
Population
All subjects or objects possessing some common specified characteristic. Arbitrarily defined by naming its unique properties
Significance level
Alpha is the probability of making a type 1 error; specified as a small probability range, usually 0.05.
Type II Error
Beta: we don't reject the null when the null is false
One sample t test
Compares the mean score of a sample to a known value~ usually the population mean
One sample t test data requirement
Continuous data, small sample, known population mean u
One sample t test assumptions
Data is normally distributed, individual observations are independent, sample is randomly drawn
One way ANOVA data requirement
Dependent variable (DV) has to be continuous and must fulfill assumptions Independent Variable (IV) can have two or more levels (male, female;religion; race)
Sampling distribution
Describes probabilities associated with a statistic when a random sample is drawn from a population
Point estimation
Estimate a population parameter
Confidence interval
Estimate a population parameter with confidence
Fc
F statistic MSW/MSB
ANOVA F statistic
F= MSB/MSW
Inferential statistics
Making inferences, hypothesis testing, determining relationship
Degrees of freedom
N-1; the number of data points that are free to vary
Sample
Smaller group of subjects or objects selected from a large group (population)
CI for t test
T value will vary depending on degrees of freedom; (1-alpha)100%CI for u=Xbar +-t(s/sqrt n); if the hypothesized value is contained in the CI= insignificant (fail to reject null)
Confidence interval
The level of certainty that the true score falls within a specific range. The smaller the range the less the certainty.
Alternative hypothesis
The relationship or association or difference that the researcher actually believes to be present; research hypothesis
Null hypothesis
There is no difference or association between variables that is any greater than expected y chance
Interval estimate
When a parameter is being estimated as a range of scores
Point estimate
When a parameter is being estimated as a single number
Why t test instead of Z test?
Z requires knowledge of the population standard deviation sigma~ usually don't have this information. Use t test for this
Test statistic Z
Z= xbar-u/sigma/sqrt n
ANOVA terminology
a) factor: independent variable b) independent measures: seperate sample for each treatment c) level: individual treatment conditions that make up a factor.
Two sample t test with unequal variances
adjusted with dF
Post Hoc testing
required to determine if the difference between groups is significant; compares two groups and determines which groups are significantly different
T test formula
t=Xbar-u/S/sqrt n Where u is the population mean, Xbar is the sample mean and S is the estimator for population standard deviation.
Paired t test test statistic formula
t=dbar-o/sd/sqrt n where d=mean diff between x and y for each case; Sd is the estimate of the standard deviation of the difference
F alpha
tabled critical value (F table); if Fc>Falpha reject null
Grand Mean
the mean of all of the condition means
Sum of Squares within
total variation that occurs in each group compute each SS deviation of each sub-group seperately and summing the results
Paired t test
two population means where you have samples in which observations can be paired (i.e pre and post test scores)
Post Hoc Comparisson ERROR
type 1 (alpha): refers to the probability of committing a type 1 error for an individual test Experimentwise (or family) error refers to the probability of committing type 1 errors for a set of statistical tests in the same experiment
95% CI formula
u=X+-1.96(sigma/sqrt n)
99%CI Fromula
u=X+-2.58(sigma/sqrt n)
One way ANOVA
use when the mean of a variable (dependent variable) differs among three or more groups; drug vs control