Stats
grand variance
the variance within an entire set of observations
Type 1 error
there is always a probabiltiy of making a Type1 error and the proability of making a type 1 error is never 0 Rejecting the null when it is true. (means their are no differences in the relationships)
What is the difference between a one-tailed and two tailed test of significant
two tailed test are getting their results stronger and stronger and can go in either direction a one tailed test is only going in one direction in other words with two tailed we look at the whole curve and with one tailed we only look at one side
population
what we are generalizing to.
sampling
when samples vary because they contain different members of the population the extent to which a statistic (the mean, median, t,F,) varies in samples taken from the same population
Are type 1 and type II errors related in anyway
yes, they are inversely related if we decrease type I error we increase type II error when we decrease one we increase the other
grand mean
The grand mean is the average of all groups
how does the size of a sample affect the standard error of the mean
The larger the sample the less error we should have in the estimate about the population or the smaller the standard error
what is the null hypotheis for a one-sample t test?
The mean difference (MD) between the two means equals zero 0H0:MD=0
what is the null hypothesis for Levene's test of the equality of (error) variances?
The null hyothesis for Levene's test is that the variances of the two groups are equal. When Levene's test is not significant then we may assume that we have equal variances when Levene's test is significant then we may not assume that we have equal variances
what is the null hypothesis for a paired-samples t test?
The null hypothesis is that there is no difference between the mean well being score of the sample at time 1 (pretest) and the mean of the sample at time 2 (posttest) O H0: MT1=M=T2 the null would aso be that the mean difference (MD) between the two means equals zero ) H0: MD=0 Thus we want to know if there is a signficant change in scores from time 1 to time 2 the value of interest if the change in score (postest-pretest)
beta level
The probability of making a Type II error
confidence interval
The provide another measure of effect size an educated prediction about the approximate value of the population parameter 95% CI corresponds to a P value of .05 99 CI corrersponds to a P value of .01 it also contains a margin of error which helps form the upper and lower limits around the sample statistic it involves greater than and less than the sample statistic
why would we want to apply to Bonferroni adjustment when evaluting the statistical significant of many bivariate correlations with a dataset
Researchers will use the Bonferroni adjustment to adjust their level of significance. The purpose is to decrease the chances of a Type 1 error when multiple test are conducted (experiment-wise error)
one-tailed test,
a statisitical hypotheses test in which values for which we can reject the null hypothesis and are located entirely in one tail of the probability of the distribution we pick a side
test statistic
a statistic for which we know how frequently different values occur. The observed value of such a statstic is typically used to test a hypothesis.
two tailed test
a test in which the values for which we can reject the null hypothesis are located in both tails of the distribution weather their is a difference higher or lower, you always want to a two tailed test we test/ analyze everything
Test statistic as a ratio of signal (effect) to noise (error)
Test statistic = Model over error Model is basically signal Error is noise when error is low our model is better
what is the null hypothesis for an independent-samples t tese?
That there is no difference between the means O H0: MG1=MG2 That the mean difference (MD) between the two group means equals zero. O H0: MD=0
Coefficient of Determination
The coefficient of determintation tells us how much of the variance in scores of one variable can be understood, or expalined by the scores on a second variable . it is to understand that when two variables are correlated they share a certain percentange of their variance the stronger thr correlation the greater the shared variance and the stronger the correelation the larger the correlation of determination
standard error of the mean (SE)
another way to say standard deviation and you can fix this by increasing your sample size It refers to the average difference between the expeted valued (population mean) and the individual sample mean does the sample mean represent the actual population mean it provides a measure of how much error we can expect when we say that a sample mean represents the mean of the larger population
What is the apporpiate scale of measurment for the outcome of dependent variables in ANOVA
continuous scales
omnivus F test
does not tell us where the difference exist among our groups
What is the appopriate scale of measurment for an indpendent variable in ANOVA
nominal with 2 or more categorical/groups/levels
degrees of freedom (df)
relates to the number of values free to vary when computing a statistic the minimum amount of data needed to calculate a statistic a number or numbers used to approximate the number of observation in the data set for the purposes of determining statistical significance it is necessary to interpret a chi-square statistic, f ration, t value sample minus one (N-1)
power
the ability to check differences (relationships) the probability of rejecting the null when it is fall the likely hood of finding differences between conditions when the conditions are truly different the probability of finding an effect when an effect exists Power = 1 - β .80 level is cutoff (there should be at least an 80% chance of not making a Type II error)
independent variable
the cause what is being manipualted
deviance (deviation)
the difference between the observed value of a variable and the value of that variable predicted by statistical model
Residual
the difference between the value a model predicts and the value observed in the data on which the model is based. Basically an error
how does the standard deviation of the sample affect the standard error of the mean
the larger the standard deviation the larger the standard error of the mean the bell curve gets fatter it gets wings (it gets wider) it decreases it gets more narrow, smaller, closer together
dependent variable
the outcome the effect
What is the relationship between probability -value and alpha and Type 1 error
the probability (p-value) of a statisitcal hypotheis test is the probability of getting a value of the test statistic as extrme or as more extreme than the observed by chance alone, if the null hypothesis H(lowercase0) is true. it is the probability of wrongly rejecting the null hypotheis if it is in face true. small p-values suggest that the null hypotheiss is unlikely to be ture its a false postive
Alpha (a) level
the probability of making a Type 1 error (usually this value is .05) the probablity of rejecting the null when it is true the null is thier are differences between groups is a well known criterion for decision making in statistical data evaluation
experimentwise error/Familywise errror
the probability of making a Type 1 error in an experiment involving one or more statistical comparisons when the null hypotheis is true in each case. the more test we had the higher our chances for error because we are grouping things together (using the same data to run a lot of test, because error increases with testing) 1-(0-95)n
What do we get when we partition the sum of squares for the dependent variable
the proportion of sum sqaures due to regresssion and the porportion of the sum of squares due to error, or residual
sample
the smaller collective units from a population
standard error
the standard deviation of the sampling distribution of a statisitc (e.g the mean). It tells us how much variability there is in the statistic across samples from the population. Large values, therefore, indicate, that a statisitc from a given sample may not be an accurate reflection of the population from which the sample came from.
when would you compute spearman's rho
(r lower case s) spearmans rho should be used to calculate the correlation between two variables that use ranked date (ordinal data)
when would you compute a point-biseral correlation
(r lowercase pb) a point-biseral correlation coefficent should be calculated when one of the variables is continuous and the other is a discrete dischotomous variable
Be able to interpret a 95% confidence interval around a sample correlation coefficient
...
F- Ratio
...
If given a 95% confidence interval around a sample correlation coefficent, be able to test the null hypotheis that no relationship exists betweent he two variables (X & Y)
...
match/paired dependent samples
...
standard error of the difference between means
...
the method of least squares using the mean as the statistical model
...
What are the conventions for interpreting the size of (n2) .
.01 .06 .14
Predicted Value
A variable that is used to try to predict values of another variable known as an outcome variable
Type II error
Accepting the null hypothesis when it is false (means their is a difference between groups, their is a relationship/ there is a group difference
When would you conduct an independent -samples t test?
It is used when you want to compare the means of two independent samples on a given veriable. They require one categorical or nominal IV and two levels or groups and one continuous DV (i.e, interval or ratio scale) we want to know whether the average scores on the DV differ according to which group one belong
when would you compute a biseral correlation
A biseral correlations (r lowercase b) should be calcualted when one of the variables is continuous and the other is a continous dichotomous variable
continuous (interval scaled) Scaled variable
A continous varable is one that gives us a score for each person and can take on any value on the measurment scale that we are using. an Interval variable is data measured on a scale along the whole of which interval are equal from 1 to 2or 4 to 5
Bonferroni correction
A correction applied to the alpha level to control the overall Type 1 error rate when multiple significance tests are carried out. to control the alpha level to adjust for type 1 error The boneferroni correction allows us to control for error however you loose stastical power when you use this
Correlation Coefficient
A measure of the stregnth or association or relationship between two variables Pearson's Correlation coeffience, pearsons r
what is a perfect negative correlation?
A perfect negative correlation (-1.00) indicates that for ever memebr of the sample or populaation, a higher score on one variable is related to a lower score on the other variable
what is a perfect postive correlation?
A perfect postive correlation (+1.00) indicates that for ever memeber of the sample or population, a higher score onone variable is related to a higher score on the other variable the closer it is to1 no matter whether is is postive or negative determiens the strength
when would you compute a phi coefficient
A phi coefficient should be calculated when we want to know if two dichotomous variables are correlated
What are the three assumptions of a one-way anova
Assumption of independence, homeogenetiy of variance, nomrality assumption
what is the effect statistic for an independent t test discussed in class? what does this statistic tell us? in other words, how is it interpreted?
Cohen's d Cohen's d provides estimates of differencesin standard deviation units it tells use the effect size
Covariance
Covariance is a measure of the average relationship between two variables. It is the average cross product deviation
Cross Product Deviations,
Cross prodcut deviations is a measure of the total relationship between two variables. It is the deviation from it mean multipled by the other variable's devation from its mean
things thay weaken or attenuate the correlation coefficeint
Curvilinear relationships- they may result in a correlation coefficent that is quite small, suggesting a weaker realtionship than may actually exist Outliers- can weaken correlation coefficients Truncated range or restricted variance -occurs when the scores on one or both the variables in the analysis do not have much range in the disribution of scores
Outcome Variable
Dependent variable
dichotomous variable
Dichotomous variable is a description of a variable that consists of only two categories (e.g. the variable gender is dichotomous because it consists of only two categories: male and female
direct replication and systematic replication
Direct replication refers to an attempt to repeat an experiemnt exactly as it was conducted originally. Ideally, the conditions and procedures across the replication and original experiment are identical. Systematic replication refers to repetition of the experiment by systematically allowing features to vary. the conditions and procedures of the replication are deliberately designed only to apporpimate those of the original experiment. it useful to consider direct and systematic replication as one opposite ends of a continuum.
What are the two fundamental chracteristics of correlation coefficients? Be able to define each
Direction and Strength Direction: Postive correlation move in the sme direction, Negative they move in opposite directions Strength ranges from -1.00 to +1.00
null hypothesis
H0 (little o it means an effect is absent) when we accept/fail to reject it we say that there it are no relationships between these variables and their is no effect and their are no differences between groups when we fail to reject we accept when we reject the null their are differences in our sample and their is a relationship between variables
alternative hypothesis
H1 (effect is present) Is a statement of what a statistical hypothesis test is set up to establish diffrences
experimental (or alternative) hypothesis
H1 is a statement of what a statistical hypothesis is set up to establish (what are we trying to establish/test)
If the regression coefficient is zero, what does the resulting regression line look like? What does the predicted Y value or (Y') equal or what is our best estimate of the predicted value of Y
If the interval includes 0, then there is no significant linear relationship between x and y. we then do not reject the null hypotheis.
Curvilinear relatonship
It can weaken pearsons r and it can result in a correlation coefficent that is quite small. suggesting a weaker realtionship than may actually exist
truncated range
It can weaken pearsons r it can reduce the effect of the correlation coefficent and it occurs when the scores on one or both variables in the analysis do not have much range in the distribution of scores
When would you conduct a paired samples t test?
It is used to comapre themeans of a single sample in a longitudinal design withonly two time points (e.g., pretest and posttest) it is also used to compare the means of two variable measured within a single sample (e.g, depression and qualty of life)
Levine's
Not significant- Assume equal variance Not significant-we cannot assume equal variance
parameter
Parameters describe the realtions between variables in the population. In other words, they are constant believe to reprsent some fundamental truth about the measured variables. We use sample data to estimate the likely value of parameters because we dont have direct access to the population. some examples of parameters with which you might be familiar are : the mean and medican (which estimate the centre of the distribution) and the correlation and regression coefficents (which estimate the relationshop between the variables)
computing correlation coeffients
Pearsons R- Continuous variables Phi Coefficient- Dichotomous Spearman rho-Ordinal point-biserial- one continuous and the other Dichotomous Biserial Correlation- continuous and continuous Dichotomous
What is the different between a post-hoc test and a planned compariosn or contrast
Planned comparison a priori happen before, it is theory driven. Post hoc is a poster after the fact and it is date driven
Regression Coefficients b and B
Regression coefficients b and B indicates the effect of the IV and the DV specificaly for each unit change of the IV there is an expectd change equal to the size of b or B in the DV The regression coefficent (b or beta) is the average amount of the dependent increases when the independent increases one unit the B coefficnet is the slop of the regression line the larger the b, the steeper the slope, and the more the dependnet changes for each unit change in the indepdnent varibale unstandarized (b)-is intepreted when the predictors are measured using the same units of measuement Standardized (Beta, B) is used when predictors are measued in different units it represents the change in terms of standard deviations in the dependent variable that result from a change of one standard deviation in an indepdent variable
sum of squares
SSb-Sum of squares between SSW sum of scres between also known as error.
central limit theorem
Sample distribution must have at least 30 to take on a normal distribution. Have 30 in your sample size
three tools that provide ways to assess how meaning full the results of statistical analyses are (indexes)
Statistical Significance (P-value) Confidence Intervals (95% or 99% CL) Magnitude of the effect (effect size) these tools provide indexes of how meaninful the results of statistical analysis are
When would you conduct a one-sample t test?
To comapre the means of a test variable (DV) with a constant "test value" such as the midpoint of a variable, the average of a variable (DV) pased on past research
when would you compute Pearson profuct-moment correlatiion
To compute pearsons r both variables mjust be meausred on an interval or ratio scales otherwise known as continuous variables
Why would we conduct a one way anova rather than an indepdent t-test
We would condcut a one way anova rather than an indepedent t-test when we have more than 2 levels condiitons that we want to compare
What are the three things to consider when choosing a post hoc test
Wheather you have equal number of casews in each subgroup. Does the test control for Type 1 error Does the test control for Type 2 error or does have good statistial power
What is the null hypothesis for Leveine's test if the equality of error variance
When levene's test is not significant then we can assume that we have equal variance. When levene's test is significt that we cannot assume that we have equal variance
When the correlation coefficient is zero, what is the relationship between x and y variable
When the correlation coefficent is zero there is no relationship H0=r=0
effect size
When we measure the size of the effect be that an experimental manipulation or the strength of the realtionship between variables it is know as the effect size Most effect size formulas are ratios that divide the numerator by the standard deviation If we divide the difference between means by the standard deviation we get a signal to noise ratio but we also get a value that is expressed in standard deviatuon units and can therefore be compared in different stuides that use different means. Ex: cohen's d d=.2 small effect d=.5 medium effect d=.8 large effect
categorical nominal variable
a categorical variable is made up of categories(you can be one thing, a human or a cat you cant be a bit of a cat and a bit of a human) nominal varianle when two things are equivalent in some sense are given the same name or number but there are more than two possibilites
pairwise comparisons
a comparisons of a pair of means
Simple Regression
a linear model in which one variable or outcome is predicted from a single predictor variable. (pg 863 for formula)
if given a 95% interval around two group means for an independent sample t test, be able to test the null hypothesis that the difference between the group mean is zero (or thst the two groups means a equal?
if the mean is signficantly different than zero we can regect the null
Predictor Variable
independent variable
eta squared
is an effect size n= ss between/ss total it is also r square and it tells us how much variance their us
sum of squares
is the sum of deviation divided by the degress of freeodm
What is the purpose of a one-way anova
it allows us to test more that one pair wise comparison in one analyis. It compares the mean of more than two groups
T-statistic
it is a test statistic with a known proabability distribution the (t-distribution) In the context of regression it is used to test wherther a regression coefficent b is significantly different from zero; in the context of experimental work it is used to test whether the differences between two means are significantly different from zero.
Brown-Forsythe F
it is design to be accurate when the assumption of homogeneity of the variance has been violated
sampling distributions
it is used as a model of what would happen if the experiment was repeated an infinite number of times an inferential statistic a distribution of a simple statistic a theoretical distribution a sample distribution is the frequency distribution of sample means from the same poulation (using lots of people, repeating
what is the realtionship betwen the coefficent of determination and the sum of squares due to regression?
it tellls us the stronger the correlation between x and y variable the larger the the coefficent of determination, the larger the coefficent of dtermination the large the percentage of variance may be expalined in the DV by using the variance in the IV Thus the larger the sum square due to regression and the smaller the amount of error in our regression equation
Cohen's d
it's the difference of the means by the standard deviation it is an effect size that expressed the difference between two means in standard deviations units. value expressed in standard deviation units you do it to compare Ex: cohen's d d=.2 small effect d=.5 medium effect d=.8 large effect