stats test 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

a correlational coefficient can be evaluated for significance (T/F)

T

what do subscripts mean in a cross tabulation chi square table

the different subscripts tell us that these proportions are significantly different. NOT COUNT

a calculated value of chi square compares

the frequencies of categories of items in a sample to the frequencies that are expected in the population

normal distribution of residuals

the residuals of the model are random, normally distributed; the means differences between the model and observed data are close to zero

linear regression means that for an increase in the X variable

there will be a constant change to Y

homoscedasticity

at each level of the predictor variable, the variance of the residual terms should be constant

SS T SS R SS M

total variability between scores and the mean residual/error variability (variability between the regression model and the actual data) model variability (difference in variability between the model and the mean)

interpret p<0.05, d=0.59

difference between Brandeis students and the general college students is greater than chance. This difference is statistically significant and has a moderate functional effect.

what do you need to know to calculate r(critical)

directionality, alpha level, and df

the chi square statistic is a __ of distributions. The shape of the distribution depends on the ___. Chi square statistic is __ and __. the chi square will be small when ___

family df nonnegative positively skewed null hypothesis is true

how to compute sample statistic

first find standard error. then calculate z(observed)=(Msample-mu)/SE

independence

for any two observations, the residual terms should be uncorrelated

a test that uses the sample data to test a hypothesis about the proportions in the general population

goodness of fit chi square

asymp sig

p value

if a significant result is obtained, how might you interpret the findings more thoroughly for a chi square test of independence?

refer to percentage totals in contingency table

total deviation

regression deviation plus residual deviation

what is a decision rule

reject H0 if Z(observed)</=-z(critical) reject H0 if Z(observed) >/= +z(critical)

assumptions for a z-test

N=30 or greater for the sample the distribution of raw variables does not have to be normal and the sampling distribution will still be normal

r^2 tells you

% of variance accounted for

degrees of freedom for chi square test of independence

(r-1)(c-1)

how are degrees of freedom calculated for a chi square test

(r-1)(c-1)

correlation varies between __ and __ what means no relationship

-1 to +1 0

chi square assumptions

1) independence of observations i.e independent groups; each person, item, or entity contributes to only one cell of the contigency table 2) size of expected cell frequency a chi square test should not be performed when the expected cell frequency is less than 5

regression in SPSS conclusions

1. Overall model fit is good (p=0.047), suggesting that the model is a significantly better fit to the data compared to the mean. 2. Anxiety score was a significant predictor of exam performance (b=-1.16, 95%CI, [-2.31, -0.02], p=0.047). 3. In addition, 41% of the variation in exam performance is explained by variation in anxiety scores (R^2=0.407)

how do you calculate covariance

1. calculate error between mean and each subjects score for the first variable (x) 2. calculate the error between the mean and their score for the second variable (y) 3. multiply these error values 4. Add these values and you get cross product deviations 5. The covariance is the average cross product deviations

Z test- formal hypothesis testing steps

1. convert the research question into a statistical hypothesis 2. set the decision criteria -select confidence level (alpha level) -calculate critical statistic (Zcritical) -establish decision rule 3. collect data and compute sample statistic (Zobserved) 4. Make the decision -compare observed and critical statistics -draw inference -calculate effect size 5. Report result (APA)

parts of a hypothesis test (11)

1. research question 2. statistical question 3. null and alternative hypothesis 4. alpha level 5. critical z 6. decision rule 7. observed z 8. decision 9. effect size 10. APA style report 11. conclusion

list three assumptions that must hold for correlation to be valid. In addition, list at least two additional misleading factors and/or possible problems with correlation

1. there is a linear relationship between variables 2. there is a pair of values for each participant or observation 3. an absence of outliers in either variable 1. there could be a third variable (measured or not) that is affecting the results, so causality can't be assumed between the variables 2. the correlation coefficient does not tell us which variable causes the change in the other variable.

what is the critical z value for one-tailed test alpha level 0.05

1.65

contamination with protein will give an A260 value slightly less than

1.8

what is the minimum number of observations required for the expected frequencies in chi square

5

how much variance is explained when a correlation of .9 in your analysis

81%

APA style reporting (z-test brandeis creativity scores )

A 2-tailed (non-directional) z-test revealed that typical brandeis creativity scores (M=110) were significantly higher than the typical college population (mu=100), z(two-tailed)=3.23,p<0.05, d=0.59

b. Below is an image of the gel that you ran to give you the data described above. Lane 1 is 0 minutes, lane 2 is 10 minutes, lane 3 is 20 minutes, and lane 4 is 30 minutes.

ALWAYS run a molecular weight ladder to determine size unfolded control to make sure you are looking at the right band.

what Is the ordinary least squares method

B1=r(xy)(sy/sx) B0=My-b1Mx estimated model: Y=b0 +b1X

r and d effect size ranges

r=.1, d=.2 (small effect) r=.3, d=.5 (medium effect) r=.5, d=.8 (large effect)

how to find chi square goodness of fit statistic on SPSS

Data--weigh cases-- weigh cases by-- frequency -- frequency variable -- ok analyze -- nonparametric tests -- legacy dialog -- chi square drag frequency into test variable list -- ok

You would like to use Ni-NTA chromatography to purify the ω RNA Polymerase subunit from the pellet. How would you change the protocol we used in the lab in order to purify active, correctly-folded protein?

Denature with urea/gu-Hcl and then refold. The protein needs to be soluble to be run on a column. You could also purify it using a hydrophobic solvent

a correlation of .9 has less error than a correlation of .3 (T/F)

F???

suppose you have a non-directional hypothesis and you are testing at an alpha level at 0.05. Your observed z is 0.25. Can you reject the null hypothesis?

No, because z observed (0.25) is less than z critical (1.96)

measures of association

Phi: accurate for 2 x 2 contigency tables; may not lie between 0 and 1 contigency coefficient: seldom reaches upper limit of 1 cramer's V: when both variables have only two categories, phi and Cramer's V are identical. However, when variables have more than two categories, Cramer's statistic can attain its max of 1

B. Another student shakes their tube very vigorously, instead of gently, after cells are lysed, and neutralization buffer was added. How might this affect their sample? Why?

This vigorous shaking may shear the genomic DNA, meaning if would be unable to properly precipitate, and would have isolated many small pieces of genomic DNA in our plasmid sample)

what kinds of protein controls should you have in your experiment; why are controls important in experimental analysis

WT and a known aggregator - important to know what both look like to be able to interpret results and assess if your experimental procedure has worked.

Y(i)= b0 + b1Xi + e(i) describe each variable

Yi= outcome variable. b0= intercept, value of Y when X=0; point at which the regression line crosses the y-axis b1=slope of the regression line; regression coefficient for the predictor; direction/strength of relationship Xi=predictor variable e(i)= the residual (error); whats left in Y(i) that cannot be explained by X(i)

what type of table is used to summarize purely categorical data

a contingency table

goodness of fit APA reporting example (student's soft drinks preference)

a chi square test for goodness of fit showed that students do not have a preference among the tested soft drinks, X^2 (df, n=sample size) = observed chi square value, p > 0.05, phi=0.135

how to report results using chi square test of independence (relationship between presonality type and color preference).

a chi square test of independence showed that there was a significant association between the personality type and preferred color, X^2 (df, n =200) = 35.6, p < 0.05, V=0.422.

chi square test of independence

a test that uses frequencies found in sample data to test a hypothesis about the relationship between two variables in the population; the test determines whether the distribution in one variable depends on the distribution of the other variable in the population

what is regression

a way of predicting the value of one variable from another; it is a hypothetical model of relationship between two variables; the model used is a linear one; therefore, we describe the relationship using the equation of a straight line.

wash lane

all nonspecific binders

SDS-PAGE separates strictly by linear size because the SDS coats the denatured protein and gives all proteins a uniform charge to mass ratio. By which of the following protein characteristics will a native gel separate?

amino acid charge shape overall length

use ___ gel for gel electrophoresis in lab 3 to separate plasmids use __ for sanger sequencing use __ for SDS PAGE

agarose gel acrylamide polyacrylamide/bisacrylamide

lysate

all protein with crystals

unbound lane

all soluble protein - cry and nonspecific binders

CL

all soluble protein including crystals

how do you find critical chi square value for goodness of fit

alpha and df

chi square is a test of

association

a test statistic is used to determine

whether the observed data is more than would be expected than by chance alone

A. You transformed your plasmid, plated onto LB-Amp and the next day counted your colonies. You know all the volumes you used during the procedure. What other information would you absolutely need in order to calculate transformation efficiency?

b. Number of colonies on the LB Amp plate AND THE concentration of plasmid you transformed

why does sample size effect goodness of fit

because you are dividing by the frequency expected which is bigger with a larger sample size

bootstrapping in SPSS

bootstrap CI tells us that population b1 is likely to fall between a certain interval table tells us - p value for CI - under bias is the bootstrap CI?? (the closer bias is to zero the better it is) - lower and upper 95% CI

what is special about calculating Cramers V for chi square test of independence

calculate both (r-1) and (C-1) and then use the smallest value for df

the test determines whether the distribution in one variable depends on the distribution of the other variable in the population

chi square test of independence

calculate the transformation efficiency

colonies that grew on LB/Amp divided by total colonies

direction of causality

correlational coefficients say nothing about which variable causes the other things to change

how do you calculate cohens d effect size

d=Msample-mu/sigma

how to do test of independence in on SPSS

data-- weigh cases select "weigh cases by" and drag "frequency" into "frequency variable" line and press "ok" Analyze--descriptive statistics--cross tables drag "personality" into "rows" drag "color" into "columns" click "statistics" -check "chi quare" "phi and crammers v" and "continue" click "cells" -counts "observed", "expected" (percentages "row" , "total" , "continue" check "display cluster bar charts" "ok"

regression in SPSS output

descriptive stats correlations -pearson's r -significance (probabiltiy fo getting the particular r observed value; we dont have r critical but we can still tell its significance because of p) model fit output model summary -absolute value of r -R^2: how much variation in output variable can be explained by variation in predictor variable; how much improvement are we making by using our model compared to using no predictors and just using the mean ANOVA predictor fit output coefficients -b0 -b1 -p value for t (testing significance of b1) -standardized beta is the same as r -when effect is significant, this 95%CI for b1 does not include 0

how do you standardize covariance

divide by the standard deviation of both variables. The standardized version of covariance is known as the correlational coefficient.

non-parametric tests

do not make any assumptions about the distribution of the population can operate on nominal data

b0=143 what does this mean

if someone had a score of 0 on X than their predicted value of Y would be 143 (hypothetical)

the third variable problem

in any correlation, causality between two variables cannot be assumed because there may be other measured or unmeasured variables affecting the results.

what is correlation

it is a way to measure the extent to which two variables are related (strength of association); describes linear relationships; used to generate new research questions, without answering cause-effect questions; a high positive correlation (r) indicates a strong association but does not prove a causal relationship

what extra info does SPSS give you if you say percentages "row" "total" "continue" versus percentages "column" "total" "continue"

it will say, for example, "% within personality" tells you 60% of people that are extraverted prefer color red. tells you that 90% of color red was picked by extraverts.

what are the regression assumptions

linearity outliers independence homoscedasticity Normality

correlation in SPSS check what assumptions

linearity and outliers and normality

how to determine if the regression model fits the obeserved data

look at the mean -- if there was no relationship between advertizing budget and album sales, then the regression model would be a flat line equal to the mean.

how do you find expected frequencies for chi square test of independence

marginal row frequency x marginal column frequenct / total number of people

when analyzing categorical variables, the __ of a categorical variable is meaningless; the numeric values we attach to different categories are___. Therefore, we analyze frequencies and we can tabulate these frequencuies in a __

mean arbitrary contingency table

frequency expected =

n x proportion expected

degrees of freedom for correlation

n-2 because theres two variables

df for goodness of fit chi square

number of categories -1

what does a b1=-1.16 represent

on average, a one point increase in X is related to a 1.16 decreaes in exam score.

__ can influence the value of a correlation

one extreme data point (outlier)

in the regression equation what does b1 denote

one unit change in X is related to b1 change in Y

assumptions for regression

outliers/ linearity- the means of each distribution of y at a given x can be joined by a straight line independence- for any two observations, the residual terms should be uncorrelated homoscedascticity - at each level of the predictor variable, the variance of residual terms should be constant normal distribution of residuals- the residuals of the model are random, normally distributed/ difference between model and observed data close to zero

Why do we see the fluorescence of the samples decrease after being heated to 50-55 deg. C

protein aggregates or dye dissociates

what does an SDS-PAGE gel tell you about your protein

purity size pre

misleading factors for pearson moment correlation r

restricted range, outliers, curvilinearity.

how to calculate chi square goodness of fit

sum of (fe-fo)^2/fe

durbin Watson test

tests for independence; you want the Durbin Watson value to be between 1 and 3

goodness of fit chi square

tests whether frequencies in a sample match frequencies in a population

chi square test of independence

tests whether two variables are associated

what does significance mean

that you have confidence(alpha) that the observed difference is bigger than sampling error

residual

the difference between each observation and the model fitted to the data

residual (chi square goodness of fit)

the error between what the model predicts (expected frequency) and the observed data (observed frequency) residual=observed-model model=row x column / total

linearity

the means of each distribution of y at a given x can be joined by a straight line

a small X^2 statistic means

the null hypothesis is true

in the regression equation what does X denote

the predictor variable

if you reject the null then

there is a difference

R^2 (regression)

the proportion of variance accounted for by the regression model. The pearson correlation coefficient squared SS M / SS T

what is represented by the R^2 Statistic

the proportion of variance in the outcome variable accounted for by the predictor variable.

when discussing the concept of covariance, the distance from a specific observation to that of the mean of a given variable is known as

the residual

when interpreting correlation coefficient it is important to know

the significance of the correlation coefficient the magnitude of correlation coefficient the sign of the correlation coefficient

what if you get an SPSS output that says .000 under "asymp sig (2-sided)"

this is your p value and you should report it as p < 0.001

assumptions of a test serve what function

to minimize sources of bias

how to report correlation in APA (exam performance and anxiety)

two-tailed pearson correlation was computed to evaluate the association for anxiety and exam performance. The correlation was negative and significantly greater than 0, r(df)=-.638, p<0.05. The analysis shows that increase in anxiety is associated with poorer exam performance. The calculated R^2=.407 indicates that 41% of variation in exam performance can be explained by variation in anxiety.

variance versus covariance

variance tells us how much scores deviate from the mean for a single variable whereas covariance tells us by how much scores on two variables differ from their respective means.

what is the covariance useful for

we need to see as one variable increases, the other increases, decreases, or stays the same. This can be done by calculating the covariance. We look at how much each score deviates from the mean. If both variables deviate from the mean by the same amount, they are likely to be related.

what do we look for when checking for normality (regression)

we want the dots to fall on the line or close to the line

what is homoscedasticity .

we want the shape of the scatterplot to be rectangular

when do you use correlation

when there is 2 dependent variables and no independent variables.

when can we use a z test

when you are comparing a sample mean to a population mean whose mean and standard deviaiton is known and sample mean comes from from a sample with n>30

does sample size effect goodness of fit chi square

yes, based on the sample size, the significance you calculate will be different

E. You then unfold the protein overnight in 4 M guanidine HCl. In the morning you come in and immediately add trypsin to the tube to digest. You are surprised to see that the full-length protein is not degraded over the thirty minute time course. a. Provide a scientific reasons that justifies this result.

you did not dilute so the 4M GdnHCl denatured the trypsin

if all the bands for the different trypsin time points look the exact same, what could have gone wrong

you forgot to add trypsin; you forgot to add PMSF

what happens by squaring the correlational coefficient

you get the proportion of variance in one variable shared by the other (coefficient of determination)

your band looks like its smiling what the fuk is up

you may have run at too high of a voltage you may have loaded your samples unevenly you may have run out of buffer during your run you may have had the wrong running buffer composition

what is the difference between a z-score and a z-statistic

z-score: X-Xbar/s Z-statistic: Xbar-mu/SE (compare a mean of a sample to the mean and standard deviation of a population)

how to do goodness of fit in SPSS

• Data->Weight Cases... • Select "Weight cases by" and drag "Frequency" into "Frequency Variable:" line and "OK" • Analyze->Nonparametric Tests->Legacy Dialogs- >Chi-square... • Drag "Frequency" into "Test Variable List:" • "OK"

correlation is an effect size

• It is an effect size • ±.1 = small effect • ±.3 = medium effect • ±.5 = large effect


Ensembles d'études connexes

Fundamentals of Computer Science Final Exam Review

View Set

Environmental 2.1-2.7 Test Review

View Set

CH 19 | Family-Centered Care of the Child with Chronic Illness or Disability

View Set

Managerial Accounting Past Test Questions for Final

View Set

Module 4- - General Patient Care Part 1

View Set