Psych Stats Exam #2
Sampling distributions
-A theoretical frequency distribution of a statistic (example-mean) based on an infinite number of samples of size n. -Useful in determining the probability of obtaining a particular sample result
Confidence intervals...
-Confidence Interval gives us a range, not just a yes or no answer saying if it is equal to or not equal to... -Confidence Interval is the percent chance that a confidence interval contains the null. -Confidence Interval: ranges of scores that intend to display the liklihood of capturing a population parameter within a sample, unlike Hypothesis Testing which is different or not, confidence interval gives us a range to estimate where the true value falls.
Type 1 error correction for pairwise comparisons: GOAL
-Goal: to control chances of Type 1 error over the course of multiple comparisons
Types of Inferential Procedures
-Hypothesis Testing (Point Estimation): "Is there a difference"? -Interval Estimation:"How much is the difference?" (a.k.a confidence intervals)
If confidence interval contains null......or...doesn't contain null...
-If a confidence interval contains the null: you retain the null--> NOT significant! -If a confidence interval does not contain the null: you would reject the null--> Significant! -if you make the confidence level higher, the width of the confidence is wider (higher range) -if you make the confidence level lower, the width of the confidence is narrower (smaller range)
The Central Limit Theorem (what would be find?)
-If we drew an infinite number of samples from the population, we would find that... 1. The mean of the sample means would equal the population mean. 2. Standard Error of the Mean: average deviation of sample means from the true population mean -reflects the accuracy with which sample means estimate the population mean. 3. Regardless of the shape of the population of scores, the sampling distribution of means (or mean differences) will tend toward normality. This tendency increases as sample size increases (n>30 is usually adequate)
What happens when you have a bigger effect? What happens to the nominator and T-size when you have a lower denominator (standard error) or lower variability?
-If you have a bigger effect, its more likely to be statistically significant. If you have a lower denominator (standard error) or lower variability, that is going to reduce your nominator and increase your T size.
independent vs. dependent samples T-test
-Independent samples: "Between Groups Design" (different, independent groups of participants) -Dependent samples: "Within Subjects Design" (same participants over different times) (issues with using this in red-bull car experiment: order effects, practice effects-the more you do it the better you get, gets easier to figure out what the hypothesis is..)
Common Dependent Samples Designs (2 levels of IV)
-Pre-post design. Example (effects of ECT depression)- two means before and after --Different measures/tasks/raters. Example: effects of task complexity on sustained attention. Example: math vs reading teacher ratings. Same Mean in multiple tasks --Matched Pairs. Example: effects of instructional method on learning (matched on IQ). Pairing samples up based on something.
Finding T-critical
-Sample size (degrees of freedom: n-1), -Alpha Level -One-tailed (you are only interested in differences in one direction) or Two-tailed? (you are interested in both directions).
what happens to T-critical value when sample size......
-Sample size gets larger: T-critical gets smaller -Sample size is infinity: T-critical value is 1.96 (which is the cut off for z-distributions). -if sample size is large enough, t-test is exactly the same as Z-test. -becomes easier to find significant difference when you have larger samples
Making decisions with F-critical
-Since F-statistic was GREATER than Fcrit, you reject the null (meaning there IS significant difference)
One way ANOVA VS. T-Test
-T-test: compares two means -ANOVA- compares 2 or more means (F)
why is the F statistic positively skewed?
-The F-Statistic is positively skewed, it cannot be less than 0, because you cannot have a negative variance
Treatment effect on F-statistic
-The bigger the treatment effect, the farther away the F-statistic is from 1. -The bigger the treatment effect, the more effect there is -if the Treatment Effect is 0, F statistic is around 1. When the Null is true, expect the F-statistic to be around 1. -if the Treatment Effect has some effect (is like 10), the F-statistic is around 2.
For treatment effect, random error...
-The more random error you have, the harder it is to detect an effect --small effect: hard to detect.
how to get confidence intervals
-To get Lower Limit: you do minus T-critical -To get Upper Limit: you do plus T-critical
What happens to T-critical when alpha...
-When alpha level gets smaller, T-critical goes up -anything that make the critical values farther in the tails, makes it harder to find a significant difference. Lower Sample size makes it harder.
ETA2 for T-Test
-When the null is true, T=0 -When there is no difference between means, your effect size is 0!
Confidence interval effects on interval width
-Wider intervals represent less precise estimates... indicator of sampling error -Increasing confidence. example- 95% to 99% increases interval width. ***whenever its comparison: difference between the two means
Dependent samples T-test
-Within samples (same subjects over different times) -compares two dependent means -tells us if 2 means are significantly different from eachother
Statistically significance is affected by:
-actual difference- (IV effect) referred to as Effect Size -Variability-(Error Term) -Sample Size- (DF) all other things being equal, as sample size increases, becomes easier to reject the null -Statistical signifiance---practical significance
Characteristics of the F-Distribution
-clusters around 1.0 -F cannot be negative -positively skewed -varies based on Degrees of Freedom (Dfbetween and Dfwithin)
independent samples t-test
-compares two independent means, and tells us if the two means are significantly different from each other.
Logic Of Dependent Samples T-Test
-decision process is same as Independent Samples T-Test -Based on Difference Scores (D) --If the Null was true (meaning there was no difference), the D scores should be close to 0
Kind of Type 1 error: Bonferroni Correction
-how much do you have overall, and how will you divide it equally? -Error Rate Per Comparison: probability that a particular comparison will be falsely declared significant (type 1 error) -Experiment-wise Error Rate: probability of 1 or more Type 1 errors in a set of comparisons (Q).
Making decisions in hypothesis testing
-how unusual does a result have to be? -decision criterion: alpha level--> if # =.05, we will reject the null hypothesis if a particular sample result occurs less than 5% of the time by chance (when the null is true). -probabilistic in nature (always some chance of error) -decision always refers to the null: retain H0 or reject H0
limitations of the one-sample Z-test
-in order to do a one-sample Z-test, you need the population standard deviation, you needed to know what the variability of the population is
Pairwise comparisons (Uncorrected).. What changes about alpha?
-instead of saying p is less than alpha, you would say that P is less than Alpha PC.
One-way ANOVA: SSw and dfW
-measuring how different is each person(Xi) from the group that they are in (Xj). Xi = Individual Score Xj = Mean of Group J dfW: N-K (N=total sample size, K=number of groups)
Other terms for Dependent Samples T-Test
-paired samples T-Test -correlated samples T-Test -Matched Pairs T-test
Ways things have effect on statistical significance
-reducing sample size will decrease significance -increasing sample size will increase significance -the more variability you have, the less likely to be significant
Effect size
-tells us how powerful the significance is. -many different types- all convey the relative strength of the relationship -r2-coefficient of determination: for correlation analysis -Gravetter & Wannau- eta2 -for means comparisons -percent of variance in DV accounted for by IV -can range from 0 (no percent accounted for) to 1.0 (100% accounted
When to use Z-test vs when to use T-test
-use Z-test when population standard deviation is KNOWN. -use T-test when population standard deviation is UNKNOWN.
Challenges with estimating standard error of the mean
-when sample size increases we get a lot less variability from sample to sample, its more accurate, its a better predictor. When sample size is low we get a lot more variability from sample to sample, we have less confidence in how well it is estimating the population. -we don't have population standard deviation, but we can use the sample deviation to estimate it but the accuracy of this estimate varies depending on sample size- cannot use with standard normal distribution.
Limitation of saying statistically significant...
-when something is statistically significant, it means that it is not unlikely due to chance (its unlikely that it occurred due to chance). You have to say HOW MUCH statistically significant. How big is the difference?
What effects statistical power?
1. Alpha Level: If my power goes up, type 2 goes down. And because type 1 and type 2 are interrelated, type 1 error will go up. -as alpha level increases, T-critical gets lower, making it easier to reject the null. 2. Sample Size: when the null is true, T is zero. The bigger T-value, more likely to reject null. Make T-larger: denominator smaller, Larger sample size, t is larger. Can also make the numerator larger. 3.Effect Size: easier to find bigger things, easier to miss smaller things.
Assumptions/consequences of the Independent Samples T-Test
1. Random Sampling- assumes that samples are representative of the populations from which they are drawn -Consequences: external validity is questionable. Cannot generalize past sample. 2. Normality- assumes that populations from which samples are drawn are normally distributed -assess by: graphs, skewness index, mean vs median Part of Normality: Consequences of violating normality -Central Limit Theorem: regardless of the shape of the population of scores, the sampling distribution of means (or mean differences) will tend toward normality. This tendency increases as sample size increases. As long as your sample size is big enough, you won't violate the assumption of Normality. 3. Independence- assumes that scores within a sample are independent of each other. -This is violated if: person provided more than one score per condition, and if participants influence other's performance (like diffusion of treatment) 4. Homogeneity of Variance- assumes that the variances of the 2 populations are equal Assess by: -compare high and low sample variances -if one is 4x greater---violated assumption -can also use Levene's test (SPSS)- if significant -->assumption is violated. Levines Test: -if significant-->then violated-->reject null (look at BOTTOM line- equal variances NOT assumed. -if not significant--->then met--->retain null (look at TOP line)- equal variances ARE assumed
-Why not use multiple T-tests?
Alpha inflation (type 1 error)->You accumulate a higher probability of a type 1 error (false positives)
Factors affecting the F-Ratio
Between groups: IV (treatment) effect, Random Error (individual differences-how much you sleep, how much you studied-, measurement error-types of question on test, people respond better/worse to different kinds of questions, the way the test is constructed..) Within groups: Random Error (Individual differences, measurement error) -Random Errors (individual differences and measurement errors): effect everybody, while the IV treatment effects some groups and doesn't effect others.
Making decisions regarding null hypothesis: Dependent samples t-test, if T is... if P (SPSS) is...
Compare T-statistic to Tcrit: If t IS in the rejection region, REJECT null (significant). If T is NOT in the rejection region, you RETAIN null (not significant) in SPSS: If P is LESS than alpha, REJECT null (significant). If P is GREATER than alpha, RETAIN null (not significant).
-What does One-Way ANOVA do?What does it tell us?
Compares 2 or more means -between subjects (one-way ANOVA) --it tells us, "is there a significant difference anywhere among the group means?
What test would you use? State the null/alternative hypothesis. -for this, write results in APA format A researcher wanted to determine whether being read to encourages children to be less violent. A class of 3 rd grade students is not read to and their violence score is calculated below. The next month their teacher reads to them in class and their stress level is calculated below. Did being read to have a significant effect on the violence level of the 3 rd graders? Pre 10 7 3 14 9 4 Post 8 8 3 7 4 1
Dependent Sample T-test H o : µbefore-µ after =0 Ho:µbefore- µafter≠0 Mean pre = 7.83 Mean post =5.16 t= 2.17 df=5 tcrit= ±2.571 S D (Standard deviation of difference scores) = 3.011 S Đ (Standard Error) = 1.229 There is no significant difference for the 3 graders between the violence level before being read to and after being read to, t(5)= 2.17, tcrit = ±2.571.
What test would you use? State the null/alternative hypothesis In order to test the effects of having a researcher present in the room on participants cheating, researchers have 5 participants come to the lab. The researchers ask them to fill out ten math problems. The researcher tells the participants the answers are on the back, and asks them not to look until they are done. The researcher then leaves the room until participants are finished, and observes the participants using a hidden camera to see how often they cheat. The researcher comes back to the room and gives the same participants a similarly easy test of verbal questions, again with the answers on the back. This time the researcher stays in the room and sits behind a desk. Again, cheating is observed with a hidden camera. Did the researcher's presence have an effect on cheating?
Dependent Samples t-test H O : µ 1 = µ 2 H A : µ 1 ≠ µ 2
One-Sample T-tests
H0= no effect (equal to eachother) H1= yes effect (not equal to eachother) -If null is TRUE ( no effect), then T-statistic is 0
Independent Samples T-test (when T is...)
If T-value is greater than the critical value, you reject the null hypothesis--> significant! If T-value is less than the critical value, you retain the null hypothesis--> not significant!
Which of the following does NOT increase power?
Increasing variability
What test would you use? State the null/alternative hypothesis In order to test which major has the most creative Halloween costume, I get 10 psychology majors and 10 biology majors to come to the lab dressed in a costume. I have 50 people from across majors rate them on how creative their costumes are. The average creativity rating for psych majors (on a 10-point scale) is 7, standard deviation of 2, and the average creativity rating for bio majors is 5, standard deviation 1. Are psych majors significantly more creative?
Independent Sample t-test H O : µ p ≥ µ b H A : µ p < µ b
What test would you use? State the null/alternative hypothesis. -for this, write results in APA format A sample of 30 Muhlenberg students study an average of 22 hours a week with a standard deviation of 5 hours. The 30 students in Psi Chi study an average of 28 hours a week with a standard deviation of 6.5. Are they significantly different from the Muhlenberg population?
Independent Samples T-test H o : µ 1 = µ 2 H o: µ 1 ≠ µ 2 t = -4 df = 58 tcrit=± 2.00 Muhlenberg Student study hours (M=22, SD= 5) were significantly less than Psi Chi study hours (M=28, SD= 6.5), t(58)= -4, tcrit= ± 2.00.
Rejection region (Tcrit) Independent vs. dependent
Independent: need alpha level, df (N1 + N2 - 2), is it one or two-tailed Dependent: need alpha level, df (np - 1), is it one or two-tailed?
Independent samples T-test: Finding T-critical value
Need: Alpha level, Degrees of freedom (n1+n2-2), One-tailed vs two-tail? --If the absolute value of the t-value is greater than the critical value, you reject the null hypothesis--> significant! --If the absolute value of the t-value is less than the critical value, you fail to reject (so you retain) the null hypothesis--> not significant!
Hypothesis testing and ANOVA
Null= every mean of each group are equal Alternative= there is a difference in the means somewhere, some effect.
What test would you use? State the null/alternative hypothesis The average 5 th grader reads 10 books per year with a standard deviation of 2.75. Thirty students enrolled in a gifted and talented program at the local elementary school report reading 20 books per year, with a standard deviation of 2.25. Do the students in the gifted and talented program read significantly more books than the average fifth grader?
One-Sample Z-test H O : µ ≤ 10 H A : µ > 10
Which of the following tests does NOT compare means between samples?
One-Sample t-test
What test would you use? State the null/alternative hypothesis The national average on the math section of the SAT for this year's current college freshman was 515. The average score on the math section of the SAT for Muhlenberg's 551 freshman was 610 and a standard deviation of 120. How does the average SAT math score of Muhlenberg freshman compare to the national SAT math average?
One-Sample t-test H O : µ = 515 H A : µ ≠ 515
What test would you use? State the null/alternative hypothesis We measured enjoyment of the dining hall on a 1 to 50 scale using a survey created by Sodexo and compared the data between 100 freshmen, 100 sophomores, 100 juniors, and 100 seniors. The freshmen had the highest enjoyment ratings with a mean of 38 (SD = 2), sophomores had a mean score of 31 (SD = 4), juniors had a mean score of 34 (SD = 3), and seniors had a mean score of 28 (SD = 1). Is there a significant difference between any of these groups?
One-Way ANOVA H O : µ 1 = µ 2 = µ 3 = µ 4 H A : µ j ≠ µ j'
Hypothesis Testing: Research vs. Statistical Hypothesis
Research hypothesis: explanation in words.. Statistical hypothesis: null/alternative (uses parameters).
One-way ANOVA: SSb and dfB
SSb: looks at how different is each group from the grand mean. Rough estimate of how much difference there is between each of the group means relative to the grand mean. Nj = Sample size of a specific group Xj = Mean of Group j XG = Mean of ALL scores (Grand Mean) dfB--> K-1 (K= number of conditions)
ETA2 APA FORMAT CONCLUSION:
T (337)=441, P< .001, N2=.055 (ETA2, how big the effect was)
Rejection region in T-tests
T-score above 2.131 or below -2.131 you REJECT the null.
Errors in hypothesis testing
Type 1 error: reject H0 (null) when it is in fact true -false positive, found something and it wasn't there -Type 2 error: retain H0 (null) when it is in fact false -a miss, you missed something that was there
What is Type I and Type II error?How is power related to significance level?
Type I: Rejecting the null when you should have retained it (false positive). Type II: Retaining the null when you should have rejected it (miss). Power is the probability of correctly rejecting the null hypothesis (thus finding a significant difference).
When comparing more than 2 treatment means, why should you use ANOVA instead of several t tests?
Using several t tests increases risk of Type I error
What is "random error?" What are the two components it is comprised of? What kind of variance does it affect?
Variability that is due to randomized differences among the scores in our samples. It is comprised of individual differences and errors in measurement. It affects both within groups variance and between groups variance.
Making decisions: when F statistic is.....
When F statistic is greater than Fcrit, you can reject the null hypothesis(significant!) -When F statistic is less than Fcrit, you retain the null hypothesis(not significant!)
Independent samples T-test (when p is.....) (SPSS, because only gives you P not T)
When P is LESS than Alpha, you reject null-->Significant When P is GREATER than alpha, you retain null--->NOT significant
What test do you use to find how rare is the sample result
Z test (find Z statistic)
What are confidence interval levels?
confidence level represents the frequency (i.e. the proportion) of possible confidence intervals that contain the true value of the unknown population parameter.
-estimating standard error of the mean when it is not known
estimating population standard error using the sample standard error
making decision from SPSS (P value not T)
if P value is LESS than Alpha= REJECT null if P value is GREATER or EQUAL to Alpha, ACCEPT null.
-T-distributions vs Z-distributions
in T-distributions, you have fewer scores in the middle and it is a little more common to get scores at the end of the tails (not unsual to get really high or really low scores) -In T-distributions, when df (sample size) goes up, the curve looks more and more like a Z-distribution. -Denominator: estimated standard error of the mean
Conditional Probability
probability if one event (A) is conditional on the occurrence of another event (B)
-Degrees of freedom
sort of like an estimate of the sample size
How to read T-table
two-tailed tests: non-directional one-tailed tests: directions
Can't do a Z test when.....
when we don't know the population standard deviation!
Bennett and Kaelyn did a study on whether plate color at the dining hall affects your enjoyment of the food. One group ate off red plates, one group used blue plates, and one group used green plates. Results showed that there were significant differences in enjoyment across all three plate types, with η 2 = .434. What does η 2 represent, and what does η 2 = .434 mean in terms of the problem?
η 2 represents effect size. η 2 = .434 means that 43.4% of the variability in food enjoyment can be attributed to plate color.
