Midterm 2
Cohen's d (effect sizes) for difference between two means
.20<d<.80
Steps in performing the t-test
1) One tailed or two tailed 2) What is the degrees of freedom? (N-1) 3) Find the critical value for that of 0.5 4) Is t-test result higher than the critical value? 5) Higher than the 0.1 critical value? 6) If T test value is negative or positive it is not significant (look for distance from mean)
Hypothesis Testing - What is it and Why
act in statistics whereby an analyst tests an assumption regarding a population parameter. Used to infer the result of a hypothesis performed on sample data from a larger population
Law of Large Numbers
as a sample size grows, its mean gets closer to the average of the whole population.
t-test for independent samples - Difference in calculations
2 key differences
Dividing the normal curve into proportions/probabilities
Draw a normal distribution curve, standard deviation dotted lines on either side.
Null and alternative hypotheses
Null: Any observed difference between samples is regarded as a chance occurrence resulting from sampling error alone. i.e. There is NO difference between the suicide rates of Protestants and Catholics. Alternative: Opposite of the null hypothesis, testing to see if there is enough change in a result to reject the null. i.e. There IS a difference between the suicide rates of Protestants and Catholics.
Directional hypotheses - One-tailed vs. two-tailed tests
One tailed test: the region of rejection is entirely within one tail of the distribution. Two tailed test: an extreme test statistic in either tail of the distribution (positive or negative) will lead to the rejection of the null hypothesis of no difference. In practice, you should use a one‐tailed test only when you have good reason to expect that the difference will be in a particular direction. A two‐tailed test is more conservative than a one‐tailed test because a two‐tailed test takes a more extreme test statistic to reject the null hypothesis.
Relationship between samples and populations in terms of probability
Probability decreases as we move farther away from the true population mean.
Statistical significance vs. practical significance
Refers to the unlikelihood that mean differences observed in the sample have occurred due to sampling error. Given a large enough sample, despite seemingly insignificant population differences, one might still find statistical significance. Practical significance: Looks at whether the difference is large.
Random Sampling/basics of other methods
Simple random sample: Is like drawing slips of paper from a hat to generate the most random result. Is commonly done using a table of random numbers. Systematic Sampling: Lists of population members are sampled by fixed intervals. Stratified random sampling: Involves dividing the population into more homogeneous subgroups. Cluster/Multisage Sampling: Used to minimize cost of surveys, choose samples from randomly selected subgroups.
Sampling Error/Standard Error - What is it and why?
The inevitable difference between a population and a sample based off chance. https://www.youtube.com/watch?v=uGuWrPFStdg
Repeated measures (Dependent samples) design in general
Time 1 <---------> Time 2 see if there change overtime
Type I and Type II errors
Type I: Rejecting the null hypothesis when it is TRUE. If we use a .01 cutoff, the chance of a Type I Error is 1 out of 100. With a .05 level of significance, we are taking a bigger gamble. There is a 1/20 (5 out of 100) chance that we are wrong, and that our treatment (or predictor variable) doesn't really matter. Type II: Accepting the null hypothesis when it is FALSE.
When to use each type of t-test
Z test- When you have the populations standard deviation ( https://www.cliffsnotes.com/study-guides/statistics/univariate-inferential-tests/one-sample-z-test) Independent samples: Comparing 2 different variables men and women GPA difference Repeated measures: Trying to see if a program affects the performance of students over time. (same participants are tested more than once)
Z-test vs. t-test - why use the t-test instead of Z?
Z-tests are statistical calculations that can be used to compare population means to a sample's. They are MORE USEFUL when the standard deviation is KNOWN. However, t-scores are used when you DON'T KNOW the population standard deviation; You make an estimate by using your sample.
Interpreting Cohen's d/effect sizes
< .20 or equal is small < .50 or equal is medium < .80 or equal is large
Distribution of sample means and sampling distributions
A frequency distribution of a large number of sample means that have been drawn from the same population. * Approximate a normal curve * Is equal to true population mean *SD of sampling distrubution means is SMALLER than that of a population mean
Nonrandom sampling
Accidental Sampling: Is based what is convenient for the researcher. The population sample is close at hand (i.e. college surveys) Quota Sampling: To take a very tailored sample that's in proportion to some characteristic or trait of a population Judgement or Purposive Sampling: Logic, sound judgement can be used to select the sample of a larger population.
Expected value of sample means
Falls close to zero because positive and negative differences cancel out each other.
Importance of Type I and II errors - Effects on research & interpretation of stats
False-positive and false-negative results can also occur because of bias (observer, instrument, recall, etc.). (Errors due to bias, however, are not referred to as type I and type II errors.) Such errors are troublesome, since they may be difficult to detect and cannot usually be quantified.
t-test for independent samples - Typical research situations
First year graduate salaries based off genders Differences in anxiety based off of educational levels.
How sample size and sample variance/standard deviation affects t results
If you can accept this line of thinking then we can insert it into the calculations of your statistics as standard error. As you can see from it's equation, it's an estimation of a parameter, σ (that should become more accurate as n increases) divided by a value that always increases with n, n−−√. That standard error is representing the variability of the means or effects in your calculations. The smaller it is, the more powerful your statistical test.
Central Limit Theorem
If you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed.
p < .05, .01 and .001 - What do these really mean?
In the behavioral and social and sciences, a general pattern is to use either .05 or .01 as the cutoff. The one chosen is called the level of significance. If the probability associated with an inferential statistic is equal to or less than .05, then the result is said to be significant at the .05 level. If the .01 cutoff is used, then the result is significant at the .01 level.
Degrees of freedom calculation - one-sample vs. Independent (two) samples
Independent samples equation is N -2 instead of N -1.