Stats Test 2
sampling error
the extent to which sample means selected from the same population differ from one another
treatment
any unique characteristic of a sample or any unique way that a researcher treats a sample Can change value of a dependent variable Associated with variability in a study
To construct the sampling distribution, you must
(1) identify the mean of the sampling distribution (2) compute the standard error of the mean (3) distribute the possible sample means three SEM above and below the mean
Factors that decrease standard error
-As the population standard deviation decreases, standard error decreases -As sample size increases, standard error decreases
Characteristics of a Normal Distribution (3 and 4)
3. Mean, median, and mode are all located at the 50th percentile Half of the data (50%) in a normal distribution fall above the mean, median, and mode, and half (50%) fall below 4. Normal distribution is symmetrical Distribution of data above the mean is exactly the same as below the mean Folding test (fold distribution in half; perfect overlap)
Characteristics of a Normal Distribution (5 and 6)
5. The mean can equal any value Mean can equal any number from positive infinity ( ) to negative infinity ( ) 6. The standard deviation can equal any positive value Data can vary (SD > 0) or not vary (SD = 0) Describing variability as negative is meaningless
Characteristics of Normal Distribution (7 and 8)
7. Total area under the curve is equal to 1.0 Area under the curve varies between 0 and 1 and can never be negative Proportions of the area are used to determine the probabilities for normally distributed data 8. The tails of a normal distribution are asymptotic Tails of the distribution are always approaching x-axis but never touch it Allows for possibility of outliers in a data set
one sample t test
A statistical procedure used to test hypotheses concerning a single group mean in a population with an unknown variance
Locating Proportions
Area at each z score is given as a proportion in the unit normal table Can use the unit normal table to locate the proportion or probability for a score To locate the proportion Step 1: Transform a raw score (x) into a z score Step 2: Locate the corresponding proportion for the z score in the unit normal table
Step 3 of Hypothesis Testing
Compute the test statistic The value of test statistic can be used to make a decision regarding null hypothesis Helps determine how likely the sample outcome is if the population mean stated in the null is true The larger the value of the test statistic, the further a sample mean deviates from the population mean stated in null hypothesis
cohens d
Measures the number of standard deviations an effect shifted above or below the population mean stated by the null hypothesis Value for Cohen's d is 0 when there is no difference between the two means and increases as differences get larger
omega squared
Omega-squared gives a more conservative estimate
The Unit Normal Table
Probability distribution table displaying a list of z scores and the corresponding probabilities (or proportions of area) associated with each z score listed The unit normal table has three columns: A, B, and C
The mean: central limit theorem
Regardless of the distribution of scores in a population, the sampling distribution of sample means selected from that population will be approximately normally distributed At least 95% of possible Ms one could select fall within two SDs of μ (Empirical Rule)
Type 1 error
Rejecting null hypothesis when it is true
Step 2 of hypothesis testing
Set the criteria for a decision -Done by stating the level of significance -Criterion of judgment upon which a decision is made regarding the value stated in a null hypothesis -Typically the level is set (minimally) at 5% in behavioral research studies -Based on the probability of obtaining a statistic measured in a sample if the value stated in the null hypothesis were true -When the probability of obtaining a sample mean is less than 5% if the null were true, we reject the null
step 1 of hypothesis testing
State the null and alternative hypothesis
To locate the proportion, and therefore probability of obtaining a sample mean:
Step 1: Transform a sample mean (M) into a z score Step 2: Locate where in the distribution this mean lands using the unit normal table
degrees of freedom
The df for a t distribution are equal to the df of the sample variance: n - 1. -As the sample size increases, sample variance more closely resembles population variance -df of the sample variance increases, the shape of the t distribution changes -The probability of outcomes in the tails becomes less likely and the tails approach the x-axis faster as n increases
the variance: skewed distribution rule
The sampling distribution of sample variances tends toward a positively skewed distribution, regardless of the distribution in the population
Locating Scores
The unit normal table can be used to locate scores that fall within a given proportion or percentile To locate the score Step 1: Locate a z score associated with a given proportion in the unit normal table Step 2: Transform the z score into a raw score (x)
Normal Distribution
The theoretical distribution with data that are symmetrically distributed around the mean, median, and mode -Scores closer to the mean are more probable, or likely, than scores further from the mean -Behavioral data that researchers measure often tend to approximate a normal distribution
the t statistic
Used to determine the number of standard deviations in a t distribution that a sample deviates from the mean value or difference stated in the null tobt= M-mui/Sm Sm=SD/squareroot n
one sample z test
Used to test hypotheses concerning the mean in a single population with known variance
An alternative was proposed by Gosset
We substitute the population variance with the sample variance in the formula for standard error -This substitution, called the estimated standard error, is the denominator of the test statistic for a t test -This is acceptable because sample variance is an unbiased estimator of the population variance
Estimated Standard Error
an estimate of the standard deviation of a sampling distribution of sample means selected from a population with an unknown variance. It is an estimate of the standard error or standard distance that sample means deviate from the value of the population mean stated in the null hypothesis SD/squareroot n
goal of hypothesis testing
determine the likelihood that a population parameter, such as the mean, is likely to be true
effect
difference between sample mean and population mean stated in null hypothesis Insignificant when null is retained Significant when null is rejected
to estimate the size of the effect in the population,
effect size is computed (typically only calculated after finding a significant effect (reject null))
step 4 requires
making a decision on whether to retain or reject the null. There are four possible outcomes... Decision to retain the null is correct -How does this apply to our example hypothesis? Decision to retain the null is incorrect Type II error Decision to reject the null is correct Decision to reject the null is incorrect Type I error
Estimated Cohen's d—
measure of effect size in terms of the number of SDs that mean scores shift above or below the population mean stated by the null hypothesis The larger the value of d, the larger the effect
Proportion of Variance—
measure of effect size in terms of the proportion or percent of variability in a dependent variable that can be explained or accounted for by a treatment
type 11 error
retaining the null when its not true
effect size
size of an effect in a population How far scores have shifted in the population Percent of variance that can be explained by a given variable Most meaningfully reported with significant effects (decision to reject null)
cohen's effect size conventions
small- d<.2 medium- .2<d<.8 large- d>.8
Sampling design
specific plan or protocol for how individuals will be selected or sampled from a population of interest. Must address the following two questions: Does the order of selecting participants matter? Do we replace each selection before the next draw?
Purpose of hypothesis testing
to test claims or ideas about a group or population
Law of large numbers
—increasing the number of observations or sample size will decrease standard error. The smaller the standard error, the closer a distribution of sample means will be from the population mean
Z score
value on the x-axis of a standard normal distribution. The numerical value specifies the distance or standard deviation of a value from the mean
Sampling without replacement
when sampling, each participant selected is not replaced before the next selection Most common method used in behavioral research Example—suppose you place eight squares on a desk in front of you, with two marked A, two marked B, two marked C, and two marked D First draw: Probability of selecting a square marked A p(square marked A on 1st draw) = = .25 Second draw: Do not replace square A, therefore probability of selecting a square marked A p(square marked A on 2nd draw) = = .14
Power
—the probability of rejecting a false null hypothesis It is the likelihood that we will detect an effect, assuming an effect exists
Characteristics of a Normal Distribution (1 and 2)
1. Normal distribution is mathematically defined However, rarely do behavioral data fall exactly within limits of the above formula 2. Normal distribution is theoretical Behavioral data typically approximates a normal distribution
three assumptions of one sample t test
1. Normality—assume that data in the population being sampled are normally distributed 2. Random Sampling—assume that the data were obtained using a random sampling procedure 3. Independence—assume that probabilities of each measured outcome in a study are independent
Directional, One-Tailed Tests (H1: >) or (H1: <):
Alternative hypothesis is stated as greater than (>) or less than (<) null Interested in specific alternative from the null hypothesis Upper-tail critical test (H1 > H0), level of significance placed in the upper tail of distribution Lower-tail critical test (H1 < H0), level of significance placed in the lower tail of distribution
Non-Directional, Two-Tailed Tests (H1: ≠):
Alternative hypothesis is stated as not equal to (≠) the null Interested in any alternative from null hypothesis
Relationship between effect size and power
As effect size increases, power increases
null hypothesis
Ho statement about a population parameter (such as the mean) that is assumed to be true Example: Children in the United States watch an average of 3 hours of TV per week
three measures of effect size
Estimated Cohen's d Eta-Squared (proportion of variance) Omega-Squared (proportion of variance)
eta-squared
Eta-squared tends to overestimate proportion of variance explained by treatment
One tailed tests
Greater power If value stated in null hypothesis is false, then this test will make it easier to detect and reject
alternative hypothesis
H1 statement that contradicts the null hypothesis We think null is wrong, H1 allows us to state what we think is wrong Example: Children in the United States watch more or less than 3 hours of TV per week In any case, can predict H1 to be <, > or ≠ H0 (Directionality!)
Setting the criteria: critical values
In setting your significance level at a particular level, for example 5%, you will need to find your critical values that establish your rejection region These critical values are looked on your table that lists critical values for the test statistic you are calculating. Critical values are "the number(s) to exceed"
other ways to increase power
Increase effect size, sample size, and alpha level Decrease beta, population standard deviation, and population standard error
sample size and power
Increase in sample size decreases standard error, thereby increases power
Step 5 of Hypothesis Testing
Interpret your statistical decision as it relates to YOUR hypothesis Think about what makes a Results section of an empirical paper readable!
Using the estimated standard error in the denominator of the test statistic led to a new sampling distribution known at the t distribution
It is like a normal distribution, but with greater variability in the tails
Column B
Lists the area between a z score and the mean First value is .0000 (area between the mean and z = 0) As z score moves away from the mean, proportion of area between score and mean increases closer to .50000
Column C
Lists the area from a z score toward the tail As z score increases, area between that score and the tail decreases closer to .0000
Column A
Lists the z scores Only lists positive z scores (for negative z scores, recall that distribution is symmetrical) Listed from z = 0 at the mean to z = 4.00 above the mean
Step 4 of hypothesis testing
Make a decision Based on the probability of obtaining a sample mean, given that the value stated in the null is true (represented by p value) 1. Reject the null hypothesis—the sample mean is associated with low probability of occurrence if the null is true p value <.05; reached significance 2. Retain the null hypothesis—the sample mean is associated with high probability of occurrence when null is true p value >.05; failed to reach significance
two tailed tests
More conservative More difficult to reject null hypothesis Eliminates possibility of Type III error
Answering both questions leads to two strategies for sampling
Theoretical sampling- sampling strategy used in development of statistical theory Experimental sampling- sampling strategy most commonly used in experimental research
type 111 error
Type of error for one-tailed tests Occurs when we retain null hypothesis because rejection region was located in wrong tail.
Sampling and conditional probabilities
To avoid bias, researchers use a random procedure to select a sample from a population -To be selected at random, all individuals in a population must have an equal chance of being selected -The probability of selecting each participant must be the same
going from z to t
To compute a zobt score, population variance must be known In behavioral science it is rare that the variance in a population is known
Reading the t Table
To locate probabilities and critical values in a t distribution, a t table is used. We will use the t distribution and critical values listed in this table to compute t tests Need to know: n, α, and location of rejection region
summarize results in app format
To report results of a z test, report the test statistic, p value, and effect size Do not state that we reject or retain null Instead, report whether a result is significant or insignificant Not required to report exact p value, but is recommended Often necessary to include a figure or table to illustrate significant effect and effect size associated with it
Sampling Distributions: the mean
To see how well a sample mean estimates the value for a population mean, construct a sampling distribution The sample mean is related to the population mean in three ways The sample mean is an unbiased estimator It follows central limit theorem It has a minimum variance
the variance
We can also use the sampling distribution to characterize the sample variance. The characteristics of the sample variance are: -The sample variance is an unbiased estimator when we divide SS by df (or n - 1) -Distribution of the sample variances follows the skewed distribution rule -Distribution of sample variances has no minimum variance when we divide SS by df
Sampling Distributions
a distribution of all possible sample means or variances that could be obtained in samples of a given size from the same population -You can then compare the statistics obtained in the samples to the value of the mean and variance in the hypothetical population
Standard Normal Transformation
a formula used to convert any normal distribution to a standard normal distribution with a mean of 0 and standard deviation of 1 z= (x-mu) / greek sd for a population of scores z= (x-M)/ SD for a sample of scores Use transformation to locate where a score would fall in the standard normal distribution Once you know location, standard normal distribution rules to find the probability
Standard Normal Distribution
a normal distribution with a mean equal to 0 and a standard deviation equal to 1 Distributed in z score units along the x-axis
A sampling distribution can be
converted to a standard normal distribution by applying the z transformation z transformation is used to determine the likelihood of measuring a particular sample mean from a population with a given mean and variance
alpha level
level of significance or criterion of a hypothesis test Researchers control for Type I error by stating a level of significance (also called an alpha level) The significance level is the largest probability of committing Type I error that we will allow and still decide to reject the null hypothesis Criterion is usually set at .05 α level is compared to the p value in making a decision
hypothesis testing
method for testing a hypothesis about a population, using data measured in a sample
Sampling with replacement
sampling in which each participant selected is replaced before the next selection Ensures probability for each selection is the same Typically not necessary in behavioral research because the populations are large Example—given a population of 100 women: Probability of selecting first woman is p = = .01 Probability of selecting second woman without replacement is p = = .01
the mean: unbiased estimator
the same mean is an unbiased estimator when M = Ex/n, then M= M(greek) on average