Exam 4

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Shape of Sampling Distribution of the Mean: The central Limit theorem states that..

"Regardless of shape of Population... As sample size (N) increases, the shape of the sampling distribution of the mean approaches a Normal Distribution" • When N>Greater than or equal to 25-30, the Sampling Distribution is assumed to be almost mesokurtic. • If a Population is normally distributed, a sampling distribution is normally shaped and a sample can be any size. • If Population has - a mild skew, sample size (N) can be N Greater than or equal to> 10-12 - A moderate skew in the population needs N greater than or equal to 18-20 - A strong to extreme* skew needs N Greater than or equal to> 25-30. <- shouldn't use technique w/ extreme scores

Conditions/Assumptions of Hypothesis Testing

- Can define mean & standard deviation of population & its related sampling distribution. - Population is finite, empirical. - Sampling Distribution of the Mean is finite. - Sample Ps are randomly sampled from population. - Behaviour we study (DV) is normally distributed in the Population (But OK if sampleN>or equal to 25-30). - randomness= key - gonna be score or interval training

The sample is..

ALWAYS EMPRICAL

Sampling Distribution of the Mean when σ Known

- Given: μ= 90.00 seconds, σ= 10.20 and z=± 1.96 X : (N = 1) X : (N = 50) ‣ The z-values stay the same for the area under the curve, however the equivalent values for X change as sample size increases from 1 to 50. ex. when N=1 then the critical range is below 70.00, mean= 90.00 and the other critical range= 110.00 or higher. - whereas when N=50, the critical range is below 87.20, mean= 90.00 and the other critical range= 92.80 or higher. - the σ (SD of population) decreases as the number of values for N increases, so to be significant you only need to be 2 SD away from the mean with an N of 50, this is because the more subjects the more that are closer to the mean - if you take 3 people from a massive population and then put down their scores theres gonna be a ton of variability but if u add more theres gonna be more a balanced variability with less outlying scores.

Sampling Distribution of the Mean ‣ A Definition

- If N= 1, then the Sampling Distribution is simply the standardized Population Distribution. - If N> 1 , then the Sampling Distribution gives all the values the Mean for sample size N, along with the probability of getting each value - If N= bigger than 1 then-> go to population-> get 2 randomly chosen scores-> compute the mean between the two-> put in sampling distribution ex. sample size of 30 ppl-> get mean of them-> distribute-> will be different depending if N=20 or 40 scores (N) • This assumes random sampling from the Population. - Five steps are needed to create the Probability Distribution of Means - The Sampling Distribution of the Mean is a Probability Distribution, which is your Null Distribution for your study.

‣ Define three (3) distributions for Hypothesis Testing (HT)

- Population (Step 1) - Sampling distribution of the Mean for a given N (Step 2) - Sample (Step 4) (step 3= probability of alpha)

Changes in Definition of 3 Distributions: Population

- Still an empirical distribution - Parameters symbolized by Greek letters & numerically defined Central tendency (μ) & Variability (σ). - Shape: Any shape from rectangular to a normal curve; but best if not skewed. - Data: Distribution of all values (X) for members of the population - Statistic of mean becomes sample mean, not all individual values - shape= NA dealing w two points

The t-Table

- probabilities are at the top - .05 with a df of 1 will have a t-score of 12.7062, two people in this study-> therefore, to be significant their scores need to be 12.71 above the mean - DF 1- 30-> most studies shouldn't go beyond - always want to be conservative, if you have sample size up to 34 go to 30-> if 59 df go to 50-> decreases probability of a type 1 error - minimize type 1 error-> would rather make a type 2 error - if we know sigma then use z table, if estimating sigma use t table

The sampling distribution is always based on?

- the population

Comparison of the t-Test to the -Test -> differences

- z can only be single sample ‣ This t-test differs from the z-test because... - Population parameter σ is unknown. - This means the shape of Sampling Distribution is unknown. - The sample standard deviation (S) is the best point estimate of unknown parameter σ. - We use the SX value of to estimate σX - Critical values differ for each df so need you need a special Table of Critical values to find the value for each df. ‣ Problems with the Single Sample t-Test of Inference - Poor design: Still not baseline or control group - Limited ability to interpret outcome meaningfully

Note, the SPSS probability (Sig.) automatically defaults to a..

2-tailed test... to get a p(obs) for a 1-tailed test, it is simply, Sig ÷ 2 p = .246

Step 5 Compare our observed and critical values

Compare p(obs) with p(a) or Z obs with Z crit (draw the observed score and critical regions so you know) - p(obs)= 0.004< p(a)=0.05 - z(obs)= -2.91 > z crit= ± 1.96 ‣ Statistical Decision about Null hypothesis - Reject the Null Hypothesis that "racing speeds for 50 THC'ed hedgehogs do not differ from the population".

Application: Single Sample Design ‣ Step 1:

Define H1 & H0 - H1: "Following the ingestion of .5mg of THC twice a day for 2 weeks, the mean racing speed of the 50 skateboarding hedgehogs will differ from the average racing speed of the population." Symbolically: H1 : X ≠ μX, H1 : μX ≠ 90 - H0: "Following the ingestion of .5mg of THC twice a day for 2 weeks, the mean racing speed of the 50 skateboarding hedgehogs will not differ from the average racing speed of the population" Symbolically: H0 : X = μX, H0 : μX = 90

Select the A Priori Criterion: Step 3: Define Cut-off (Critical) Region

For a 2-tailed test: p(a)= .05, z= ± 1.96 - because we know population parameters we can calculate what it is known raw scores are ‣ Mean score (X) values that define region of rejection - to get σX divide σ by the square root of N X = μX + z(σX) = 90.00 − 1.96(1.4425) = 87.1727 -> so its gotta be below 87.17 or above 92.82 to be significant X = μX + z(σX) = 90.00 + 1.96(1.4425) = 92.8273

Generating a Sampling Distribution of the Mean

Generating a sampling distribution based on a sample mean ‣ This is an extension of probability theory from... - Sampling distributions for single value of X to... - Sampling distributions for the sample mean, - move beyond single participant - sample based on the mean score - population= individual scores-> sample= based on mean score-> so create sampling population - how do i compare a mean? Mean= based on different sample sizes - smaller sample size= more variability in potential means - bigger sample size= less variability ‣ We will apply this to two research designs - Value for is known or can be computed, so the sampling distribution is generated from an Empirical population, or... - Value for is not known & thus must be estimated, so the sampling distribution is generated from a Theoretical population. - if dont know population parameters then use T- Test

Steps for Creating a Sampling Distribution

Given: A Population of 4 scores & a sample size of N=2 ‣ Step 1: List all possible outcomes of a random draw of 2 values for X * Apply sampling with replacement * Draw 1 is independent of Draw 2 * When N=2, total number of outcomes is 4 power 2= 16 ‣ Step 2: Compute the Mean for each pair of scores. ‣ Step 3: Compute the probability for each outcome by * applying the Multiplication Rule ‣ Step 4: Create a Probability (Frequency) Table by * applying the Addition Rule ‣ Step 5: Create a Graphic Figure of the distribution of possible means.

Step 2. Choose a Sampling Distribution

Model of HT: Random sampling Research design: Single sample design Type of data: Scores (Racing speed in seconds) Statistic for Analysis: X (mean racing speed after THC) Population Parameters: μ= 90.00, σ= 10.20 (known) Shape of distribution: Normally distributed Sample size: N= 50 ‣ Choose the Sampling Distribution of the Mean for N= 50 when population parameters are known(can use z-score) - Will use the z-score distribution as population parameters are known.

Single Sample Design when σ Unknown

Population Data: Scores (X's) Mean: μ SD: σ Shape: Unknown (best if ND) Sampling Distribution of the Mean when σ unknown Data: Means for df___ Mean: μX SD: SX Shape: Kurtotic (symmetrical, the more sample the closer to ND it is) Sample (N=___) Data: Scores Mean: X (bar) SD: S Shape: Not applicable

Three Distributions for a single sample design

Population: Data: 4 scores Mean: U(population mean)= 2.50 SD: Sigma= 1.12 Shape: Rectangular Sampling Distribution of the Mean Data: 16 Means of N= 2 Mean: (Population mean and mean)= 2.50 SD: (Sigma population mean): 0.79 Shape: Normal distribution Sample Data: 2 scores (X1, X2) Mean: ? SD: ? Shape: Not applicable

Research Design: Single Sample

Population: Data: Scores (500 X's) Mean: μ= 90.00 secs SD: σ= 10.20 secs Shape: Known to be Normal distribution Sampling Distribution of the mean Data: Means Mean: μxmean= 90.00 sec SD: σ= 1.44 secs shape: Normal distribution Sample Data: Scores (50 X's) mean: Xbar= 85.80 SD: 8.00 secs SD= 8.00 secs Shape: Not applicable

Step 4 Gather Data, Compute Descriptives, Do Test: Generate a Test Statistic

Population: μ = 90.00 secs, σ = 10.20 secs Sample Data N= 50, X= 85.80 secs, SD= 8.00 secs - σX= σ/ square root of N= 10.20 divided by square root of 50= 144.25 - Zmean= Xmean- μX/ σXmean= 85.8000-90.0000/1.4425= -2.9116 Z= -2.91 -> Compute the area under the curve associated with z= -2.91 is p=.0018 - p=(1.0000- 0.9982)= 0.0018 - Because we are conducting a non-directional (2-tailed) test, we must double the p (to account for the fact it may have occurred at the other end); therefore p(obs)= 0.0018+ 0.0018= 0.0036 - p=.004

Changes in Definition of 3 Distributions: sample

Sample: Statistic of interest is the Sample mean (Xbar) - Varies in size (N>1) - Always finite - Statistics symbolized by Roman letters & numerically defined Central tendency (Xbar ) & Variability: (SD) - Shape: Not really applicable as we are concerned with the mean

Example of Creating a Sampling Distribution

Sampling Distributuion of all possible outcomes when N=2 from a population of 4 scores (X=1,2,3,4) ‣ Step 1: List all possible outcomes of a random draw of 2 values (pairs) for X (ex. you could get a 1 and a 1 or a 1 and a 2 or a 1 and a 3 or 4) ‣ Step 2: Compute the Mean for each pair of scores (ex. 1,1= 1+1=2/2=1, mean= 1) ‣ Step 3: Compute the probability for each outcome by applying the Multiplication Rule you have a 1/4=0.25 of getting a 1, 2, 3 or 4. (0.25)(0.25)= 0.0625, p=1.0000 ‣ Step 4: Create frequency table: Compute the probability for all related outcomes/ count up how many times each mean(Smallest mean on bottom) occurs and then divide it by the total number of possible outcomes (16)& make a Probability Table(= relative frequency, sums to 1.00, now can say what is probability of getting a mean of 2.50 is) by applying the Addition Rule. ‣ Step 5: Create a Graphic Figure of the distribution of all outcomes Sampling Distribution of the Mean (N=2) - looks like normal distribution - step 6: Population parameters

Descriptive Measures: Sampling Distribution

Sampling distribution of the mean for (N=2), The total outcomes is equal to 4(1, 2, 3 or 4) to the power of 2(b/c choosing 2 from population and finding mean) which = 16(the total possible outcomes) If N>1 , then the Sampling Distribution gives all the values the Mean for sample size N, along with the probability of getting each value - Data: N=16: Values for mean are from table - Mean: Population mean= Sum of means/ N= 40/16=2.5000 - Variance: Sigma2= sum of mean - population mean squared/ N= 10.0000/16=0.6250 - Std Dev.: Sigma= square root of 0.06250= 0.7906

Changes to Steps of Hypothesis Testing step 2:

Select Sampling Distribution Same Model of Hypothesis Testing: Random Sampling (ALWAYS) Type of Data: Score (interval/ratio) Population Parameters: μ & σ are known Different: Research Design: Single Sample Statistic for Analysis: Sample Mean (X) Sample Size: N>1 Select the "Sampling Distribution of the Mean when N=50 and population parameters are known"-> then can use Z-test

Steps of HT: Single Sample t-Test step 2:

Select sampling distribution and define its parameters Model of hypothesis testing: Random sampling Type of data: score (interval/ ratio) Research design: Single sample Statistic for analysis: Sample mean (X) sample size: N>1 (should be N>equal to 7) Different: Population parameters: μ known and σ unknown Select the "sampling distribution of the mean for t-scores for df=____" OR "sampling distribution of the mean when σX is unknown for df____"

Changes to Steps of Hypothesis Testing step 1:

State hypotheses in verbal & symbolic form Sample mean is compared to the mean of the Sampling Distribution of the Mean H1 : X ≠ μX, H0 : X = μX

Steps of HT: Single Sample t-Test step 1:

State the Alternative and Null Hypotheses H1 : X ≠ μX, H0 : X = μX

Measures of the Sampling Distribution: Measure of Central Tendency

Symbol: μ mean It represents the "Mean of the Sampling Distribution of the Mean"(μ mean) is an Estimate of Population Mean (μ) and the variability among Means occurs because of Errors of Estimation via random sampling.

Measures of the Sampling Distribution: Measure of Variability

Symbol: σ mean It represents the "Standard Deviation of the Sampling Distribution of the Mean" or "Standard Error of the Mean" or "Standard Error" - Standard error(σ mean) is a measure of errors among the means and is an estimate of Population variability modified by sample size formula: (σ mean) = σ / (square root of) N

Step 6 Interperate and Report your outcome

The racing speed of the hedgehogs who had ingested THC in their brownie every day for 2 weeks was reliably faster (X = 85.80 secs, SD = 8.00) than the average racing speed of the colony of hedgehogs (μ = 90.00 secs, σ = 10.20), SE = 1.44, z −2.91 p < .004. These results, however, must be treated with caution because... ‣ Tiny possibility (p < .004) sample was very fast group anyway. It is more probable that the brownies had an effect but this still a flawed study because when you have a Single Sample Research Design (drawn from a known population)... THERES NO BASELINE

Relationship Between N and Standard Error

Two Principles Apply 1. σmean varies directly with σ and is reduced by the size of the sample, N 2. As the value for N increases, the value for σmean decreases relative to σ (SD of population) • Thus, as sample size INCREASES, we make smaller errors when trying to estimate σ (populations SD) from the values in the sampling distribution. ‣ Explanation: Mathematical & Conceptual - As N increases, the probability of extreme values for the means is smaller, and thus we have smaller deviations from the mean. - Random error always present: XO= XT + XE but its influence on a measure of variability is reduced by N.

Assumptions of the t-Test

‣ Assumptions of applying the t-test to Single Sample Research Design in Hypothesis Testing 1. Random sampling of participants from population. - Assumed but rarely done. - OK... Just do not generalize outcome to population. 2. Dependent variable (scores) is normally distributed in population*. 3. Data are scores: Interval or ratio scale of measurement (a must!). 4.N ≥ 7 *The t-test is moderately robust to violations of the normality assumption (#2)... EXCEPT when scores are very skewed in population & you have a directional hypothesis.

Issue #2: Degrees of Freedom (df)

‣ Definition - "The number of 'units' or pieces of information (N) on which a statistic is based (e.g., SD) minus the number of population parameters being estimated". ‣ Value for df - Gives the number of scores that are free to vary ‣ For a Single Sample design: df = N − 1 - 1 population parameter (σ) is estimated, so we lose 1 df

t-Statistic Sampling Distribution of the Mean Definition, characteristics, shape

‣ Definition (t-score distribution) - "The distribution of all possible means (X) for each value of df when σX is being estimated by SX" ‣ Characteristics of the t-distribution: - mean: μX = μ (or 0 in standard score units) - SD: SX estimated standard error of the mean. SX= S/ square root of N - X-axis: Values range from: −∞ to +∞ ‣ Shape is kurtotic (unimodal and symmetrical) - low values of df (platykurtic); high values of df (mesokurtic)

Issue #1: Need value for σ and σX

‣ Distribution of the Mean: When σ is known... - If we randomly sample from the population • μX is an unbiased estimate of μ; that is, μX= μ • X is a slightly biased estimate of μ (may be higher or lower) - We use σ to estimate σX using the formula: σX= σ/ square root of N ‣ Distribution of the Mean: When σ is not known... - We can use SD as an estimate of σ - But, SD is a biased estimator of σ (it underestimates it) - We can reduce this bias by using a "degrees of freedom" adjustment

Sampling Distribution of the Mean

‣ Generated from the raw scores of population - Represents the probability distribution of all possible means of sample size N ‣ Is a distribution of standard scores ( Z-scores). ‣ The Mean & Standard Deviation of Sampling Distribution of the Mean are estimates of its Population. ‣ Descriptives of Sampling Distribution combine Greek & Roman Central tendency μxmean & Variability (standard error) =σ xmean - Variance of Sampling Distribution of the Mean is σxbar squared ‣ Shape: Normal Distribution ‣ Distribution used to determine whether our experimental outcome is of statistical significance.

Degrees of Freedom (df) Adjustment

‣ How does a Degrees of Freedom (df) adjustment work? - Given: N = 4 and X = 5.00 - will be a unique (fixed) value... no other value works ex. have X's of 8, 4, 5, ? and they sum together to be 20.00 so, 20-8-4-5= 3 -> couldn't be any other number!

Why the Change?

‣ In a Single Participant design... - A single value from the sample (X) is compared to a sampling distribution of single -scores (SND). ‣ But... Research studies typically analyze groups of Ps, not a single P. - Problem: Cannot compare a sample to a (Sampling) Distribution of single values of Z. - Solution: Generate a new Sampling Distribution based on a distribution of means - could see where score was in relation to all other scores - use sampling distribution based on the means - unique to every single study-> represents chance ‣ How many values should make up the mean? - Problem: What about the effect of sampling variation? That is, different size samples produce different possible means. - Solution: A Sampling Distribution of means is generated based on the sample (N) size of the study - sampling distribution= unique because of N so amount of variability will change when N does

Summary

‣ Issues when estimating population parameter - If σ (σX) is unknown, shape of Sampling Distribution is unknown. - If the shape of Sampling Distribution is unknown, we cannot use the standard normal curve as Sampling Distribution ( z-scores) ‣ Must know shape, so we must compute an estimate of σ so that a value for σX can be estimated (S is an estimate of σ, not σX). - Use S to compute SX, which is an estimate of σX • SX is a biased estimate of σX • Degrees of freedom: Reduce bias when one (or more) population parameters are estimated from the sample statistics • If df < 1000, need unique Sampling Distribution for t-statistic ‣ Table of Critical Values for Student's t - Gives unique Sampling Distribution for each df value for tcrit - Critical values for t vary for df & α

Comparison of the t-Test to the z-Test -> similarities

‣ This t-test is like the z-test because... - both standardized scores - Applied to single sample research design. - Statistic for analysis is the Sample Mean (X) - DV represents scores that are normally distributed in population. - μ(population mean) is either known or can easily be estimate • can be estimated by pilot studies, previous research, etc. - The t-test ratio examines treatment effect reduced by error.

Example: Single Sample Design Given SX (Estimated standard error of the Sampling Distribution of the Mean)

‣ Preference for "new improved" crib mobile N = 25, 6 w.o. infants, μ = 5.00 s, X = 6.75 s, S = 3.00, SX = 0.60 ‣ H1: Infants looking time at the crib mobile will vary (be different) from the average looking time of 6-week old infants: H1: X ≠ μX or H1: μX ≠ 5.00 ‣ H0: Infants looking time at the crib mobile will not be different from the average looking time of 6-week old infants: H0 : X = μX or H0 : μX = 5.00 t(24)crit = ± 2.0639 (go to 24 in z table and then because its a two tailed test look at .05 and find its level of significance is 2.0639), t = (6.75 − 5.00)/0.60 = 2.9167 (perform t-test and find that t score (like z score) 2.92 is bigger than 2.06 which means significant), ∴ Reject H0 ‣ The average time spent by 6-week-old infants in looking at the mobile (X = 6.75, S = 3.00) was longer than that of the population of infants (μ = 5.00) SE= 0.60, t(24) = 2.92(equivalent to z-score), p < .01 (p is less than .01 because it cant eat 3.0905 but it is greater than/ can eat 2.49, and 2.79 so p<.01)

Sampling Distribution when σ is Unknown/ 3 issues

‣ Problem: What if σ is unknown? - We will not know the shape of the sampling distribution so... - cannot use z-table for probability values, and thus... - cannot apply the z-test ratio for hypothesis testing. ‣ Issue #1: Problem: Value of σ not known - Solution: Use value for SD of sample ‣ Issue #2: Problem: SD is biased estimate of σ - Solution: Reduce bias by making a 'degrees of freedom' adjustment to SD (S is now used to estimate SX) ‣ Issue #3: Problem: Using SX to compute changes the shape of the Sampling Distribution so NO z distribution! - Solution: New "Table of Critical Values for t-statistic"

Issue #3: Estimating Standard Error, SX, problem

‣ S is used to estimate σ, because we need a value for σX ‣ Our estimate for the standard error the Sampling Distribution becomes SX= S/ square root of N - SX is the Estimated standard error of the Sampling Distribution of the Mean ‣ Problem!!! - The distribution of z-scores requires σ to be a constant to have a single (normal) curve for Sampling Distribution & fixed z-scores. - S (and SX) is subject to sampling variation and is not a constant! - S can be anything, its not constant, its estimated - We can no longer use the z-Table

Descriptive Measures: Sample

‣ Sample Statistics (N=2)-> know a priori - The sample is based on randomly sampling two scores (with replacement) from the population, and computing the Mean and Standard Deviation - There are a total of 16 possible draws, so there are 16 possibilities of Means and Standard Deviations. - These possibilities are illustrated in the table - for each draw can compute a mean and SD but dont know until compute - mean of sampling distribution= mean of population - know that theres 16 possible outcomes-> but only see one at a time

A Probability (Sampling) Distribution

‣ Sampling Distributions are based on the sample size (N) & the statistic of interest which is the mean from the Sample being applied in the analysis. ‣ We identify the measures of central tendency and variability of the Sampling Distribution by combining the symbols for central tendency and variability from the Population with the symbol for the statistic of interest from the Sample ‣ In the case where the statistic of interest from the sample is the Mean... - Population symbol for central tendency is μ, statistic of interest is (mean) • Sampling Distribution symbol is μ mean - Population symbol for variability is (σ), statistic of interest is (mean) • Sampling Distribution symbol is (σ mean)

Single Sample Hypothesis Testings with an Empirical Population

‣ Single Sample Research Design: Empirical Population - Statistic for analysis: Sample Mean (Xbar or M) - Application of the Zmean test ratio

Example: Single Sample Design Given S

‣ Statistics professors level of ecstasy at end of term using the well standardized Yahoo Euphoria Scale (YES). ‣ N= 16, μ = 100.00, X = 106.00, S = 12.87, t(15)crit = ± 2.1314 - First calculate: SX(standard error SE)= S/Square root of N= 12.87/ square root of 16= 3.2175 - Then calculate: tX= (X-μX)/ SX= 106.00- 100.00/3.2175= 1.8648 ‣ At the end of term, statistics professors feelings of jubilation as measured by the YES , were not different from the population μ = 100, SE = 3.22, t(15) = 1.86, p > .05 (tx was= 1.86 which was less than 2.1314 in the t-table)

Changes to Steps of Hypothesis Testing step 3 and 4:

‣ Step 3: No changes... still need to determine your p(a) ‣ Step 4: New Test Ratio symbolized by zxmean (Z -test of means) Variation of z-test ratio for Single Participant design zx mean= Mean- μ population mean/ σ population SD Criteria for using this test ratio 1. Statistic for Analysis = Sample Mean (X) 2. Population Parameters are known 3. Sampling Distribution is normally shaped -> Uses μ population mean and σ populationSD mean because we compare the outcome of Sample to values in the Sampling Distribution, NOT to values in Population Distribution.

Steps of HT: Single Sample t-Test step 3, 4, 5, 6

‣ Step 3: Set a priori criterion for p(a) .Determine t crit from table based on df & a ‣ Step 4: Carry out Experiment. Graph your data (apply the interocular trauma test) Calculate descriptive statistics: X, S, SX Apply the new test ratio (t-test) to data - cant adjust after this step ‣ Step 5: Make your Statistical Decision Compare to p(obs) to p(a) or to t(obs) to t(crit) ‣ Step 6: Report your outcome and interpret your data. Report in both APA format, and in practical terms.

Test Ratio for Distribution of t-Scores

‣ Test Ratio - We use this with the Random Sampling Model of Hypothesis Testing - We still require three distributions: Population, Sampling Distribution, Sample ‣ Single sample research design when σ unknown and estimated from S. tX= X- μ/ SX

The t-Table of Critical Values

‣ The Table of Critical Values for the distribution of the t-statistic - is a Numerical representation for a series of sampling distribution of Student's t (the t-statistic). ‣ These critical t-values are used in both - Hypothesis testing & Estimation: when σX is unknown & must be estimated by SX (SD estimate of population SD mean) ‣ Each Sampling Distribution of the t-statistic is unique for each df - At small df, the critical values are high! - As df increases, the critical values become smaller until they eventually equal the z-distribution - Distribution of t-scores is a "family of curves" -> bigger the sample size is (N)-> less variability= more mesokurtic-> more towards a standard normal distribution as df or N increase (look at table: family of curves)-> big variability when t(2) vs when t(60)

Shape and Critical Values for t

‣ Values for t crit change systematically with df. - The smaller the df, the larger is value for t crit ‣ Value for df based on how many estimates needed of the population parameters (usually estimates of variability). ‣ Sampling distribution of t-scores is numerically represented as... "Table of Critical Values for t-statistic". ‣ Based on mathematical theory and a priori probability of possible outcomes.

Example: Data Provided as Raw Scores

‣ Was the mean score for the Final Exam of 9 students given extra time higher than the mean of the Final Exam (μ = 72.00) for all Psyc 300A students? (why is this 1-tailed?b/c dont care if it was lower) t(8)crit = + 1.8595 X = 75.0000 SS = 1250.0000 S2= SS/ N-1= 1250.0000/8= 156.2500 S= Square root of S2= 12.5000 SX= S/ square root of N= 12.5000/3.0000= 4.1667 ‣ Compute the test statistic: tX = X − μX/SX = 75.0000 − 72.0000/4.1667 = + 0.7200 ‣ Make your statistical decision: tobs = + 0.7200 < tcrit = + 1.8595(less than tcrit, gotta be more than to be significant), p(obs) ≈ 0.20(one tailed) > p(α) = .05 (one tailed, .20 is not less than .05) - retain the Null hypothesis ‣ Formal Report: Allowing the nine students extra time to complete the Final Exam resulted in a mean score (X = 75.00, S = 12.50) no higher than the average Final Exam scores for all students(μ = 72.00) SE = 4.17, t(8) = 0.72, p > .05

New Formula, New Symbol

‣ We have been using the formula SD= square root of SS/N - To compute SD as a Descriptive Statistic only - When we are doing Inferential Analysis with an Empirical Population (that is, we know σ so WE CAN USE Z-SCORES) ‣ When we are doing Inferential Analysis and using SD to estimate σ, we use the following formula S= square root of SS/ N-1 - We use the symbol, S (or s), for Inferential Analysis when σ unknown, comes from sample - S= estimate what population sigma is, estimate for standard error - lower sample sizes= s varies more

SD as a Biased Estimator of σ

‣ What do we mean by SD is a biased estimator of σ? - SD is based on the mean of our sample X - The mean of our sample X is a slightly biased estimator of the population mean (μ)- it may be higher or lower ‣ Let's assume that our population μ = 6 and our sample is as follows: (look at example) -

Distribution of the t-Statistic

‣ What's new? - The critical value, t crit, is adjusted for each df and (a) - The t-Table has critical values for 43 unique Sampling Distributions - random sampling model - used when estimating and dont know sigma - small df= critical values are high - solid line= z - dotted line= quite high

What Happens if Population is Theoretical?

‣ When our Population is Theoretical (essentially Infinite), it means that we may or many not know what the values for the mean (μ) and/or standard deviation (σ) is, nor what the shape of the distribution is. This means that our Sampling Distribution is also Theoretical, and we will not know what its mean, standard deviation, nor shape is. - Not knowing μ is typically not a problem (we can estimate usingX) - Why do we need to know σ and the shape of the distribution? • We use σ to estimate σX • Knowing the shape of the Sampling Distribution allows us to... • Use the z-table for probability values, and thus... • Apply Null Hypothesis Testing and Estimation

Sampling Distribution of the Mean — Review

‣ When the Population is Finite, it means it is also Empirical; that is, we know the value for its - Mean(μ) - Standard Deviation(σ) - Shape (can be anything, but best if mesokurtic) ‣ This means that the Sampling Distribution (of the Mean of the population) is also Finite & Empirical, and the Sampling Distribution will be fully defined in terms of - Value for Mean,(μX) & - Standard Deviation,(σX) & - Shape (will be mesokurtic if Central Limit Theorem applies).

The Solution to the Problem

‣ When the df (sample sizes) are small... - values for S (and SX) fluctuate because of the increased probability of extreme means - the shape of the sampling distribution flattens (becomes platykurtic) and changes for each df value. ‣ When the df (sample size) is large... - values for S(estimated SD and Sx estimated SD of population) become more stable and approximates σX - the shape of the sampling distribution approaches the SND- standard normal distribution ‣ Solution - Unique Table of Critical Values based on the degrees of freedom - The distribution of the t-Statistic!

Finite Sampling Distribution

‣ When μ and σ are known & Central Limit Theorem criteria are met - Shape is mesokurtotic: i.e., Normal Curve - We can use the Standard Normal Distribution (SND) as our Sampling Distribution (of the Mean) - The area under the curve is fixed for each z-value - z(1.96) = .025 or 2.5% of the area under the curve to the right of the value ‣ Actual raw scores associated with Z-scores vary based on sample size (N)


Ensembles d'études connexes

Chapter 2 The Accounting Process

View Set

Chapter 41 PrepU - Management of pt's with musculoskeletal disorders

View Set

porth essentials of pathophysiology ch 32 & 33

View Set

MED SURG CH 14 Infection and Human Immunodeficiency Virus Infection

View Set

Authors, Authors, Authors Part One and Two--------And A Few more philosophers of note-AP Euro

View Set

Клиент-серверная архитектура

View Set