Statistics Ch 9 & 10

¡Supera tus tareas y exámenes ahora con Quizwiz!

Why can the null hypothesis only be proven FALSE and not true?

The true state of a null hypothesis CANNOT BE TESTED DIRECTLY because it only describes the POPULATION (not the sample) You either REJECT or ACCEPT the true state See: Different Types of Errors (Table 9.1)

What if my sample and population averages are zero?

Then the sample selection was perfect, random and fairly represents the population

critical value

The critical value is a threshold or cutoff point used in hypothesis testing. It is determined based on the chosen significance level (α) and the degrees of freedom of the statistical test. It represents the point beyond which you would reject the null hypothesis If the test statistic (e.g., t-statistic or z-statistic) calculated from your data exceeds the critical value, it suggests that the observed results are statistically significant, and you would reject the null hypothesis.

publication bias

The tendency for journals to publish positive findings but not negative or ambiguous ones

What is the difference between significance and meaningfulness?

"Significance" refers to the results of hypothesis testing and whether the observed differences or effects are likely to have occurred by random chance or if they are statistically significant "Meaningfulness" goes beyond statistical significance and considers whether the observed results have practical or substantive importance in a real-world context.

Why do statisticians consider effect size more important than significance?

1) Discourages p-hacking (data or analysis manipulation for statistically significant results) 2) Crucial for understanding the PRACTICAL and CLINICAL significance of the findings in medicine, psychology, education, social sciences, etc. 3) Less affected by sample size fluctuations 4) Standardizes a more interpretable and informative impact

Overall Steps

1) Form null and alternate hypothesis Null Hypothesis (H0): The new drug is equally effective as the standard treatment; there is no significant difference in cholesterol reduction between the two groups. Alternative Hypothesis (H1): The new drug is more effective than the standard treatment; there is a significant difference in cholesterol reduction between the two groups. 2) Conduct a randomized controlled experiment with a sample You conduct a randomized controlled trial (RCT) where you randomly assign 100 patients with high cholesterol into two groups: Group A: Receives the new drug. Group B: Receives the standard treatment. 3) Calculate the mean, std, and other statistical measurements for the t-test Sample Mean (Group A, new drug): 30 mg/dL reduction. Sample Mean (Group B, standard treatment): 20 mg/dL reduction. 4) Determine your level of confidence I will reject the null hypothesis if my p-value is less than 0.05 (alpha of 5%), indicating that the observed difference is unlikely to have occurred by random chance when the null hypothesis is true. 5) Find evidence to support REJECTING THE NULL HYPOTHESIS (T-Test) I calculated a p-value of p = 0.03. 6) Form a conclusion based on the confidence level and p-value. Since the calculated p-value (p = 0.03) is less than the commonly chosen significance level α = 0.05 (5%), I have evidence to reject the null hypothesis. This means that the observed difference in cholesterol reduction between the two groups (30 mg/dL vs. 20 mg/dL) is statistically significant because it indicates that only approximately 3% of the difference is likely due to random chance.

Significance Level EXAMPLES

1) Mothers Working Experiment * Findings occurred at the .05 level (p < .05) ** This means there is 1 chance in 20 or 5% chance that ANY DIFFERENCES WERE NOT DUE TO HYPOTHESIZED REASONS BUT TO SOME RANDOM REASON (chance) 2) A researcher is interested in seeing whether there is a difference between the academic achievement of children who participated in a preschool program and that of children who did not participate. * Null Hypothesis: The two groups are equal to each other on some measure of achievement. * Research Hypothesis: The mean score for the group of children who participated in the program is higher than the mean score for the group of children who did not participate in the program. * Goal: Ensure any difference that exists between the two groups is due only to the effects of the preschool experience and no other factor or combination of factors. Through a variety of techniques, you control or eliminate all the possible sources of difference, such as the influence of parents' education, number of children in the family, and so on. Once these other potential explanatory variables are removed, the only remaining alternative explanation for differences is the effect of the preschool experience itself. ** BUT YOU WILL NEVER BE SURE IF THE SAMPLE ACCURATELY REFLECTS THE POPULATION OF PRESCHOOLERS. *** EVEN IF IT DOES, YOU WILL NEVER BE SURE OTHER INFLUENCES THAN THE AFOREMENTIONED MIGHT AFFECT THE OUTCOME By concluding that the differences in test scores are due to differences in treatment, YOU ACCEPT SOME RISK. This degree of risk is the level of statistical significance at which you are willing to operate.

Z Test Statistic Steps

1) State the null and research hypotheses * H0 = xbar = mu * Ha = x bar =/= mu 2) Set the level of risk (significance level/type 1 error) * p < 0.05 3) Select the appropriate test statistic * See flow chart, this example uses one-sample z test 4) Compute the test statistic (obtained value) * SEM = 2.5 / sqrt (36) = 0.42 * z = (100-99)/0.42 = 2.38 5) Determine the value needed to reject the null hypothesis (critical value) * Reading the z-table, we know +- 1.96 is associated with a .025 probability 6) Compare the obtained value and critical value * Our obtained value (2.38) is much higher than a critical value of +- 1.96 for a null hypothesis at the .05 level with 36 participants. ** Because our sample falls outside "chance" as the most likely explanation of why the sample mean and population mean differ, we can now make a decision. 7) Make a decision based on obtained value > critical value (or vice versa) * Because obtained value (2.38) > critical value (1.96), the null hypothesis should be rejected. There is a significant difference between the sample mean and population mean. Although you have proven a difference, a true statistician delves further and researches what factors attribute to these differences.

When is it appropriate to use the one-sample z test?

1) When the population parameters are known * mean, standard deviation 2) When the data is NORMALLY DISTRIBUTED 3) When the sample size is large enough * n > 30 4) When the probability of making a type 1 error (level of significance, alpha is determined * Commonly .05 and .01 5) When hypothesis testing whether the sample mean is significantly different from a known population mean * H1: mu > < =/= mu0

Reasons a significant difference is not 100%

1) You could just be wrong. * Perhaps the difference between adolescent attitude and the mother working was a factor inadvertently unaccounted for ** Like a 'Mothers Who Work Club' speech several students attended. ** What if the genders of each adolescent group were unequal? 2) The sample might not contain a 100% accurate representation of the population Obvious reasons such as clerical or measurement errors, etc. SOLUTION * Account for a certain amount of uncontrolled error ** Establish a LEVEL OF CHANCE OR RISK you will take. *** Express this value as SIGNIFICANCE LEVEL

INVESTIGATING A MAGIC TRICK (P-Value)

1. Think of a p-value as a Measure of Surprise Imagine you're investigating something unusual, like a magic trick. You want to know if the magician's performance is genuinely magical or if it could have happened by chance. The p-value is like a measure of how surprised you should be if there's no magic involved. 2. Null Hypothesis as "No Magic" In this analogy, the null hypothesis (H0) is your starting point. It's like saying, "Assume there's no magic; the magician's tricks are just random chance." You start with skepticism, assuming no effect or difference. 3. Collecting Data Now, you collect data by observing the magician's performance or conducting experiments. This data might be how many times the magician successfully performs the trick, just like in a scientific experiment where you collect measurements. 4. The p-value The p-value tells you how likely it is to observe the data you collected if there's no magic, only random chance. The smaller the p-value, the more surprised you should be if there's no magic. In other words, it quantifies how inconsistent your data is with the idea that there's no effect. If p is very small (e.g., p < 0.05 or alpha 5%), it means your data is quite surprising if there's no magic. You might start to doubt the "no magic" assumption. If p is large (e.g., p > 0.05), it means your data is not very surprising if there's no magic. You might conclude that the magician's performance is explainable by random chance. 5. Setting a Threshold Researchers typically set a threshold called the significance level (α), often at 0.05. If p is less than α, they reject the "no magic" assumption and INTENTIONALLY COMMIT A TYPE 1 ERROR If p is greater than α, they fail to reject the null hypothesis (no magic) 6. Drawing Conclusions If you reject the "no magic" assumption (p < 0.05), you might conclude that there's evidence of magic in the magician's performance (i.e., the effect is statistically significant). If you fail to reject it (p > 0.05), you don't have enough evidence to claim there's magic; the results could be due to random chance alone. 7. Remember the Limitations A small p-value doesn't prove the effect is practically significant or important. A large p-value does

Z-Score Cheat Sheet

95% * +- 1.96 99% * +- 2.56

Cohen's d

A measure of effect size PURPOSE * Assesses the difference between two means in terms of standard deviation, not standard error INTERPRETATION * Small effect size (0 to .2) * Medium effect size (.2 to .8) * Large effect size (.8 or above) 1 SAMPLE Z TEST * d = (xbar - mu) / sigma * d = (sample mean - population mean) / population standard deviation Value of 0 * No difference between the two distributions Value of 1 * The two groups overlap about 45%

T-Test

A statistical test used to evaluate the size and significance of the difference between two means. PURPOSE * Finds evidence to REJECT THE NULL HYPOTHESIS

If .05 is good and 0.1 is "better", why not set your Type 1 level of risk at .0000001?

A stringent Type I error rate allows for little leeway * The research hypothesis might be true but you'd never reach that conclusion with a too-rigid Type I level of error.

Type 2 Error

Accepting a false null hypothesis Testing against POWER * Probability statement of the significance of finding: rejecting a null hypothesis when it is false HOW TO MINIMIZE * Increasing sample size (increases sample characteristics to population) EXAMPLE * There may really be (statistically significant) differences between the populations represented by the sample groups, but you mistakenly conclude there are not.

TESTS OF SIGNIFIANCE

Based on the fact that each null hypothesis is associated with a particular statistic and each statistic is associated with a particular distribution * Null Hypothesis -> Statistic -> Distribution 1) Create the null hypothesis statement * A statement of equality, assumed to be "true" given no other information 2) Set the risk level (aka level of significance, aka chance of type 1 error) * Decide the degree in which you are WRONG given the null hypothesis is true (traditionally .05 or lower) ** Less risk = smaller error = lower p-value * Risk is always required because you never know the "true" relationship between variables 3) Select the appropriate test statistic * Each null hypothesis requires a particular approach 4) Compute the test statistic value (aka obtained value, aka observed value) 5) Determine the value needed to reject the null hypothesis using the appropriate critical values table for a particular statistic * Each statistic (along with group size and risk) has a critical value associated with it ** Ex: Z-Score Table 6) Compare the obtained value and critical value (crucial) * Value you computed vs Expected value by chance alone 7a) If the OBTAINED VALUE > CRITICAL VALUE, the null hypothesis CANNOT BE ACCEPTED * The H0's statement of equality/chance IS NOT the most valid explanation for the differences found. * The most valid explanation being the treatment or your independent variable ** Only if your obtained value is more extreme than what would happen by chance (meaning that the result of the test statistic is not a result of some chance fluctuation) can you say that any differences you obtained are not due to chance 7b) If the OBTAINED VALUE < CRITICAL VALUE, the null hypothesis SHOULD BE ACCEPTED * If you cannot show that the difference you obtained is due to something other than chance (such as the treatment), then the difference must be due to chance or something you have no control over. In other words, the null is the best explanation.

Why does the confidence interval itself get larger as the probability of your being correct increases (from, say, 95% to 99%)?

Because the larger range of the confidence interval (in this case, 19.6 [73.8, 54.2] for a 95% confidence interval vs. 25.6 [76.8, 51.2] for a 99% confidence interval) allows you to encompass a larger number of possible outcomes and you can thereby be more confident.

Real-World Stats: Visual Representation

Been to the doctor lately? Had your test results explained to you? Know anything about the use of electronic medical records? In this study, Noel Brewer and his colleagues compared the usefulness of tables and bar graphs for reporting the results of medical tests. Using a z test, the researchers found that participants required less viewing time when using bar graphs rather than tables. The researchers attributed this difference to the superior performance of bar graphs in communicating essential information (and you well remember from Chapter 4, where we stressed that a picture, such as a bar graph, is well worth a thousand words). Also, not very surprisingly, when participants viewed both formats, those with experience with bar graphs preferred bar graphs, and those with experience with tables found bar graphs equally easy to use. Next time you visit your doc and he or she shows you a table, say you want to see the results as a bar graph. Now that's stats applied to real-world, everyday occurrences! Want to know more? Go online or to the library and find . . . Brewer, N. T., Gilkey, M. B., Lillie, S. E., Hesse, B. W., & Sheridan, S. L. (2012). Tables or bar graphs? Presenting test results in electronic medical records. Medical Decision Making, 32, 545-553.

Confidence Intervals

Best estimate of the population parameter (range) given the sample statistic How much confidence could we have that the population mean will fall between two scores? * 95% confidence interval = correct 95% of the time = +- 1.96 z score/stds EXAMPLE 1) Mean (spelling score) = 64 (of 75 words), n = 100 (6th graders), std = 5 * What confidence can we have in predicting the population mean for the average spelling score for the ENTIRE POPULATION of 6th graders? ** 95% confidence interval = 64 +- 1.96(5) = 54.2 to 73.8 range ** 99% confidence interval = 64 +- 2.56(5) = 51.2 to 76.8 range 95% of the 6th grader populations' mean spelling score is between 54.2 and 73.8; 99% between 51.2 and 76.8 NOTE * Most statisticians compute confidence intervals via the standard deviation of the mean (standard error of the mean)

Inferential Statistics: The Maternal Employment & Adolescent Attitude Experiment

Decisions made about population based on information about samples 1) Select representative samples for two groups: * Adolescents w/ mothers who work * Adolescents w/ mothers who do not work 2) Administer a test to each adolescent, assessing his or her attitude * Calculate and compare the mean scores * Ensure the test is reliable and valid 3) Reach a conclusion regarding the scores' difference * A result of chance (factor(s) other than mom working) * A statistically significant difference (results due to mom working) 4) Reach a conclusion regarding relationship between maternal employment and adolescent attitude * INFERENCE, based on sample data analysis, is made about the population of all adolescents

How do I select the correct test statistic?

EXAMINING RELATIONSHIPS BETWEEN VARIABLES * 2 variables: T TEST (significance of correlation coefficient) * 3+ variables: REGRESSION, FACTOR ANALYSIS, CANONICAL ANALYSIS EXAMINING DIFFERENCES BETWEEN GROUPS ON ONE OR MORE VARIABLES * PARTICIPANTS TESTED MORE THAN ONCE ** 2 Groups: T TEST (independent samples) ** 2+ Groups: REPEATED MEASURES OF ANALYSIS VARIANCE * PARTICIPANTS TESTED ONCE ** 2 Groups: T TEST (dependent samples) ** 2+ Groups: SIMPLE ANALYSIS OF VARIANCE

Significance versus Meaningfulness

EXAMPLE * Experiment: Traditional Teaching vs Computer Teaching for Illiterate Adults ** Group A: 5000, 75.6 mean, about equal variance ** Group B: 5000, 75.7 mean, about equal variance *** ONLY 0.1 difference! **** Yet when we T-TEST the significance between the means, the results are SIGNIFICANT at the 0.01 level (p < 0.01) **** Thus computers work better than traditional teaching BUT IS THIS TRULY MEANINGFUL? IS IT WORTH $300,000 OF FUNDING FOR A 0.1 DIFFERENCE? Statistical significance alone IS NOT MEANINGFUL unless the study has a SOUND CONCEPTUAL BASE that gains plausible meaning from the significance (a 0.1 is not that) Statistical significant must be interpreted DEPENDENTLY of the contextual outcomes. * As a super intendent, is it practical to retain children in Grade 1 if the retention program significantly raises their standardized test scores by one half point? STATISTICAL SIGNIFICANCE IS NOT THE END-ALL * It is not the sole goal of scientific research * WE TEST OUR HYPOTHESIS, WE DO NOT PROVE OUR HYPOTHESIS ACCEPTING THE NULL HYPOTHESIS IS OF EQUAL IMPORTANCE (by implication) even if we perfect study design * If a particular treatment does not work, this is important information that others need to know about. * If your study is designed well, then you should know why the treatment does not work, and the next person down the line can design his or her study taking into account the valuable information you have provided. RESEARCHER VERNACULAR VARIES (.05 is merely a custom) * Marginally significant might be .04 * Nearly significant might be .06 * .051 might be enough to for significance * Healthcare professions use 'clinical significance' = meaningfulness IT DOES NOT NEED TO BE ALL-OR-NOTHING WITH .05 OR .01 * SPSS/Excel allows one to pinpoint exact probabilities * Do not fall for publication bias: use what you deem is appropriate and explain why it is

True or False: A Type I error of .05 means that five times out of 100 experiments, we will reject a true null hypothesis.

False A Type I error of 0.05 means that in a single experiment, there's a 5% chance of incorrectly rejecting a true null hypothesis.

True or False: It is possible to set the Type I error rate to zero (for the best outcome)

False In most practical cases, it's not possible to set the Type I error rate to exactly zero while conducting hypothesis tests. Doing so would mean you never reject the null hypothesis, which is often not feasible, especially in scientific research where you aim to detect real effects. While you can make the Type I error rate very small (e.g., 0.01 or 0.001) by using stringent criteria for significance, it's rarely reduced to absolute zero.

True or False: The smaller the Type I error rate, the better the results.

False The choice of the Type I error rate (α) should be based on the specific context and the consequences of making Type I errors. The optimal choice of α depends on the research goals, the potential costs of errors, and the available sample size. While a smaller Type I error rate can reduce the chance of making false positive errors, it may increase the chance of making false negatives (Type II errors).

David Blackwell

He wrote one of the first Bayesian statistics books and developed a statistical technique called the Rao-Blackwell estimator, which provides what is often the best way to guess a population value using a sample value.

What does it mean if your data has a p-value of .02?

If the null hypothesis were true (no effect or difference in population), the probability of observing the sample (or more extreme) results by RANDOM CHANCE is ONLY 2% CONCLUSION * With a p-value lower than .05, you would typically reject the null hypothesis. This suggests that the observed results are highly unlikely to have occurred due to random chance alone, and there is strong evidence to support an effect, difference, or relationship in your data.

A major research study investigated how representative a treatment group's decrease in symptoms was when a certain drug was administered as compared with the response of the entire population. It turns out that the test of the research hypothesis resulted in a z score of 1.67. What conclusion might the researchers put forth? Hint: Notice that the Type I error rate or significance level is not stated (as perhaps it should be). What do you make of all this?

If the researchers used α = 0.05 as their significance level, the critical Z-scores for a two-tailed test would be approximately ±1.96. A Z-score of 1.67 is less extreme than the critical value of 1.96, suggesting that the sample mean is not significantly different from the population mean at the 0.05 significance level. However, if the researchers used a more lenient significance level, such as α = 0.10, a Z-score of 1.67 might be considered statistically significant.

Real-World Stats: SIGNIFICANCE DOES NOT MAKE ONE MEANINGFUL

In a medical journal that devotes itself to articles on anesthesia. The focus was a discussion of the relative merits of statistical versus clinical significance, and Drs. Timothy Houle and David Stump pointed out that many large clinical trials obtain a high level of statistical significance with minuscule differences between groups (just as we talked about earlier in the chapter), making the results clinically irrelevant. However, the authors pointed out that with proper marketing, billions can be made from results of dubious clinical importance. This is really a caveat emptor or buyer-beware state of affairs. Clearly, there are a few very good lessons here about whether the significance of an outcome is really meaningful or not. How to know? Look at the substance behind the results and the context within which the outcomes are found. Want to know more? Go online or to the library and find . . . Houle, T. T., & Stump, D. A. (2008). Statistical significance versus clinical significance. Seminars in Cardiothoracic and Vascular Anesthesia, 12, 5- 6.

Why should we think in terms of "failing to reject" the null rather than just accepting it?

It acknowledges the need for more empirical evidence than what is found H0 is the conservative default until evidence suggests otherwise It acknowledges the increased possibility of making a type 2 error when lowering the odds of making a type 1 error It acknowledges the observed result as "strong enough" to reject null, not 100% sufficient or infallible It acknowledges the door as open for further inquiry beyond whole-hearted acceptance

What does a large SEM imply?

It implies the samples were obtained through bias or nonrandomness, thus not representing the population. This is because the variability between the sample means is greater, indicating that the individual sample means are straying further away from the true population mean. In other words, a large standard deviation of sample means suggests that the samples are not providing consistent or reliable estimates of the population mean.

Statistical Power Analysis for the Behavioral Sciences

Jacob Cohen's book A must for anyone who wants to go beyond the very general information that is presented here. It is full of tables and techniques for allowing you to understand how a statistically significant finding is only half the story—the other half is the magnitude of that effect. In fact, if you really want to make your head hurt, consider that many statisticians consider effect size more important than significance. Imagine that! Why might that be?

standard error of the mean

Measure of the variability or precision of sample means when drawing multiple random samples from the same population. The standard deviation of all possible (sample) means selected from the population * Value expected by CHANCE given all variability surrounding all possible sample means of population PURPOSE * Quantifies how much sample means are expected to vary from one sample to another due to random sampling variation. USE * In inferential statistics when making inferences about a population based on sample means. ** Sample Means -> POPULATION (induction) FORMULA * SEM = (sigma) / sqrt (n) * SEM = (std of population) / sqrt (sample size) REMINDER * The standard deviation incorporates INDIVIDUAL data points within a sample mean * The SEM incorporates SAMPLE MEANS within a population * Researchers aim for smaller standard deviations of sample means because they indicate greater precision and confidence in the estimation of the population mean.

Different Types of Errors

NULL HYPOTHESIS IS TRUE 1) Accept the Null: You accept there is no difference between the groups when it is true. 2) Reject the Null: You made a TYPE 1 Error (alpha, α) when there is no actual difference between the groups. * Type 1 Chances = α (alpha) NULL HYPOTHESIS IS ACTUALLY FALSE 3) Accept the Null: You made a TYPE 2 Error (beta, β) when there is an actual difference between the groups * Type 2 Chances = β (beta) 4) Reject the Null: You reject that there is no difference between groups when it is true. * Chances = power = 1 - β (beta)

What does chance have to do with testing the research hypothesis for significance?

Null Hypothesis (H0) * Represents the default assumption that there is no effect, difference, or relationship in the population being studied. * It assumes that any observed results WITHIN A CERTAIN THRESHOLD (we determine via critical value, p value) are due to random chance or variability. Alternative Hypothesis (H1 or Ha) * Represents the researcher's assertion that there is a specific effect, difference, or relationship in the population. * It stands in contrast to the null hypothesis, claiming an observed result falls OUTSIDE A CERTAIN THRESHOLD and is due to a factor(s) P-Value * It represents the probability of obtaining the observed results or more extreme results assuming the null hypothesis is true. In other words, it quantifies the role of chance in the observed data. ** If the p-value is small (e.g., less than the chosen significance level α, often 0.05), it indicates that the observed results are unlikely to have occurred by random chance alone under the null hypothesis. ** Conversely, if the p-value is large (greater than α), it suggests that the observed results could reasonably occur by random chance, and there isn't enough evidence to reject the null hypothesis. "Chance" is seen as the "norm" because it aligns with the null hypothesis

Chapter 10

OBJECTIVES • Deciding when the z test for one sample is appropriate to use • Computing the observed z value • Interpreting the z value • Understanding what the z value means • Understanding what effect size is and how to interpret it SUMMARY * The one-sample z test is a simple example of an inferential test * The (very) good news is that most (if not all) of the steps we take as we move on to more complex analytic tools are exactly the same as those you saw here.

How can I intuit whether my obtained value (p-value) should accept the null hypothesis (critical value) as a product of chance?

OBTAINED VALUE > CRITICAL VALUE * Only if your obtained value is more extreme than what would happen by chance (meaning that the result of the test statistic is not a result of some chance fluctuation) can you say that any differences you obtained are not due to chance

Type 1 error (alpha error)

Rejecting null hypothesis when it is true Notated as Greek letter alpha, or α A level of significance EXAMPLE 1) Null: No difference between the two sample groups * Findings: There is a difference between the two sample groups ** Reality: There is no difference in the whole population *** Thus, REJECTING THE NULL HYPOTHESIS means you are MAKING A TYPE 1 ERROR **** Next Step: Find out whether the findings indicate that ERROR or ACTUAL DIFFERENCE is responsible

What does the z in z test represent? What similarity does it have to a simple z or standard score?

SIMILARITIES * Both measure how much a value (z-score) or sample mean (z-test) deviates from the mean of a distribution * Both expressed as standard deviations * Both standardize data into a common scale DIFFERENCES * A Z-score is calculated as (X - μ) / σ, where X is the individual data point, μ is the population mean, and σ is the population standard deviation. * In a Z-test, you compare the sample mean (X̄) to the population mean (μ) while considering the sample size (n) and sample standard deviation (s). * Z-scores tell you how many standard deviations a data point is from the mean of a distribution. They are often employed for understanding the relative position of data points within a distribution. * In a Z-test, you compare a sample mean to a known population mean to assess whether the difference is statistically significant. They are commonly used in hypothesis testing to make inferences about populations.

Flu cases per school this past flu season in the Remulak school system (n = 500) were 15 per week. For the entire state, the weekly average was 16, and the standard deviation was 15.1. Are the kids in Remulak as sick as the kids throughout the state?

STEP 1 H0: μ = 16 - The flu cases in the Remulak school system are not significantly different from the state average. H1: μ ≠ 16 (two-tailed) - The flu cases in the Remulak school system are significantly different from the state average. STEP 2 * Sample size (n) = 500 * Sample mean (xbar) = 15 (flu cases per week in Remulak) * Population mean (mu) for the state = 16 (flu cases per week) * Population standard deviation (σ) for the state = 15.1 STEP 3 * Significance level (alpha) = .05 STEP 4 * z = (15-16)/15.1/sqrt(500) = -1 /(15.1)/(22.4) = +-1.487 STEP 5 * Critical z-value for two-tailed test w/ .05 alpha = +-1.96 STEP 6 * Observed Value < Critical Z-Value, thus you fail to reject the null hypothesis. There is no statistically significant difference between the flu cases in Remulak and the state average at the .05 level.

T Test vs Z Test

Standard Deviation * T test = UNKNOWN * Z test = KNOWN Critical Values * Use different distributions

Statistical significance

THE DEGREE OF RISK you are willing to take and REJECT the null hypothesis when it is true Your chance of making a TYPE 1 ERROR The first step toward scientific contribution EXPLANATION * The null hypothesis is not a reasonable explanation for what we observed EXAMPLE * E. Duckett and M. Richards's "Maternal Employment and Young Adolescents' Daily Experiences in Single Mother Families" ** X = Mothers-who-work / Mothers-who-dont-work ** Y = Adolescent's Attitude *** "There is a significant difference (p < 0.05) in attitude toward maternal employment between adolescents whose mothers work and adolescents whose mothers do not work." The "significant difference" means differences between the attitudes of the two groups were due to systematic influence, NOT CHANCE. * We assume that all other factors that might account for any differences between groups were controlled (and we can do this through some research design choices). BUT There is more to VERIFY beyond the conclusion. * YOU CANNOT BE 100% SURE. NO MATTER HOW SMALL, THERE IS ALWAYS THE CHANCE AGAINST THE CONCLUSION ** Proof: The normal curve's tails never touch. The probability of an event being extreme is NEVER ABSOLUTE ZERO. THERE IS ALWAYS THE CHANCE.

Significance level

The CAVEAT associated with not being 100% confident in your observation for an experiment due to the treatment or what was being tested GOAL 1) To reduce the likelihood of NON-HYPOTHESIZED REASONS as much as possible * Remove the competing reasons for any differences you observed NOTE * You cannot control every possibility; thus, you MUST assign some level of probability and report results with this disclaimer. THERE IS ALWAYS THE POSSIBILITY OF ERROR/CHANCE

Interpreting Significance Levels

The level of significance is associated with an independent test of the null and is based on a "what if" way of thinking. If the null hypothesis is true in the population, what are the chances I would have MISTAKENLY found an unequal result (like the difference between two groups) in my sample? Statistical significance is usually represented as p < .05 * "The probability of observing my outcome due to CHANCE is less than .05" * "There is a less than 5% chance I will make a type 1 error, and my sample will not represent the population due to CHANCE." * "significant at the .05 level."

Effect Size

The magnitude of a relationship between two or more variables Can be CORRELATIONAL values or values that ESTIMATE DIFFERENCE PURPOSE * Addresses the MILLION DOLLAR QUESTION: Although a research hypothesis is statistically significant, IS IT MEANINGFUL? INTERPRETATION (Cohen's d) * Small effect size (0 to .2) * Medium effect size (.2 to .8) * Large effect size (.8 or above) * Higher value = larger difference, from 0% difference (0) to 55% difference (1) NOTES * Does not take sample size into account * Does not require significance * Provides a new dimension of judgment and evaluation * Different formulas associated with different inferential tests * Uses Cohen's d for group differences

degrees of freedom

The number of individual scores that can vary without changing the sample mean. Statistically written as 'N-1' where N represents the number of subjects. PURPOSE * Controls variability due to the nature of random sampling * More accurately represents the constraint or limitation imposed by the sample data because they are interrelated with the population. * Degrees of freedom help determine the appropriate probability distribution for a statistic (different shapes vary on n-1) * The choice affects the distribution of the test statistic for critical or p-values, which, in turn, influences the results of hypothesis tests.

p = ns

The probability is not significant OTHER NOTATIONS * p > 0.05 (probability of rejecting a true null exceeds .05) ** Range: 0.050001 to close to 1.00 * p > 0.01 ** Range: .0100001 to close to 1.00 P can never reach 1.00 because a curve is asymptotic (never touching the baseline, theoretically infinite)

P-Value (Probability value) or SURPRISE value

The probability of obtaining your sample mean (observed result), or something more extreme, when the null hypothesis is true. Researchers typically compare the p-value to a predetermined significance level (α), such as 0.05 or 0.01, to determine whether to reject the null hypothesis and INTENTIONALLY COMMIT A TYPE 1 ERROR PURPOSE * Helps you assess the likelihood of your data under the null hypothesis. It's all about quantifying surprise and providing a basis for drawing scientific conclusions. p < 0.05 * If your data is very unlikely and SURPRISING under that assumption H0 is true (located beyond the threshold), it raises questions and suggests that there might be something more interesting happening BEYOND RANDOM CHANCE tl;dr * Quantifies how unusual or surprising your sample mean is if the null hypothesis were correct.

Type 1 vs Type 2 Errors

Type 1: False positive (incorrectly rejecting the Null Ho) * Under your control according to the level or amount of risk you are willing to take (p < 0.05, etc) Type 2: False negative (incorrectly failing to reject the Null Ho) * Not easily controlled * SENSITIVE to factors such as sample size

Power

Type of probability statement CALCULATION * 1 - Proportional Chance of Type 2 Statement

one-sample test

USE * Inferential COMPARISON between sample and population * Only ONE group being tested (vs a theoretical invisible population) * Does a sample mean (xbar) belong to or is a fair estimate of a population (mu)? PURPOSE * When the research hypothesis wants to assess whether the SAMPLE (mean) is DIFFERENT from the POPULATION (mean) collected using the same measurement FORMULA * z = (x bar - mu) / SEM * z = (sample mean - population mean)/standard error of mean CONCLUSION * If the sample statistic is or is not representative of the population parameter EXAMPLE 1) Cappelleri, J. C., Bushmakin, A. G., McDermott, A. M., Dukes, E., Sadosky, A., Petrie, C. D., & Martin, S. (2009). Measurement properties of the Medical Outcomes Study Sleep Scale in patients with fibromyalgia. Sleep Medicine, 10, 766-770. * Evaluating the Medical Outcomes Study (MOS) Sleep Scale ** Compare sample MOS scores with population (national) MOS norms *** "The treatment sample's MOS Sleep Scale scores were significantly different from normal (the population mean; p < .05)" *** The null hypothesis that the sample average and the population average were equal IS REJECTED

Two-Tailed vs One-Tailed Test

Use a two-tailed test when you are interested in whether there is a significant difference, but you do not specify the direction of the difference * The alternative hypothesis (H1) is often expressed as a "not equal to" relationship, indicating that you are testing for any significant difference. ** H1: mu =/= mu0 Use a one-tailed test when you have a specific expectation about the direction of the effect (e.g., you expect the result to be greater or smaller than a particular value). * The alternative hypothesis (H1) is often expressed as a "greater than" or "less than" relationship, indicating that you are testing for a DIRECTIONAL difference. ** H1: mu > mu0, mu < mu0

false positive

When a system incorrectly approves an action or condition instead of denying it. * Condition: Rejecting the null hypothesis

false negative

When a system incorrectly denies an action or condition instead of accepting it. * Condition: Rejecting the null hypothesis

When is the null hypothesis (state of equality) an unlikely explanation?

When the results fall outside the threshold: * THIS COULD NOT HAVE OCCURRED BY CHANCE ALONE (something else is happening). ** Thus, the research hypothesis (state of inequality or difference) is the likely outcome

test statistic value

aka obtained value or observed value The result or product of a specific statistical calculation

standard deviation of the mean

the standard deviation of a set of measurements divided by the square root of the number of measurements - 1 in the set "Errors" or "Approximations" in measurement that surround a "true" point

standard error of the mean (SEM)

the standard deviation of an ALL SAMPLE MEANS that could be selected from the population "Errors" or "Approximations" in measurement that surround a "true" point * Calculates the range of possible mean values, not an individual score ** Alternative to computing/understanding confidence intervals

How Do I Interpret z = 2.38, p < .05?

z = test statistic used 2.38 = obtained value (using 1 sample z test formula) p < .05 = If the null hypothesis is true, there is a less than 5% probability that on any future tests, the sample and population averages will differ by that much or more. * This rarity is not mere randomness/chance


Conjuntos de estudio relacionados

Peds Chapter 35 Key Pediatric Nursing Interventions, Peds Chapter 36: Nursing Care of the Child with an Alteration in Comfort-Pain Assessment and Management

View Set

MGT 325- Motivating your Workforce

View Set

(J. Katz) Seductions of Crime: Moral and Sensual Attractions in Doing Evil

View Set

Social Studies 8 - Edo Japan: From Isolation to Adaptation

View Set

MusculoSkeletal Chapter 40/Pediatrics

View Set