Review for exam 2
Independent
The choice of whether a given sample was chosen for the study did not depend on the other group.
Failed to reject their null hypothesis
There is insufficient evidence to conclude that xx is different.
Dependent variables hypothesis
When conducting hypothesis tests using dependent samples, the null hypothesis is always μd=0, indicating that there is no change between the first population and the second population. The alternative hypothesis can be left-tailed (<), right-tailed(>), or two-tailed(≠).
Two Mean Independent terms
When conducting inference using independent samples we use x¯1, s1, and n1 for the mean, standard deviation, and sample size, respectively, of group 1. We use the symbols x¯2, s2, and n2 for group 2.
Requirements for a test for a single mean with σ known as:
The data represent a simple random sample from a large population. The sample mean x¯ is normally distributed. This happens if either one of the following is true: The population is normally distributed. The sample size is large.
ANOVA
ANOVA is used to compare the means for several groups. The hypotheses for the test are always: H0:Ha: All the means are equal At least one of the means differs For ANOVA testing we use an F-distribution, which is right-skewed. The P-value of an ANOVA test is always the area to the right of the F-statistic. We can conduct ANOVA testing when the following two requirements are met: The data are normally distributed in each group. The largest variance is not more than four times the smallest variance.
Type II error
Accepting a false null hypothesis (failing to reject a false hypothesis). If alpha is very small, the probability of committing a type II error will be large. If alpha is very large, then the probability of committing a type II error will be small.
Automatic Language Translation Programs
Analyze→Descriptive Statistics→Explore: Make a histogram Verify the requirements have been met. A simple random sample was drawn from the population 2. x¯ is normally distributed Find the confidence interval - Analyze→Descriptive Statistics→Explore (Using the Explore function is the best way to calculate confidence intervals.)
Describe the data collection procedures
Give the relevant summary statistics x¯, σ, n Make an appropriate graph (e.g. a histogram) to illustrate the data Verify the requirements have been met. We assume that the individuals chosen to participate in the study represent a (simple) random sample from the population. x¯ will be normally distributed, because the sample size is large. Give the test statistic and its value z=−6.669 Mark the test statistic and P-value on a graph (probability applet) of the sampling distribution Find the P-value and compare it to the level of significance State your decision - Since the P-value is less than α, we reject the null hypothesis. There is sufficient evidence that xx is different.
Alternative hypothesis
The alternative hypothesis (Ha) is a different assumption about a population and is a statement of inequality (<, >, or ≠). Using a hypothesis test, we determine whether it is more likely that the null hypothesis or the alternative hypothesis is true. The alternative hypothesis tells us whether we look at both tails or only one.
The null hypothesis
The null hypothesis (H0) is the foundational assumption about a population and represents the status quo. It is a statement of equality (=).
Type I error
A Type I error is committed when we reject a null hypothesis that is, in reality, true. A Type II error is committed when we fail to reject a null hypothesis that is, in reality, not true. The value of α is the probability of committing a Type I error. Is committed when we reject a true null hypothesis. Also fail to reject a false null hypothesis. Rejecting the null when it is true is a type I error. If we choose alpha to be small then we rarely commit a Type 1 error. If I reduce the probability of committing a Type 1 error, I increase the probability of committing a Type II error. The significance level alpha of a hypothesis test is the probability of a Type I error. Whenever we reject a true null hypothesis, we say that a Type I error was committed. Assuming the null hypothesis is true, the level of significance (α) is the probability of getting a value of the test statistic that is extreme enough that the null hypothesis will be rejected. In other words, the level of significance (α) is the probability of committing a Type I error.
Calculate confidence interval
A confidence interval for an unknown parameter consists of an interval of numbers. A point estimator is one number that is used to estimate a parameter (unknown values). Interval estimator are a range of values for a parameter. By adding and subtracting the margin of error we get the interval estimate. 95% of the time , the standard mean will lie within two standard deviations of the population mean (u). sigma known - one sample sigma unknown - mean of differences, using dependent samples difference of means, using independent samples 2 means independent 2 mean dependent NOVA A confidence interval is an interval estimator used to give a range of plausible values for a parameter. The width of a confidence interval depends on the chosen confidence level (and its corresponding value of z∗) as well as the sample size (n). This is the equation for calculating confidence intervals: (x¯−z∗σn‾√, x¯+z∗σn‾√)
Hypothesis statement - Reject the null hypothesis
Because the p-value < alpha. We have sufficient evidence that at least one of the means is different from the rest of the means. Testing rule: If p-value < alpha (compare these two and if the p-value is less then alpha we will reject) If it is greater than the alpha it is failed to reject Ho.
One mean, sigma unknown
Give the relevant summary statistics x-, s, n Make an appropriate graph (e.g. a histogram) to illustrate the data Verify the requirements have been met We assume that the individuals chosen to participate in the study represent a (simple) random sample from the population. x¯ will be normally distributed, because the sample size is large. (Note: We could have also noticed that the body temperature data appears to be normally distributed, so even with a small sample size, x¯ would be normal.) Give the test statistic and its value. We will need to conduct the analysis using software, so we can report this value. Instructions for conducting a test for one mean with sigma unknown: Analyze→Compare Means→One−Sample T Test Move the variable containing your data values into the "Test Variable(s)" box. Your null hypothesis will state that μ is equal to some number. Enter that number as the "Test Value." (This is a step that people often forget to do! Don't forget to do this. This tells SPSS what the value of μ is in your null hypothesis.) The output will contain a box labeled "One-Sample Test." This will give the value of the test statistic (t), the degrees of freedom (df), and "Sig.(2-tailed)." SPSS uses the term "Sig.(2-tailed)" for the area in both tails under the t-distribution that is more extreme than the test statistic (t). If it is a one tail test divide "Sig.(2-tailed)" by 2. State the degrees of freedom. Find the P-value and compare it to the level of significance
Two Means: Independent Samples
Knowing which subjects are in group 1 tells you nothing about which subjects will be in group 2. With independent samples, there is no pairing between the groups. Step 1: Summarize the relevant background information Step 2: State the null and alternative hypotheses and the level of significance Step 3: Describe the data collection procedures Step 4: Give the relevant summary statistics Step 5: Make an appropriate graph to illustrate the data Graphs→Legacy Dialogs→Histogram Click on the name of the variable for which you want to generate a histogram. For the reading practices data, this is "Nights," the number of nights per week the child reads in the home. Then, click on the arrow next to the ``Variable box. This will move the highlighted variable over. To make side-by-side histograms for two or more groups, you then select the grouping variable. For the reading practices data, this is "Group," the variable that indicates if the child is in the DEV or GEN group. Then, click on the arrow next to the ``[Panel by] Rows box. This will move the highlighted variable into the area at right. Graphically illustrate each sample separately. Step 6: Give the test statistic and its value Transform→AutomaticRecode Move the grouping variable (i.e., Gender) into the box labeled "Variable−>New Name". In the box labeled, "New Name", enter the name that should be given to your new coded variable. As a suggestion, you might consider using the same variable name with the word "CODE" appended to the end of it. For example, if the original variable was called "Group", the new name for your variable could be "GroupCODE". Remember, variable names cannot contain any spaces or special characters. Step 7: Hypothesis Test - Analyze→CompareMeans→Independent-Samples T Test Move the variable containing your data values into the "Test Variable(s)" box. In this case, that will be "Nights per week read..." Move the variable that tells to which group the data belong into the "Grouping Variable" box. This will be your GroupCODE variable. Click on [Define Groups...]. Next to Group 1, type "1" (which corresponds to DEV). Next to Group 2, type "2" (which corresponds to GEN). When conducting hypothesis tests using independent samples, the null hypothesis is always μ1=μ2, indicating that there is no difference between the two populations. The alternative hypothesis can be left-tailed (<), right-tailed(>), or two-tailed(≠). Whenever zero is contained in the confidence interval of the difference of the true means we conclude that there is no significant difference between the two populations.
Two means - Paired data
The key characteristic of dependent samples (or matched pairs) is that knowing which subjects will be in group 1 determines which subjects will be in group 2. Transform→Compute Variable In the "Target Variable" box, type the name of the new variable. Remember, variables in SPSS cannot contain spaces or special characters. They can contain letters and numbers, and they must start with a letter. In the "Numeric Expression" box, you will tell SPSS how to determine the values for the new variable. To subtract one column of data from another, do the following: Move the first variable into the "Numeric Expression" box. (For example, for pre/post data, you might move the variable "Post".) You can type a subtraction sign "-" or you can click on the corresponding button in the dialog box. Then move the second variable into the "Numeric Expression" box. (For example, "Pre".) Now, the "Numeric Expression" box should show the subtraction operation you want to conduct. (For example, "Post - Pre".) A new variable will be created in your data file that will contain the result you defined. Spot check a few values to make sure you got the result you needed. Now, use the difference column that you calculated to conduct a hypothesis test for the difference in the means. Step 2: Describe the data collection procedures Step 3: Give the relevant summary statistics Step 4: Make an appropriate graph (histogram) to illustrate the data Step 5: Verify the requirements have been met (the data represent a simple random sample from the population the mean of the differences follows a normal distribution) Step 5: Give the test statistic and its value Step 6: Degrees of freedom
Level of significance
The level of significance (α) is the standard for determining whether or not the null hypothesis should be rejected. Typical values for α are 0.05, 0.10, and 0.01. If the P-value is less than α we reject the null. If the P-value is not less than α we fail to reject the null. How do we decide if the P-value is small enough to reject the null hypothesis? We need a way to determine if there is enough evidence to reject the null hypothesis that does not depend on the data. We need a number that can be used to determine if the P-value is small enough to reject the null hypothesis. This number is called the level of significance and is often denoted by the symbol α (pronounced "alpha".) "If the P is low, Reject the null." If the P-value is less than α, we reject the null hypothesis. Conversely, if the P-value is greater than α, we fail to reject the null hypothesis.
Margin of error
The margin of error gives an estimate of the variability of responses. It is calculated as m=z∗σn‾√ where z∗ represents a calculated z-score corresponding to a particular confidence level.
P-value
The probability of obtaining a test statistic at least as extreme as the one you calculated assuming the null hypothesis is true. If we find that the P-value is large, that means that we did not find a contradiction in our requirement that the null hypothesis is true. That does not mean that the null hypothesis is true, it only means that we cannot reject it. For this reason, we never say we "accept the null hypothesis." When the P-value is large, we say that we "fail to reject the null hypothesis."
Sample size formula
The sample size formula allows us to estimate the number of observations required to obtain a specific margin of error. n=(z∗σm)2
T-distribution
The t-distribution is similar to a normal distribution in that it is bell-shaped and symmetrical, but the exact shape of the t-distribution depends on the degrees of freedom (df). You will use ExcelSPSS to carry out hypothesis testing and create confidence intervals involving t-distributions.
Dependent variables
We use slightly different variables when conducting inference using dependent samples: Group 1 values: x1 Group 2 values: x2 Differences: d Population mean: μd Sample mean: d¯ Sample standard deviation: sd In the hypothesis test, we will refer to the variable representing the differences as d. We will use this notation throughout the hypothesis test. For example, the true population mean will be labeled μd and the sample mean will be labeled d¯. The sample standard deviation of the differences is denoted sd.
To calculate confidence intervals for the true mean of the difference in SPSS
confidence intervals for the true mean of the difference in SPSS, do the following: Follow the directions given above for creating a new column containing the differences between two variables. Select the menu item Analyze→Descriptive Statistics→Explore Click on the name of your new variable of the differences for which you want to calculate the confidence interval. Then, click on the arrow next to the "Dependent List" box. This will move the highlighted variable over. Click on [OK]. By default a 95% confidence interval is shown. To find other confidence intervals, go to Analyze→Descriptive Statistics→Explore. Move the desired variable into the "Dependent List" and then press the "Statistics" button on the right of the window. Change the percentage next to "Confidence Interval for Mean" to the desired confidence level then press "Continue."