Math FInal
Properties of Chi-Square Distribution
- only positive # -right skewed -degree of freedom is only parameter -p-value is the. area to the right of the observed chi square stat value under the density curve
Properties of F in a ANOVA test.
- right skewed distribution -2 parameters -only takes on only non-negative #'s - takes on the value 0 only when all sample means are the same -Gets larger as variation among sample means increase -larger values are evidence against null hypothesis -upper one-sided test
Properties of Chi square stat
-a measure of the distance of the observed counts from the expected counts -range of values are. any # greater than or equal to 0 -equals 0 when all observed counts = expected counts -large values mean evidence against null -one-sided
Conditions of ANOVA
-all of the populations have the same Standard deviation
We have 2 independent SRS, from two distinct means then
-one sample has no influence on the other -measure the same variable for both sample
What are all the possible values that a standard deviation σ can take?
0 ≤ σ
Suppose you randomly select 500 students and observe that 85 of them smoke. Estimate the probability that a randomly selected student smokes.
0.17
Eggs that are contaminated with salmonella can cause food poisoning among consumers. A large egg producer takes an SRS of 200 eggs from all the eggs shipped in one day. The laboratory reports that 11 of these eggs had salmonella contamination. Unknown to the producer, 0.2% (two-tenths of one percent) of all eggs shipped had salmonella. In this situation,
0.2% is a parameter and 11 is a statistic.
I choose a card at random from a well-shuffled deck of 52 cards. There is a 1/4 probability that the card chosen is a spade, a 1/4 probability that the card is a heart, a 1/4 probability that the card is a diamond, and a 1/4 probability that the card is a club. Both spades and clubs are black cards, while hearts and diamonds are red. What is the probability that the card chosen is black?
0.50
I choose a card at random from a well-shuffled deck of 52 cards. There is a 1/4 probability that the card chosen is a spade, a 1/4 probability that the card is a heart, a 1/4 probability that the card is a diamond, and a 1/4 probability that the card is a club. Both spades and clubs are black cards, while hearts and diamonds are red. The probability that the card chosen is not a spade is
0.75
Assume that event A occurs with probability 0.4 and event B occurs with probability 0.5. Assume that A and B are disjoint events; that is, that the probability of both A and B occurring is zero; P(A and B) = 0. The probability that either event occurs, P(A or B), is
0.9
The area under a density curve (density function) is always
1
The Five-Number Summary contains the:
1st and 3rd Quartiles, Median, Minimum and Maximum
Colleges often rely heavily on raising money for an "annual fund" to support operations. Alumni are typically solicited for donations to the annual fund, and studies suggest that annual income of an alum is a good predictor of the amount of money he or she would be willing to donate, and there is a reasonably strong, positive linear relationship between these variables. In the studies described,
Annual income is an explanatory variable. The correlation between "alum income" and "size of alum's donation" is positive. Size of alum's donation to the annual fund is the response variable.
The work habits of students who lived in the dorms vs students who lived off campus were studied by tracking the total number of hours studied by each group of students. Which type of study is this?
Observational
Suppose that from data taken each day for several weeks you calculate that the correlation between humidity and temperature is 𝑟=1.27r=1.27 What should you conclude?
You made a mistake.
What is the Alternative Hypothesis for a ANOVA test?
alternative Hypothesis is that there is some difference. ( all means are not the same) (many-sided)
A news release for a diet products company reports: "There's good news for the 65 million Americans currently on a diet." Its own study showed that people who lose weight can keep it off. The sample was 20 graduates of the company's program who endorsed the program in commercials. The results of the sample are probably
biased, overstating the effectiveness of the diet.
chi-square statistic
determine if the differences between observed and expected counts are statistically significant
standard error or estimated standard deviation
difference in sample means
I toss a penny and observe whether it lands heads up or tails up. Suppose the penny is fair, i.e., the probability of heads is 1/2 and the probability of tails is 1/2. This means
if I flip the coin many, many times, the proportion of heads will be approximately 1/2, and this proportion will tend to get closer and closer to 1/2 as the number of tosses increases.
condition in chi square
if the observed costs are far from the expected counts, that is considered evidence against null hypothesis
Goal of two sample inference
is to compare the responses to two treatments or to compare the characteristics of two independent populations
Does. correlation have units?
no units
categorical variable
one which takes on a finite number of values or categories
Suppose we fit the least-squares regression line to a set of data. Points with unusually large values of the residuals are called
outliers
The law of large numbers states that as the number of observations drawn at random from a population with finite mean
tends to get closer and closer to the population mean μ
What is the Null Hypothesis for a ANOVA test?
test the null hypothesis that there are no differences among the means of population
A researcher finds 2000 mildly overweight women who exercise regularly, have not had a heart attack, and are willing to participate in the study. She randomly assigns 500 of the women to take an appetite suppressant. The other 500 women are given a placebo. Both groups are followed for five years and the amount of weight lost after this time is recorded.
the amount of weight lost.
What is the conditions/assumptions of an ANOVA test?
the basic conditions for inference are that we have random samples from each population. and that the population is Normally distributed.
Only use interval for proportion when
the counts of successes and failures in the sample are eat at least 15
The central limit theorem says that when a simple sample of size 𝑛n is drawn from any population with mean 𝜇μ and standard deviation 𝜎σ, then when 𝑛n is sufficiently large
the distribution of the sample mean is approximately Normal.
when p* value in proportion is 0.5
the margin of error will be less than or equal to m
For two sample test both population are normally distributed when
the means and standard deviations of the population are all unknown
null hypothesis of Chi Square
the percentages for one variable are the same for every level of the other variable
alternative hypothesis of Chi Square
the percentages for one variable vary over levels of the other variable
the proportion of success in a sample is measured by
the sample proportion
The variability of a statistic is described by
the spread of its sampling distribution.
The fraction of the variation in the values of a response y that is explained by the least-squares regression of y on x is
the square of the correlation coefficient (r^2).