Econometrics Final Study Guide
1) What are the 2 requirements for random sampling?
1) i. Probability of inclusion is equal for all the population ii. Whether individual is included = independent of whether anyone else is included
What are important things to know for statistics? What are the 2 types of estimates?
1. Expected values 2. Variance 2 types: 1. Point estimates 2. Interval estimates
Problem Set 3: Part 2- https://imgur.com/gallery/DpoozJz shows the probability distribution for A and B. Complete the table and answer the questions. 1) P[A∩B] = ? 2) P[A∩B bar] = ? 3) P[A∪B] = ? 4) P[A|B] = ? 5) P[A bar∪B bar] = ?
First, complete the table: --> Remember that P(x) = sum of probability distributions in row of x! --> Pd = probability distribution in each specific cell and also intercepts! A A_ B | 0.48 0.25 |--> P(B) = 0.73 B_ | 0.09 0.18 |--> P(B_) = 0.27 --> P(A) = 0.57 --> P(A_) = 0.43 1) P(AnB) = 0.48 2) P[A∩B bar] = 0.09 3) P[A∪B] = P(A)+P(B)-P(AnB) = (0.57+0.73)-(0.48) = 0.82 4) P[A|B] = P(AnB)/P(B) = 0.48/0.73 = 0.658 5) P[A bar∪B bar] = (0.43+0.27)-(0.18) = 0.52
Problem Set 3: Part 4- This question uses Bayes' Rule. (It can be solved with other techniques, as well). P[A|B] = 0.55 , P[A|B bar ] = 0.72 , and P[B] = 0.09 . What is P[B|A]? Part 5- This question uses Bayes' Rule. (It can also be solved with other techniques). P[A|B] = 0.80 , P[A] = 0.76 , and P[B] = 0.60 . What is P[B|A]?
4) Bayes Rule: P[B|A]= [P(A|B)xP(B)]/[(P(A|B)xP(B))+(P(A|Bbar)xP(Bbar))] --> P[A|B] = [P(B|A)xP(A)]/P(B) = [(P(AnB)/P(B)] P(B|A)=(0.55x0.09)/((0.55x0.09)+(0.72x0.91))=0.07 5) P[A|B] = [P(B|A)P(A)]/P(B); x = P(A|B) --> 0.80 = (x*0.76)/0.60--> P(B|A) = 0.632
When is it better to use complementary probability (finding the inverse, and subtracting from inverse)? 1) Weather forecast: 0.10 chance of rain each day; this is independent (rain on tue doesn't change probability of rain on wed). P(rains within 7 days?)
When it's sequential; asking for AT LEAST ONE of... 1) 1-P(no rain, 7 times in a row) = 1-(1-0.10)^7 = 0.5217
1) Explain what B0 and B1 mean. 2) How much of y is explained by x?
1) B0=base level, average y when all x=0 B1=effect of x on y; diff between average x and average y 2) r^2, or the coefficient of determination, is how much of y is determined by x!
1) Arithmetic mean 2) Geometric mean 3) Harmonic mean
i. *Arithmetic mean* = sum of all observations/# of observations --> Doesn't always work well --> ie-to beach = 75 mph (150 mi distance; 2 hours) -from beach = 25 mph (150 mi distance; 8 hours) --> Total distance = 300 miles in 8 hours--> 300/8 = *actual average speed is 37.5 mph* --> this is actually using the harmonic mean! ii. *Geometric mean* = ((x1)(x2)...(xn))^(1/n) --> For growth rates (R=decimal): [(1+R1)(1+R2)...(1+Rn)]^(1/n)]-1 -->) [$100(1+R1)(1+R2)(1+Rn)] = $100(1+R bar)(1+R bar) --> Rn = decimal form --> R bar = average interest rate --> *1+R bar = Sq. Root N[(1+R1)(1+R2)...(1+Rn)]* --> N = # of observations --> ie- I'm saving for college. I put $100 into mutual fund with variable rate fo return. I gain 20% in year 1; lose 20% in year 2. What is the average rate of return? --> Year 1 = +20% (100 + (0.2x100) = now have $120 in account) --> Year 2 = -20% (120 - (0.2x120) = now have $96 in account) --> [$100(1+0.2)(1-0.2)] = $100(1+rate)(1+rate)--> 0.96 = (!+R bar)^2 iii. *Harmonic mean* = sum of all reciprocals of values/# of observations, take reciprocal of that --> [(1/X1 + 1/X2 + ...1/Xn)/N]^(-1) --> N = # observations --> X = each value within N --> Not uncommon; used for averaging ratios (ie- fuel efficiency, CPI for inflation, etc.) --> ie-to beach = 75 mph (150 mi distance; 2 hours) -from beach = 25 mph (150 mi distance; 8 hours) --> Harmonic mean = [(1/75 + 1/25)/2]^(-1) = *37.5 mph* --> ie- Car A = 20 miles per gallon; Car B = 40 miles per gallon Average miles/gallon = ? --> I drive each car for 80 miles--> 160 miles total for 6 gallons total --> 160 gallons on 6 gallons--> *avg miles/gallon = 26.69 mpg* OR using harmonic mean: --> [(1/20 + 1/40)/2]^(-1) = 26.69 mpg!
Problem Set 3: Part 3- https://imgur.com/gallery/0p5eEwY shows the probability distribution for A and B. Complete the table, assuming that A and B are independent, with P[A] > P[B] , and answer the questions. 1) P[A∩B] = ? 2) P[A∩B bar] = ? 3) P[A∪B] = ? 4) P[A|B] = ? 5) P[A bar∪B bar] = ?
i. Remember that if A and B are independent: ==> Since independent, P(AnB) in table = P(A) x P(B)! 0.18 0.12 0.42 0.28 P(Abar)=0.4; P(Bbar)=0.7 --> P(A|B) = P(A) --> P(AnB) = P(A)*P(B) ii. Also keep in mind that P(A bar) = 1-P(A) iii. Given that P[A] > P[B] 1) P[A∩B] = P(A) x P(B) = 0.60 x 0.30 = 0.18 2) P[A∩B bar] = 0.60 x (1-0.30) = 0.42 3) P[A∪B] = [P(A)+P(B)]-(P(AnB)) = [0.60+0.30]-[0.18] = 0.72 4) P[A|B] = [P(AnB)/P(B)] = 0.18/0.30 = 0.60 5) P[A bar∪B bar] = [P(A bar)+P(B bar)]-(P(A barnB bar)) = [0.40+0.70]-[0.40 x 0.70] = 0.82
1) Linear transformations: ☆ E[a + b⋅X] = ? ☆ Var(a + b⋅X) = ? ☆ Cov(a + b⋅X,Y) = ? ☆ Corr(a + b⋅X,Y) = ? 2) Linear combos: ☆ E[X +Y] = ? ☆ Var(X +Y) = ? ☆ Cov(X +Y,Z) = ? 3) Multiplying RVs: ☆ E[X ⋅Y] = ?
1) ☆ E[a + b⋅X] = a + b⋅E[X] ☆ Var(a + b⋅X) = b2 Var(X) ☆ Cov(a + b⋅X,Y) = b⋅Cov(X,Y) ☆ Corr(a + b⋅X,Y) = Corr(X,Y) 2) ☆ E[X +Y] = E[X]+ E[Y] ☆ Var(X +Y) = Var(X) + Var(Y) + 2⋅Cov(X,Y) ☆ Cov(X +Y,Z) = Cov(X,Z) + Cov(Y,Z) 3) ☆ E[X ⋅Y] = E[X]⋅E[Y], when X and Y are independent! --> This is the same as P(XnY) which = P(X)xP(Y) when X and Y are independent
Midterm 2 12) Calculate the standard error in X , in a sample of N = 17 with s^2 = 21.3 13) When the size of a sample doubles, how does the standard error for a proportion change (everything else the same)? 14) Calculate a 92% confidence interval for µ , in a (large) sample of N = 100 with mean = 75 and s^2 = 25
12) SEx = SD/N^0.5 --> (21.3/17)^0.5 = 1.1193 13) SEx = SD/N^0.5 --> Every time sample doubles, SE divided by 2^0.5 14) 92% CI for µ = µ+/-(Z of alpha/2 x SEµ) i. µ=75 ii. Z of alpha/2=Z of 0.04--> reverse trace p-area=0.04 for Z score --> 1.75 iii. SEµ=SD/N^0.5 --> 5/100^0.5 = 0.5 iv. 92% CI for µ = 75+/-(1.75x0.5) --> *(74.125, 75.875)*
Suppose you are drawing 2) Suppose two seals are trying to get pregnant within 3 months. P(Pregnancy) = 1/3 each month; stays constant. What is the probability of getting pregnant within 3 months?
Tree: Total possibilities - 8 sodas - 2 beers --> All items = equally likely to be gotten --> Purchase of 2 items; probability for getting 0, 1, 2 beers? P(1st purchase): Coke = 8/10 Beer = 2/10 P(2nd purchase, if you got coke the first time): Coke = 7/9 (1 less coke) Beer = 2/9 P(2nd purchase, if you got beer the first time): Coke = 8/9 Beer = 1/9 P(1st purchase = coke; 2nd purchase = coke): (8/10)(7/9) = 56/90 (0 beers) P(1st purchase = coke; 2nd purchase = beer): (8/10)(2/9) = 16/90 P(1st purchase = beer; 2nd purchase = coke): (2/10)(8/9) = 16/90 P(1st purchase = beer; 2nd purchase = beer): (2/10)(1/9) = 2/90 Thus, probability for 0-2 beers--> # Beers: 0; 1; 2 P(# Beers): 56/90; 32/90; 2/90
State the confidence interval formula for the following: 1) Mean 2) Proportion 3) Difference in means 4) Difference in means, assuming equal variance
--> CI's always use sample estimators, that's the whole point of an interval estimator—that it estimates population parameters 1) *CI of mean* = mean^ +/- [tα/2, df x (s/N^0.5)] --> mean^ = sample mean --> df = N-1 --> s = SD of sample --> [tα/2, df x (s/N^0.5)] = margin of error --> For large samples, Zα/2 = approximately correct 2) *CI of proportion* = p^ +/- [Zα/2 x (p^(1-p^)/N)^0.5] --> [Zα/2 x (p^(1-p^)/N)^0.5] = margin of error 3) *CI of diff. in means* = Δmean^ +/- [tα/2,df x ((s1^2/N1)+(s2^2/N2))^0.5] --> Δmean^ = difference in sample means --> s^2 = sample variance --> df = N1+N2-2 --> [tα/2,df x ((s1^2/N1)+(s^2/N2))^0.5] = margin of error 4) *CI of diff. in means, if variances are the same* = Δmean^ +/- [tα/2,df x sp x ((1/N1)+(1/N2))^0.5] --> This basically substitutes 1 for s1^2 and s2^2 bc the variances are equal to each other/the same --> df = N1+N2-2 --> sp = pooled calculation of SD = {[s1^2(N1-1) + s2^2(N2-1)]/[N1+N2-2]}^0.5 --> For large samples, Zα/2 = can be used to approximate tα/2,df
What is the standard error of the: 1) Sample mean (x)? 2) Sample proportion (p)? 3) x1-x2? 4) p1-p2? What is proportion? --> ie- you're interested how many red m&ms there exist out of every m&m that exists. there are 170,000 m&ms total, 17,000 red m&ms. what is the population proportion?
--> SE = sample SD 1) SEx=s/(n^0.5) --> s=sample estimate of population SD 2) SEp=[p(1-p)/n]^0.5 3) SEx1-x2=[(s1^2/n1)+(s2^2/n2)]^0.5 4) SEp1-p2=([(p1(1-p1))/n1]+[(p2(1-p2))/n2])^0.5 Proportion = number of subjects that possess the traits you are investigating/interested in --> the population proportion = 17,000/170,000=0.1
Problem Set 3: Part 1- https://imgur.com/gallery/qaI0vxB shows the probability distribution for A and B. 1) P[A∩B] = ? 2) P[A∩B bar] = ? 3) P[A∪B] = ? 4) P[A|B] = ? 5) P[A bar∪B bar] = ?
--> Since joint distribution is given, AnB = 0.19; AuB = P(A)+P(B)-P(AnB), etc.--> n = in table!!!! WOW 1) P(AnB) = 0.19 2) P[A∩B bar] = 0.46 3) P[A∪B] = P(A)+P(B)-P(AnB) = (0.65+0.4)-0.19 = 0.86 4) P[A|B] = P(AnB)/P(B) = 0.19/0.4 = 0.475 5) P[A bar∪B bar] = (0.35+0.6)-(0.14) = 0.81
--> what is p hat? --> what is parameter? --> what is statistic? 1) What is the E(p^)? 2) What is var(p?)? 3) What is SE(p^)? 4) What are the 3 requirements for p hat? 5) What is bayne's formula? 6) What is a "space" in regards to probability? --> event=? 6) Simple vs. complex events
-->p hat=sample proportion; proportion of interest within sample --> parameter = descriptor of population --> statistic = descriptor of sample 1) E(p^)=P 2) Var(p^)=(p(1-p))/N 3) SE(p^)=[(p(1-p))/N]^0.5 --> SE is just the sample SD, it's just the sq root of the variance! 4) unbiased; accurate; consistent 5) PA|B=(A|BxB)/[(A|BxB)+(A|BbarxBbar)] 6) "Space"- describes all possible outcomes --> Event=subset/part of the space 6) --> Simple = 1 desired outcome --> Complex = multiple/intersecting desired outcomes (drawing a 6 of spades)
1) Define the following: i. Type I error --> What is the probability of a Type I error/the frequency of its occurrence? ii. Type II error --> What is the probability of a Type II error/the frequency of its occurrence? 2) What are the 3 ways to calculate 2 sided hypotheses? b) What is the critical value?
1) i. Type I error = rejecting a true H0 --> ie- H0 = person is innocent--> result = imprisoning an innocent person --> Frequency of Type I error occurrence = α --> Say α=0.05--> probability of rejecting a true H0=5%, so if p<0.05, this means the chance of rejecting a true H0 is below 5%--> we can reject the H0 without too much worry ii. Type II error = failing to reject a false H0 --> ie- H0 = person is innocent--> result = setting a guilty person free --> Frequency of Type II error occurrence = β --> Say β=0.50--> probability of failing to reject a false H0=50% --> Power = 1- β 2) a) *Confidence Interval method:* i. Construct (1-α)x100% confidence interval for population parameter in question (ie- hypothesizing about population mean, variance, SD, etc) ii. If hypothesized value of population parameter falls OUTSIDE the confidence interval, we REJECT the H0 --> If hypothesized value is OUTSIDE the CI, hypothesized value of H0 is not a likely value for the parameter --- b) *Test Statistic method:* --> Test Statistic=(observed - hypothesized)/SE of sample statistic (another version of Z score, but for sample) i. Construct test statistic (TS) using formula ii. Compare test statistic to critical value from data distribution --> Critical value = largest Z value TS could take, and still be consistent with H0 (the upper limit of the H0) --> Critical value = either tα/2,df or Zα/2--> depending on application iii. Reject H0 if: critical value(Z score for alpha/2)< test statistic iii. Rejection: 1) For 2 sided test—population mean (often ok to use pooled variance in SE): --> Reject H0 if... |(mean^-hypothesized µ)/(s/N^0.5)| > tα/2,N-1 2) For 2 sided test—diff in means: --> Reject H0 if... |(Δmeans^- hypothesized Δµ)/(s x [(1/N1)+(1/N2)]^0.5)| > tα/2,N1-N2-1 3) For 2 sided test—population proportion: --> Reject H0 if... |(p^-hypothesized p)/[(p^(1-p^))/N]^0.5| > Zα/2 4) For 2 sided test—diff in proportions: --> Reject H0 if... |(Δp^- hypothesized Δp)/[((p1^(1-p1^))/N1)+((p2^(1-p2^))/N2)]^0.5 --- c) *P-Value method:* --> p-value = chance of H0 being true; compared to α (the chance of rejecting a true H0)--> lower p-value = reject H0; higher p-value = fail to reject H0 i. If p<α = reject H0; if p>α = fail to reject H0 --> Lower p = sample would have been unlikely/not possible to generate, had H0 been true (H0 being true would mean sample would not have been generated) ii. Calculate as 2x normal dist area of P(t>test statistic) or 2xP(Z>test statistic)--> *this is for TWO SIDED TEST! for one sided test, just find the p-value or normal dist. area of one corner!!!!!* --> Area of t that's above the test statistic; area of Z that's about the test statistic (haha!) iii. Rejection: 1) For 2 sided test—population mean (often ok to use pooled variance in SE): --> Reject H0 if... 2xP that [t > |(mean^- hypothesized µ)/(s/N^0.5))| ]< α 2) For 2 sided test—diff in means: --> Reject H0 if... 2xP that [t > |(Δmean^- hypothesized Δµ)/(s x [(1/N1)+(1/N2)]^0.5)| ] < α 3) For 2 sided test—population proportion: --> Reject H0 if... 2xP that [Z > |(p^- hypothesized p)/(s x [(p^(1-p^))/N^0.5)| ] < α 4) For 2 sided test—diff in proportions: --> Reject H0 if... 2xP that [Z > |(Δp^- hypothesized Δp)/[((p1^(1-p1^))/N1)+((p2^(1-p2^))/N2)]^0.5 < α
Midterm 1: 1) A sample has a mean of 50 and a SD of 10. Approximately what fraction of observations is between 40 and 70? 2) What word is used to describe a distribution that is unbalanced or asymmetric? 3) What is the range of possible values for a correlation coefficient? 4) During 2 years, the interest rate was 5% and 35%. Calculate the average interest rate during this period. 5) ln(5)=1.61, ln(10)=2.3. What is ln(200)? 6) Type I error? Type II error? 7) How do you know whether to reject the null hypothesis, using... a) Test statistic (not to be confused with t-value) b) p-value c) Confidence interval
1) Approximately=empirical rule=2/3-95-99.9% rule for 1-2-3 SDs from the mean 40 is 1 SD away from 50 --> one side is 0.5x(2/3)=0.333 70 is 2 SDs away from 50 --> one side is 0.5x0.95=0.475 --> Observations between 40 and 70=0.333+0.475=0.8083=80.83%! 2) Skewed 3) -1<=r<=1 (-1 to 1) 4) Mean of rates= [(1+r1)(1+r2)...(1+rn)]^(1/n)-1 [(1.05x1.35)^(1/2)]-1 --> 1.190588-1=*0.190588* 5) Apply properties of logs to this! ln(ab)=ln(a)+ln(b); ln(a/b)=ln(a)-ln(b); ln(a^n)=ln(a) x n ln(200)=ln([10/5]x10^2) [2.3-1.61]+(2.3x2)=*ln(200)=5.29* 6) Type I error=reject true hypothesis --> Imprisoning innocent person (H0=they're innocent) Type II error=accept false hypothesis --> Acquitting guilty person (H0=they're innocent) 7) a) Test statistic((o-h)/SE) > critical value (Z score of alpha/2) = reject H0 b) p<0.05 = reject H0 --> Broadly, it is if p<alpha (alpha=1-confidence interval) = reject H0 --> So if confidence interval=90%, if p<0.1 = reject H0 c) H0 value falls outside of confidence interval = reject H0
1) For a study involving one population and a sample size of 18 (assuming you have a t-distribution), what row of the t-table will you use to find the right-tail ("greater than") probability affiliated with the study results? 2) For a study involving a paired design with a total of 44 observations, with the results assuming a t-distribution, what row of the table will you use to find the probability affiliated with the study results? 3) A t-value of 2.35, from a t-distribution with 14 degrees of freedom, has an upper-tail ("greater than") probability between which two values on the t-table?
1) Degrees of freedom = the value to use on the sides of the t distribution df=n-1 n=18--> n-1=17; row 17 is the one you'd use to find the probability! 2) For a paired study, n=22 df=n-1=22-1=21 3) Reverse trace row 14 df--> 2.35 --> There is no exact 2.35 value; it falls between 2.145 and 2.624 --> Column headings for these values = 0.025 and 0.01 Thus, upper tail probability = between 0.025 and 0.01
1) What is an estimator that is unbiased and efficient called? i. What is the statistic vs. the parameter? 2) What is the estimator for the Ux? p? σ^2? 3) What are the two types of guesses of a parameter? --> What is the confidence interval? Margin of error? 4) How do you calculate α? i. What is the condition for Zα/2?
1) Estimator that is unbiased and efficient = *Minimum Variance Unbiased Estimator* (sample estimates to approximate population parameters) i. Statistic = sample approximation of population value --> Parameter = population value 2) Estimator AKA sample version/approximation for Ux = sample mean (x bar) Estimator AKA sample version/approximation for p = sample p^ Estimator AKA sample version/approximation for σ^2 = s^2 3) Two types of guesses of parameter: i. Point estimator = single guess of parameter (mean is a good example of this) ii. Interval estimator = range of plausible values that parameter could fall within --> ie- Confidence interval = range of values that parameter could fall within [typically point estimator + (margin of error)] --> Margin of error = *reliability factor (z or t value) x SE of point estimator* 4) α = 1-Q (just the area outside the confidence range under bell curve) --> Q = decimal of confidence level (95% = 0.95) i. Condition to rmr for Zα/2: P(Z>Zα/2) = α/2 --> ie-Q=0.95; α=0.05; Zα/2--> Z0.025=1.96 --> Thus, P(Z>1.96)=0.025 --> Essentially, the area above 1.96 (which is the Z score for the boundary of 0.95) = 0.025, which can be found through finding area of the 1 of 2 sides left out of confidence interval
Part 1: 1) 2) Suppose that β1 = 10 with a standard error of 4. What is the probability that the estimated effect β1^ is negative? 3) Suppose that β1 = 13 with a standard error of 10. What is the probability that the estimated effect β1^ appears statistically significant? 4) Suppose that a regression produces β^ = 0.57 with a standard error of 6.97. Would these results be unusual if the true β = 0? Explain. 5) Suppose that a regression contains N = 23 observations and two β coefficients, β0 and β1. The estimate of β1 is 73.3 with a standard error of 9.2 . Construct a 95% confidence interval for β1. (Hint: use tα/2,N−K) 6) Suppose that we estimate the relationship between a consumer's QUANTITY(demanded) and PRICE of a good, using a double-log form: ln(Qd)=B0+B1(ln(P))+u We estimate β1^ = −0.86 with a standard error of 0.05 . The sample has N = 100 observations. a) Test the hypothesis that demand is unit elastic (against the alternative that it is not unit elastic). b) Calculate the probability that demand is elastic (that is, β1 < −1) given the available information.
1) For t=2 and alpha=0.025 (95% CI), df=60--> df=n-1--> *n=61* 2) This is calling for a simple test statistic (test statistics include Z-scores and t-scores!) H0: B1^<0 and Ha: B1^>0 (one sided) i. Test statistic = (observed-hypothesized)/SE --> (10-0)/4 = 2.5 ii. Z=2.5--> p-area to the RIGHT of test stat = *p=0.006 that B1^<0* iii. Since p-area = 0.006<0.05, we reject the H0 that B1 is negative. 3) This is calling for a simple hypothesis test (p value method) (test statistics include Z-scores and t-scores!) H0: B1^=0 (NOT statistically significant) and Ha: B1^=/=0 (statistically significant) —> Two sided, multiply p-area to the RIGHT of t stat x 2 i. Test statistic = (observed-hypothesized)/SE (using z score for this; mean=0 for normal distribution) --> (13-0)/10 = 1.3 ii. Z=2.5--> p-area = 0.097 x 2--> *0.194 chance that B1^ is not statistically significant* iii. Since p-area=0.194 and 0.194>0.05, we FAIL to reject the H0 that B1^ is not statistically significant. 4) Test stat=(obs-hyp)/SE —> H0: B=0; Ha: B=/=0 (two sided, find area to the RIGHT test stat, x 2 bc both sides) i. Z = (0.57-0)/6.97 = 0.082 ii. p-area for Z=0.468—> 0.468x2 = 0.936 iii. Since p-area = 0.936, which falls way above 0.05, we fail to reject the H0. Therefore, it is NOT unusual for B=0 (H0). 5) N=23, B1^=73.3, B1^'s SE^=9.2, construct 95% CI for B1^ --> statistic+/- [tα/2,N−K x SE of statistic] i. CI= B1+/-[tα/2,N−K x x SE of B1] --> 73.3+/-[t(0.025, 2) x 9.2] → 73.3 +/- [margin of error=2.08 x 9.2] --> *(54.164,92.436)* Sidenote: N=# observations; K=# estimated parameters (estimated descriptors of population) → 23-2=21 6) ln(Qd)=B0+B1(ln(P))+u B1^=-0.86; SE=0.05; N=100 a) H0: B1=-1 and Ha: B1=/=-1 (demand unit elastic = -1) (two sided test, so 2 x the area to the right of test stat) i. Test statistic = (-0.86-(-1))/0.05—> Z= 2.8 ii. P-area for Z of 2.8 = 0.003 x 2 = 0.006 —> *Since p=0.006, which is lower than 0.05, we would reject the H0 that demand is unit elastic.* b) H0: B1<-1 and Ha: B1>-1 --> Area to the RIGHT of test statistic i. T statistic = (-0.86-(-1))/0.05—> Z= 2.8 ii. P-area to the RIGHT of Z=2.8--> *0.003 chance that B1<-1 or that demand is elastic* --> Since p=0.003, which is lower than 0.05, we would reject the H0 that demand<-1 or that demand is elastic.
Quiz 3: 1) The probability of A is 0.60, the probability of B is 0.45, and the probability of A or B is 0.80. The probability that both A and B occur is 2) The probability of A is 0.60, the probability of B is 0.45, and the probability of both is 0.30. --> True or False: A and B are independent. 3) Use this following probability distribution for X to solve the next four problems. A repairperson replaces broken windshield glass throughout the day. The number of windshields replaced is: X: 1; 2; 3; 5 P(X): 0.42; 0.26; 0.04; 0.28 4) The expected number of windshields replaced is µ = A. µ = 1 * 0.42 + 2 * 0.26 + 3 * 0.04 + 5 * 0.28 B. µ = (1 + 2 + 3 + 5) / 4 C. µ = (0.42 + 0.26 + 0.04 + 0.28) / 4 D. µ = (1 * 0.42 + 2 * 0.26 + 3 * 0.04 + 5 * 0.28) / 4 5) Using µ to represent the answer to the previous question, the variance in the number of windshields replaced is σ 2 = A. σ^2 = (1 - µ)2 * 0.42 + (2 - µ)2 * 0.26 + (3 - µ)2 * 0.04 + (5 - µ)^2 * 0.28 B. σ^2 = (12 * 0.42 + 22 * 0.26 + 32 * 0.04 + 52) / (4 - 1) C. σ^2 = ((1 - µ) * 0.42 + (2 - µ) * 0.26 + (3 - µ) * 0.04 + (5 - µ) * 0.28)^2 D. σ^2 = ((1 - µ) * 0.42 + (2 - µ) * 0.26 + (3 - µ) * 0.04 + (5 - µ)) * 0.28)2 / (4 - 1) E. σ^2 = ((1 - µ)2 + (2 - µ)2 + (3 - µ)2 + (5 - µ)2) / (4 -1) F. σ^2 = ((1 - µ)2 * 0.42 + (2 - µ)2 * 0.26 + (3 - µ)2 * 0.04 + (5 - µ)2 * 0.28) / (4 - 1) 6) The worker is paid a salary depending on his outcome: $S = 10 + 2*X. Let µ represent the expected value of X and σ 2 the variance in X, from Questions 2 and 3. --> E(x) = ? --> Var(x) = ? 7) A firm engages in two research projects. The returns from each project are random. In millions of dollars, the expected profit from Project X are E[X] = 3.7, and the variance is Var(X) = 9.9. The expected profit from Project Y, also in millions of dollars, are E[Y] = 8.3, and the variance is Var(Y) = 12.2. Because there is some spillover in the knowledge acquired during research between the two projects, Cov(X,Y) = 5.3. --> Expected total profit = ? --> Variance in total profits = ? --> Correlation between 2 profits = ? 8) Find cov(x,y) for the following distribution: x 2 8 y -3 | 0.43 0.22 | 5 | 0.19 0.16 |
1) Independence test: P(A|B) = P(A) --> P(AuB) = 0.80 = (0.6+0.45)-(P(AnB))--> P(AnB) = 0.25 2) P(A|B) = P(AnB)/P(B)--> (0.3)/0.45 = 0.66; P(A) = 0.6--> they're NOT independent! --> False 3) 0.58 4) E(x) = sum of all (x*P(x) --> A. µ = 1x0.42 + 2x0.26 + 3x0.04 + 5x0.28 = 2.46 5) Var(x) = sum of all ((x-mean)^2)*(P(x)) --> A. σ^2 = (1 - µ)^2x0.42 + (2 - µ)^2x0.26 + (3 - µ)^2x0.04 + (5 - µ)^2x0.28 6) Asking for linear transformations, with E(x)=2.46; Var(x)=2.77 i. Mean of (10+2x) = 10+2(mean of x) --> 10+2(2.46)=*14.92* ii. Variance of (10+2x) = (2^2)(Var(x)) --> (2^2)(2.77)=*11.08* iii. Covariance of (a+b(x,y)) = (b)(cov(x,y)) --> So if y=a+b(x,y), covariance of function y = (b)(cov(x,y)) mean(10+2x) = 10+2(mean of x) var(10+2x) = 4(var(x)) 7) i. E(x+y) = E(x)+E(y) --> 3.7+8.3 = *12* ii. Var(x+y) = var(x)+var(y)+(2)(cov(x,y)) --> 9.9+12.2+10.6 = *32.7* iii. Corr(x,y) = Cov(x,y)/SDx*SDy --> 5.3/(3.14643x3.4985) = *0.48* 8) x 2 8 y -3 | 0.43 0.22 |--> P(y=-3) = 0.65 5 | 0.19 0.16 |--> P(y=5) = 0.35 --> P(x=2) = 0.62 --> P(x=8) = 0.38 Cov(x,y) for joint distr. table = sum of [(x-E(x))(y-E(y))*P(x,y)] i. E(x) = (2x0.62)+(8x0.38) = 4.28 ii. E(y) = (-3x0.65)+(5x0.35) = -0.2 --> Go through each sq and apply this to each sq as follows: [(2-4.28)(-3+0.2)(0.43)]+[(8-4.28)(-3+0.2)(0.22)]+[(2-4.28)(5+0.2)(0.19)]+[(8-4.28)(5+0.2)(0.16)] = *Cov(x,y) = 1.296*
1) What is the null hypothesis? Alternative hyp? 2) What is the significance level? 3) What do all hypothesis tests focus on? 4) Define the following types of hypothesis tests: i. One sided (is this one common?) ii. Two sided (is this one common?)
1) Null = the hypothesis is the idea we will test and possibly/want to reject Alternative = the binary option that must be true if we reject the Null (never state that you accept the Alt tho! can only reject the Null) 2) Sig level = α = probability of falsely rejecting the H0 (rejecting the H0 when it is actually true = Type I error) --> α = typically 0.05 --> *H0 is rejected IF p-value < α* = probability of falsely rejecting the H0 when it is actually true is very low! --> Sometimes sig level expressed = 95%, 99%, 90%--> corresponding α's = 1-Q = 0.05, 0.01, 0.10 3) All hypothesis tests focus on approximating/testing hypotheses about the population parameter (values in population) --> About hypothesizing the characteristics of the population and testing the hypothesis 4) Types of hypotheses: i. *One sided* = test hypothesis of population parameter being > or < hypothesized value (not commonly used in economics) --> x=some population parameter --> 2.13 = hypothesized value that we hypothesize population parameter will take *ie1:* - H0: x>=2.13 — Ha: x<2.13 OR - H0: x<=2.13 — Ha: x>2.13 *ie2:* - H0: x=2.13 — Ha: x<2.13 OR - H0: x=2.13 — Ha: x>2.13 --> *ie1 and ie2 are equivalent hypothesis packages--> whether H0 is written as equality or weak inequality = irrelevant* ii. *Two sided* = test hypothesis of population parameter being = or =/= hypothesized value (commonly used in economics) --> x=some population parameter --> 2.13 = hypothesized value that we hypothesize population parameter will take *ie:* - H0: x=2.13 — Ha: x=/=2.13 --> *Often, hypothesis is whether or not population parameter = 0*
Midterm 2: 1) P[A] = 0.48 , P[B] = 0.31, and P[A∩B] = 0.17 . What is the probability that neither event occurs? In other words, what is P[A'∩B'] ? 2) In today's election, the probability that the Republican Party wins a majority of the Senate is 5/6, and the probability that the Republicans win a majority of the House of Representatives is 1/8. To calculate the probability that they win majorities in both, what additional information would you need?
1) P[AbarnBbar]=[P(A)+P(B)]-[P(AnB)]=(0.48+0.31)-0.17= --> P(A'nB')=1-[P(A)+P(B)-P(AnB)]=1-[(0.48+0.31)-0.17]=0.38 --> P(A'nB')=whole venn diagram - intersection = P(AuB) --> P(A'nB')=P(AuB) 2) Senate majority=5/6; House majority=1/8 To find P(majorities in both), what do you need? --> Need to know: i. If probabilities are independent ii. Conditional probability of Republicans winning majority in senate given they win House majority--> P(A|B)--> P(win the senate, given they win the house) iii. The P(B|A)--> P(win the house, given they win the senate)
Problem Set 3: Part 6- State whether each of the following statements is true or false, and explain. 1) The complement of the union of two events is the intersection of their complements. 2) If A and B are mutually exclusive, then they are independent. 3) If P[A|B] = P[B|A], then P[A] = P[B]. 4) If an event and its complement are equally likely to occur, the probability of that event must be 0.5. 5) If A and B are independent, then A bar and B bar are independent. 6) If A and B are mutually exclusive, then A bar and B bar are mutually exclusive 7) P[A∪B] is greater than or equal to P[A∩B]. 8) P[A∪B] is greater than or equal to P[A]+ P[B]. 9) P[A∩B] is less than or equal to both P[A] or P[B] . 10) P[A]+ P[B] is less than or equal to 1. 11) P[A|B] is greater than or equal to P[A∩B] . 12) P[A|B] is greater than or equal to P[A].
1) The complement of the union of two events is the intersection of their complements. --> True; *complement of P(AuB) = P(Abar n Bbar)* 2) If A and B are mutually exclusive, then they are independent. --> False, mutually exclusive=A cannot happen if B happens vs. independence=A happening does NOT affect chance of B happening 3) If P[A|B] = P[B|A], then P[A] = P[B]. --> True, (5*5/5 = 5) 4) If an event and its complement are equally likely to occur, the probability of that event must be 0.5. --> True, 1-0.5 = 0.5 5) If A and B are independent, then A bar and B bar are independent. --> True; independence transfers over to complements—remember dat! 6) If A and B are mutually exclusive, then A bar and B bar are mutually exclusive --> FALSE; mutual exclusivity doesn't necessarily transfer over to complements. 7) P[A∪B] is greater than or equal to P[A∩B]. --> True, draw venn diagram (constitutes the area of A+B minus the AnB sliver, so it is bigger than AnB!) 8) P[A∪B] is greater than or equal to P[A]+ P[B]. --> False, it's less than since PAuB = (PA+PB)-PAnB 9) P[A∩B] is less than or equal to both P[A] or P[B] . --> P(AnB) = P(A|B)*P(B) --> TRUE! bc P(A) and P(B) are always going to be decimals; so decimal*decimal = smaller decimal :) 10) P[A]+ P[B] is less than or equal to 1. --> False, A and B aren't necessarily related. If P(A)=0.9 and P(B)=0.9, then obviously P(A)+P(B)=/=1, since the events are necessarily related! 11) P[A|B] is greater than or equal to P[A∩B] . --> True, PA|B = PAnB/PB--> divided by decimal = greater or equal :) 12) P[A|B] is greater than or equal to P[A]. --> False, this varies depending on B
List the formulas for the following: 1) Unions 2) Complements 3) Conditional Probability 4) Intersections 5) Bayes Rule 6) You're a doctor. Past data tells you 10% of patients entering clinic = have liver disease. 5% of clinic's patients = alcoholics. Among those diagnosed with liver disease, 7% = alcoholics. What is the probability of a patient having liver disease if they're an alcoholic?
1) Unions: P(AuB) = P(A)+P(B)-P(AnB) 2) Complements: P(A)=1-P(A) 3) Conditional Probability: P(A|B) = P(AnB)/P(B) 4) Intersections: P(AnB) = P(A|B)*P(B) 5) Bayes Rule: used for conditional probability Bayes formula = P(A|B) = P(B|A) * P(A) / P(B) - Conversely, P(B|A) = [P(A|B)P(B)]/[[P(A|B)*P(B)]+[P(A|B)*P(B)]] --> Patient does this, GIVEN a positive result for the disease; working backwards almost) --> A = patient has liver disease --> B = patient is alcoholic --> P(A) = 0.10 --> P(B) = 0.05 --> P(B|A) or P(patient is alcoholic, if they have/given that they have liver disease) = 0.07 --> P(A|B) = (0.07 * 0.1)/0.05 = 0.14 1/1000 has disease; if disease, P(positive) = 0.99 Test: if no disease, P(positive) = 0.005 P(positive|disease) = "sensitivity"
1. Expected value! 2. Linear transformations! Knowing for x its E(x) and var(x) Want to know: for (a+bx), what is E(a+bx), and var(a+bx) --> a,b = constant
1. E(x) = x*f(x)*dx Var(x) = all x [(x-E(x))^2]f(x)dx discrete: sum of all x; P(x) continuous: S(x); f(x) 2. E(a+bx_= a+b(E(x)) Avg temp < 72 degrees F avg in C? va(a+bx) = b^2(var(x) --> LMAO I ALREADY MEMORIZED THIS SHIT
1. Random variables! 2. What is discrete rv? 3. What is continuous rv? 4. What are expected values? 5. What is the variance in relation to probability? 6. What is a binomial distribution? 7. What is the covariance/joint distribution (discrete)? 8. What is correlation (discrete)? --> What if x and y are statistically independent?
1. Specifically, outcomes = quantitative (often x/y/z) 2. Discrete random variables = limited set of values --> Described with probability distribution or table 3. Continuous = any value in some range 4. Expected value = average mean = E(x) = SUM of all (x*P(X)) --> so P(x) = (x)*(probability of x) 5. Variance = sum of all (x-E(x))^2(P(x)) --> ux = mean of population = mu 6. Binomial distribution: Way to measure probability of specific successes/failures --> N attempts P = probability of success x = number of successes --> P (x successes) = N!/[X!(N-X)!]*(P^x)(1-P)^(N-X) 7. cov(x,y) = sum of all [(X-E(x))(y-E(y))*P(x,y)] --> x and y= the values; P(x,y) = probability of both or where the cells join in joint distribution 8. corr(x,y) = [cov(x,y)]/[SDx*SDy] = Pxy --> range = from -1 to 1 --> If x and y are independent, then corr = 0. (basically means they're mutually exclusive)
Problem Set 3: Part 10- A firm receives widgets from two sources. The first manufacturer is more reliable, producing defective widgets 5% of the time. This manufacturer provides 2/3 of the firm's widgets. The second manufacturer produces defective widgets 15% of the time. This manufacturer provides the remaining 1/3 of widgets. Given that a randomly selected widget is defective, what is the probability that it came from the more reliable firm?
10) Classic problem asking for application of Baye's Rule! --> P(B|A)=it came from firm 1, given it's defective=? --> P(B)=came from firm 1=2/3; P(Bbar)=came from firm 2=1/3 --> P(A)=defective --> P(A|B)=defective given it came from firm 1=0.05 --> P(A|Bbar)=defective given it came from firm 2=0.15 i. P(B|A)=[P(A|B)xP(B)]/[(P(A|B)xP(B))+(P(A|Bbar)xP(Bbar))] --> (0.05x0.667)/[(0.05x0.667)+(0.15x0.333)] = *0.4*
Midterm 2: 10) The random variable X has a normal distribution with a mean of 85.0 and a standard deviation of 10.0. In a sample with N = 25 observations, what is the chance that the sample mean would be less than 88.3? 11) A single M&M candy has a 0.13 chance of being red, independent of the colors of the other candies in the bag. Let p^ denote the red proportion in a bag of N = 55 candies. 75% of bags will have p^ less than what value?
10) z for sample=(x-mean)/(SD/N^0.5) P(x<88.3) = ? i. Standardize 88.3 --> z=(88.3-85)/(10/5) = 1.65 ii. Find 1.65 on chart --> P(x<88.3) = *0.9505* 11) 75% of bags will have p^ less than the upper limit of the confidence interval, simply asking for upper limit of 75% confidence interval! --> 75% CI for p^ = p^+/-(Z of alpha/2 x SEp^) i. p^=0.13 ii. Z of alpha/2=Z of 0.125--> reverse trace p-area=0.125 for Z score --> 1.15 iii. SEp^=[(p^(1-p^))/N]^0.5 --> [(0.13(1-0.13))/55]^0.5 = 0.045 iv. 75% CI for p^ = 0.13+/-(1.15x0.045) --> (0.078, 0.18)--> *75% of of bags will have p^ less than 0.18, the upper limit of a 75% confidence interval for p^* ----- Asking for upper limit of confidence interval! i. Find one sided Z limit for 0.75 CI --> Z score of (1-0.75) = 0.67 ii. Find SEp^ = [(p^(1-p^))/N]^0.5 --> [(0.13(0.87))/55]^0.5 = 0.04535 --> This means that 75% of observations are 0.67 (or less) SE above the mean iii. P(p^ < p^+ Z of 0.25 x SEp^) = P(p^ < upper limit of 75% confidence interval! --> Upper limit to 75% confidence interval = p^+ Z limit of CI x SEp^ --> 0.13+0.68x0.04535 = 0.160838 iv. *P(p^ < 0.160838) = 0.75*
Problem Set 3: Part 11: Sue calls classmates to ask for help with homework. Each classmate has an independent 0.3 chance of answering their question. Sue continues calling classmates until they get an answer. Then they stop. 1. What is the probability that Sue receives an answer from the fourth classmate they call? 2. What is the probability that they must call at least three classmates? 3. What is the probability that they receive an answer from the N-th classmate they call? A. (0.7)^N B. (0.7)^(N-1) * (0.3) C. 1 - (0.3)^N D. 1 - (0.7)^(N-1) Part 12. A firm hires 3 workers from an applicant pool of 5 men and 5 women. What is the probability that all three employees are male? Part 13. & 14. For these questions, calculate: 1. The probability distribution for the random variable. (For example, P[X = 0], P[X = 1], P[X = 2].) 2. The expected value of the random variable. 3. The variance in the random variable. Part 13. The Acme Manufacturing Company produces a large batch of widgets, of which 20% are purple. Two widgets are chosen at random from the batch. Let X denote the number of purple widgets. XPart 14. The Acme Manufacturing Company produces a batch of N = 20 widgets, of which 4 are purple. Two widgets are chosen at random from the batch. Let X denote the number of purple widgets.
11: 1) REMEMBER THAT SUE KEEPS CALLING UNTIL THEY GET AN ANSWER; SO FOR 4TH CLASSMATE TO ANSWER MEANS PREVIOUS THREE DID NOT ANSWER! i. (1-P)^(N-1)x(P) --> 0.7^3x0.3 = 0.1029 2) Probability of calling AT LEAST 3 students = AT LEAST/3 OR MORE 3 students do NOT answer the question (3 or more) = P(at least 3 do not answer the question)= 1-(P(call 1 student)+P(call 2 students) --> At least ALWAYS includes the number itself (at least 3 = includes 3; its complement includes 0, 1, 2 but NOT 3!) i. P(call 1 student)=0.7^0x0.3=0.3 i. P(call 2 students)=0.7^1x0.3=0.21 iii. Complement of P(1)+P(2) = 1-(0.3+0.21) = *0.49* 0.2401 3) B. (0.7)^(N-1) * (0.3) --> Probability of student NOT answering (necessitating Sue to continue calling on students)^(N-1)*Probability of success on the Nth time --> ie- P(n=3) = 0.7*0.7*0.3 = 0.7^(3-1)*0.3 :-) 12: P(women) = 0.5; P(men) = 0.5 First pick = man = 5/10 Second pick = man = 4/9 Third pick = 3/8 --> 0.5*(4/9)*(3/8) = 0.083 13: Asking for simple binomial distribution calculation! --> Binomial distr =[n!/(k!(n-k)!)]xP^kx(1-P)^(n-k) k = x = # successes desired = 0,1,2 n = # trials = 2 P = probability of 1 success/proportion of desired = 0.2 a) P[X = 0] = [2!/(0!(2-0)!)]x(0.2^0)x(1-0.2)^(2) = 0.64 b) P[X = 1] = [2!/(1!(2-1)!)]x(0.2^1)x(1-0.2)^(1) = 0.32 c) P[X = 2] = [2!/(2!(2-2)!)]x(0.2^2)x(1-0.2)^(0) = 0.04 2) The expected value of the random variable = sum of all (x)(P(x)) --> (0x0.64)+(1x0.32)+(2x0.04) = 0.4 3. The variance in the random variable = sum of all [(x-mean)^2(P(x))] --> (((0-0.4)^2)*0.64)+(((1-0.4)^2)*0.32)+(((2-0.4)^2)*0.04) = 0.32 14: Another binomial distribution problem! p=0.2. hm, this has something to do with the smaller population, not sure. gonna move on. 1. The probability distribution for the random variable. (For example, P[X = 0], P[X = 1], P[X = 2].) --> P(x=0)=0.8; P( 2. The expected value of the random variable. 3. The variance in the random variable.
Midterm 2: 12) Calculate the standard error in X , in a sample of N = 17 with sx^2 = 21.3 13) When the size of a sample doubles, how does the standard error for a proportion change (everything else the same)? 14) Calculate a 92% confidence interval for µ , in a (large) sample of N = 100 with X = 75 and s^2 = 25 15) Use the Stata output to calculate the values of β0 and β1 for the regression line Y^ = β0 +β1X (You might not need all information.)
12) SEx = (variance/N)^0.5 = SD/N^0.5 --> (21.3/17)^0.5 13) When the size of a sample doubles, how does the standard error for a proportion change (everything else the same)? SD = (SD/N^0.5)--> *SE is divided by additional 2^0.5 when sample is doubled* 14) Calculate a 92% confidence interval for µ , in a (large) sample of N = 100 with mean = 75 and s^2 = 25 CI = mean +/- Z limit of CI x SEmean i. Find mean --> Mean = 75 ii. Find Z limit of CI (Z of alpha/2) = Z of (1-0.92)/2 --> Z limit = from area of (0.08/2) or 0.04 = 1.75 iii. Find SEmean = SD/N^0.5 --> 5/10 = 0.5 iv. Plug values into CI formula --> 75 +/- 1.75x0.5 = 75 +/- 0.875 = *74.125, 75.875* 15) Use the Stata output to calculate the values of B0 and B1 for the regression line Y^ = B0 +B1X (You might not need all information.) i. Find B1 = cov(x,y)/var(x) --> 45992.2/70578.2 = 0.6516 ii. Find B0 = mean of y - B1(mean of x) --> 640.656-(1041.428x0.6516) = -37.938 iii. Regression line = B0+B1x --> *y = -37.938+0.6516x*
Midterm 2: 15) Use the Stata output to calculate the values of B0 and B1 for the regression line Y = B0+B1X (You might not need all information.) 16) The following table gives the joint probably distribution for two random variables, X and Y. (https://imgur.com/gallery/Jkmafn0) a) Calculate the mean of X and the mean of Y. b) Calculate the variance in X and the variance in Y. c) Calculate the covariance between X and Y.
15) i. B1=cov(x,y)/var(x) --> 45992.2/70578.4 = 0.6516 ii. B0=meany-meanx(B1) --> 640.656-(1041.428x0.6516) = -37.938 iii. Plug values into regression formula --> *y = -37.938+0.6516x* 16) a) Mean of x=E(x)=sum of all x(P(x)) i. Find P(x=0); P(x=1) for E(x) --> P(x=0)=0.5; P(x=1)=0.5 ii. E(x)=(0x0.5)+(1x0.5)=*0.5* iii. Find P(y=0); P(y=1) for E(y) --> P(y=0)=0.6; P(y=1)=0.4 iv. E(y)=(0x0.6)+(1x0.4)=*0.4* b) i. Var(x)=sum of (x-E(x))(P(x)) --> [(0-0.5)(0.5)]+[(1-0.5)(0.5)] = 0 ii. Var(y)=sum of (y-E(y))(P(y)) --> [(0-0.6)(0.4)]+[(1-0.4)(0.4)] = 0 c) Cov(x,y) = sum of all [(x-mean)(y-mean)(P(x,y))] (0-0.5)(0-0.4)(0.21)+ (0-0.5)(1-0.4)(0.29)+ (1-0.5)(1-0.4)(0.11)+ (1-0.5)(0-0.4)(0.39) = -0.09
Midterm 2: 17-no integration on final) X is a continuous random variable that can take values between 0 and 5. The cumulative density function is F(x) = (1/25)x^2 a) Calculate P(2<=x<=4) b) Find the probability density function for x. c) Calculate the expected values of x. 18) A dataset contains info on the salaries of N=8,369 workers, separated into female and male. a) Construct a 90% confidence interval for the proportion of the population that is female. b) Construct a 95% confidence interval for the diff in average salaries between female and male workers.
17) a) To find P(2<=x<=4)--> continuous area of 4 - continuous area of 2 i. Continuous area of 4 = plug x=4 into f(x) equation --> 16/25=0.64 ii. Continuous area of 2 = plug x=2 into f(x) equation --> 4/25=0.16 iii. Area, wala! --> 0.64-0.16 = *0.48* b) 18) a) 90% confidence interval for proportion female--> p^+/-(Zalpha/2 x SEp^) i. p^=proportion interested in/N --> 4114/8369 = 0.4916 ii. Zalpha/2 = Z(0.1/2) = Z0.05--> reverse trace to find Z score for area of 0.05 (ALWAYS REVERSE TRACE FOR Z(WHATEVER NUMBER)) --> Zscore for p-area 0.05 = 1.64 iii. SEp^=[(p^(1-p^))/N]^0.5 --> [(0.4916(1-0.4916))/8369]^0.5 = 0.005 iv. 90% CI = 0.4916+/-(1.64 x 0.005) --> *(0.4834, 0.4998)* b) 95% confidence interval for difference in avg salaries between female(2) and male(1) workers--> (x1-x2)+/-(Zalpha/2 x [(s1^2/N1)+(s2^2/N2)]^0.5) i. (x1-x2)=43897-39807 = ii. Zalpha/2 = Z0.05/2 = Z0.025 = reverse trace to find Z score from p-area of 0.025 --> Z0.025 = 1.96 iii. SE(x1-x2) = [(s1^2/N1)+(s2^2/N2)]^0.5 --> [(32517^2/4255)+(23044^2/4114)]^0.5 = 614.471 iv. 95% CI = 4090+/-(1.96 x 614.471) --> *(2885.64, 5294.36)*
Midterm 2: 3) A firm hires 3 workers from an applicant pool of 6 men and 4 women. What is the probability that all three workers are male? 4) P[A|B]= 0.90 , P[A|Bbar]= 0.45 , and P[B]= 0.25 . Calculate P[B|A]. 5) X and Y are two random variables. Each can take values of 0 or 1. P[X = 1]= 0.65 , P[Y = 1]= 0.48 , and the variables are statistically independent. What is Cov(X,Y)?
3) Total applicant pool = 10; 3 muthafckin BARz 6/10 x 5/9 x 4/8 = 0.16667 4) Since we're not given A but given P(A|Bbar), apply Baye's Rule: P[B|A]=[P(A|B)xP(B)]/P(A) = [P(A|B)xP(B)]/[P(A|B)xP(B)+P(A|Bbar)xP(Bbar)] i. [0.9x0.25]/[(0.9x0.25)+(0.45x0.75)] = *0.4* 5) Cov(x,y)=[sum of all (x-E(x))(y-E(y))(P(x,y))] i. Find expected values: --> E(x) = sum of all x*(P(x)) = (0x0.35)+(1x0.65)=0.65 --> E(y) = sum of all y*(P(y)) = (0x0.52)+(1x0.48)=0.48 ii. Find P(x,y) for every combo of x=0,1 and y=0,1 Remember P(x,y)=P(xny)! --> Joint distributions always assume x and y = independence P(x=0 n y=0) = 0.35x0.52 = 0.182 P(x=0 n y=1) = 0.35x0.48 = 0.168 P(x=1 n y=0) = 0.65x0.52 = 0.338 P(x=1 n y=1) = 0.65x0.48 = 0.312 iii. Plug values into Cov(x,y) formula! (Remember that x, y, P(x,y) are the only values that change!) --> Cov(x,y) = [(0-0.65)(0-0.48)(0.182)]+ [(0-0.65)(1-0.48)(0.168)]+ [(1-0.65)(0-0.48)(0.338)]+ [(1-0.65)(1-0.48)(0.312)] = *0* *Alternatively, you could state that Cov(x,y) = 0, since x and y are independence variables!*
Midterm 1: 6) In a sample of N=100 observations, D is a dummy variable that takes a value of 1 for 36% of the sample. Calculate the mean and standard deviation in this variable. 7) A sample has N = 23 observations of the variables X and Y. ∑(Xi −Xm)^2 = 213 , ∑(Yi −Ym)^2 = 373 , and ∑(Xi −Xm)(Yi −Ym) = 87. What is the correlation between X and Y? (Xm=mean of X; Ym=mean of Y) 8) In a sample with N = 400 observations, the correlation between two variables is 0.13. Does this correlation suggest a real relationship between the variables? Explain your answer. 9) X and Y are two variables in a sample. X = 17 , Y = 28 , sX^2 = 9 , sY^2 = 16 , and sXY = −7 . a) Mean of (X +Y) = 45 b) Var(X +Y) = 11
6) Mean (or expected value) = sum of all x*(p(x))--> (0*0.64)+(1*0.36) = 0.36 SD = [sum of all [(x-mean)^2*(p(x))]]^0.5 [[(0-0.36)^2(0.64)]+[(1-0.36)^2(0.36)]]^0.5 = *0.48* 7) Correlation (r) = cov(x,y)/(SDxSDy) i. Cov(x,y)=[∑(Xi −X)(Yi −Y)]/N-1 --> 87/22 = 3.9545 ii. SDx=[∑(Xi −Xm)^2/N-1]^0.5 --> (213/22)^0.5 = 3.112 iii. SDy=[∑(Yi −Ym)^2/N-1]^0.5 --> (373/22)^0.5 = 4.118 iv. Correlation (r) = 3.9545/(3.112x4.118) = *0.309* 8) Real relationship between variables = when |r|>2/N^0.5 --> 0.13 vs. 2/20--> 0.13>0.10 --> *This IS suggestive of a real relationship!* 9)
Midterm 2: 6) Calculate these probabilities for the standard normal distribution. a) P[Z<−1.84]= ? b) P[1.43 < Z < 1.95]= ? 7) Suppose that X ~ N(50,25). a) P[58 < X < 61]= ? b) P[X < ? ]= 0.316 8) X and Y and two random variables. E[X]= 81.9, E[Y]= 7.14, Var(X) = 29.7, Var(Y) = 9.7, and Cov(X,Y) = 3.8 a) E[X +Y]= ? b) Var(X +Y) = ? 9) What two conditions should a random sample satisfy?
6) a) P[Z<−1.84]= ? Standard normal chart tells you area (or P) that falls to LEFT of z --> *0.0329* b) P[1.43 < Z < 1.95]= P(Z<1.95)-P(Z<1.43) P(Z<1.95)=0.9744 P(Z<1.43)=0.9236 0.9744-0.9236=*0.0508* 7) Remember, X~N(mean, variance) z=(x-mean)/SD a) P[58 < X < 61]= ? --> (58-50)/5=1.6 --> (61-60)/5=2.2 b) P[X < ? ]= 0.316--> just asking for Z score of p-area=0.316, converted to X!--> this is to the LEFT of Z score (so Z score will be negative mirror) i. p-area=0.316 has a Z score=0.48 [Z score of positive 0.48=(P(X>?))] --> Since it is telling us that the area of 0.316 is to the LEFT of the Z score, Z score will be negative as reflection of 0.48 --> Z score for P(X<?)=-0.48 ii. Convert Z score to X --> -0.48=(X-50)/5-->*47.6* 8) X and Y and two random variables. E[X]= 81.9, E[Y]= 7.14, Var(X) = 29.7, Var(Y) = 9.7, and Cov(X,Y) = 3.8 a) E[X +Y] = E(x)+E(y) = 81.9+7.14 = 89.04 b) Var(X +Y) = Var(x)+Var(y)+2xCov(x,y) = 29.7+9.7+2x3.8 = 47 9) i. Every member of the population has equal chance of being included in the sample ii. Chance of one member being included is independent of another being included
How do you control for variables in regression? 7. Dr. Jones gives two exams in his Introduction to Archaeology course. We know the chance of getting a grade of A on his exams: P[A on the midterm] = 0.40 P[A on the final] = 0.35 P[A on the final | A on the midterm] = 0.80 What is the probability of getting an A on at least one of his exams? 17. Ashton needs to purchase gadgets. Each store that Ashton visits has a probability p of selling a gadget; a probability 1 − p if not. Ashton continues visiting stores until he or she has all the gadgets needed. [Note: this question does not use the binomial, Poisson, or Bernoulli distribution. You should use more general techniques to solve it.] a) Ashton needs to purchase 1 gadget. What is the probability that Ashton needs to visit exactly 4 stores to get them? b) Ashton needs to buy 1 gadget. What is the chance that he or she finds it in 3 visits or fewer? c) Ashton needs to purchase 2 gadgets. What is the probability that Ashton needs to visit exactly 4 stores to get them?
Just include them in the regression as independent variables (x) lmao 7) P(A on at least one of his exams)=P(A on the midterm OR final)=P(AuB) --> At least=1 OR more--> OR=u--> find P(AuB) --> P(AuB)=P(A)+P(B)-P(AnB) i. P(AnB)=P(A|B)xP(B)--> 0.8x0.4=0.32 ii. 0.35+0.4-0.32=0.43 17. This is just a sequence probability problem! a) P(success on 4th)=no/no/no/yes -->(1-p)^3(p) b) P(success on 1st)+P(success on 2nd)+P(success on 3rd)=(1-p)^0(p)+(1-p)^1(p)+(1-p)^2(p) =D c) In order to get 2 gadgets within N=4, this ~space~ will contain: 2 yeses, 2 no's (this is P(AnB)) Possible sequencing (remember that it has to be 4 visits, so cannot be yes/yes/no/no as that would only entail 2 visits): no/no/yes/yes yes/no/no/yes no/yes/no/yes --> Formula for each visit = (1-p)^2(p^2) --> 4 attempts = (1-p)^2(p^2)+(1-p)^2(p^2)+(1-p)^2(p^2)+(1-p)^2(p^2)=(1-p)^2(3p^2)! =D
Problem Set 3: Part 7- During the semester, a student takes N = 6 quizzes. The probability of failing any individual quiz is 0.05. What is the probability that the student fails at least one quiz? Part 8- This question is the same as before, except that the professor drops the lowest quiz. What is the probability that the student still fails at least one of the remaining five quizzes? Part 9- Ginkgo trees are dioecious, meaning that they are either male or female(sounds fake but okay lol), with equal probability. Producing seeds requires at least one of each sex. When purchasing 5 trees, what is the probability of at least one male and one female?
P = probability of success x = number of successes n = number of trials --> P (x successes) = N!/[X!(N-X)!]*(P^x)(1-P)^(N-X) 7) N=6; P(failing individual quiz)=0.05 P(AT LEAST 1 quiz)--> find P(0); 1-P(0) = P(at least 1) --> P(fail >= 1 quiz) = 1-P(fail=0 quizzes); plug into formula: 1 - [6!/[0!(6-0)!]*(0.05^0)(1-0.05)^(6-0)] = 1-0.735 = *0.26491* 8) P(fail >= 2 quizzes) = 0.032773 --> P(fail>=2 quizzes) = 1-[(P(fail=1 quiz))+(P(fail=0 quizzes))] P(fail=0 quizzes) = 0.735 from (7) P(fail=1 quiz) = [6!/[1!(6-1)!]*(0.05^1)(1-0.05)^(6-1)] --> 1-(0.735+0.23213) = *0.03277* 9) P(F) = 0.5; P(M) = 0.5; N = 5 --> Producing seeds = AT LEAST 1 M + 1 F --> Purchasing 5 trees; P(n>=1 M)nP(n=1 F) = ? n=5; P(either "gender") = 0.5--> P(f>=1) = ?; P(m>=1) = ? i. P(either "gender">=1) = 1-(P(either "gender" = 0) --> 1 - [5!/[0!(5-0)!]x(0.05^0)(1-0.5)^(5-0)] ii. P(either "gender">=1) = 0.96875 iii. P(FnM) = 0.968x0.968 = *0.937* --> This is bc the chance of getting a female does not affect the chance of getting a male = independence. Thus, P(FnM) = P(F)xP(M)! 4+11+12+9+13+8
Continuous distribution! 1. What is the probability of density function (PDF)? --> What is the f(x) that gives you the bell curve? 2. What is the area under the curve? 3. What is the cumulative density function (CDF)?
P(x=a) = 0~generic value--> P(a<=x<=b) 1. PDF = probability that x>some value (remember cumpdf on the calc? lmao) PDF of P(a<=x<=b) = area under f(x) from a to b Example of PDF: f(x) = 1/(2*pi*r^2)^0.5 --> This function gives you the bell curve! 2. Area under curve = surplus 3. CDF = probability that x<some value --> f(x) = 1-e --> antiderivative of f(X) = x^3 antiderivative = 1/4(x^4) --> f(x) = e^(x^2) has no antiderivative
Continuous Random Variables What are the PDFs of the following? 1) x 2) E(x) 3) Var(x) 4) Mean = 100; variance = 225 --> P(110<=n<=120) = ? 5) What is the value of z, given that there is an area of 0.20 that falls to the right of the z score? 6) Rules to remember for normal distribution? (3)
PDF = area under bell curve; PDF for x P(a<=x<=b) = S f(x)dx E(x) = S (all poss x) x*f(x)dx Var(x) = S ((x-E(x))^2)*f(x)dx Uniform distribution = places equal likelihood on all values inside some range; usually described from lower limit (l) to upper limit (u) 1) PDF of x: If we want to find out probability of P(a<=x<=b) = integrate from a to b--> S f(X)dx--> S 1/(u-1)dx--> find antiderivative with respect to x--> x/(u-l) --> Upper limit = b; lower limit = a --> Plug in for x!--> (b/(u-l))-(a/(u-l)) = *PDF of x = ((b-a)/(u-l))* --> (u-l) = entire range; (b-a) = range we're interested in Uniform PDF: f(x) = 1/(u-l)--> if l<=x<=u, otherwise, = 0 2) PDF of E(x): E(x) = S all poss x (x*f(x)dx) = S (x*(1/(u-l))dx)--> antiderviative with regards to x = 1/(u-l)(0.5x^2) --> Plug in upper limit = u; lower limit = l (plug into x) 1/(u-l)*0.5(u^2)-(l^2) = 0.5[(u^2-l^2)/(u-l)] = 0.5[((u+l)(u-l))/u-l)] = *PDF of E(x) = 0.5(u+l)--> the average of u and l!* 3) PDF of Var(x): Var(x) = S all poss x (x-E(x))^2(f(x)dx) = S (x-0.5(u+l))^2(1/(u-l))*dx = ? --> Will end up being a HW problem where we have to derive the actual problem! --- In theory: probabilities come from integration In practice: We use numerical approximations; come up with them from tables; or use software! i. Standard normal distr: mean=0; variance=SD=1 4) P(110<=n<=120)--> z = (x-mean)/SD --> Mean = 100; variance = 225 --> Standardizing = transitioning from P(a<=x<=b)--> P(st. value of a<=z<=st. value of b) --> Find probabilities for st. values of distribution iii. Standardize values of 110 and 120! --> P[((110-100)/15)<=z<=((120-100)/15)] (use Standard Norm Distr Table, which gives area to the right of the value) --> normal value = P(below value) --> P[(0.251)<=z<=(0.092)]--> these are the z scores! find the corresponding normal distr. decimals in the table--> area falls to right --> 0.251-0.092=0.159 5) P-area=0.20 that falls above Z score, reverse tracing it in normal distribution table renders a Z score=*2.05* 6) Rules to remember(TMS): i. TOTAL AREA: All area under curve = 1 ii. MEAN: Mean sits in middle of distribution (area above 0 = 1/2; area below 0 = 1/2) iii. SYMMETRY: bell curve is symmetric, so P(z>=c) = P(z<=c) --> Bell curve is perfectly symmetric, that's literally all he's saying lmao he sucks at teaching
Problem Set 3.5 Part 1: 1) Calculate E(x) and E(y) 2) Calculate Var(x) and Var(y) 3) Calculate Cov(x,y) Part 2: 4) Calculate E(x) and E(y) 5) Calculate Var(x) and Var(y) 6) Calculate Cov(x,y)
Part 1: --> P(x) = 0 = 0.35 --> P(x) = 1 = 0.65 --> P(y) = 0 = 0.65 --> P(y) = 1 = 0.35 1) E(x) = sum of all [(x)(P(x))] = [(0)(0.35)]+[(1)(0.65)] = *0.65* E(y) = sum of all [(y)(P(y))] = [(0)(0.65)]+[(1)(0.35)] = *0.35* 2) Var(x) = sum of all[(x-E(x))^2(P(x))] = [(0-0.65)^2(0.35)]+[(1-0.65)^2(0.65)] = *0.2275* Var(y) = sum of all[(y-E(y))^2(P(y))] = [(0-0.35)^2(0.65)]+[(1-0.35)^2(0.35)] = *0.2275* 3) Cov(x,y) = sum of all[(x-E(x))(y-E(y))(P(x,y))] = [(0-0.65)(0-0.35)(0.20)]+[(0-0.65)(1-0.35)(0.15)]+[(1-0.65)(0-0.35)(0.45)]+[(1-0.65)(1-0.35)(0.20)] = *-0.0275 (inversely related!)* Part 2: --> P(x) = -1 = 0.50 --> P(x) = 1 = 0.50 --> P(y) = 3 = 0.85 --> P(y) = 5 = 0.15 4) E(x) = sum of all [(x)(P(x))] = [(-1)(0.55)]+[(1)(0.50)] = *-0.05* E(y) = sum of all [(y)(P(y))] = [(3)(0.85)]+[(5)(0.15)] = *3.3* 5) Var(x) = sum of all[(x-E(x))^2(P(x))] = [(-1+0.05)^2(0.50)]+[(1+0.05)^2(0.50)] = *1.0025* Var(y) = sum of all[(y-E(y))^2(P(y))] = [(3-3.3)^2(0.85)]+[(5-3.3)^2(0.15)] = *0.51* 6) Cov(x,y) = sum of all[(x-E(x))(y-E(y))(P(x,y))] = [(-1+0.05)(3-3.3)(0.40)]+[(-1+0.05)(5-3.3)(0.10)]+[(1+0.05)(3-3.3)(0.45)]+[(1+0.05)(5-3.3)(0.05)] = *-0.1 (inversely related!)*
Problem Set 3.5 Part 3: Z ~ N(0,1) 7) P[Z > 1.44] = ? 8) P[Z > 2.31] = ? 9) P[Z < 1.59] = ? 10) P[0.13 < Z < 0.34] = ? 11) P[−2.02 < Z < −1.44] = ? Part 4: X ~ N(50,100)??? 12) P[X > 64] = ? 13) P[X < 73] = ? 14) P[58 < X < 67] = ? Part 5: Z ~ N(0,1) . 15. P[Z < ??] = 0.6844 16. P[Z < ??] = 0.7939 17. P[Z > ??] = 0.0934 Part 6: Z ~ N(0,1) . Find the symmetric intervals that have these probabilities. (In other words, the upper and lower limits are the same value with different signs.) 18. P[?? < Z < ??] = 0.5407 19. P[?? < Z < ??] = 0.8638 20. P[?? < Z < ??] = 0.9815
Part 3: Z ~ N(0,1)--> mean=0; SD=1 (no need for conversion bc this is the normal distribution) --> Use normal distribution to find Z-score, then find the area under the curve for probability (all areas fall to the *right* of the z score) 7) P[Z > 1.44] = 0.075 8) P[Z > 2.31] = 0.01 9) P[Z < 1.59] = 1-0.056 = 0.944 10) P[0.13 < Z < 0.34] = P(0.13)-P(0.34) = 0.448-0.367 = 0.081 11) P[−2.02 < Z < −1.44] = P(1.44)-P(2.02) = 0.075-0.022 = 0.053 --> Remember that normal distribution = symmetric, so just find the area between 1.44-2.02 Part 4: X ~ N(50,100)--> mean=50; SD=100 12) P[X > 64] = 0.081 --> Convert X to z=(64-50)/100 = 1.4--> find on normal distribution 13) P[X < 73] = 0.989 --> (73-50)/100 = 2.3 = 0.011--> find complement for Z<2.3 = 1-0.011=0.989 14) P[58 < X < 67] = 0.167 --> (58-50)/100=0.8; (67-50)/100=1.7 --> P(0.8)=0.212; P(1.7)=0.045--> area=0.212-0.045=0.167 Part 5: Z ~ N(0,1) . 15. P[Z < ??] = 0.6844 --> 1-0.6844 = 0.3156--> Z=0.374; P(Z>0.364)=0.3156 --> ??=0.364 16. P[Z < ??] = 0.7939 --> 1-0.7939 = 0.2061--> Z=0.82; P(Z>0.82)=0.2061 --> ??=0.82 17. P[Z > ??] = 0.0934 --> ??=1.32 Part 6: Z ~ N(0,1) . Find the symmetric intervals that have these probabilities. (In other words, the upper and lower limits are the same value with different signs.) --> Just asking for bordering Z score, find this through deriving the Z score for p-area=alpha/2 18) P[?? < Z < ??] = 0.5407 --> (1-0.5407)/2 = 0.22965--> reverse trace to find Z score --> ?? = 0.74; -0.74 19) P[?? < Z < ??] = 0.8638 --> (1-0.8638)/2 = 0.0681--> reverse trace to find Z score --> ?? = 1.49; -1.49 20) P[?? < Z < ??] = 0.9815 --> (1-0.9815)/2 = 0.00925--> reverse trace to find Z score --> ?? = 2.36; -2.36
Problem Set 3.5 Part 7: Calculate the standard error (that is, the square root of the variance) in the following estimates. 21) Standard error in X, in a sample of N = 100 with sX^2 = 25 22) Standard error in X, in a sample of N = 20 with sX^2 = 18 23) Standard error in pˆ, in a sample of N = 400 with pˆ= 0.5 24) Standard error in pˆ, in a sample of N = 188 with pˆ = 0.62 Part 8: Calculate the upper and lower limits of these confidence intervals. 25) The 95% confidence interval for µ, in a (large) sample of N = 100 with X = 75 and sX^2 = 25 26) The 95% confidence interval for µ, in a (small) sample of N = 20 with X = 8.3 and sX^2 = 18 27) The 85% confidence interval for p, in a (large) sample of N = 400 with p^ = 0.5
Part 7: Calculate the standard error (that is, the square root of the variance) in the following estimates. 21) Standard error in X, in a sample of N = 100 with sX^2 = 25 --> SE=s/(n^0.5)=5/10=0.5 22) Standard error in X, in a sample of N = 20 with sX^2 = 18 --> SE=s/(n^0.5)=18/(20^0.5)=4.472 23) Standard error in pˆ, in a sample of N = 400 with pˆ= 0.5 --> SEp=[p(1-p)/n]^0.5=[(0.5(1-0.5))/400]^0.5=0.025 24) Standard error in pˆ, in a sample of N = 188 with pˆ = 0.62 --> SEp=[(0.62(1-0.62))/188]^0.5=0.0354 Part 8: Calculate the upper and lower limits of these confidence intervals. --> Formula for CIx=mean +/- z(SD/n^0.5)=mean +/- (z x standard error of mean) --> CIp=proportion +/- [z x [(p(1-p))/n]^0.5] --> z=z value for the confidence interval; reverse trace from the tail area of one side not included in the confidence interval! --> ie- if 95% confidence interval, end tail of one side not included = (0.05/2) = 0.025--> corresponding z=1.96 =D 25) The 95% confidence interval for µ, in a (large) sample of N = 100 with mean = 75 and sX^2 = 25 -: 75-1.96(5/10)=74.02 +: 75+1.96(5/10)=75.98 --> CIx=75±0.98--> (74.02, 75.98) 26) The 95% confidence interval for µ, in a (small) sample of N = 20 with mean = 8.3 and sX^2 = 18 -: 8.3-1.96(4.2426/4.472)=6.4405 +: 8.3+1.96(4.2426/4.472)=10.1595 --> CIx=8.3±0.9487--> 27) The 85% confidence interval for p, in a (large) sample of N = 400 with p^ = 0.5 --> individual tail area=0.15/2=0.075--> z=1.44 --> CI of sample p=p±[z x ((p(1-p)/n)^0.5] -: 0.5-[1.44(0.5(1-0.5))/400)^0.5] +: 0.5+[1.44(0.5(1-0.5))/400)^0.5] --> CIp=0.5±0.03--> (0.47, 0.53)
Sample statistics: E[x bar] = mean of population Var(x bar) = variance of x/N Type of distribution: -->
Sample statistics: E[x bar] = mean of population Var(x bar) = variance of x/N Type of distribution: --> IF X~=N; x bar ~= N --> If N is large, x bar ~=N Sample proportion: p hat = # observations with characteristics/# total observations E[p hat]=p Var(p hat) = p(1-p)/N Type? --> P hat ~=N, when large sample P=population P hat=sample SD in x bar = SD/(N^0.5) Sd in p hat = (p(1-p)/N)^0.5 St. error in X bar = Sx/(N^0.5) St. error in p hat = [(p hat(1-p hat))/N]^0.5
Random variable x = distributed normally with population mean of 12.46, population variance of 13.11. --> N(12.46, 13.11) --> N(mean, variance) a) P(X is between 12 and 13)=? b) Sample of n=5, P(X is between 12 and 13)=? c) Sample of n=5, P(all 5 observations are between 12 and 13)=?
a) z=(x-mean)/SD--> P just asking for area between 12-13 --> Normal dist. table gives area to the LEFT of value i. z=(12-12.46)/13.11^0.5 = -0.12704468 --> 0.08914065 ii. z=(13-12.46)/13.11^0.5 = 0.14913941 --> 0.19482067 iii. P(between 12-13) = 0.19482067-0.08914065 = 0.109825 b) sample z=(x-mean)/(SE)--> SE=SD/N^0.5 --> Normal dist. table gives area to the LEFT of value i. z=(12-12.46)/(13.11/5)^0.5 = -0.28408054 --> 0.19932456 ii. z=(13-12.46)/(13.11/5)^0.5 = 0.33348585 --> 0.43563225 iii. P(between 12-13) = 0.43563225-0.19932456 = 0.24244189 c) P(event happens n times) = [P(event)]^n --> This is bc it's essentially asking for intersectional probabilities=P(1n2n3n4n5) are all between 12 and 13 (given this is independent) (0.24244189)^5 = 0.0008376