stat107 midterm 2
You roll two 8-sided dice. What is the probability of both dice rolling odd values?
16/64
If we apply linear regression on the data to predict the tip amount based on the number of eating out, what would be the root mean squared error (RMSE)?
sqrt(1-r^2)*SDy
Select all of the following that are true.
(c) If the point is exactly on the regression line, the residual is zero. (d) For any regression line, the SUM of the errors is always zero.
Suppose you have a bag with 5 red, 3 blue, and 3 white marbles. Write a simulation in Python to simulate drawing two marbles with replacement. Run your simulation 10000 times and store your results in as a DataFrame df with the columns draw1 and draw2. Values in both draw1 and draw2 must be numbers (integers) that each contain a value between 1 and 11 (the total number of marbles representing different colors). With 5 red, 3 blue and 3 white marbles, this mapping would be that 1-5 represents red, 6-8 represents blue, and 9-11 represents white).
# Write your simulation here: def simulation1(): draw1 = random.randint(1,11) return draw1 def simulation2(): draw2 = random.randint(1,11) return draw2 # Make sure to store your simulation result as a DataFrame `df`: data=[] for i in range(10000): d = {"draw1": simulation1(), "draw2": simulation2()} data.append(d) df = pd.DataFrame(data)
Suppose jurors make the right decisions about guilt and innocence 80% of the time and that 85% of all defendants are truly guilty. What is the chance of a convicted (found guilty) defendant being truly guilty?
( (1 - 0.8) * (1000 - 0.85* 1000)), and we get P(Convicted) = 710.0/1000. So P(Guilty | Convicted) = 0.8 * 850.0 / 710.0, or 0.9577464788732394.
You roll a 10-sided die. What is the probability of the die rolling a 7 OR the die rolling a value less than or equal to 7?
(1 + 7 - 1)/10 = 7/10
You roll two 6-sided dice. What is the probability of both die rolling the value 2 OR both dice rolling odd values?
(1 + 9 - 0)/36 = 10/36.
You roll two 8-sided dice. What is the probability of both dice rolling even values OR both die rolling the value 3?
(16 + 1 - 0)/64 = 17/64 64 possible outcomes
You roll two 8-sided dice. What is the probability of both dice rolling even values OR the sum of both dice being equal to 5?
(16 + 4 - 0)/64 = 20/64
This question pertains to a standard 52-card deck (52 total cards, 4 suits each with 13 possible values: ace, two, king, etc). You draw two cards at random with replacement. What is the probability that both cards are not tens?
(48/52)^2
You roll a 10-sided die. What is the probability of the die rolling a value less than or equal to 6 OR the die rolling a value strictly less than 3?
(6 + 2 - 2)/10 = 6/10.
You roll a 10-sided die. What is the probability of the die rolling a value less than or equal to 7 OR the die rolling an odd value?
(7 + 5 - 4)/10 = 8/10
Select all of the following that are true about correlation coefficient r.
(a) The correlation coefficient is always between -1 and 1. (d) The closer the points hug a line with a positive slope, the closer r is to 1. Changing units does NOT change r.
In another one of your classes, you are taking an exam with 5 multiple choice questions. Each question has five possible answers, (A), (B), (C), (D), and (E) and only one of the five answers is correct. If you guess randomly on each question, what is the probability that you answer all of the questions correctly?
0.00032 (randomly selected answer is correct is 0.2, so for 5 questions (0.2^5)
In a recent study, a group of 1,191 American parents were asked the following question: Do you think basic coding should be a mandatory class in middle school? 41% answered "Yes" and everyone else answered "No". The parents in the poll were chosen as a simple random sample.
0.014251548 sd/sqrt of n
This question pertains to a standard 52-card deck (52 total cards, 4 suits each with 13 possible values: ace, two, king, etc). You draw two cards at random with replacement. What is the probability that at least one card is a queen?
0.14792899408
This question pertains to a standard 52-card deck (52 total cards, 4 suits each with 13 possible values: ace, two, king, etc). You draw two cards at random without replacement. What is the probability that at least one card is a two?
0.1493212
A box has 15 tickets; 3 blue, 6 red, and 6 yellow. What is the chance of drawing 2 tickets without replacement and getting no reds?
0.3428 (9/15*8/14)
There are 12 birth stones. Assume that each of the birth stones is equally likely. In a group of 5 random people, what is the chance that there is at least 1 birth stone match? (i.e., at least 2 people share a birth stone.)
0.618055555555 The easiest approach to this question is to calculate the probability that none of these people share a birth stone. Assume one of these people has some birth stone, then there are 11 birth stones that are different, so the probability that the next person does not share a birth stone with the first person is 11/12. There are then 10 birth stones a person may have and not match with either of the first two people, so the third person has a probability of 10/12 of not sharing a birth stone with either of those people. Notice that the Nth person has a (12-N)/12 chance of not sharing a birth stone with the people before them. Then, because we want this condition to be true for every person, we multiply these probabilities together to find the overall probability that nobody shares a birth stone; in this case 7920/20736. To find the probability that two people DO share a birth stone, we negate that probability by 1 - P(no match) = (20736 - 7920)/20736 = 12816/20736.
This question pertains to screening for lung cancer. Suppose 90% of people who get tested have lung cancer. If someone has cancer the test will correctly give a positive result 60% of the time and if they don't have cancer the test will correctly give a negative result 60% of the time. If someone tests negative, what is the probability they really have cancer?
0.857 (negative has cancer/total negative
In another one of your classes, you are taking an exam with 5 true/false questions. If you guess randomly on each question, what is the probability that at least one question has an answer of false?
0.96875 1-p(none are false) 1-(0.5)^5
You roll two 8-sided dice. What is the probability of both dice rolling odd values?
1 - (1 - 5/20)^2 or 0.4375.
Below is survey results about student handedness and their gender. If we select 2 students randomly with replacement, what is the probability that not all are left-handed?
1 - (51/602)^2 = 0.9928229268992615.
In another one of your classes, you are taking an exam with 3 true/false questions. If you guess randomly on each question, what is the probability that at least one question has an answer of true?
1-(.5^3)
There are 12 birth months. Assume that each of the birth months is equally likely. In a group of 10 random people, what is the chance that there is at least 1 birth month match? (i.e., at least 2 people share a birth month.)
1-(12!/(10! *12^12)
Below is survey results about student handedness and their gender. If we select 4 students randomly with replacement, what is the probability that not all are right-handed?
1-(all) 1-(all)^4 1 - (575/674)^4 = 0.4702977805672721.
Based on the 2017-2018 Average Starting Teacher Salaries by State measured by NEA, the average starting salary for teachers in Idaho is $34,801, the sample size is 4,356 and the standard deviation is $278.41. What is the margin of error for a 99% confidence interval for the average teacher starting salaries in Idaho (in USD)? Your answer must be be the error that needs to be added/subtracted. Specifically, provide the value for X: $34,801 ± X. If your answer is in decimal, make sure to include at least 2 NON-ZERO digits after the decimal point.
10.8656 (find SE and Z ~~ sd/sqrt n and Z of CI) multiply Z and SE
There are 12 zodiac signs. Assume that each of the zodiac signs is equally likely. Randomly pick 2 people from our class. What is the chance that NONE of the 2 people share your zodiac sign?
11/12*11/12
A survey is carried out at a university to estimate the percentage of undergraduate students who drive to campus to attend classes. If the SD of the population was 35%, then how many people would we need to poll to get a 80% confidence interval with a 4% margin of error?
126 --> n = (z*sd/MoE)^2
1,600 college freshmen were randomly selected for a national survey. Among survey participants, the mean grade-point average (GPA) was 3 , and the margin of error was ±3% for the 80% confidence interval. If we want to get a 99% confidence interval with a small margin of error ±6%. How many students do you need to select?
1615 (use the given info to find SD by using MoE formula). then plug in to find n using MoE
A roulette wheel has 18 reds, 18 blacks, and 2 greens. If the ball lands on red, you win and if it lands on another color, you lose. You play roulette 80 times in a row and count how many times you win. What is the expected value of the number of wins?
18/38*80
Suppose we want to estimate the average GPA of a highschool student in Illinois. We draw a random sample of 1500 students from a population of 1,000,000 students and recorded their GPAs. We find that the average GPA in our sample is 3.2, and the standard deviation of the sample is 0.4. What is the 99% confidence interval?
3.2 +/- 0.027 (find SE and Z ~~ sd/sqrt n and Z of CI)
This question pertains to a standard 52-card deck (52 total cards, 4 suits each with 13 possible values: ace, two, king, etc). You draw two cards at random with replacement. What is the probability that the first card is not a heart and the second card is a club?
39/52*13/52
This question pertains to a standard 52-card deck (52 total cards, 4 suits each with 13 possible values: ace, two, king, etc). You draw two cards at random with replacement. What is the probability that the first card is a queen and the second card is not a nine?
4/52*48/52
You draw a single card from a standard 52-card deck (52 total cards, 4 suits each with 13 possible values: ace, two, king, etc). What is the probability of drawing a jack OR drawing a club?
4/52+13/52-1/52
In a recent study, a group of 1,572 American parents were asked the following question: Do you think basic coding should be a mandatory class in middle school? 58% answered "Yes" and everyone else answered "No". The parents in the poll were chosen as a simple random sample.
58% (answer is given)
We have been rolling dice so many times in this class! But let's do this one more time! Suppose a die is rolled 43 times, and the number of 4's appearing is counted. What are the expected value (EV) and standard error (SE) of the number of 4's? What is the expected value of the number of 4's?
7.17
Full time students at UIUC usually register for between 12 and 18 credit hours. Let X be the credit hours taken this semester and P(X) be the fraction of students registered for that number of credit hours. What is the probability a randomly selected student is registered for at least 16 credit hours?
A student taking at least 16 credit hours is taking 16 OR 17 OR 18 hours, so we can use the Addition Rule to find the probability
Let X be the random variable that looks at the number of workouts that I will do in a week. The distribution of X is shown in the table below. Find the expected value of X. Find the standard error of X.
EV = sum of all x*p(x) SE = square root ((x^2*p(x)) - EV)
This question pertains to tossing a fair coin. What is the chance of either getting all tails or no tails on 5 tosses?
Each coin toss has two possible outcomes, meaning there are 32 possible sequences. The sequence of all tails is just one of these possible outcomes, while the sequence of no tails is a separate outcome, so the probability of getting either sequence is 2/32.
In another one of your classes, you are taking an exam with 5 true/false questions. If you guess randomly on each question, what is the probability that you answer all of the questions correctly?
For each individual questions, the probability that randomly select the correct answer is 0.5. So for 5 questions, the probability that answering all of the questions correctly is 0.03125.
Select all of the following that are true about hypothesis test.
If you have a small sample ( 25) drawn from a Normal population with an unknown SD, use t-test. If you have a large sample ( 25) drawn from a Normal population with an unknown SD, use z-test.
Luther loves to decide what's for dinner according to the weather. He uses a table to describe the probability of each choice under different weather.
P(choose Sakanya) = P(sunny) * P(Sakanya| sunny) + P(rainy) *P(Sakanya| rainy) + P( cloudy)*P(Sakanya| cloudy) = ...
A box has 13 tickets; 2 blue, 1 red, and 10 yellow. What is the chance of drawing 2 tickets with replacement and not getting all reds? NOT ALL - ALL
The probability of not drawing all reds is 1 - P(drawing all reds). 1 - (1/13)^2 or 0.9940828402366864.
If we increase the size of the sample and the standard deviation remains the same, what will happen to the size of the 95% confidence interval?
The size of confidence interval will become smaller (narrower).
A roulette wheel has 18 reds, 18 blacks, and 2 greens. If the ball lands on red, you win and if it lands on another color, you lose. You play roulette 100 times in a row and count how many times you win. What is the standard error of the number of wins?
The standard error is found by the square root of (P(win) * P(lose) * number of plays). So SE = (18/38 * 20/38 * 100) ^ 0.5 = 4.9930699897395465.
Suppose jurors make the right decisions about guilt and innocence 70% of the time and that 80% of all defendants are truly guilty. What is the chance of a convicted (found guilty) defendant being actually innocent?
We can find the number of defendants convicted by adding correctly convicted guilty defendants (0.7*0.8*1000) and the number of innocent defendants incorrectly convicted ( (1 - 0.7) * (1000 - 0.8* 1000)), and we get P(Convicted) = 620.0/1000. So P(Innocent | Convicted) = (1-0.7) * 200.0 / 620.0, or 0.09677419354838711.
Suppose Spongebob has 5 coins: 2 Dimes (10 cents) and 3 Quarters (25 cents). Let X denote the amount Patrick gets if he steals TWO coins at random. Find the expected value of the amount Patrick Bart gets, E(X) (in dollars).
below is pmf of X: P(X=10) =P(both nickels) =(2/7)*(1/6)=1/21 P(X=15)=P(1nickel and other dime) =2*(2/7)*(3/6)=2/7 P(X=20)=P(both dime) =(3/7)*(2/6)=1/7 P(X=30)=P(one nickel and 1 quarter)=2*(2/7)*(2/6)=4/21 P(X=35)=P(1 dime and 1 quarter)=2*(3/7)*(2/6)=2/7 P(X=50)=P(both quarter) =(2/7)*(1/6)=1/21 hence expected value E(X) =xP(x)=10*1/21+15*2/7+20*1/7+30*4/21+35*2/7+50*1/21 =180/7 =25.71
A study is conducted on a group of 80 freshmen at a university. Each student is asked whether he/she drinks coffee or not. The results are recorded in the following table: Draw 3 students WITH replacement. What is the probability that ALL 3 students are coffee drinkers?
coffee drinkers/total)^3
Write a simulation in Python to simulate rolling three 12-sided dice. Run your simulation 10000 times and store your results in as a DataFrame df with the columns die1, die2, and die3 that each contain a value between 1-12 representing the 12 possible rolls of the 12-sided die.
data=[] for i in range (10000): die1 = random.randint(1,12) die2 = random.randint(1,12) die3 = random.randint(1,12) d = {"die1":die1, "die2":die2, "die3":die3} data.append(d) # Make sure to store your simulation result as a DataFrame `df`: df = pd.DataFrame(data)
Write a simulation in Python to simulate spinning two wheels each with all of the numbers 1 through 16. Run your simulation 5000 times and store your results in as a DataFrame df with the columns wheel1 and wheel2.
data=[] for i in range (5000): wheel1 = random.randint(1,16) wheel2 = random.randint(1,16) d = {"wheel1":wheel1, "wheel2":wheel2} data.append(d) # Make sure to store your simulation result as a DataFrame `df`: df = pd.DataFrame(data)
Write a simulation in Python to simulate spinning new roulette-like wheel. The new, modified roulette wheel has 17 reds, 17 blacks, and 4 greens. If the ball lands on red, you win; if it lands on another color, you lose. You play "modified roulette" 140 times in a row and count how many times you win. Create 3000 simulations of the above spinning process and store your results in as a DataFrame df with the columns winTimes that contains a value representing win times for each time you play.
def playRoulette(games): winTimes = 0 for i in range(games): wheel = random.randint(1,38) if wheel <= 17: winTimes = winTimes + 1 return winTimes # Write your simulation here: data = [] for i in range(3000): d = {"winTimes":playRoulette(140)} data.append(d) # Make sure to store your simulation result as a DataFrame `df`: df = pd.DataFrame(data)
A simulation has already been done of a bag with 5 red, 5 blue, and 5 white marbles where two marbles are drawn and stored in DataFrame columns draw1 and draw2. The value of the data in the DataFrame column draw1 and draw2 are integers that represent a color. Specifically: 1-5 represents red, 6-10 represents blue, and 11-15 represents white Using the results of this simulation that is provided to you in the DataFrame df, write the Python code to calculate the estimated probability of drawing exactly one red marble and one white marble and store that probability is prob.
df_success = df[((df["draw1"]<=5)&(df["draw2"]>=11)) | ((df["draw1"]>=11)&(df["draw2"]<=5))] prob = len(df_success)/len(df)
900 college freshmen were randomly selected for a national survey. Among survey participants, the mean grade-point average (GPA) was 3.3 , and the margin of error was ±3% for the 99% confidence interval. If we want to get a 95% confidence interval with a small margin of error ±3%. How many students do you need to select?
find SD from the first set first and then find n
A simulation has already been done of a bag with 6 red, 5 blue, and 4 white marbles where two marbles are drawn and stored in DataFrame columns draw1 and draw2. The value of the data in the DataFrame column draw1 and draw2 are integers that represent a color. Specifically: 1-6 represents red, 7-11 represents blue, and 12-15 represents white Using the results of this simulation that is provided to you in the DataFrame df, write the Python code to calculate the estimated probability of drawing exactly one red marble and one white marble and store that probability is prob.
howie = df[((df["draw1"] <= 6) & (df["draw2"] >= 12)) | ((df["draw1"] >= 12) & (df["draw2"] <= 6))] prob = len(howie)/len(df)
A complete simulation of an event has already been written and run for you. The simulation is of spinning two wheels each with all of the numbers 1 through 10. Each spin in stored as a row (observation) in the DataFrame df under the columns wheel1 and wheel2. Using the simulation results in df, find an estimate of the probability that you get at least one value less than 6 and the second value is greater than 6.
howie = df[((df["wheel1"] < 6) & (df["wheel2"] > 6))] prob = len(howie)/len(df)
Company A is developing a new pregnancy test. Based on their experiments, the test gives the correct result about 94% of the time, meaning the test result is positive when the person is pregnant, it is negative when the person is not pregnant. Suppose that 65% of women who take the test are pregnant. If the test result is positive, what is the probability that the person is actually NOT pregnant?
make a matrix and divide the two conditions
Karle walks to campus or takes the bus to campus each day. After many observations, we find the probability that Karle arrives via bus is 0.35. What is the probability Karle arrives on the bus for the FIRST time on the fourth day (Karle did NOT take the bus before then)?
p(karle arrives by bus for the first time on the 4th day) = p(karle does not arrive by bus on the first 3 days and arrives by bus on the 4th day) = (1-0.35)^3 * (0.35) = 0.09611875
Suppose jurors make the right decisions about guilt and innocence 50% of the time and that 70% of all defendants are truly guilty. What is the chance of an aquitted (found not guilty) defendant being actually guilty?
s P(Guilty | Aquitted) = P(Guilty) * P(Aquitted | Guilty) / P(Aquitted). P(Aquitted | Guilty) P(Guilty | Aquitted) = (1-0.5) * 700.0 / 500.0, or 0.7.
A multiple-choice test has 48 questions. Each question has 3 choices. If you guess at random on all the questions, what are the expected value (EV) and standard error (SE) of the number of correct answers on your test? What is the standard error?
sqrt(n*p*q) --> sqrt(100*0.25*0.75)