Statistics Final Exam Study Guide

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Open Slavic data and use the appropriate sorting in order to answer the following questions: 1. What is the average salary for employee with ID number= 797976 2.What is the standard deviation for the salary for employee with ID number=797976

1. 1090 2.48

Refer to the box-plot to answer questions: 1.Which of the following dates has the largest inter-quartile range 2. Which of the following dates has the highest median 3. Which of the follow dates is most symmetric

1. 15 july 2. 12 july 3. 11 july

Open hockey data. SETUP: some experts think that some players in USHL league, on average, play fewer games than players in WHL league. Given the data your job is to prove is disprove this belief 1. What test did you perform 2. Statistical interpretation 3. Conclusion

1. One sided t-test 2. None of these 3. No, we cannot confirm this belief

Open cars04 data.SETUP: use the highway mileage to construct the appropriate histogram (use bin size starting with 20) Answer the questions: 1. The chart you created resembles? 1. Which interval has the highest frequency 2. How many cars have highway gas mileage below 30? 3. For this data which of the following is true?

1. Skewed Shape 1. 20-25 2. 167 3. StDev < Median < Average

Imagine an experiment where three dice are tossed and the number on each die is recorded under Die1, Die2, Die3. Answer the following questions about the sum of the three numbers recorded from: Die1+Die2+Die3 1. What is the probability that Die1+Die2+Die3=10 2. What is the probability that Die1+Die2+Die3> 10 3. What is the probability that Die1+Die2>9

1. The probability is approximately 10% 2. The probability is app. 50% 3. The probability is app. 15%

A random experiment was conducted where Person A rolled two dice and recorded the product of the two dice. Person B just rolled one die, and recorded the number. 1. What is the probability that Person A observes number 7 2. What is the probability that person B obtains the number 5 or 6 3. Which of the two persons will more likely obtain the number 6

1. Zero 2. about 32% 3. Person B

Use houseprice data and via multiple regression select three variables that predict the house selling price the best. Make another table with these three variables and answer the questions 1. one of the coefficients should be negative and very close to zero. What is its value and what is the interpretation of the negative sign 2. Which of the three variables has the best P-value and what is the P-value 3. Using this second table predict the selling price of a house that is 5 years old, has 2 and 1/2 bathrooms and 3 rooms 4. Based on the table would you characterize the regression fit and the prediction as Poor, good, very good, or excellent

1. -0.047; This is how much the selling price of the house will decrease in hundred thousand as the house age by a year 2. #bathrooms has the best P-value; P-value= 9.17E-05 3. 11.5 hundred thousands 4. Very good

Three coins were tossed, answer the questions: 1. What is the probability that the first coin has a "tail" as an outcome 2. What is the probability that the first coin is "head" and the second coin is "tail" as the outcome? 3. What is the probability that the number of "tails" on these three coin equal to 3?

1. 1/2 2. 1/4 3. 1/8

Sort the data (from smallest to largest) from: Slavic, by two keys, by CompanyID and EmployeeID 1. After this sorting, what are the numbers in Cells A22 2. After sorting, use data from company with ID=71 and create a scatter plot chart where x-axis is 401KContribution while Y-axis is matching. Statement: Dots on the left side of the chart lie approximately on a straight line 3. For the same chart. Statement: Dots on the right side of the chart follow almost perfect line 4.On this same chart the dot with highest Y-axis has also highest X-axis value

1. 137445 2. True 3. False 4. False

Use the Us_Crime.txt data to extract 4 columns of data: Murder, Robbery, Larceny, and Burglary for West Virgina. Your task is to use multiple regression and predict the Murder count based on the other three variables. Create the appropriate regression table and answer the questions 1. What is the standard error and what would be a one sentence interpretation of this number 2. What is the coefficient next to burglary and what would be a one sentence interpretation of this number 3. What is the statistical interpretation of the number 57.77? 4. Give an interval prediction of the murder count in W.V if we recorded 5000 counts of robbery, 15000 counts of larceny, and 236600 counts of burglary

1. 17.53; it is the number used to construct the prediction interval of the number of murders 2. None of these 3. It is the number of murders when there are not robberies, larcenies, or burglaries 4. [1803.03, 1838.09]

Use the SAT data and create a multiple regression tale but this time only input two variable: Highschool GPA and Letters. 1. If an incoming student has a High school gpa= 2.5 and Letters= 7.5 what would his predicted college GPA be 2. What is the approximate error of this prediction 3. IF this student decreases his high school GPA by 1.5 point how much would the students predicted college gpa change 4. If two students have the identical high school gpa's but one has a letter score 2 points higher than the other, how much higher will his GPA be>

1. 2.24 2.0.63 3.-0.83 4. 0.014

Use the SAT data and via multiple regression select the two variables that predict college GPA the best. Make another table with these two variables and answer the questions based on this table. 1. What is the predicted college GPA for the student with high school GPA= 4, SAT=1000, and Letters= 10? 2. What is the standard error of this prediction? GIve the interval prediction 3. What is the coefficient of SAT and what is the interpretation? 4. If student A's high school SPA is 0.75 point higher than student B's, and on th eother hand, student B's SAT score is higher than student A by 300 points, which of the two students will have higher predicted college GPA and by how much?

1. 2.76 2. 0.587;[2.17,334] 3. 0.001217: FOr an additional 100 points scored on the SAT, a students college GPA increases by 0.1217 points 4. Student B by 0.06 points

The above two tables are related to the variable WEIGHT. The first is a regression table with 5-variables predicting the WEIGHT while the second table contains the summary statistic related to WEIGHT data. 1.How many people participated in this study 2. Which of these quantities is the Least influenced by outliers 2. IF you need to pick o=to variables and increase their value by one unit, which two would you pick in such a way that the prediction changes by the largest amount

1. 252 2. Median 3. Variable 1 and Variable 3

Use houseprice data and via multiple regression select three variables that predict the house selling price the best. Make another table with these three variables and answer the questions 1. Identify the largest coefficient. What is its values and what is the interpretation of this number 2. Which of the three variables has the second largest _value and what is this P-value? 3. Using this second table predict the selling price of a house that is 10 years old, has 2 bathrooms and 3 rooms 4. What is the Y intercept and its interpretation

1. 4.411; For each additional bathroom the selling price of the house increases by 4.41 hundred thousand dollars 2. Age has the second best P-value; P-value= 0.018 3. 9.06 hundred thousand 4. -1.437; it has no meaningful interpretation in this case

1. If a person rolls two dice, what is the probability that the sum of the numbers is equal to 8? 2. If a person tosses three coins what is the probability that he will have three tails?

1. 5/36 2. 1/8

Use the tornadoes data. Your task is to use the months of April and MAy to predict the tornado activity in July. 1. IF April had 100 tornadoes and May had 200 tornadoes, what would be your prediction for the number of tornadoes in July? 2. What is the app. error of this prediction 3. The intercept here has a value of 79.32. What is the STATISTICAL Interpretation of this number

1. 91.4 2. 45.05 3. This is the predicted number of tornadoes in July when May and April have zero tornadoes

Open File: Slavic data. Use only company 6354 from state Alabama. Create a scatter plot chart where X-axis is 401k Contribution and Y-axis is Paycheck. Label the chart properly 1. What is the maximum 401KContribution 2. There is a cluster of "dots" who X values are less than 200. Among these dots there a few standing out with highest Y-axis value. What is this value 3. The chart has one isolated point in Lower right corner. What are the approximate (X and Y) coordinates of this point

1. 950 2. 6500 3. 900 & 4000

Open the file: US_crime.txt. Produce a chart a scatter plot, where the X-axis contains the number of burglaries in Vermont while the Y-axis contains the number of burglaries in Wyoming. Fit the trend line. 1. What does each dot represent? 2. What is the slope of the trend line 3. What is the highest number of burglaries recorded for Wyoming 4. What is the app. average number of burglaries for Vermont?

1. A year 2. 0.47 3. 4728 4. 4200

For a health human, a body temperature follows a normal distribution with a mean of 98.2 degrees F and Standard Deviation of 0.26 F. For an individual suffering with common cold, the average body temperature is 100.6 degrees F with a standard deviation of .54 degrees F. Simulate 10000 health 10000 unhealthy individuals and answer the questions 1. What is the approximate probability that a randomly picked, healthy individual will have a body temperature below 98.4 degrees F 2. What would be a range which would contain 68% of individuals having the common cold 3. What is the approximate probability that a randomly picked, unhealthy individual would have body temperature below 99 degrees F

1. About 77% 2. between 100.06 and 101.14 3. about 1%

You insurance company has coverage for three types of cars. The annual cost for each type of cars can be modeled using Gaussian distribution, with the following parameters. - Car Type 1 Mean= $520 and standard deviation $110 - Car Type 2 Mean= $ 720 and standard deviation= $170 - Car Type 3 Mean= $470 and standard deviation= $80 Use random number generator and simulate 1000 long columns, for each of these three cases. 1. What is the approximate probability that Car type 1 has annual cost less than $550 2. which for the three types of cars is most likely to cost more than $1000 3. For which of the three types we have the highest average cost

1. Between 55% and 70% 2. Type 2 3. Type 2

Upload US_Crime data and focus on the state of california. Your job is to cut and paste the data and create 7 columns and label them Murder, Rape, Robbery, Assault, Burglary, Larceny, and Car Theft respectively. Next you are to perform a regression where you are to predict the murder count using the remaining 6 variables. Finally pick the two variables with the best P-values and create a new table, this time using these two variables in order to predict the murder 1. What are the two variable you picked 2. After getting the new table, choose the variable with the larger coefficient and describe its interpretation 3. How good is the fit and why 4. Which of the two variables have a better P-value

1. Car Theft and robbery 2. None of the above 3. Excellent; R-square is very close to 1 4. Robbery

Upload CARS data.SETUP: it is believed that BMW" prices are higher than 40000. Given the data your job is to confirm or disprove this belief. 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Confidence Interval 2. 8594.31906 3. None of these 4. No, we cannot claim that the above assertion is correct

Use the brains data. SETUP: is it reasonable to claim that the average IQ of the population is less then 110 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Confidence Interval 2. None of these 3. We are 95% certain that the 110 is not within the confidence interval 4. Yes, I am confident that the above claim is correct

Use the Brains data. SETUP: is it reasonable to claim that the average head circumference is less than 56 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Confidence Interval 2. None of these 3. None of these 4. No, we cannot claim that the above claim is correct

Upload CARS data. SETUP: it is believed that Lincolns are heavier than 3500 pounds. Given the data your job is to confirm or disprove this belief. 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Confidence interval 2. 339.06234 3. None of these 4. Yes, I am confident that the above assertion is correct

Open hockey data. SETUP: can one claim that players from the SUPER Elit league, on average, create more than 15 assits? Given the data your job is to answer this question 1. What test did you perform 2. Statistical interpretation 3. Conclusion

1. Confidence interval 2. None of these 3. No, we cannot confirm the claim

The above two tables are related the variable DEALER COST. The first is a regression table with 4-variables predicting the DEALER COST while the second table contains the summary statistics related to DEALER cost data. Answer the questions 1. If variable 3 increases by 10 units, according to the table what would happen to the prediction 2. If all 4 variables increase by 1 unit, according to the table what would happen to the prediction 3. Which two numbers are designed to describe the center of the variabel, DEALER COST 4. Which number should be used in order to answer the question: IS variable 2 related to DEALER COST 5. Once a prediction is made, we often provide a range where we believe the true value will lie. What number is used to construct this range 4. Which number described the variation of the variable, DEALER COST, the best

1. Dealer cost will increase by 1731.4 2. The dealer cost would increase by approximately 597.97 units 3. 27385.8 & 23792 4. 0.0006 5. 7532 6. 14579

Use hurricanes data and via multiple regression select three input variables: Min-pressure, gender, category in order to predict the All-deaths. Pick one of the three variables that has the WORST P-value and re-do the regression using ONLY this one variable to predict the All-deaths. 1. What variable did you pick 2. What is the P-value associated to the variable in the new table 3. Based on the second table, how would you characterize the regression fit 4. The intercept has coefficient of 14.23 . Which is the best interpretation of this number

1. Gender 2. 0.372813918 3.Poor 4. The y Intercept; it is how many death is expected if variable is zero

Upload the MANBODY data. Check which of the following four categories (Age, weight, height, and knee) is negatively correlated to the BODYFAT category. Make a scattered plot with X representing the most negatively correlated category and Y representing the BODYFAT, plot the line and compute the R-square 1. What category did you choose for X 2. According to the convention table how would you describe the strength of the relation between the X and Y coordinates 3. What does each dot represent 4. Given the plot what is the estimated slope

1. Height 2. Poor 3. Person 4. None of the above

Upload US_Crime data, and make a chart with x-axis being the robbery in Vermont and y-axis being the robbery in Connecticut: 1. Plot the char and fitted line. What is the interpretation of the slope 2. The fitted line exhibits a positive slope. Is this expected or not, and why 3. Create a regression table. How does this P-value affect your slope interpretation

1. It predicts the increase in number of robberies in Ct. for each additional robbery in Vermont 2. Yes, this is expected since for the years when crime is higher through the US, one would expect that crime counts increase in these two states as well 3. It is very small, thus we are certain that crime in one state is related to the crime in another

Use hurricanes data and via multiple regression select three input variables: Min-pressure, gender, category in order to predict the All-deaths. Pick one of the three variables that has the best P-value and re-do the regression using ONLY this one variable to predict the All-deaths. 1. What variable did you pick 2. What is the P-value associated to the variable in the new table 3. Based on the second table, how would you characterize the regression fit 4. The intercept has coefficient of 3618.001. Which is the best interpretation of this number

1. Min-pressure 2. 9.98769E-05 3. Poor 4. The number does not make since since we cannot have atmospheric pressure of zero

Use the Tornadoes data to find the two moths that are the most correlated with the total annual tornadoes 1.This observed positive trend means that if we observe more tornadoes for this given months we will also see less tornadoes for the following month 2. This observed positive trend does not mean that as years go by we see more and more tornadoes 3. Which are the two months that are most correlated to the total number of tornadoes

1. No 2. Yes 3. May and June

Four friends took their final exam and the probability that each of them would get an "A" is 1/4. Use binomial distribution to answer the following questions. 1. What are the n and p in this model 2. What is the probability that 3 out of 4 friends got A's on the exam 3. Which of these is most likely to happen

1. None of the above 2. 0.046875 3. Only one friend got an A

Six friends took their final exam and the probability that each of them would get an "A" is 1/3. Use binomial distribution to answer the following questions. 1. What are the n and p in this model 2. What is the probability that 3 out of 6 friends got A's on the exam 3. Which of these is most likely to happen

1. None of the above 2. 0.2194 3. Only two friends got A's

You insurance company has coverage for three types of cars. The annual cost for each type of cars can be modeled using Gaussian distribution, with the following parameters. - Car Type 1 Mean= $520 and standard deviation $110 - Car Type 2 Mean= $ 720 and standard deviation= $170 - Car Type 3 Mean= $470 and standard deviation= $80 Use random number generator and simulate 1000 long columns, for each of these three cases. 1. What is app. probability that Car Type 2 has annual cost less than $550 2. Which of the three types of cars is least likely to cost more than $700 3. For which of the three types we have the highest probability that it will cost between $400 and $500

1. None of these 2. Type 3 3. Type 3

Use Tornados data and your statistical expertise to answer the questions: Is it reasonable to claim that on average October has less tornado-related deaths than April per year: 1. What test did you perform 2. What is the P-value 3. Statistical Interpretation 4. Conclusion

1. One sided t test 2. 0.003555059 3. None of the above 4. Yes, I am confident that the above claim is reasonable

Use Tornados data and your statistical expertise to answer the questions: Is it reasonable to claim that there are more tornados in April than in August 1.What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. One sided t test 2. 4.08753E-06 3. None of these 4. Yes, I am confident that the above assertion is reasonable

Upload CARS data. SETUP: it is believed that Jaguars' highway mileage is above that of Mercedes . Given the data your job is to confirm or disprove this belief 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. One-sided T-test 2. 0.050539587 3. None of these 4. No, we cannot claim that the above claim is correct

A random experiment was conducted where a Person A tossed five coins and recorded the number of "heads". Person B rolled two dice and recorded the sum of the two numbers. Simulate this scenario ( use 10000 long columns) and answer questions 10 to 13 1. Which of the two persons is more likely to get the number 4 2. Which of the two persons will have higher median among their outcomes 3. What is the probability that person B obtains number 5 or 6 4. Which of the persons has higher probability of getting the number 3 or smaller

1. Person A 2. Person B 3. about 25% 4. Person A

For a health human, a body temperature follows a normal distribution with a mean of 98.2 degrees F and Standard Deviation of 0.26 F. For an individual suffering with common cold, the average body temperature is 100.6 degrees F with a standard deviation of .54 degrees F. Simulate 10000 health 10000 unhealthy individuals and answer the questions 1. If person A is health and person B has a cold, which of the events are least likely to happen 2. What would be a range [A to B] which would contain 95% of health individuals 3. What is the approximate probability that a randomly picked, unhealthy individual would have a body temperature below 100 F

1. Person A will have temperature lower than 97.5 degrees 2. Between 97.68 and 98.72 3. about 13%

Open MathCreditsSalary data. SETUP: it is believed that the more math credits that students take during their college years, the higher their salary will be. Given the data your job is confirm or disprove this assertion. 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Regression 2. 0.001661 3. Since the P-value is very small, we are confident that the slope of regression line is not zero 4. Yes, I am confident that the above assertion is correct

Upload the data HearthRate_Exercise. These data are based on 45 randomly chosen high school students.SETUP: is it reasonable to claim that students who exercise more have lower heart beat rate at rest. Given the data your job is to confirm or disprove this claim 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Regression 2. 0.014702 3. Since the P-value is very small, we are confident that the slope of regression line is not zero 4. Yes, I am confident that the above assertion is correct

Use Tornados data and your statistical expertise to answer the questions: Is it reasonable to claim that the number of tornados in April will affect the number of Tornados in August 1.What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Regression 2. 0.05370006 3. Since P-value is not small we cannot claim that the tornadoes in April will influence the tornadoes in August 4. No, I cannot claim that the above assertion is reasonable

Use Tornados data and your statistical expertise to answer the questions: Is it reasonable to claim that years with more observed tornados in January would be the years with more overall tornado deaths 1. What test did you perform 2. What is the P-Value/Margin of error 3. Statistical Interpretation 4. Conclusion

1. Regression 2. 0.406778889 3. Since P-valye is not small we cannot claim that the number of tornados in January will influence the tornados deaths 4. No, I cannot claim that the above assertion is correct

It is commonly believed that as men get older they accumulate more body fat. Use ManBody data and your statistical expertise to answer the questions: Is it a reasonable belief? 1. What test did you perform? 2. Statistical Interpretation? 3. Conclusion?

1. Regression 2. Since P-Value is small we are confident that the slope is not zero 3. Yes, i am confident that the above belief is accurate

Open pollution data. SETUP: One would expect that the more populated the city the more SO2 will be recorded. Given the data your job is to decide if this is a reasonable expectation 1. What test did you perform 2. Statistical interpretation 3. Conclusion

1. Regression 2. Since P-Value is small we are confident that the slope is not zero 3. Yes, this is a reasonable expectation

open Brain Data. SETUP: it is believed that the heavier the person, the larger the brain volume should be. Given the data your job is to confirm or disprove this assertion 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Regression 2. 0.379045702 3. None of these 4. No, we cannot claim that the above assertion is correct

Upload the CARS data and make the histogram using dealership cost column. Use the shape of the histogram to answer the following questions for Assessment: Upload the CARS data and make the histogram using weight column. Use the shape of the histogram to answer the following questions for Assessment: 1. The histogram chart should resemble the shape presented as 2. Compute the median, st.dev, and average for this whole data set. Choose the correct statement 3. The histogram chart should resemble the shape as: 4. Compute the median, st.dev., and average for this whole data set. Choose the correct statement

1. Skewed 2. StDev< Median <Average 3. Double bell shape 4. StDev<Average<Median

Upload the CARS data and make the histogram using retail cost column. Use the shape of the histogram to answer the following questions for Assessment: Upload the US_crime data and make the histogram using aggravated assault column. Use the shape of the histogram to answer the following questions for Assessment: 1. The histogram chart should resemble the shape presented as 2. Compute the median, st.dev, and average for this whole data set. Choose the correct statement 3. The histogram chart should resemble the shape as: 4. Compute the median, st.dev., and average for this whole data set. Choose the correct statement

1. Skewed 2. StDev< Median <Average 3. Skewed 4. Median< Average< StDev

Open the data wages. For the WAGES column compute the three quantities: Average, median, and Standard deviation: 1. Which of the following is true 2. Among these three number which one describes the amount of randomness or variation in the data 3. Among these three quantities which one(s) are NOT sensitive to the outliers

1. Standard deviation < median < average 2. Standard Deviation 3. None of these

The above two tables are related the variable DEALER COST. The first is a regression table with 4-variables predicting the DEALER COST while the second table contains the summary statistics related to DEALER cost data. Answer the questions 1. If variables 1 and 2 increase by 10 units, according to the table what would happen to the prediction 2. If all 4 variables are equal to zero according to the table what would happen to the prediction 3. What was the average observed dealer cost 4. Which number should be used in order to answer the question: Is variable 3 related to DEALER COST 5. Once a prediction is made, we often provide a range where we believe the true value will lie. What number is used to construct this range? 6. What was the range of the variable, DEALER COST?

1. The dealer cost will go up 2. The dealer cost would be 208.93 units 3. 27386 4. None of these 5.None of these 6. None of these

Open Hurricane data. SETUP: Test if there is a significant difference in death by hurricanes with category 1 and category 4. Answer the questions 1. What would be the correct null-hypothesis 2. The P-value is 0.128060089, what can be statistically concluded 3. What would be an appropriate comment 4. Which of the following is NOT of the Rules of Thumb

1. The population averages are equal 2. We cannot reject the null hypothesis 3. We cannot conclude that Death by Hurricanes with Cat 1 and cat 4 are different 4. The Null hypothesis is always stated with a NOT "equal" sign. Either: the population averages are not equal or the slope of the regression line is NOT equal to zero

Open Hurricane Data. SETUP: is it reasonable to assume that average hurricane deaths in 1960 are not different from those in 1970's? Given the data your job is to check if this assertion is indeed reasonable or not 1. What would be the correct null hypothesis 2. The P-value is 0.58. What can be statistically concluded 3. Write a one line additional comment

1. The population averages are equal 2. We cannot reject the null hypothesis 3. We cannot concluded that hurricane deaths for 60's and 70's are different

Open Weight_vs_IQ data. SETUP: is it reasonable to assume that the IQ and weight of a person should not be related? Given the data, your job is to check if this assertion is indeed reasonable or not 1. What would be the correct Null-hypothesis 2. The P-value is 0.74. What can be statistically concluded? 3. Write a one-line additional comment

1. The slope of the regression line is equal to zero 2. We cannot reject the Null hypothesis 3. We cannot conclude that IQ and Weight are related

Use Cars04 data. your task is to use engine size, width, and weight in order to predict the retail price 1. The overall prediction fit is 2. What is the app. error of this prediction 3. If two cars are such that both are of the same width but car A has engine size one liter larger while car B is heavier by 250 pounds. Which of the two cars would have a larger predicted prices

1. Very good 2. 8870.54 3. Car B

Upload the MANBODY data. Create a scattered plot chart with X representing age and Y representing the percentage of body fat; plot the line and compute the R-square: 1. If a person is 40 years old, according to the chart and the analysis, what is his predicted body fat percentage 2. If a person A is 30 years old and a person B is 32 years old, according to the chart and the analysis, which of the following can be concluded? 3. How strong is the relationship between age and BODYfat

1. about 18% 2. None of these 3. Poor

Open Weight_vs_IQ data. SETUP: common sense dictates that a person's IQ and weight should not be related. However, one never knows until on examines the data. Given the data your job is to check if the common sense assumption is reasonable or maybe its not 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. regression 2. None of these 3. Since P-value is large we cannot claim that IQ and weight are related 4. I have no statistical evidence to claim that IQ and weight are related

Use the tornadoes data and find month Y that has the strongest positive correlation with month x=April . 1. What is the R-squared on your chart 2. The slope of the line is 3. The y-intercept should 10.71. What is the meaning of this number

1.0.21 2. 0.13 3. There will be 10.71 tornadoes in Month Y if there are no tornadoes in April

Use the box plot to answer questions 7-9: 1. What is the approximate median for the 60+ group? 2. Which of these groups is skewed? 3. Which group has the outliers

1.about 162 2. The 46-59 group 3. The 30-50 group

Upload US_Crime data and compute the following numbers: Using the sorting feature find the state and the year that had the highest number of recorded robberies 1. The average robbery numbers for all the states in 1960 is 2. The standard deviation of all robberies for all the states in 1960 is 3. The Median of all robberies for all the states in 1960 is 4. The state is 5. The year is 6. The Highest recorded number of robberies is

1. 1711 2. 3684 3. 563 4. California 5. 1992 6. 130897

Open file: Flyers_Customers data. Produce the scatter plot chart in which x-axis is the number of Flyers and Y-axis is the number of customers. 1. Two dots on the left have approximately Y-axis values 2. What is the app. average for x-axis 3. What is the app. X-axis for a dot with highest Y-axis?

1. 350 2.400 3.650

Open BRAIN data. Claim: Male and female subjects have different IQ scores 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Two sided T-test 2. 0.870921057 3. Since P-value is too large we cannot conclude that the two variables have different averages 4. No, I cannot claim that the average IQ's are different

Upload CARS data. SETUP: It is believed that Lincolns and Cadillac have different weights . Given the data your job is to confirm or disprove this belief 1. What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Two-sided T-test 2. 0.598587029 3. Since P-Value is large we cannot claim that the averages are different 4. No, we cannot claim that the above assertion is correct

Use Tornados data and your statistical expertise to answer the questions: Is it reasonable to claim that the average number of observed tornadoes per year is different from the average number of tornado related deaths per year 1.What test did you perform 2. What is the P-value/margin of error 3. Statistical interpretation 4. Conclusion

1. Two-sided t-test 2. 4.07497E-24 3. Since P-value is very small we are confident that the averages of these two data are different 4. Yes, I am confident that the above assertion is correct

It is commonly believed that cities with wind speeds of 10 or more have different average temperature from the cities with winds of less than 10 mph. Use Pollution data and you statistical expertise to answer the questions: Is this a reasonable belief? 1. What test did you perform? 2.Statistical Interpretation 3. Conclusion

1. Two-sided t-test 2. Since P-value is too large the test is inconclusive 3. No, i cannot claim that the above belief is correct


Kaugnay na mga set ng pag-aaral

Renal PrepU Ch53 - 16, Ch54 - 18, Ch55 - 15

View Set

Article 4: Music and The Recording Industry

View Set

Unit 3.5 - Events Before the War

View Set

Exam 1 - Module 1 (Nurse's Role)

View Set

International Accounting Final Chapter 8

View Set

chapter 22 Postpartum complications

View Set