STAT 226 exam 3/4

Ace your homework & exams now with Quizwiz!

Imagine 75 students taking one of the quizzes with a maximum possible score of 4. I determine the median score. After grading the quiz I realize that 5 students with 4 points each did exceptionally well and decide to award them a bonus point. Fill in the blank: The median of the new score distribution will be _____________ that of the original score distribution.

Equal to

For a simple linear regression model fit to a set of data, even if the R2-value(i.e. the coefficient of variation) is close to 1 (or 100%), we cannot conclude that the model provides an adequate fit to the data.

True

If men always marry women who are 2 years younger than them, then the correlation between the age of a husband and his wife would be equal to 1.

True

If the standard error increases, the p-value for a test of hypotheses for the population mean increases.

True

TRUE or FALSE: For a simple linear regression model fit to a set of data, even if the R2-value(i.e. the coefficient of variation) is close to 1 (or 100%), we cannot conclude that the model provides an adequate fit to the data.

True

TRUE or FALSE: If we calculate 1000 90% confidence intervals for a population mean μ, approximately 900 of them will contain μ.

True

TRUE or FALSE: The t-distribution is symmetric and unimodal.

True

TRUE or FALSE: The total area under a t−distribution curve is 1.

True

TRUE or FALSE: For different values of an explanatory variable the simple linear regression model describes the change in the predicted value (or mean) of the response variable.

True

The assumption of independence of observations is required for the correct conclusions of a statistical test of hypotheses.

True

The correlation indicates both the direction and strength of a linear relationship.

True

The distinction between explanatory and response variables is not important for correlation

True

The t-statistics for a test of hypotheses follows a t-distribution under the null hypothesis.

True

The test statistic for a test of hypotheses is unitless.

True

The total area under a t−distribution curve with 5 degrees of freedom is 1.

True

The variable that is being predicted in regression analysis is the response variable.

True

For a Student t-distribution with 30 degrees of freedom we have that P(t<5) is (choose the correct answer from those provided below):

approximately 0

The alternative hypothesis (Ha)...

contradicts the null hypothesis.

A test in which the alternative hypothesis allows any value of a parameter larger than a specified value is...

A one-sided test

When we increase the sample size, the p-value generally decreases.

True

Every second of our lives we are using oxygen to keep our bodies alive, even though we don't think about it very often. When we exercise, we need more oxygen than usual. After consulting with a statistician, the YMCA designed a survey to see how oxygen uptake is related to pulse (heartbeat) while running. The data for thirty-one individuals are displayed in a scatterplot below, along with some summary statistics corresponding to the simple linear regression model fit to these data. Use this information to calculate the linear correlation coefficient between these two variables. Report your answer to TWO decimal places.

*On one note doc*

Correlation Detective. The figure below consists of graphs for four different data sets, labeled as A, B, C, and D. For each of these graphs, you are required to identify the correct value of the linear correlation coefficient from the list provided. Match the correct correlation for each plot.

*One one note doc*

Refer to the context of the NFL data set described earlier. Construct a 95% confidence interval for the population slope. Report here ONLY the lower bound of this interval.

*double check one note doc* -.0895

Refer to the context of the NFL data set described earlier. The degrees of freedom for sampling distribution of the t-ratio calculated as part of a test statistic involving the population slope are [df]

*double check one note doc* 1,000

A liquor wholesaler is interested in measuring how the price per bottle (in dollars) of a premium scotch whiskey affects sales (in quantity sold). Data for eight randomly selected weeks are displayed below together with the corresponding JMP output from fitting a least-squares regression line to the data. Use the fitted least-squares regression line presented in the JMP output above to predict the number of bottles sold per week when the price of a bottle is $19.22. Keep FOUR decimals in your intermediate calculations, and report your answer to TWO decimal places.

*on one note doc*

A liquor wholesaler is interested in measuring how the price per bottle (in dollars) of a premium scotch whiskey affects sales (in quantity sold). Data for eight randomly selected weeks are displayed below together with the corresponding JMP output from fitting a least-squares regression line to the data. Use the fitted least-squares regression line presented in the JMP output above to predict the number of bottles sold per week when the price of a bottle is $20.76. Keep FOUR decimals in your intermediate calculations, and report your answer to TWO decimal places.

*on one note doc*

A liquor wholesaler is interested in measuring how the price per bottle (in dollars) of a premium scotch whiskey affects sales (in quantity sold). Data for eight randomly selected weeks are displayed below together with the corresponding JMP output from fitting a least-squares regression line to the data. Use the output given below to identify the explanatory variable and report the correlation coefficient.

*on one note doc*

Central Limit Theorem in Practice. The figure below presents four histograms, labeled with upper case letters A through D. One of these graphs displays the population of interest for a study. The other three graphs represent the sample means for 400 random samples of size 5, 12, and 60. Answer the following question using this information and the graphical displays. The histogram corresponding to samples of size 5 is shown in the graph labeled

*on one note doc*

Consider the five scatterplots that are shown below: Select the scatterplot that shows the strongest linear relationship between the X and Y variables (Select all that apply).

*on one note doc*

How does the cost of a movie depend on its length? The scatterplot below provides information on a random sample of 120 films. Data were collected on the cost (millions of dollars) and the running time (minutes) for each film. The analysts fit a simple linear regression of cost vs. length and obtained a coefficient of determination (R2 ) of 0.101. Using all the information provided in this problem, report the value of the correlation coefficient up to four decimal places.

*on one note doc*

NFL scores Consider the question on how the Home score can be used to predict the Visit score for NFL games. We have obtained a data set which includes information on the Home and corresponding Visit scores for all games played in the NFL during the 2001 through 2008 seasons. That is, for every game we know the home team's achieved Home score and the corresponding Visit score achieved by the visiting team. We provide below the JMP output obtained from fitting a simple linear regression which predicts the Visit score from the Home score. Use this JMP output to answer all the questions below which refer to the NFL data set. For this question, report the estimated slope. Report this value exactly as it appears in JMP.

*on one note doc*

When finding the Least Squares Regression line it does not matter which variable is used as the response variable and which one as the explanatory variable, the LS regression line will be the same.

False

Refer to the context of the NFL data set described earlier, and the confidence interval computed in JMP. Choose the correct terms to describe the interpretation of the 95% confidence interval for the population slope calculated above. [ Select ] that for each additional point in the [ Select ] score, the [ Select ] score will decrease by between [ Select ] and [ Select ] points.

*on one note doc*

Refer to the context of the NFL data set described earlier. For this question, review the information obtained for the test of hypothesis and the confidence interval. Select the correct statements from those provided below.

*on one note doc*

Refer to the context of the NFL data set described earlier. The test statistic corresponding to the test for a linear relationship is [t]

*on one note doc*

Refer to the context of the NFL data set described earlier. To test whether there is statistically significant evidence of a linear relationship between Home and Visit scores, we need to set up the null and alternative hypotheses. Select, from the answers provided below, the correct pair: H0: [ Select ] H1: [ Select ]

*on one note doc*

Refer to the context of the T-bills and Inflation problem above. Calculate and report the predicted T-bills value when there is no inflation (i.e. inflation is 0). Use four decimal places for all your intermediate calculations and report your answer to 2 decimal places.

*on one note doc*

Refer to the context of the T-bills and Inflation problem above. In this question, you will need to comment on the evidence from the data regarding possible violations of the assumptions necessary to make valid inferences (such as conclusions from tests of hypotheses or construct correct confidence intervals) for population parameters. Make the correct choices to complete the statements below. Using graph(s) [ Select ] we conclude that the linear model assumption [ Select ] violated. Using graph(s) [ Select ] we conclude that the constant variance assumption [ Select ] violated. Using graph(s) [ Select ] we conclude that the normality assumption [ Select ] violated. To identify if the assumption of independence is violated, we use [ Select ] .

*on one note doc*

Refer to the context of the T-bills and Inflation problem above. Report the margin of error for a 95% confidence interval for the population intercept. Use four decimal places for all your intermediate calculations and report your answer also to 4 (FOUR) decimal places.

*on one note doc*

Since the Federal Reserve Rate is strongly tied to the T-bill rates, we may be able to predict the T-bill rate from the Federal Fund rate. Use the JMP output below to calculate the linear correlation coefficient between these two variables. Report your answer to TWO decimal places.

*on one note doc*

T-bills and inflation. When inflation is high, lenders require higher interest rates to make up for the loss of purchasing power of their money while it is loaned out. In this problem, we will be using data on the return (%) of one-year Treasury bills (T-bills) and the rate of inflation (%) as measured by the change in the government's Consumer Price Index in the same year. The data includes a random sample of 40 years. The figure below presents the JMP output resulting from fitting a simple linear regression model to these data, including graphs, parameter estimates and inferential quantities. Please note that some quantities were removed from the JMP output. The questions in this quiz will all refer to this output and ask you to compute some the missing values as well as identify and comment on the graphs and features of this model. If you prefer to save this figure for easier reference, you can do so by downloading it from here: TBill_Summaries_Quiz14.png For this question, you are asked to match the descriptions or concepts on the left with the values or graph names on the right. Note: not all terms from the dropdown menus are used, and no term is used twice.

*on one note doc*

Toyota Corolla. A large Toyota car dealership offers purchasers of new Toyota cars the option to buy their used cars as part of a trade-in. The dealership then sells the used cars for a small profit. To ensure a reasonable profit the dealers need to predict the price based on the age of the car. For that reason, data were collected on all previous sales of used Toyota Corollas at the dealership. The data includes the sales price of used cars (recorded in US dollars) and the age of the cars (recorded in months). The JMP output provided below presents the results obtained by the dealer when fitting a least square regression model to the data collected for 654 cars. Use the JMP output provided below to calculate the value of the correlation coefficient. Report your answer to TWO decimal places.

*on one note doc*

Toyota Corolla. A large Toyota car dealership offers purchasers of new Toyota cars the option to buy their used cars as part of a trade-in. The dealership then sells the used cars for a small profit. To ensure a reasonable profit the dealers need to predict the price based on the age of the car. For that reason, data were collected on all previous sales of used Toyota Corollas at the dealership. The data includes the sales price of used cars (recorded in US dollars) and the age of the cars (recorded in months). The JMP output provided below presents the results obtained by the dealer when fitting a least square regression model to the data collected for 654 cars. Use this JMP output to answer the following question. Based on the information provided in the JMP output, are all the residuals obtained from the least-squares regression model are positive?

*on one note doc*

Toyota Corolla. A large Toyota car dealership offers purchasers of new Toyota cars the option to buy their used cars as part of a trade-in. The dealership then sells the used cars for a small profit. To ensure a reasonable profit the dealers need to predict the price based on the age of the car. For that reason, data were collected on all previous sales of used Toyota Corollas at the dealership. The data includes the sales price of used cars (recorded in US dollars) and the age of the cars (recorded in months). The JMP output provided below presents the results obtained by the dealer when fitting a least square regression model to the data collected for 654 cars. Use this JMP output to identify the explanatory variable and report the correlation coefficient.

*on one note doc*

Choose all that apply. In statistical inference for the population mean μ when σ is unknown, we use the t-distribution which has heavier tails than the normal distribution. This is necessary because using the sample standard deviation instead of the population standard deviation

- adds more variability to the distribution of the test statistic. - adds more uncertainty to the estimation of the population mean μ.

p-values can be positive or negative, depending on the null hypothesis and the sample mean.

False

Operating expenses in U.S. private and public colleges are funded through individual, corporation, and foundation contributions (a.k.a. donations). Much of this money is put into an endowment fund, and the college spends only the interest earned by the fund. A random sample of sixteen college endowments was drawn from the list of endowments in the Chronicle of Higher Education Almanac (Sept. 2, 1996). The endowments (in millions of dollars) were recorded and provided to users to be analyzed. Analysts calculated a confidence interval for the mean college endowment across all U.S. private and public colleges. The interval calculations were done using JMP, and are shown in the rightmost part of the output provided to you (under the heading Confidence Intervals). Using this JMP output (below and linked here download), report the following quantities: Sample size The _____distribution was used to calculate the interval Sample mean Margin of error Interval width /length Lower bound Confidence level used

16 Student's T 625.79 78.697 157.393 547.090 99%

Retirement savings. Research indicates that many Americans do not save enough for retirement, on average. A group of economists at the Federal Reserve Bank of St. Louis are interested in conducting a study to examine whether Americans between the ages of 55 and 65 have saved too little. The analysts obtain a random sample of Americans in this age group and proceed to test if there is evidence of insufficient savings from these data. The variable considered is the total amount of savings for each individual, reported in thousands of US dollars. For this question, assume that financial specialists suggest that the minimum level of retirement savings should be 1.5 million US dollars. Use the JMP output provided below to report the value of the test statistic used to gather evidence against the null hypothesis. Report your answer as a number (no symbols) and round to two decimal places. Test Mean Hypothesized Value 1500 Actual Estimate 1613.07 DF 251 Std Dev 782.626 t Test Prob > |t| 0.0226*

2.2935 t=(1613.07-1500)/(782.626/sq rt 252) *calculation on one note doc*

Laundry detergent use. The amount of laundry detergent that customers use has a right-skewed distribution with a mean μ = 26 milliliters (mL) and a standard deviation σ = 17 mL. Suppose we conducted a study in which we measured the laundry detergent use of 64 randomly sampled US adults. What is the value (in mL) that corresponds to the 95th percentile of the sampling distribution of the sample mean of the 64 adults? (Round your answer to 2 decimal places.)

29.5

Page Loading Speed. widgetwarehouse.com is a popular e-commerce site that processes thousands of transactions every day. Tom, product manager at the site is analyzing some historical user data that widgetwarehouse.com has collected over several months to gain insights into how fast the site loads for each of its users. Tom knows that the time it takes each user to load the site follows a normal distribution with mean μ= 341 milliseconds and standard deviation σ= 7 milliseconds. Tom doesn't have a table or calculator with statistical capabilities, but he does know about the Normal distribution. Tom wants to know more about the typical user's experience rather than any individual. He investigates the sample mean for samples of size n= 160,000, which is a typical number of visitors the site has each minute. Tom knows that the sampling distribution of the sample means also follows a normal distribution. Because the sampling distribution is normal, you can use the empirical rule (68-95-99.7 rule) to calculate approximate probabilities for sample means as well. Use this fact to find the time that is greater than 99.85% of average loading times for samples of 160,000 users. (Report your answer in milliseconds rounding to 2 decimal places.)

341.05

Companies designing furniture for elementary school classrooms produce a variety of sizes for kids of different ages. Suppose the heights of kindergarten children can be described by a Normal distribution with mean 37.85 inches and standard deviation 1.82 inches. At least how tall are the tallest 8% of all kindergartners? Report your answer as a number and round to TWO decimal places.

40.42

A dataset has a mean of 11.21 and a variance of 40. Suppose we add the value 69 to each of the observations in the dataset. Report the the standard deviation of the resulting dataset. Report your answer to 2 decimal places.

6.32

The weekly salary paid to each employee of a small company is normally distributed with a mean of $767 and a standard deviation of $92. This small company has 36 employees. What is the average weekly salary for all 36 employees such that the probability of being above this weekly average salary is only 10%? (Report your answer rounding to 2 decimal places.)

786.63

A statistical hypothesis is...

A claim about a parameter of a population

A test in which the null hypothesis allows any value of a parameter larger than a specified value is...

A one sided test

For a Student t-distribution with 100 degrees of freedom we have that P(t>5) is (choose the correct answer from those provided below):

Approximately 0

For a Student t-distribution with 30 degrees of freedom we have that P(t<-5) is (choose the correct answer from those provided below):

Approximately 0

For a Student t-distribution with 10 degrees of freedom we have that P(t<5) is (choose the correct answer from those provided below):

Approximately 1

Quality assurance. In the food industry, quality assurance is an important business practice. One manufacturer produces packages of potato chips that are advertised to contain 16 ounces of product. The actual weight of contents in a package has a mean of 16.5 ounces and a population standard deviation of 0.6 ounces. The mean amount filled, μ, is set high so that under-filling will not occur frequently. What is the sampling distribution of the sample mean for samples of size 35?

Approximately Normal

Residuals should never be negative.

False (Residual=Observed Value−Predicted Value)

Data regarding quarterly sales (in thousands of Dollars) were collected for a random sample of n=10 Armand's restaurants located near college campuses. The size of the student population(in thousands) was thought to be related to the sales of the Pizza Franchises. Would you use this model to predict Armand's quarterly sales in a town with a student population of 50,000 students?

No

A college student wishes to study if herbal tea can improve the health of nursing home patients. She makes weekly visits to a local nursing home, visiting and talking with the residents, and serving them herbal tea. After six months, the residents drinking tea on more occasions were observed to have fewer days of ill and cheerful attitudes. Identify the explanatory variable from the choices given above. Also, indicate what your findings would be for the correlation.

Explanatory variable- number of weeks drinking herbal tea Correlation - Negative

Suppose you were to collect data for blood alcohol level and reaction time (in minutes). You want to make a scatterplot but need to determine what should be the proper x and y axes. Which variable should you use as the explanatory variable? Furthermore, would you expect a positive, negative, or no linear association between the two variables?

Explanatory variable: Blood alcohol level Correlation: Positive

A negative correlation between the response variable y and the explanatory variable x indicates that large values of x are associated with small values of y.

False

A p-value for a two-sided test of hypotheses is always double the p-value for a one-sided test.

False

A p-value is only meaningful for a two-sided test of hypotheses.

False

A scatterplot shows the relationship between two qualitative variables.

False

Correlation implies causation as long as we can ensure that the collected data are a random sample from the population.

False

Correlation near ±1 always implies a cause and effect relationship between both variables

False

Daniel is studying to be a string music teacher. Part of his student teaching assignment, he has to collect and analyze data relating to the length of the musical education and talent. He is working with middle schoolers, thus he asks his 100 students to provide data on the following two variables: Variable 1: The number of years each student had played their instrument. Variable 2: Musical talent index. This is actually a measurement provided by a group of independent listeners/judges. A value closer to 1 indicates a beginner and a value closer to 100 indicates a master musician. Although Daniel took an intro level statistics class, he didn't pay close attention during class. After studying the relationships between these variables, Daniel reported the following statement in his student teaching report. Is the statement is correct? "The correlation between variable 1 and variable 2 is .89. This implies that more years playing an instrument causes an increase in the talent of the musician."

False

Data have been published which indicates that the more children a couple has, the less likely the couple is to get a divorce. Therefore we can conclude that having fewer or no children causes couples to divorce.

False

Deleting outliers from a data set is considered good statistical practice.

False

For regression, statisticians want to minimize the distance between the observations (data) and the fit line in both the horizontal and vertical directions.

False

If we are only given the value of r2 and the least squares prediction line we cannot tell whether the relationship is positive or negative without looking at a scatterplot.

False

Influential observations in a regression analysis always have large residuals.

False

TRUE or FALSE: Statistical significance always implies practical significance.

False

TRUE or FALSE: The best way to get a representative sample is to handpick the sample carefully.

False

TRUE or FALSE: Keeping the level of confidence, the sample mean and the standard deviation the same, a confidence interval for the population mean becomes wider as the sample size increases.

False

TRUE or FALSE: The larger the p-value the stronger the evidence against the null hypothesis.

False

The correlation coefficient between two quantitative variables is zero. This implies that there is no association between the two variables.

False

The p-value for a statistical test of hypotheses is the same as the confidence level for a confidence interval.

False

The standard error of the sample mean increases as the sample size increases.

False

The value of the correlation contains only information about the strength of the linear relationship but not the direction.

False

Unlike the standard deviation or the mean, the correlation is not influenced by outliers.

False

University of Louisville researchers examined the process of filling plastic pouches of dry blended biscuit mix (Quality Engineering, Vol. 91, 1996). The current fill mean of the process is set at μ= 407 grams. Operators monitor the process by randomly sampling 36 pouches each day and measuring the amount of biscuit mix in each. Suppose that on one particular day, two operators, Finny and Quinn, observe a sample mean of 400 grams and a standard deviation of 10.1 grams. Operator Finny believes that this indicates that the true process fill mean μ is off-target, i.e. the true process mean is actually different from 407 grams. Operator Quinn argues that μ= 407, and the small value of x¯x¯ observed is due to random variation in the fill process. Which operator do you agree with? (Hint: think of this in the context of hypothesis testing).

Finny is correct

A retailer maintains a Web site that it uses to attract shoppers. The average purchase amount is $80. The retailer is evaluating a new Web site that would, it hopes, encourage shoppers to spend more. Select the appropriate alternative hypothesis from those provided below, or indicate that none of the options are correct.

Ha: The true mean purchase of all shoppers using the new Web site is greater than $80.

A negative correlation coefficient between the response variable y and the explanatory variable x indicates that

Large values of x are associated with small values of y (smaller(larger) values of the explanatory variable are associated with larger(smaller) values of the response variable.)

Which of the following are examples of people mistaking correlation for causation? Choose all correct answers.

More books at home make you better at reading Wearing a lucky shirt to a game helps the team win Diet coke drinking leads to weight gain

Baseball Game Length. Baseball has long been America's past time, but the appeal of the game seems to be fading away. One of the reasons is the length of games seems to be growing over time. In 2017, the average length of Major League Baseball games was 185 minutes with a standard deviation of 24 minutes. The distribution of length of Major League Baseball games is known to be skewed. Can you find the probability that a randomly chosen game had a length of less than 3 hours (180 minutes)?

No

White Sharks. White sharks are known to be a migratory species of shark, meaning they do not tend to remain in the same location year-round. However, there is usually a couple of white sharks together around the same location for some amount of time. Suppose that the amount of time they spend in Southern California follows a normal distribution with mean μ = 62 days and standard deviation σ = 5.4 days. Suppose we obtained the time spent in Southern California for 9 randomly sampled white sharks. The corresponding sampling distribution of the sample mean, i.e. the distribution of all sample means based on all samples of size 9 from all-white sharks, has the following attributes

Normal

Game of Thrones. Last July, a new episode of the hit HBO series "Game of Thrones" (GOT) was released every Sunday at 8 PM CST for 7 weeks. Many fans of the show started streaming the new episodes soon after 8 PM through the streaming service HBO Now or HBO Go. However, other fans waited longer to start watching the new episodes for various reasons, such as having to wait for their kids to go to bed or their parents to stop hogging the TV with reruns of The Golden Girls. Overall, pretend that the distribution of the minutes that a typical GOT fan waits (after 8 PM) to stream a new episode is right skewed with mean μ = 30.8 minutes and standard deviation σ = 20.4 minutes. Suppose HBO could provide us with the times it took 9 randomly sampled GOT watchers to start watching the 6th episode of last season: "Beyond the Wall." The corresponding sampling distribution of the sample mean, i.e. the distribution of all sample means based on all samples of size 9 from all GOT watchers, has the following shape:

Not Normal

Laundry detergent use. The amount of laundry detergent that customers use has a right-skewed distribution with a mean of 30 milliliters (mL) and a standard deviation of 25 mL. Suppose we conducted a study in which we measured the laundry detergent use of 4 randomly sampled US adults. The corresponding sampling distribution of the sample mean, i.e.the distribution of all sample means based on all samples of size 4 from all US adults, has the following distribution

Not Normal

A random sample of 100 salaries from the Des Moines insurance sector is recorded and found to have a median of $110,000 and a mean of $150,000. Based on this information, a frequency distribution of these 100 salaries is most likely:

Not enough information is provided to establish the mode of the distribution

Suppose you were to collect data for grip strength of adults as well as their age. You want to make a scatterplot but need to determine what should be the proper x and y axes. The results of the study found that younger adults tended to have slightly higher grip strength. Which variable would you use as the response variable? Furthermore, would you expect a positive, negative, or no linear association between the two variables?

Response: Grip strength Correlation: Slight Negative

Emergency Room Waiting Time. The administrator at the Iowa Health Department claims that on weekends the average wait time for emergency room visits in Iowa is 10 minutes. Based on discussions you have had with friends who have complained on how long they waited to be seen in the ER over a weekend in your local hospital, you got interested to know whether the average wait time in your local hospital is actually different than the state average. Over the course of a few weekends you record the wait time for 40 randomly selected patients in your local hospital. Relevant JMP output for conducting a test of hypothesis to investigate whether the waiting room is indeed 10 minutes is provided below. Summary Statistics Mean 12.9 Std Dev 6.5272604 Std Err Mean 1.0320505 Upper 95% Mean 14.987519 Lower 95% Mean 10.812481 N 40 t Test Test Statistic 2.8099 Prob > t 0.0039* Prob < t 0.9961 Answer the following questions by selecting the correct answer from those provided for each situation:

The value of the population mean under the null hypothesis equals: 10 The number of degrees of freedom for the sampling distribution of the t-ratio in this problem equals: 39 The p-value associated with this test equals: .0078

Skinny jeans. A clothing company makes skinny jeans and has a design specification for the elasticity of these jeans set to 57 Pascals (Pa). They would like to know if the mean elasticity is below this specification. The figure below presents relevant JMP output for conducting a test of hypothesis to investigate this question. Answer the questions below using this output. Summary Statistics Mean 55.533796 Std Dev 4.5528105 Std Err Mean 1.0180394 Upper 95% Mean 57.664577 Lower 95% Mean 53.403015 N 20 t Test Test Statistic Prob > t 0.9170 Answer the following questions by filling in the blanks for each situation. Report each answer as a number and round to two (2) decimal places. Please note that answers that do not follow this exact specification will not receive any credit.

The value of the population mean under the null hypothesis equals: 57 The number of degrees of freedom for the sampling distribution of the t-ratio in this problem equals: 19 The p-value associated with this test equals: .0830

Soccer Referees. A manager for a well-known soccer club in England has complained that his players are injured more often because the match referees are allowing opposing players to commit fouls without being penalized. The manager for a rival club claims the opposite, that more fouls are being assessed than necessary due to players simulating injuries. We know that in 2016-17, there was an average of 21.5 fouls per match across all matches played. To see which manager is correct, you are asked to test the claim that referees called fewer fouls, on average, during the entire 2017-18 English Premier League (soccer) season than the previous season. The JMP output presented in the figure below presents partial JMP output from the statistical analysis conducting the test of hypotheses. Answer the following questions using this output. Please report the values exactly as they appear in JMP so you get credit for your correct answers. If you need to compute something, round to 3 decimal places.

The value of the population mean under the null hypothesis: 21.5 The value of the test statistic: -2.2811 The correct p-value for the claim that referees called fewer fouls, on average (see the problem for the complete claim): .0116

A third variable that is not originally included in a study but may help to explain relationships between other variables is called a lurking variable.

True

Correlation does not imply causation even when the correlation between two variables is very high.

True

The null hypothesis (H0)...

is the default belief that we accept in the absence of data.


Related study sets

Biodiversity Unit 3 Review: Part 4

View Set

HESI Case Study: Myasthenia Gravis

View Set

Chapter 57: Concepts of Care for Patients With Pituitary and Adrenal Gland Problems

View Set

Chapter 11 Mendel and the Gene Idea

View Set

Patho Chapter 12: Disorders of White Blood Cells

View Set

Unit 1, Unit 2, Unit 3, Unit 4, Unit 5, Unit 6

View Set

english l (honors) final study guide ~ mrs.russell ~ 2nd hour

View Set