module 6 review test- STATISTICS

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

A social media app wants to study the relationship between the total shares of an item that is shared on the app ( y ) given the number of friends connected to by the original sharer ( x ). They calculate the linear regression equation for data they collected to be y=25181.5x−762,920.7 , where the number of friends ( x ) ranges from 0 to 250 . What is the predicted total number of shares for an item that was originally shared by a user with 100 friends on the app? Round to the nearest integer. a) 1,755,229 total shares b) 762,921 total shares c) 2,105,716 total shares d) 3,281,071 total shares

The answer is a. To calculate the predicted total number of shares when the original sharer has 100 friends on the app, substitute x=100 into the regression equation and solve for y . y=25181.5(100)−762,920.7 , so y=1,755,229.3 which rounds to 1,755,229 total shares

What correlation coefficient represents a perfectly linear positive correlation? a) −1 b) 0 c) 1

The answer is c. A perfectly linear positive correlation has a correlation coefficient of 1 .

What technique is used to estimate the BMI for a female aged 35 , if a line of best fit is created to estimate BMI for females aged 20−30 ?

The answer is c. Extrapolation is used to estimate an unknown value that is located outside of the observed values.

The b in y=mx+b is what? a) The point at which the least squares line crosses the y-axis b) The value of y when x=0. c) Neither A nor B d) Both A and B

The b in y=mx+b is the point at which the least squares line crosses the y -axis and the value of y when x=0 .

In the study of the effects of carbon emissions on global warming, the quantity of carbon released in the atmosphere is an explanatory variable for changes in global temperature (response variable). True or False?

The correct answer is a. This is a true statement. The quantity of carbon released into the atmosphere is an explanatory variable for changes in global temperatures (response variable).

Researchers looked at the relationship between Fat-Free Mass and Total Daily Energy intake, to better understand the metabolics of weight gain, appetite, and exercise. The scatterplot shows the result of their research, the simple linear regression line, y=mx+b , that best fit their data. Based on the graph alone, what would you estimate the EI to be of an individual with a Fat-Free Mass of 60 kg? a) 11 b) 9 c) 7 d) Cannot be determined.

The correct answer is b. First, find the place on the x -axis where Fat-Free Mass equals 60 kg and slide your finger up to where it meets the regression line. Then slide your finger across horizontally until you reach the y -axis. You should get a reading of approximately 9 EI.

Researchers looked at the relationship between Fat-Free Mass and Total Daily Energy intake, to better understand the metabolics of weight gain, appetite, and exercise. The scatterplot shows the result of their research, the simple linear regression line, y=mx+b , that best fit their data. Based on the graph what would you estimate b , the y -intercept, to be? a) 8 b) 0 c) 12 d) 6

From 40 to 80 on the x -axis, the line rose a little more than 2 units on the y -axis. The best answer is that it will cross the y -axis at approximately 6 .

Using the scatterplot Body weight vs. Systolic blood pressure, what would be a good estimate of systolic blood pressure for an individual who weighs 250 pounds? a) Around 165 b) Around 190 c) Around 135 d) Around 100

. First, find the place on the x -axis where body weight is 250 . Then slide your finger up to where it meets the regression line. Next, slide your finger across horizontally until you reach the y -axis. You should get a Systolic Blood Pressure reading of approximately 165 .

Consider the following equation y=1−2.6x . What is the slope of this line? a) −1 b) 1 c) 2.6 d) −2.6

Correct. The answer is d. The slope intercept equation in simple linear regression is y=mx+b . The equation y=1−2.6x can be rewritten as y=−2.6x+1 . In this equation, m represents the slope of the line. Therefore, the slope of the line with equation y=−2.6x+1 is −2.6 .

The scatterplot below was passed onto an RN who was helping a patient manage metabolic issues. The patient was concerned about finding an appropriate total daily energy intake (EI) with an FFM of 32 (kg). Based on the data, what should she suggest to the patient? a) There is too much scatter to draw a conclusion b) The RN should seek other sources to provide advice based on the data. c) Plug 32 into x and recommend y . d) None of the above

Correct. The correct answer is b. The data doesn't address FFM (kg) less than 40 . The research might not apply to her case. Proceed carefully.

A lurking variable is a variable that identifies the true relationship between variables. True or False?

The answer is b. This is a false statement. A lurking variable can hide the true relationship between variables. A lurking variable impacts analysis because it is a variable that is not included as an explanatory variable or response variable

A pattern or relationship between two variables is called a(n): a) scatterplot b) response variable c) association d) causation

The answer is c. A pattern or relationship between two variables is called an association.

How will the removal of the outlier affect the relationship between x and y ? ) It will strengthen the negative relationship. b) It will weaken the negative relationship. c) It will strengthen the positive relationship. d) It will weaken the positive relationship.

The answer is c. The dots move up from left to right (positive relationship). Removing the outlier will make the rest of the dots relatively closer to the imaginary line running through them, which means a stronger relationship.

A researcher took a sample of 10 years and found that the average annual profits (in millions of dollars, y ) of insurance companies in the United States can be predicted by the equation y=342.6−2.10x where x is the number of major calamities (such as tornadoes, hurricanes, earthquakes, floods, etc.) that occurred during that year. The number of calamities recorded ranged from 10 to 35 . A randomly selected year had 24 major calamities. What are the expected profits of US insurance companies for that year? a) $342.6 million b) $2.10 million c) $292.2 million d) $393 million

The answer is c. To find the expected profits when there are 24 calamities, substitute x=24 into the regression equation and solve for y . y=342.6−2.10(24) , so y=292.2 . Since y is measured in millions of dollars, the expected profit is $ 292.2 million.

A company wants to study the relationship between the price of a widget and the demand for that widget. They collect data on the demand corresponding to different prices and come up with the regression equation y=−6.01x+82.18 where x is the price of the widget and y is the demand. Data was collected for prices between $ 2 to $ 13 per widget. What is the predicted demand for a widget priced at $ 6 ? Round to the nearest integer. a) 82 widgets b) 118 widgets c) 52 widgets d) 46 widgets

The answer is d. Use the regression equation and substitute x=6 and solve for y . y=−6.01(6)+82.18 , so y=46.12 , which rounds to 46 widgets.

A lurking variable can affect the explanatory variable or response variable. True or False?

The correct answer is a. A lurking variable can have an effect on both the explanatory and response variables.

Which of the following best describes a scatterplot that has a correlation coefficient of −0.3 ? a) The points loosely follow a line that is moving down and to the right. b) The points closely follow a line that is moving down and to the right. c) The points roughly move up and to the right, but do not follow that close to a linear pattern. d) None of the above.

The correct answer is a. If a scatterplot has a correlation coefficient of −0.3 , that indicates there is a weak negative correlation. Therefore, the points would roughly move down and to the right, but not follow that close to a linear pattern.

A social media app wants to study the relationship between the total shares of an item that is shared on the app ( y ) given the number of friends connected to by the original sharer ( x ). They calculate the linear regression equation for data they collected to be y=25181.5x−762,920.7 , where the number of friends ( x ) ranges from 0 to 250 . What is the correct interpretation of the slope of the regression line? a) For every increase of 1 friend for the original sharer, there is a corresponding increase in total shares of 25181.5. b) For every increase of 1 friend for the original sharer, there is a corresponding decrease in total shares of 25181.5. c) For every increase of 1 friend for the original sharer, there is a corresponding increase in total shares of 762,920.7. d) For every increase of 1 friend for the original sharer, there is a corresponding decrease in total shares of 762,920.7.

The correct answer is a. The answer is a. Using the regression equation, when we plug in x=0 , we get y=−762,920.7 and when x=1 , y=−737,739.2 , so when x increases by 1 , y increases by 25181.5 .

Research has proven that excessive speed results in an increase in car accidents. Excessive speed is one explanatory variable for car accidents (response variable). True or False?

The correct answer is a. This is a true statement. Excessive speed is one explanatory variable for car accidents. There are several others.

A study of men aged 20 to 40 investigated the relationship between the average number of hours of exercise per week x and average number of hours of sleep per day y . The regression line was found to be y=8.5−0.2x . Which of the following is a true statement? a) For every additional hour of weekly exercise there is a corresponding 0.2 decrease in daily hours of sleep b) We can correctly predict that a 60 year-old man who exercises 6 hours per week sleeps on average 7.3 hours per day c) For every additional hour of weekly exercise there is a corresponding 8.5 increase in daily hours of sleep. d) Increasing amount of weekly exercise causes a decrease in daily hours of sleep.

The correct answer is a. We always interpret the slope of the line of best fit: "as x increases by one unit, y changes according to the slope." The slope is −0.2 meaning as x the hours of exercise increases by 1 hour, the y average hours of sleep decreases by 0.2 hours.

Suppose there is a linear regression equation y=10+.5x where x is equal to Body Mass Index (BMI) percentage and y is equal to blood cholesterol levels measured in mg/dL. Which of the following is the correct interpretation of this slope? a) For every Body Mass Index (BMI) percentage point increase, there is a corresponding .5 point increase in blood cholesterol levels. b) For every .5 point reduction in blood cholesterol level, there is 10 percentage point increase in Body Mass Index (BMI) percentage. c) For every 10 Body Mass Index (BMI) percentage point increase, there is a corresponding .5 point decrease in blood cholesterol levels.

The correct answer is a. When solving y=10+.5x algebraically, when x=0 , y=10 , when x=1 , y=10.5 . Therefore, for every Body Mass Index (BMI) percentage point increase, there is a corresponding .5 point increase in blood cholesterol levels.

Please select the correct definition for response variable: a) A variable whose data is used to predict the data of another variable b) A variable whose data values are predicted by the values of another variable. c) A variable is only plotted on a scatterplot. d) All of the above.

The correct answer is b. A response variable is a variable whose data values are predicted by the values of another variable.

An RN with a geriatric specialty is studying age and the incidence of hip fractures. She is examining the correlation between a person's age, in years, and their increase in risk factor of suffering a hip fracture. The following scatterplot shows this relationship: Based on this information what would be a good estimate of risk factor level for a patient who is 52 years of age? a) 1 b) 1.5 c) 2 d) 3

The correct answer is b. Based on an age of 52 , we can estimate that patient's risk factor to be approximately 1.5 .

Which of the following can help prevent Simpson's Paradox from occurring? a) Having the greatest number of subjects in the lowest performing trial. b) Having an equal number of subjects exposed to each of the treatments in each trial. c) Having the greatest number of subjects in the highest performing trial. d) Having each subject be exposed to each treatment in the trial.

The correct answer is b. Simpson's paradox is avoided by having an equal number of subjects exposed to each of the treatments in each trial.

Using the scatterplot Total Cholesterol vs. BMI, what would the total cholesterol level be for an individual with a BMI of 21 ? Round your answer to the nearest whole number. a) 154 b) 155 c) 153 d) 156

The correct answer is b. Solving algebraically using the linear regression equation y=6.4957x+18.514 , we can substitute 21 for x to obtain x , y=6.4957(21)+18.514 , y=154.9237 . Therefore, an individual with a BMI of 21 will have a cholesterol level around 155 .

Consider the following equation y=7.5x+9.3 . What is the slope of a line with this equation? a) 9.3 b) 7.5 c) −9.3 d) −7.5

The correct answer is b. The slope intercept equation in simple linear regression is y=mx+b . In this equation, m represents the slope of the line. Therefore, the slope of the line with equation y=7.5x+9.3 is 7.5 .

Research has linked environment lead exposure to hypothyroidism in pregnant women. Given this statement, hypothyroidism in pregnant women is an explanatory variable for environmental lead exposure (response variable). True or False?

The correct answer is b. This is a false statement. Exposure to lead in the environment raises the likelihood of hypothyroidism; therefore lead exposure is the explanatory variable.

There is strong evidence that pollution levels and vision health problems are positively correlated. Vision health problems is an explanatory variable for pollution levels (response variable). True or False?

The correct answer is b. This is a false statement. Pollution levels will predict the rate of visual health problems. Therefore, pollution level is the explanatory variable, and vision health problems is the response variable.

Extrapolation is always appropriate. True or False?

The correct answer is b. This is a false statement. There are applications of extrapolation, and times in which it is necessary. However, it is incorrect to assume that extrapolation is always appropriate. Be mindful of the situation and try to avoid inappropriate extrapolation by considering the context.

A pharmaceutical company is testing a new respiratory therapy drug. Specifically, they are interested in determining what dosage works best for elderly patients aged 80 years and older. One researcher suggests, to save money, that they base the dosage amounts for seniors on the findings from a study that included adults aged 25−50 . That study found that there is a fairly strong, positive correlation between the age of the patient and the proper dosage level. Should the company use this study to determine the dosage for individuals 80 years and older? Yes or no? a) Yes, it is appropriate to extrapolate the value for older patients using a line of regression. b) No, this is an example of inappropriate extrapolation.

The correct answer is b. This is an inappropriate extrapolation, as the study only examined the changes in necessary dosage for people between the ages of 25−50 . These results do not tell us what a proper dosage would be for patients aged 80 years

A car company wanted to study the relationship between the weight of the car and the car's average gas mileage, so they collected data from several of their cars and wrote down their weights and average gas mileage and plotted this data on the scatterplot below. The least squares regression line for the data is y=−0.0084x+48.8 , where x is the weight of the car in pounds and y is the average gas mileage (in miles per gallon). What is the correct interpretation of the slope of the least squares regression line? a) For every increase by 1 pound in a car's weight, there will be a corresponding increase in the car's miles per gallon by 0.0084 . b) For every increase by 1 pound in a car's weight, there will be a corresponding decrease in the car's miles per gallon by 0.0084 . c) For every increase by 1 pound in a car's weight, there will be a corresponding increase in the car's miles per gallon by 48.8 . d) For every increase by 1 pound in a car's weight, there will be a corresponding decrease in the car's miles per gallon by 48.8 .

The correct answer is b. Using the regression equation, we see that when x=0 , y=48.8 and when x=1 , y=48.7916 , so the miles per gallon ( y ) decreased by 0.0084 .

Consider the following CDC data on alcohol consumption by region in the United States. The gap in the data between regions has narrowed considerably over the years. What might be needed to make better predictions of the data? a) Consideration given to lurking variables b) Additional research c) Both A and B d) None of the above

The correct answer is c. Both additional research and consideration given to lurking variables that might be present are needed in order to make better predictions of the data.

Why is it valuable to fit a line, in particular, the least squares simple regression line, to paired data? a) All values of x, the predictor variable, may not be covered in the data, and with a line, you can predict them. b) If the data is correlated, you can systematically get better predictions of y when you use y=mx+b than when basing the prediction on the mean of y alone. c) Both A & B d) None of the above

The correct answer is c. It is valuable to fit a line, in particular, the least squares simple regression line, to paired data because all values of x , the predictor variable, may not be covered in the data; and since the data is correlated, you can systematically get better predictions of y when you use y=mx+b than when basing the prediction on the mean of y alone.

Please select the correct definition for least squares. a) A method that minimizes the squared distances of data points from a line that captures the trend in paired data. b) The criteria by which the best-fit line is selected for paired data c) Both A & B d) None of the above

The correct answer is c. Least squares is the criteria by which the best-fit line is selected for paired data. It is a method that minimizes the squared distances of data points from a line that captures the trend in paired data.

Physicians looked at the relationship between prescribing opioids and addiction to heroin. What would you estimate r , the correlation coefficient, to be? a) Between −.25 and −.75 b) Between −.25 and −.50 c) Between +.25 to +.75 d) Lower than −.75 or higher than +.75

The correct answer is c. The correlation in this graph is positive, and a good estimation is that the correlation falls somewhere between +.25 to +.75

Due to budget cuts, a scientist had her funding reduced for a research project. Her support only enabled her to collect 2 data points, (2,3.26) and (4,6.52) . The data points are plotted below. What is your estimate of the correlation coefficient? a) −1 b) Zero c) 1 d) Cannot be determined

The correct answer is c. The correlation indicated by these two data points would be +1 , though no conclusions should be drawn as more data would need to be collected.

A middle school principal wants to compare the effectiveness of two different methods for teaching spelling. She selects two teams of students with similar academic abilities. Four teaching sessions are held. After each session, the two teams compete in a spelling match. Spelling Match Scores by Team: Which conclusion can be made? a) Team A performed better. b) Team B performed better. c) There is not a significant difference in performance between the teams. d) There is a significant difference in performance between the teams.

The correct answer is c. The winning team alternates for each match and the difference in scores is roughly the same for matches 1 and 4, and for matches 2 and 3. Therefore, we cannot say that one team outperformed the other.

Which of the following is the best estimate of the correlation coefficient for the above scatterplot? a) .90 b) .5 c) −.95 d) −.3

The correct answer is c. This scatterplot illustrates a strong negative relationship, indicating a correlation coefficient close to −1 . Therefore, −.95 is a good estimate.

A car company wanted to study the relationship between the weight of the car and the car's average gas mileage, so they collected data from several of their cars and wrote down their weights and average gas mileage and plotted this data on the scatterplot below. The least squares regression line for the data is y=−0.0084x+48.8 , where x is the weight of the car in pounds and y is the average gas mileage (in miles per gallon). What is the predicted gas mileage for a car that weighs 2500 pounds? a) 29.18 miles per gallon b) 26.4 miles per gallon c) 27.8 miles per gallon d) 69.8 miles per gallon

The correct answer is c. To find the predicted gas mileage for a car that weighs 2500 pounds, substitute x=2500 into the regression equation and solve for y . y=−0.0084(2500)+48.8 , so y=27.8 .

What can be inferred from the data in the scatterplot Body Weight vs. Systolic Blood Pressure? a) There is a causal relationship between body weight and systolic blood pressure. b) There is a negative association between body weight and systolic blood pressure. c) There is no association between body weight and systolic blood pressure. d) There is a positive association between body weight and systolic blood pressure.

The correct answer is d. From the scatterplot, it can be determined that there is a positive association between body weight and systolic blood pressure. It cannot be determined from this scatterplot if this is a causal relationship.

Based on the results shown, what do you think the relationship is between age and depression? a) It is positively correlated b) It is negatively correlated c) No correlation d) Can't tell from this display

The correct answer is d. We can not determine the relationship between age and depression as we don't know how the data pair up. The other question answer choices cannot be determined because to assess correlation we need to be able to see how the data is paired.

Which of the following is a primary takeaway message from the lesson on Simpson's paradox? a) Do not use samples of different sizes. b) Everything about research and sampling is tricky c) Simpson's paradox is one of the main ways in which statistical methods confuse results. d) All of the above

The issues raised from Simpson's Paradox stem from a design flaw, and this can be avoided by making sure the sample is broken into test groups of equal size.

The average gestation period, or time of pregnancy, of an animal is closely related to its longevity (the length of its lifespan). Data on the average gestation period and longevity (in captivity) of 40 different species of animals have been examined, with the purpose of examining how the gestation period of an animal is related to (or can be predicted from) its longevity. How will the removal of the outlier affect the relationship between longevity and gestation? a) It will weaken a negative relationship. b) It will strengthen a positive relationship. c) It will strengthen a neutral relationship. d) It will weaken a positive relationship.

The presence of the outlier gives the illusion of a stronger positive relationship than what actually exists when the outlier is removed. Removing the outlier will make the rest of the dots relatively further from the imaginary line running through them, which means a weaker relationship.

A study was conducted in the United States between the years 1999 and 2006 to compare the prevalence of sensory impairment between two different age groups: people aged 70 to 79 and people aged 80 or above. Which conclusion can be made? a) The prevalence of sensory impairment is less among people aged 80 or above. b) The prevalence of sensory impairment is greater among people aged 70 to 79. c) There is a significant difference between the two age groups. d) There is no significant difference between the two age groups.

The prevalence of sensory impairment is greater for people aged 80 and above in every category.

Which of the following best describes the relationship between the variables in the scatterplot below. a) As the values on the x -axis increase, there is a decrease in the values on the y -axis. b) As the values on the x -axis increase, there is a slight increase in the values on the y -axis. c) As the values on the x -axis increase, there is an increase in the values on the y -axis. d) As the values on the x -axis increase, there is a slight decrease in the values on the y -axis.

The relationship in this scatterplot indicates that an increase in one variable is associated with a decrease in the other variable, absent other variables.

Suppose there is a linear equation y=10x+2 where x is equal to hours of weight training per day and y is equal to the percentage increase in BMI, what is the correct interpretation of this slope? a) For every 10 hours of weight training, there is a corresponding 2 percent increase in BMI. b) For every 2 hours of weight training, there is a 10 percent increase in BMI. c) For every hour of weight training, there is a 10 percent increase in BMI. d) For every hour of weight training, there is a 2 percent increase in BMI.

When solving y=10x+2 algebraically, when x=0 , y=2 , when x=1 , y=12 , therefore for every hour of weight training there is a 10 percent increase in BMI

A study was conducted in Korea to investigate the relationship between vitamin C intake and osteoporosis in adults aged 50 and over. The subjects were grouped according to physical activity level. Within the low physical activity group it was found that subjects with higher vitamin C intake levels tended to have lower risk of osteoporosis. Which type of relationship is demonstrated? a) Positive causation b) Negative association c) Positive association d) Negative causation

he answer is b. A higher intake of vitamin C is associated with a lower risk of osteoporosis. When an increase in one variable is associated with a decrease in the other, this is a negative association. The relationship is not a causation as there can be lurking variables such as overall diet and other mineral levels.

Consider the scatterplot above and then answer the following question: True or false? Treating a large number of patients enables a hospital to have lower mortality rates.

his is a false statement. The scatterplot shows a relationship between the number of patients treated and mortality rate. It can not be stated however that treating more patients causes a hospital to have a lower mortality rate.

Using a line of best fit in slope-intercept form, y=mx+b , what must be true if there is a positive correlation?

m must be greater than 0 . Correct. The correct answer is c. When using a line of best fit in slope-intercept form, y=mx+b , if there is a positive correlation m will always be greater than zero.

A researcher collected data that described a linear relationship of blood pressure in men aged 50−55 who have had triple bypass surgery. A cardiac nurse is using this research to predict the blood pressure of her 53 -year-old male patient who is in recovery from triple bypass. Her prediction is a ________________? a) Linear extrapolation b) Inappropriate extrapolation c) Linear interpolation d) None of the above

Linear interpolation Correct. The correct answer is c. This is a linear interpolation because 53 years of age is inside of the range of the known data points 50 to 55 years of age.

If there is a negative correlation, using a line of best fit in slope-intercept form, m , must be less than zero. True or False? a) True b) False

The answer is a. This is a true statement. In a negative correlation, the response variable decreases as the explanatory variable increases. Therefore, the slope of the line is also negative.

Please select the correct definition for regression equation: a) An equation, based on least squares fit that offers the predicted value for y for a value of x. b) y=mx+b, where m and b are defined by the sum of the least squares criteria. c) Both A and B d) Neither A nor B

The correct answer is c. The regression equation is an equation based on least squares fit that offers the predicted value for y or a value of x . The formula is y=mx+b , where m and b are defined by the sum of the least squares criteria.

To better understand the metabolics of weight gain, appetite, and exercise researchers looked at the relationship between Fat-Free Mass and Total Daily Energy Intake. The scatter plot shows one result of their research. What trend do you see? a) As FFM increases, EI tends to increase. b) There is no linear relationship c) As Energy Intake goes up, it increases Fat-Free Mass d) All of the above

This scatterplot indicates a positive correlation.

How will the removal of the outlier in the following graph affect the relationship between x and y ? a) It will weaken the negative relationship. b) It will strengthen the positive relationship. c) It will strengthen the negative relationship. d) It will weaken the positive relationship.

he correct answer is c. The dots move down from left to right (negative relationship). Removing the outlier will make the rest of the dots relatively closer to the imaginary line running through them which means a stronger relationship.


Set pelajaran terkait

Econ 102 Sample Test Chapter 3 Spring 2020

View Set

Explicit, Implicit, Imply, Infer, Inferences

View Set

Practice Test: Module 07 Network Architecture

View Set

Ex3 Peds LWW Toddler Flash Cards FA20

View Set