AP Statistics Test Questions

Ace your homework & exams now with Quizwiz!

A survey of autos parked in student and staff lots at a large university were classified by country of origin, as seen in this table. Country of Origin DRIVER Student Staff American 94 81 European 38 22 Asian 66 53 What percent of the cars were foreign?

(60+119)/354

A school board study found a moderately strong negative association between the number of hours high shcool seniors worked at part-time jobs after school hours and the students' grade point averages. Hoping to improve student performace, the school board passed a resolution urging parents to limit the number of hours students be allowed to work. Discuss the school board's reasoning. DO YOU AGREE OR NOT?

(Half-credit) Yes, I do agree with the school board's reasoning, as the statement above states that they found a moderately strong negative association between the number of hours high school seniors worked at part-time jobs after school hours and the students' grade point averages. The moderately strong negative association implies that the more hours they worked after school, the lower their GPA would be, since the explanatory variable is number of hours worked after school and the response variable is the students' GPA. I believe that the school boards decision on urging parents to limit the number of hours students be allowed to work will heavily influence the affecting GPA after the resolution has been mandated. General Feedback Association does not mean causal. Only experiments can prove causation.

A company manufactures polypropylene rope in six different sizes. To assess the strength of the ropes they test TWO samples (1 and S2) of each size to see how much force (in kilograms) the ropes will hold without breaking. The table shows the results of the tests. We want to create a model for predicting the breaking strength from the diameter of the rope. Diameter (mm) Strength (kg)S1 S2 4 60 76 7 157 153 10 254 262 12 334 388 15 551 529 20 938 893 AVERAGE the two samples' response variable values to create the needed response variable. Use the (logx, logy) transformation to straighten the scatterplot and state the appropriate model (equation).

(root strength hat)=2.64+1.37 (diameter) General Feedback (logx, logy) power model on (4,68) (7,155) (10,258) (12,361) (15,540) (20, 915.5)results inyhat = (6.93)(x^1.61)strength of rope = (6.93)(diam^1.61)

What is true of a data distribution having the BULK of the data at the lower numbers? I. The distribution is skewed to the right. II. The mean is smaller than the median. III. We should summarize with mean and standard deviation.

(wrong; guessing) I only General Feedback Clearly skewed to the right

Doctors studying how the human body assimilates medication inject some patients with penicillin, and then monitor the concentration of the drug (in units/cc) in the patients' blood for several hours. The data are shown in the table. Time Elapsed Concentration(Hours) (Units/cc) 1 42 2 28 3 19 4 13 5 9 6 6 7 4 First they tried to fit a linear model. The regression analysis is shown. Regression of Concentration on Time Elapsed Dependent Variable: Concentration R-squared = 90.05% Variable Coeff S.E. of CoeffConstant 41.2857 2.1602 Time -6 1.8766 Find the correlation (r) between time and concentration.

-0.9489 General Feedback remember r = sqrt(.9005) which can result in either a + or - value of .949 (adjusting for negative slope) would be -.949

A study performed by a psychologist determined that a person's score on a resiliency test is linearly related to the person's score on an inventory measuring the person's overall health. The equation of the least-squares regression line is resiliency(hat) = -49 + 1.8(health). What is the residual for an individual with an overall health score of 110 and a resiliency score of 140?

-9

Ch10_2d What is the answer to part d?

0.77 General Feedback yhat = 1.2(0.8)^xyhat = 1.2(0.8)^2yhat = 1.2(.64)yhat = .768 rounds to .77

As a 4-H project, Billy is raising chickens. He feeds and waters them evey day, and collects the eggs every other day, selling them to people in the neighborhood. He has found that each hen's nest will contain from 0 to 2 eggs. Based on past experience he estimates that there will be no eggs in 10% of the nests, one egg in 30% of the nests, and 2 eggs in the other 60%. Conduct a simulation to estimate how many nests Billy will have to visit to collect a dozen eggs. 1) Describe how you will use a random number table to conduct this simulation. 2) Give your results for three trials using the random number table below. 37542 04805 64894 74296 24805 24037 20636 10402 00822 9166508422 68953 19645 09303 23209 02560 15953 34764 35080 3360699019 02529 09376 70715 38311 31165 88676 74397 04436 2765912807 99970 80157 36147 64032 36653 98951 16877 12171 768333) State your conclusion after performing 3 trials.Give as many details as possible.

1) You will use a random number table to conduct this simulation. This can be done by labeling 0 as no eggs as it is represented by 10%. 1, 2, 3 as one egg as it is represented by 30%. 4, 5, 6, 7, 8, 9 as two eggs as it is represented by 60%. You will go across the row to find how many eggs are represented by each number, and since he will have to collect a dozen eggs, you will stop until it has reached 12 or more. The number of nests visited is the number of digits it took to reach 12 or more. 2) In the first trial he visited 8 nests, in the second trial he visited 7 nests, and in the third trial he visited 7 nests. I performed the simulation by starting the next trial right from where I left off, however if you start a new row, the average can also be calculated this way. 3) According to the simulation, Billy will have to visit on average of 7 nests, to collect a dozen eggs. This is done by taking the average of all of the trials from performing the simulation. General Feedback use digits 0 - 9let 0 be nest with no eggslet 1,2,3 be nests with 1 egglet 4 - 9 be nests with 2 eggsgo through the random digit table one digit at a time and record how many eggs came from each nest...STOP at 12 eggs and count the nestsrepeat 3 times and average:should be approximately 7 nests needed...

Ch 11_4Many kinds of low tech board games people play rely on randomness. (These are not video games.) Cite three (3) different methods commonly used in the attempt to achieve this randomness, and discuss the effectiveness of each.

1. Rolling 1 die or two dice: If the dice are fair and equal, each outcome with numbers 1-6 should have an equal outcome. 2. Spinning a spinner: Each outcome should be equal, however due to other factors such as friction and the way the spinner was made, it will most likely land on a certain side than the other. 3. Cards: If the cards are shuffled properly and fairly, each card will be about equally likely. One way to properly shuffle the cards is 7 times for rifle shuffling. General Feedback one or two dice, spinners, deck of playing cards

Doctors studying how the human body assimilates medication inject some patients with penicillin, and then monitor the concentration of the drug (in units/cc) in the patients' blood for several hours. The data are shown in the table. Time Elapsed Concentration(Hours) (Units/cc) 1 42 2 28 3 19 4 13 5 9 6 6 7 4 First they tried to fit a LINEAR model. The regression analysis is shown. Regression of Concentration on Time Elapsed Dependent Variable: Concentration R-squared = 90.05% Variable Coeff S.E. of Coeff Constant 41.2857 2.1602 Time -6 1.8766 Not being satisfied with those results the researchers try a new model, using the reexpression log(Concentration). Examine the regression analysis below. Dependent Variable: log (Concentration) R-squared = 99.98% Variable Coefficient S.E. of CoefficientConstant 1.7891 0.045 Time -0.1688 0.010 Is this model better than the original linear model? Why or why not? Using the new nonlinear model, estimate the concentration of penicillin after 4 hours....remember the transformation used is x, logy

13 units/cc

Chapter 3 Exercises #15d What percent of the white graduates are planning to attend a 2-year college?

13.4% General Feedback total white students: 268planning on 2 yr college: 3636/268 = .134 or 13.4%

Ch8_1c What is the value of ybar in this question?

152 General Feedback 152 from yhat = 200 - 4x or equivalently ybar = 200 - 4(xbar)and knowingxbar = 12 we substitute to get ybar = 200 - 4(12) or 200 - 48 = 152

Doctors studying how the human body assimilates medication inject some patients with penicillin, and then monitor the concentration of the drug (in units/cc) in the patients' blood for several hours. The data are shown in the table. Time Elapsed Concentration(Hours) (Units/cc) 1 42 2 28 3 19 4 13 5 9 6 6 7 4 First they tried to fit a linear model. The regression analysis is shown. Regression of Concentration on Time Elapsed Dependent Variable: Concentration R-squared = 90.05% Variable Coeff S.E. of CoeffConstant 41.2857 2.1602 Time -6 1.8766 Using the LINEAR model suggested in the summary, estimate what the concentration of penicillin will be after 4 hours.

17.2857 Units/cc General Feedback Use the model suggested by the computer output and substitute hours = 4Variable Coeff S.E. of CoeffConstant 41.2857 2.1602Time -6 1.8766 equation model would bepredicted concentration of penicillin = 41.2857 - 6(hours elapsed)

An airline records data on several variable for each of its flights: model of plane, amount of fuel used, time in flight, number of passengers, and whether the flight arrived on time. The number and type of variables recorded are a. 2 categorical, 3 quantitative (2 discrete and 1 continuous) b. 2 categorical, 3 quantitative (1 discrete and 2 continuous) c. 1 categorical, 4 quantitative (2 discrete and 2 continuous) d. 1 categorical, 4 quantitative (1 discrete and 3 continuous) e. 3 categorical, 2 quantitative (1 discrete and 1 continuous)

2 categorical, 3 quantitative (1 discrete and 2 continuous)

Ch8_9b What price would you predict for a 3000 square foot house?

230.82 thousand or $230,820

Ch 11_8 After simulating the spread of a disease, a researcher wrote, "24% of the people contracted the disease." What should the correct conclusion be? (Give this some thought before responding) a. 24% of the people who were exposed to the disease contracted it b. 24% of the people will contract the disease. c. 24% of the people may contract the disease. d. It is not possible to draw a conclusion from a simulation.

24% of the people may contract the disease. General Feedback Yes...the use of the word "MAY' makes this the correct selection

Chapter 3 Exercises #5b What percent of deaths were from causes not listed in the table?

26.3% General Feedback total (30.3+23+8.4+7.9+4.1) then subtract from 100%

In order to plan the design of a school spirit shirt, the student council conducted a survey. They asked students which color they prefer (blue, white, maroon) and which type of shirt (t-shirt, sweatshirt). The table summarizes the responses. t-shirt sweatshirt total blue 25 10 35 white 15 15 30 maroon 5 10 15 total 45 35 80 What percent of those who prefer sweatshirts chose maroon?

28.6% General Feedback 35 total prefer sweatshirts10 prefer maroon10/35 = .2857 or approximately 28.6%

Ch 11_18 A new electronics store holds a contest to attract shoppers. Once an hour someone in the store is chosen at random to play the Music Game. Using the winning amounts given in the book what is the average dollar amount of music the store will give away? a. 15-25 b. 30-40 c. 10-20 d. 50-60

30-40 General Feedback randInt(1,5,5) Ace: 1other cards 2,3,4,5record amount won on each trialtake average over # trials

Students in a political science course were asked to describe their politics as "Liberal", "Moderate", or "Conservative." Here are the results: Liberal Moderate Conservative Female 38 41 11 Male 51 48 20 What percent of the class considers themselves to be "liberal"?

42.6% General Feedback (38+51)/209 = .4258 or approximately 42.6%

In order to plan the design of a school spirit shirt, the student council conducted a survey. They asked students which color they prefer (blue, white, maroon) and which type of shirt (t-shirt, sweatshirt). The table summarizes the responses. t-shirt sweatshirt total blue 25 10 35 white 15 15 30 maroon 5 10 15 total 45 35 80 What percent of the students prefer blue?

43.75% General Feedback 80 students total35 prefer blue (row)35/80 = .4375 or 43.75%

A survey of autos parked in student and staff lots at a large university were classified by country of origin, as seen in this table. Country of Origin DRIVER Student Staff American 94 81 European 38 22 Asian 66 53 What percent of the staff drove American cars?

52% General Feedback (81/156) = .5192 or approximately 52%

A survey of autos parked in student and staff lots at a large university were classified by country of origin, as seen in this table. Country of Origin DRIVER Student Staff American 94 81 European 38 22 Asian 66 53 What percent of the European car drivers were students?

63% General Feedback (38/60) = .6333333 or approximately 63%

In order to plan the design of a school spirit shirt, the student council conducted a survey. They asked students which color they prefer (blue, white, maroon) and which type of shirt (t-shirt, sweatshirt). The table summarizes the responses. t-shirt sweatshirt total blue 25 10 35 white 15 15 30 maroon 5 10 15 total 45 35 80 What percent of those who prefer maroon chose sweatshirt?

66.7% General Feedback 15 total preferred maroon10 of those prefered sweatshirts10/15 = .666667or approximately 66.7%

It's easy to measure the circumference of a tree's trunk, but not so easy to measure its height. Foresters developed a model for ponderosa pines that they use to predict the tree's height (in feet) from the circumference of its trunk (in inches): height = 1.46 (circumference) + 5 A lumberjack finds a tree with a circumference of 60 inches. How tall does this model estimate the tree to be? (Round to the nearest whole number) a. 83' b. 5' c. 11' d. 19' e. 93'

93' General Feedback substitute and evaluate

x 2 5 6 8 10y 5 27 35 65 102 Find both an exponential and power model for this data.Which model is the MOST appropriate for the data? Justify your answer.

After finding both the exponential and power model for the data... the model that is the most appropriate for the data is the power model. My reasoning is that after transforming the points for both, the exponential and power model, the exponential model showed slight curve in the graph while the power model did not. I also calculated the Least Square Regression line for both of the data points, and the Power Model was the most appropriate.

Twelve babies spoke for the first time at the following ages (in months): 15 26 10 9 15 20 18 11 8 20 12 13 Based ONLY on the values of the mean and the median, what can you say about the shape of the distribution of the data set? Explain!

Based ONLY on the values of the mean and median, you can say that the shape of the data set is symmetric. From the previous question, the mean is 14.75 and the median is 14, meaning that the mean and the median are closely related denoting a symmetric, fairly balanced shape.

List the variables. Indicate whether each variable is categorical or quantitative. If the variable is quantitative, tell the units. In June 2003 Consumer Reports published an article on some sport-utility vehicles they had tested recently. They reported some basic information about each of the vehicles and the results of some tests conducted by their staff. Among other things, the article told the brand of each vehicle, its price, and whether it had a standard or automatic transmission. They reported the vehicle's fuel economy, its acceleration (number of seconds to go from zero to 60 mph), and its braking distance to stop from 60 mph. The article also rated each vehicle's reliability as much better than average, better than average, average, worse, or much worse than average.

Brand of each vehicle: Categorical. Its price: Quantitative; Dollars (units). Standard or automatic transmission: Categorical. Vehicle's Fuel Economy: Categorical (ordinal). Its acceleration: Quantitaive; m/s^2 (units). Braking Distance: Quantitative; meters/feet (units)

The Mars candy company starts a marketing campaign that puts a plastic game piece in each bag of M&Ms. 25% of the pieces show the letter "M," 10% show the symbol "&," and the rest just say "Try again." When you collect a set of three symbols "M," "&," and "M" you can turn them in for a free bag of candy. About how many bags will a consumer have to buy to get a free one? Let's use a simulation to find out. (This is the second part to a question you may have already had) Carefully conduct and label your simulation using the digits provided, show your results, and state your conclusion. 57821 76309 63508 29418 13026 34993 54636 17877 00987 23401

By conducting my simulation the first M would appear in the "17" in the first set, the second M would appear shortly after representing the "09", and the & would be represented in the last set by the "34". In conclusion, you would need to buy about 24 bags of M&Ms in order to get the entire set and a free one. (included some of this information in the first question) General Feedback M: 00-24, &: 25-34, Try again: 35 -99 or using only 20 digits 5/20 = 25% assign 01-05 as M, 2/20 = 10% assign 06 and 07 as "&" and the remaining digits (08-20) as try again.

In which one of the following distributions is the mean most likely greater than the median?

C

Which of the following represent a mosaic graph?

C

A medical researcher finds that the more overweight a person is, the higher his blood pressure tends to be. That is not very specific but we can say something about ONE of the following descriptors, FORM, DIRECTION, or SHAPE. What do we know about the correlation between weight and blood pressure?

Direction, since correlation measures strength and direction of quantitative variables. It has a positive correlation, because as weight increases, so does blood pressure.

With regard to regression, which of the following statements about outliers are true? I. Outliers have large residuals. II. A point may not be an INFLUENTIAL outlier even though its x-value is an outlier in the x-variable and its y-value is an outlier in the y-variable. III. Removal of an outlier sharply affects the regression line.

I and II

Chapter 2 Exercises #12 Which of these variables are quantitative? I. age II. days absent III. current grade level

I and II General Feedback Age and days absent are quantitative. Grade Level is not.

Which statement about influential points MUST be true? I. Removal of an influential point changes the slope of the regression line. II. Data points that are outliers in the horizontal direction are more likely to be influential than points that are outliers in the vertical direction. III. Influential points will always have large residuals.

I and II only General Feedback I: is the definition of influential (p 169) II:outliers in the x-direction have high "leverage" pulling the line towards themselves and possibly small residuals (p 168)

Consider the scatterplot of midterm and final exam scores for a class of 15 students. Which of the following are true statements? Read each statement and decide on its validity before selecting your answer. I. The same number of students scored 100 on the midterm exam as scored 100 on the final exam. II. Students who scored higher on the midterm exam tended to score higher on the final exam. III. The scatterplot shows a moderate negative correlation between midterm and final exam scores.

I and III

In order to rate TV shows, phone surveys are sometimes used. Such a survey might record several variables, some of which are listed below. Which of these variables are categorical? I. The type of show being watched II. The number of persons watching the show III. The ages of persons watching the show IV. The name of the show being watched V. The number of times the show has been watched in the last month

I and IV General Feedback II and III are quantitative

Here are the weights (in pounds) of 20 steers on an experimental feed diet: 183 140 136 142 172 153 172 156 70 167 192 166 113 171 159 129 112 135 174 155 How would you describe this data's overall shape?

I would describe this data's overall shape is skewed left with an outlier that consists of 70. This data's overall shape is skewed left because most of the data is in the higher numerical values than lower and it is generally populated there with the exception of the value of 70.

Chapter 4 Exercises #11 Which of the following statements is true about the stemplot pictured in the question. I. contains a gap II. skewed right III. outliers are present IV. symmetric

I, II, III General Feedback I, II, and III are true.outliers are present according to the (1.5)(IQR) formula

Which of the following are true statements? I. The standard deviation is the square root of the variance. II. The standard deviation is zero only when all values are the same. III. The standard deviation is strongly affected by outliers

I, II, III General Feedback all are trueII is also true

Chapter 3 Exercises #12 Although the pie chart looks appropriate, it has many flaws. I. total is not 100% II. no variables are named III. sizes do not accurately reflect the %Which of these are flaws in the data display?

I, II, and III General Feedback all are flaws with the data display

Which of the following statements are true? I. A histogram is exactly the same as a bar graph. II. Stem plots preserve the actual data values. III. Categorical data can be shown using a stem plot.

II General Feedback Option I is false:a histogram is NOT the same as a bar graphOption II is true: stem plots preserve actual dataOption III:false....stem plots are only used for quantitative data

Which statment(s) is/are true? I. Random scatter in the residuals indicates a model with high predictive power. II. If two variables have a very strong linear relationship, then the correlation between them will be near +1.0 or -1.0. III. The higher the correlation between two variables the more likely the association is based in cause and effect.

II only General Feedback I: residuals do not tell about predictive power simply that linear model is suitable or not II. TRUE statement III. NEVER cause and effect

Chapter 2 Exercises #16Which of these are categorical variables? I. size II. # of years in existence III. varieties of grapes grown

III General Feedback Size as measure is "acres" is quantitative (numerical).# of years is definitely numerical.Varieties of grapes is NOT numerical so it is categorical

Here are the weights (in pounds) of 20 steers on an experimental feed diet: 183 140 136 142 172 153 172 156 70 167 192 166 113 171 159 129 112 135 174 155 Where is the center of the distribution of this data set?

Just by a glance, you can assume that the center of distriution is around the 150 pound area by taking in variables such as low outliers as well as high numbers in the data set. If you calculate the mean, you would get about 149.85 pounds. If you calculate the median, you would get 155.5. Thus, going with our original observation, the center of distribution would likely occur in the 150 pound area.

Students in a political science course were asked to describe their politics as "Liberal", "Moderate", or "Conservative." Here are the results: Liberal Moderate Conservative Female 38 41 11 Male 51 48 20 What are the MARGINAL DISTRIBUTIONS by gender? Remember that these are reported as percents.

Male: 56.94%; Female: 43.06% General Feedback 209 total students surveyed90 females119 males90/209 = 43% females119/209 = 57% males

Consumers' Union measured the gas mileage in miles per gallon of 38 1978-1979 model automobiles on a special test track. The pie chart below provides information about the country of manufacture of the model cars used by Consumers Union. Based on the pie chart, we may conclude that: a. Swedish cars get gas mileages that are between those of Japanese and U.S. cars. b. More than half of the cars in the study were from the United States. c. U.S cars get significantly higher gas mileage than cars from other countries. d. Japanese cars get significantly lower gas mileage than cars of other countries. e. Mercedes, Audi, Porsche, and BMW represent approximately a quarter of the cars tested.

More than half of the cars in the study were from the United States.

A school board study found a moderately strong negative association between the number of hours high shcool seniors worked at part-time jobs after school hours and the students' grade point averages. Explain in this context what "negative association" means.

Negative association in this context means that the number of hours high school seniors working at part-time jobs after school hours was above average, and the students' grade point averages were below average. General Feedback When hours worked increase....then GPA tends to decrease

Ch9_6a Does the value of rsquare mean that a LINEAR model is NOT appropriate. EXPLAIN completely!!

No, it just means that the model doesn't explain more than 13% of the variation of the response variable. General Feedback Only the residual plot indicates the appropriateness of a using a linear model.r-squared does not say whether a linear model is appropriate or not but rather tells the % of variation of the response variable that is attributable to the variation in the explanatory variable.

Here are the weights (in pounds) of 20 steers on an experimental feed diet:183 140 136 142 172 153 172 156 70 167 192 166 113 171 159 129 112 135 174 155 Make a stemplot of this data.

Stemplot: Weights of 20 steers on an experimental feed diet 07 | 0 08 | 09 | 10 | 11 | 2 3 12 | 9 13 | 5 6 14 | 0 2 15 | 3 5 6 9 16 | 6 7 17 | 1 2 2 4 18 | 3 19 | 2

In order to plan the design of a school spirit shirt, the student council conducted a survey. They asked students which color they prefer (blue, white, maroon) and which type of shirt (t-shirt, sweatshirt). The table summarizes the responses. t-shirt sweatshirt total blue 25 10 35 white 15 15 30 maroon 5 10 15 total 45 35 80 What are the MARGINAL Distributions by shirt type? Remember these should be reported as percents.

T-Shirt: 56.25%; Sweatshirt: 43.75% General Feedback 45/80 = 56.25% t-shirt35/80 = 43.75% sweatshirt

Twelve babies spoke for the first time at the following ages (in months): 15 26 10 9 15 20 18 11 8 20 12 13 Find the five number summary. Identify each value with number AND statisical name.

The Five Number Summary: Minimum: 8 Quartile 1: 10.5 Median: 14 Quartile 3: 19 Maximum: 26

All but one of these statements contain a blunder. Which could be true? a. The correlation between a football player's weight and the position he plays is 0.54 b. There is a high correlation (1.09) between height of a corn stalk and its age in weeks. c. The correlation between the amount of fertilizer used and the yield of beans is 0.42. d. There is a correlation of 0.63 between gender and political party. e. The correlation between a car's length and its fuel efficiency is 0.71 miles per gallon.

The correlation between the amount of fertilizer used and the yield of beans is 0.42. General Feedback Correlation can only be measured using two quantitative variables.

It would be best to answer the question a nd then go on and supply the logic that produced that result.Is it easier for the best National Basketball Association (NBA) players to get a rebound or an assist? The data was taken during the 1994-1995 season.The top 20 rebounding leaders averaged the following numbers of rebounds per game: 16.8 (Rodman), 12.6, 11.4, 11.1, 11.0, 10.9, 10.9, 10.9, 10.8, 10.8, 10.6, 10.6, 10.6, 10.4, 10.3, 9.9, 9.7, 9.7, 9.6, and 9.4 The top 20 assist leaders averaged the following number of assists per game:12.3 (Stockton), 9.4, 9.3, 8.8, 8.7, 8.3, 8.2, 7.9, 7.7, 7.7, 7.6, 7.5, 7.3, 7.2, 7.1, 6.9, 6.4, 6.2, 6.1, and 5.7*Compare these data using a pair of distributions. It is not necessary to show your graph.*Comment on shape and outliers.You must comment enough so that your response can be followed WITHOUT a picture! Make sure you answer the original question...which is easier...rebound or assist???

The data regarding the 1994-1995 NBA season, is listed above. The summary of the rebounding data is that, it has a mean of 10.9, a median of 10.7, a minimum of 9.4, a maximum of 16.8, and the range is 7.4. While taking a glance at the data for rebounding leaders, it is clear to be said that the value 16.8 and 12.6 are outliers. While taking a look at this from a graph stand point you can assume that it is a symmetrical graph, however it does tend to slightly skew right. The IQR of the rebound data set is .85, making it a relatively small Interquartile range. The summary of the assist data is that, it has a mean of about 7.8, a median of 7.65, a minimum of 5.7, a maximum of 12.3. While taking a glance at the data for the assist leaders, it is clear to be said that 12.3 is an outlier and that the graph is about symmetrical. The assist data set has an IQR of about 1.5, making it relatively larger. In comparison from the two data sets, it clear that rebounding is easier than assisting by the basis of the IQR. The IQR of the assist data is 1.5, indicating a larger spread from 50% of the middle data values. Rebounding is also easier due to the mean in comparison to the assists. The mean for rebounding (10.9) is greater than the mean for assists (7.8) General Feedback It would be best to answer the question and then go on and supply the logic that produced that result. "Is it easier for the best National Basketball Association (NBA) players to get a rebound or an assist?"

A study was conducted on the weights of three different species of fish found in a lake in Finland. These three fish (bream, perch and roach) are commercial fish. Their weights are displayed in the boxplots above. Which of the following statements comparing these boxplots is NOT correct? a. The distributions of weights are approximately symmetric for all three species. b. The spread of roach is less than the spread of the other two species. c. There are no outliers in weight for the three species. d. The median weights of the three species differ. e. The variability in the weights for the three species exceeds the variation in the three species' means.

The distributions of weights are approximately symmetric for all three species. General Feedback look at the boxplots again

Ch10_8 What is the equation for your model to describe the length of a planet's year based on its distance from the sun? How did you decide what type of model is best???

The equation is predicted log = -2.955 + 1.501 log (distance). It fits perfect General Feedback (logx, logy) is the needed transformationsuggesting a POWER model is best. [predicted length of the year] = (.001) (distance from the sun) ^1.5yhat = (.001) x^1.5

What does the square of the correlation (r2) measure?

The fraction of the variation in the values of y that is explained by least-squares regression on the other

After conducting a marketing study to see what consumers thought about a new tinted contact lens they were developing, an eyewear company reported, "Consumer satisfaction is strongly correlated with eye color." Comment on this observation.

The marketing study "Consumer satisfaction is strongly correlated with eye color", this states that there is an association between customer satisfaction and eye color. However, this wouldn't be considered correlated since both of the variables are categorical, hence not being able to clarify if one increases/decreases the other will increase/decrease. General Feedback correlation is only suitable for quantitative variables and these are NOT, they are categorical

Twelve babies spoke for the first time at the following ages (in months): 15 26 10 9 15 20 18 11 8 20 12 13 Find the mean and median and indicate which is which.

The mean is 14.75. The median is 14.

Ch8_8b Interpret the meaning of rsquare in this problem.

The meaning of R^2 in this problem is that the number of wins explains 33.3% of the variation in attendance. General Feedback The variation in the # of wins accounts for about 33% of the variation in attendance...and 67% comes from other sources.

Chapter 3 Exercises #8b What is true about the percentages listed? a. The percentages should be divided by 3, since there are 3 years represented. b. The percentages total more than 100%. c. The percentages are unreliable since there are too few categories. d. The percentages are unreliable since there are too many categories

The percentages total more than 100%. General Feedback No....The percentages total more than 100%.

A researcher reports that, on average, the participants in his study lost 10.4 pounds after two months on his new diet. A friend of yours comments that she tried the diet for two months and lost no weight, so clearly the report was a fraud. Which of the following statements is correct? a. Since your friend did not lose weight, the report must not be correct. b. Your friend is an outlier. c. In order for the study to be correct, we must now add your friend's results to those of the study and recompute the new average. d. The report only gives the average. This does not imply that all participants in the study lost 10.4 pounds or even that all lost weight. Your friend's experience does not necessarily contradict the study results. e. Your friend must not have followed the diet correctly, since she did not lose weight.

The report only gives the average. This does not imply that all participants in the study lost 10.4 pounds or even that all lost weight. Your friend's experience does not necessarily contradict the study results.

A study of the fuel economy for various automobiles plotted the fuel consumption (in liters of gasoline used per 100 kilometers traveled) vs. speed (in kilometers per hour). A least-squares regression line was fitted to the data and the RESIDUAL PLOT is displayed to the right. What does the pattern of the residuals tell you about the linear model? a. he residual plot confirms the linearity of the data. b. The residual plot clearly contradicts the linearity of the data. c. The evidence is inconclusive. d. The residual plot suggests a different line would be more appropriate.

The residual plot clearly contradicts the linearity of the data. General Feedback You can't assume another line would be appropriate. Look only at what is given. The correct response should be The residual plot clearly contradicts the linearity of the data.

You are conducting an experiment to test the density of a new recipe for chocolate cake. You will test the recipe with 3 different baking temperatures (350, 400 and 450) and 3 different baking times (45 minutes, 55 minutes and 60 minutes). What is the response variable?

The response variable is the density of the new recipe.

As reported in the Journal of the American Medical Association (June 13, 1990), for a study of ten nonagenarians (90+yrs old), the following data shows a measure of strength versus a measure of functional mobility Strength (kg) 7.5 6 11.5 10.5 9.5 18 4 12 9 3 Walk time (s) 18 46 8 25 25 7 22 12 10 48 Find the LSRL and tell what the slope signifies?

The sign is negative, signifying that the greater the strength, the less the functional mobility. General Feedback The sign is negative, signifying that the greater the strength, the less the functional mobility.

The following table shows the number of hot dogs eaten by contestants in a hot dog eating contest in 2005. James 45 Susan 20 Maggie 28 Richard 15 Wes 56 Amy 48 Daniel 36 What type of display would you use to display this data? Don't forget that the names are also a variable. Defend your choice.

The type of display that I would use to display this data is a bar graph. I would use a bar graph since the names are a variable, each bar would represent the name as a category, as many bar graphs are often used to display categorical data. On the x-axis, the labels would represent the names, and on the y-axis, the number of hot dogs eaten by contestants in the competition would be the numerical values, corresponding to each person. General Feedback There are two variables to be represented...names and # hot dogs consumed. The best graph for this data would be a bar graph. A pie graph COULD work if all values are converted into % using the total hot dogs consumed.

Which variable is NOT quantitative? a. Final scores for every Falcons game last season b. The types of cars in the parking lot. c. Number of books in your locker d. Teacher's ages

The types of cars in the parking lot. General Feedback numerical IS quantitative

In order to plan the design of a school spirit shirt, the student council conducted a survey. They asked students which color they prefer (blue, white, maroon) and which type of shirt (t-shirt, sweatshirt). The table summarizes the responses. Identify the variables and tell whether each is categorical or quantitative.

The variables are color (blue, white, or maroon) and type of shirt (t-shirt, or sweatshirt). Both of these variables are categorical. Though each of the variables have a numerical value for how many are in the category.

Describe, in words, how a WEAK but POSITIVE LINEAR association would look. If the scatterplot is not possible, say so.

The weak but positive linear association would present itself as increasing, however the scatter points are not close together in some regard, indicating low correlation. Both of the variables will go up in response to each other, but the relationship is not very strong. The correlation r would often be between 0.1/0.2 to 0.4. General Feedback widely scattered data points trending upward from left to right

Describe, in words, how a STRONG LINEAR association with r near 0 would look. If the scatterplot is not possible, say so.

This scatterplot is not possible. This is due to the fact that strong linear associations often have the correlation r near 1 to -1, as that is often characterized in a perfect correlation representing a straight line. A r near 0 would show that there is no correlation between the two variables. General Feedback not possible...r should be near 1 or -1 to be strong (at least >.5)

Listed below are 10 of the students in your class. Use the random numbers listed to choose a random sample of 3 of them. Clearly explain your method. Lucy, Charlie, Cindy, Bobby, Jan, Marcia, Peter, Greg, Oliver, Arnold 04905 83852 29350 91397 19997 65142 05087 11232

To use a random numbers listed to choose a random sample of 3 of them, you would first individually go and assign each person a number in order from left to right from the scale of 00-09. For exmaple, Lucy would be assigned 00, Charlie - 01, Cindy - 02, etc. You will ignore valaues 10-99 as they do not correspond to any student on the list. To beging the random sample, the first trial would include Jan (04) since he appeared first on the list. Then the second trial, you would have to skip many numbers as they are located in the 10-99 category, however you will eventually reach Arnold (09), as you carefully go through two-digits at a time. The last trial would conclude Marcia (05) as she appeared later, having to skip many of the numbers. General Feedback NUMBER 00-19 or 01-20....look through random digits two at a time until 3 legitimate names are selected

Listed below are names of the 20 pharmacists on the hospital staff. Use the random numbers listed to select three of them to be in the sample. Clearly explain your method. Pastore, Back, Spiridonov, Ahi, Hedge, MacDowell, Schissel, Novelli, Lavine, Kaplan, Highland, Roundy, Grubb, Markowitz, Glass, Davies, Golkowski, Reeves, Janis, Yen 04905 83852 29350 91397 19997 65142 05087 11232

To use the random numbers listed to select three of them to be in the sample. You would first have to assign each individual with their corresponding number in the scale from 00-19 since there are 20 individuals listed above. For example, Pastore would be represented as 00 since he is the first individual in the list, Back would represent 01 as he is the second individual represented in the list, Spiridonov - 02, etc. To select three of them to be in the sample, you will go two digits at a time, but if the digits are in the 21-99 category, you will skip it unitl you find the name of a pharmacist. The first selected includes Hedge as the first pharamcist since 04 showed up first on the list. The second would be Kaplan after you skip many numbers in the random number list and you eventually reach 09. The last selection is Markowitz as he is the number pair right after Kaplan in the random number list, representing 13. General Feedback NUMBER 00-19 or 01-20....look through random digits two at a time until 3 legitimate names are selected

Give two major differences between the mean and median as measures of the center of the distribution.

Two major differences between the mean and median as measures of the center of distribution is that the mean is often sensitive to outliers, while the median isn't making it resistant and only characterizes the middle value. The next difference is that the median is often referred to the positional average as it takes the mid-point of a total data set, while the mean is typically the numerical average and takes up the center of gravity within the distribution. This makes the mean more lenient to the high populated areas, as stated before, it takes extreme values into account.

Ch 11_10 Cereal Problem Again... Suppose you really want the Tiger Woods picture card. How many boxes of cereal do you need to buy to be pretty sure of getting at least one? Your simulation shold use at least 10 runs. Include a short explanation of your method.

You will need 5 boxes of cereal to be pretty sure of getting at least one. We can run the trial (opening boxes) until the collection is complete with at least receiving one more Tiger Woods picture card. This can be done by looking at each random digit and indicating if it is the Tiger Woods picture card or not. The stimulation will have at least 10 runs. The response variable is the number of boxes of cereal to be pretty sure of getting at least one (5). General Feedback 20% are Tiger Woods cards...use two digits....maybe 0 and 1 to represent Tigerproduce random digits using randInt(0,9) until 0 or 1 shows up...count how many tries repeat 10 times and average the # of tries

The Mars candy company starts a marketing campaign that puts a plastic game piece in each bag of M&Ms. 25% of the pieces show the letter "M," 10% show the symbol "&," and the rest just say "Try again." When you collect a set of three symbols "M," "&," and "M" you can turn them in for a free bag of candy. About how many bags will a consumer have to buy to get a free one? Let's use a simulation to find out. Explain how you will use the random numbers listed below to conduct your simulation. 57821 76309 63508 29418 13026 34993 54636 17877 00987 23401

You would use the random numbers listed below to conduct the simulation. First you would each value from 00-99. Then assign the 00-24 as the M as it says 25% show the letter, then the 25-34 will show the symbol "&" since 10% shows the symbols. And the rest would include 35-99 as the try again, which would be the remaining 65%. In this simulation, you would need to buy about 24 bags of M&M's before you can buy a new one. This is calculated by stating that two-digits represent one bag, and you will have to choose enough 2 digits until you win a free bag. General Feedback M: 00-24, &: 25-34, Try again: 35 -99 or using only 20 digits 5/20 = 25% assign 01-05 as M, 2/20 = 10% assign 05 and 06 as "&" and the remaining digits (07-20) as try again.

Vocabulary: Two variables that are actually not related to each other may nonetheless have a very high correlation because they both result from SOME OTHER, possibly HIDDEN, factor. This is an example of ...

a lurking variable. General Feedback look up the definition of outlier...not suitable here

FREE RESPONSE A simple random sample of adults living in a suburb of a large city was selected. The age and annual income of each adult in the sample were recorded. The resulting data are summarized in the table below. a) What is the probability that a person chosen at random from those in this sample will be in the 31-45 age category? b) What is the probability that a person chosen at random from those in this sample whose incomes are over $50,000 will be in the 31-45 age category? Show your work .c) Based on your answers to parts (a) and (b), is annual income independent of age for those in this sample? Explain.

a) The probability is 43%. This can be calculated by taking the number of individuals in the 31-45 age category and dividing it by the total number of individuals in the sample. (89/207) b) The probability is 36.45%. This can be calculated by taking the number of individuals in the 31-45 age category within the over $50,000 annual income category and dividing it by the totla number of individuals within the over $50,000 annual income category. (35/96) c) No, I do not believe annual income and age are independent for those in this sample. This part due to the reason why the values from part (a) and part (b) would be equal, thus making both of the variables not independent. It can then be slightly inferred that the age group 31-45 are more likely to make more money. General Feedback a) (89/207) = .42995 or approximately 43%The probability that a person chosen at random from those in this sample will be in the 31-45 age category is approximately 43%.b)Conditional probability: P(31-45 age | >50,000 income) = P(31-45 AND >$50,000)/P(>$50000) or (35/96)=.36458 or approximately 36.5%The probability that a person chosen at random from those in this sample whose incomes are over $50,000 and who will be in the 31-45 age category is approximately 36.5%.c)For annual income and age to be independent variables they cannot exert any influence over each other. Statistically independence would exist if P(age) and P(age|income level) were equal. From parts (a) and (b) we see that the values are different .43 and .365, therefore the variables of age and income in this sample are NOT independent.

The coefficient of determination for the scatter plot pictured is approximately...(More than one answer may apply) a. 0.88 b. 0.35 c. -0.80 d. 0.65 e. 0

a. 0.88 d. 0.65 General Feedback question is asking for rsquare approximate r from the picture then square itan estimate of either .88 or .65 would be acceptable

Chapter 4 #14 The type of graph pictured is ________________Select the BEST answer choice to describe the distribution. OBSERVE THE STEMS!!! a. stemplot b. back to back stem plot c. dot plot d. back to back stem plot with plot stems

back to back stem plot with split stems General Feedback YES...back to back stem plot with split stems

Chapter 3 Exercises #13b Which graph type works better for this specific data,bar graph or pie chart? Explain.

bar graph because it makes comparisons easier

Ch11_16 A person with type O-positive blood can receive blood only from other type O donors. About 44% of the U.S. population has type O blood. At a blood drive, how many potential donors do you expect to examine in order to get three (3) units of type O blood? a. between 2 and 5 b. between 10 and 20 c. between 6 and 10 d. between 9 and 12

between 6 and 10 General Feedback randInt(1,99, 10)assign 00-43 as type Ocount # needed to find 3 values in the correct rangerepeat at least 10 trialsshould be between 6 and 10

Chapter 2 Exercises #15In this context the country of origin is what type of variable? a. binary b. quantitative c. none of the above... country of origin is not a variable d. categorical

categorical General Feedback NO...country of origin is a categorical variable taking on many values (countries)

A distribution shows the __________________ of a data set. a. center, shape, and spread b. specific entries c. height, weight, and length d. mean and mode

center, shape, and spread General Feedback lost in intervals

Chapter 3 Exercises #24 These percentages are ...

column percentages General Feedback the column totals are 100 so COLUMN PERCENTAGES

Chapter 7 Exercises #11 Which scatterplot is best described as having a correlation factor of -.487?

d General Feedback YES... -.487 indicates a moderate negative association

A scatterplot x versus log y shows a strong positive LINEAR pattern although the original data do not. Which type of non-linear model (exponential or power) would be appropriate for the original data?

exponential General Feedback if (x, log y) straightens the original data...then an exponential model is the most appropriate model to use.

The center of a histogram... a. cannot be more than 50. b. is the point in the middle of the x-axis of the distribution. c. is not present in skewed data. d. is the point in the graph where about half of the data lies above and half of the data lies below.

is the point in the graph where about half of the data lies above and half of the data lies below. General Feedback no such rule

A company's sales grow by the SAME FIXED AMOUNT each year. That means the increase is the same year over year. This growth is ...

linear General Feedback nope

Chaper 3 Exercises #29d What were the delay rates at the SMALL hospital for each kind of surgery?

major: 20% minor: 8% General Feedback major: 10/50 or .20 = 20%minor: 20/250 or .08 = 8%

Does regular exercise decrease the risk of cancer? A researcher finds 200 women over 50 who exercise regularly, pairs each with a woman who has a similar medical history but does not exercise, then follows the subjects for 10 years to see which group develops more cancer. This is a ...

prospective study General Feedback into the future is not retrospectiveretrospective looks at past events

The correlation between X and Y is r = 0.17. If we add 10 to each X value, divide each Y value by 2, and interchange the X and Y variables, the new correlation is

r = 0.17

Chapter 7 Exercises #3 part d Which variable is the explanatory variable? a. reaction time b. arbitrary assignment c. the driver d. blood alcohol level

reaction time General Feedback the book says the answer is reaction time...so credit for this...this question can be viewed two ways...(1) blood alcohol level is the explanatory variable and reaction time is dependent...drinking comes first...impaired reaction time comes next(2) reaction time is the explanatory variable which suggests that blood alcohol level is increased

Chapter 2 Exercises #2 The HOW of this question is____________

sampling General Feedback YES..the method was SAMPLING

The table below shows how a company's employees commute to work. Transportation Car Bus Train Managers 26 20 44 Labor 56 106 168 What kind of display would be best to demonstrate an association, or not, between job classification and method of transportation?

side by side segmented bar graphs General Feedback there is no such graph typebar graphs are used for categorial data and side by side segmented bar graphs clearly demonstrate whether the same percents appear in both variables

Chapter 4 Exercises #15 The SHAPE of this hurricane data distribution is _______ a. skewed left b. skewed right c. symmetric d. uniform

skewed right General Feedback YES skewed...but...Skewed right with longer whisker at the higher end and a clear outlier at 7mean = 2.3 and median = 2 so skewed RIGHT

In a frequency distribution of 3000 scores, the mean is 78 and the median as 95. One would expect this distribution to be_______________. a. skewed to the left b. symmetrical and uniform c. symmetrical and mound-shaped d. bimodal

skewed to the left General Feedback Recall, the mean is always pulled toward the tail. mean is pulled toward the tail and 78 < 95SO the distributions is skewed LEFT (LOWER)

Chapter 4 Exercises #28 Using the data and your graphs for the smokers and ex-smokers distributions...select the best description from these choices. a. smokers' distribution is less symmetric with a slightly lower mode than ex-smokers and has an outlier b. the two distributions are almost identical in shape and characteristics c. ex-smokers' distribution has a mode (most frequent data value) at approimately 200 d. ex-smokers' distribution has no gaps or outliers

smokers' distribution is less symmetric with a slightly lower mode than ex-smokers and has an outlier General Feedback no...there are many differences apparent from looking at the graphs

Chapter 7 Exercises #23 part a The mistake being made in this statement is __________ a. correlation is never negative b. the correlation indicates a moderate association between the variables c. infant mortality rate is not a quantitative variable so we cannot talk about correlation d. GDP is not a variable

the correlation indicates a moderate association between the variables General Feedback NO...the correlation indicates a moderate association between the variables

Vocabulary: Residuals are ...

the difference between observed responses and values predicted by the model. General Feedback the correct answer is present in the options

An agriculture research firm is testing a new type of technique for growing plants. They keep track of the number of seeds in each plot and the number of healthy plants that grow for the plot. The data is given in the table. What is the equation for the Least Squares Regression Line obtained from the calculator?

y hat= 0.82353+0.40196x General Feedback Enter #seeds into L1 Enter # healthy plants into L2 Stat/Calc/LinReg y = a + bxusing the a and b from the result


Related study sets

Hess&Hunt and Inman: Domain III-Topic B

View Set

Week 4 Check Your Understanding Assignment

View Set

Radiation questions from classmates

View Set

Vocabulario de los dias del arcoiris

View Set

Chapter 68: Management of Patients With Neurologic Trauma

View Set

Chapter 2 Financial Statements, Taxes, and Cash Flow

View Set

Week 1: Ch. 9: Chronic Illness and Disability

View Set

Ch 40 PrepU: Nursing Care of a Family ... Respiratory Disorder

View Set