Statistics Midterm

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Curiously, during months when sales of beer are above average, sales of ice cream also tend to be above average; during months when sales of beer are below average, sales of ice cream also tend to be below average. Which of the following can we conclude from these facts? 1. The correlation between "beer sales" and "ice cream sales" is negative 2. For a lot of people, drinking beer causes a desire for ice cream, or vice versa 3. A scatterplot of monthly ice cream sales versus monthly beer sales would show that a straight line describes the pattern in the plot, but it would have to be a horizontal line. 4. None of these are correct

4

Calculate the median for the given data set: 3.0, 3.5, 3.5, 4.1, 4.8, 5.2, 7.1, 11.2

4.45

Calculate the mean for the given data set: 3.0, 3.5, 3.5, 4.1, 4.8, 5.2, 7.1, 11.2

5.3

Define Statistics

A branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data. "The science of data."

Pie Chart

A circular chart divided into triangular areas proportional to the percentages of the whole. Categorical data only.

The graph below compares the average yards per game for Quarterbacks from the AFC and NFC conferences. This data represents 78 different Quarterbacks from the 2017 NFL regular football season. 1. What type of variables are in the plot? 2. From the plot which of the following statements is FALSE? a) The interquartile range for the AFC is smaller than the NFC. b) The 75th percentile for NFC quarterbacks is approximately 220 yards.

1. Average yards per game is quantitative and conference is categorical. 2. b

In a survey interested in studying the habits of seniors and internet usage, a question asked: "By which method do you typically access the internet?" a. cell phone b. personal laptop or computer c. another person's laptop or computer d. public computer (library or other) e. I don't access the internet Identify the following in this scenario: 1. Individual Unit or Subject 2. The variable being measured 3. The variable type

1. Individual Unit or Subject is the a senior taking the survey. 2. The variable being measured is the method by which a senior access the internet . 3. The variable type is categorical .

Wine bottles at a small winery are sampled and tested for quality. One measurement is volume, the bottles should be filled to 750 ml, with some variation expected. 1. What shape would you use to describe the distribution? 2. Based on this histogram does 750 ml seem to be a reasonable center for the data? 3. Approximately about how many bottles of wine have more than 752 ml? 4. Approximately, what percent of bottles is less than 752 ml?

1. Symmetric 2. Yes the center is approximately 750 ml 3. 6 4. 50%

Located in Yellowstone National Park in Wyoming, Old faithful is a spectacular geyser spewing water several hundred feet in the air on a regular basis. The following is a regression analysis on the relationship between length of eruption in minutes and the waiting time until the next eruption for 272 randomly selected eruptions. 1. Waiting time between eruptions in minutes is the ___________ variable. 2. Length of eruptions in minutes is the ___________ variable. 3. The value that makes the most sense for the correlation coefficient is r = 4. The least squares regression equation is:

1. response 2. explanatory 3. 0.90 4. y-hat = 33.4744 + 10.7296x

In a study of human development, investigators showed two different types of movies to groups of children. Crackers were available in a bowl, and the investigators compared the number of crackers eaten by children watching both movie types. One movie was shown at 8 A.M. (right after the children had breakfast) and the other at 11 A.M. (right before the children had lunch). It was found that during the movie shown at 11 A.M., more crackers were eaten than during the movie shown at 8 A.M. The investigators concluded that the different types of movies had different effects on appetite. 1. The response variable in this experiment is 2. The treatment in this experiment is 3. The results cannot be trusted because

1. the number of crackers eaten 2. the different kinds of movies 3. the time each movie was shown is a confounding variable

A researcher studying the effect of price cuts on consumers' expectations makes up two different histories of the store price of a hypothetical brand of laundry detergent for the past year. Eight students in a class view one or the other price history on a computer. Some students see a steady price, whereas others see regular sales that temporarily cut the price. Students are asked the price they would expect to pay. 1. The response is 2. The experimental units are 3. This is an example of

1. the price they would expect to pay 2. the eight business students who participated 3. a randomized comparative experiment

The graph depicts total points scored per basketball team for four schools in the first 18 games of the 2018-2019 season. From the plot compute the approximate interquartile range for Utah.

12 (IQR = Q3-Q1 = 81 - 69 = 12)

Married respondents in a survey were asked how they met their spouse. According to the bar chart below, approximately how many respondents met their spouse through a dating service?

30

Suppose the office of campus safety at Oregon State University would like to know to what degree students feel safe on campus. A student intern interviews 15 student volunteers she meets walking on campus between 9:00PM and 11:00PM. What is the method of sampling used?

Convenience sample

Describing Distributions: Shape

Symmetric (both sides are mirror images of each other), Left (Negatively-skewed; tail of the distribution is longer to the left with bulk of the data on the right), and Right (Positively-skewed; tail of the distribution is longer to the right with bulk of the data on the left).

Quantitative Variable

Takes numerical values for which it makes sense to find an average. Display types: Histogram, stemplot, box plot.

Suppose a survey asked "How many times have you accessed the Internet this week?" to which individuals responded with "never", "once or twice", "three or four times" or "more than four times". The following is a bar chart of the results, what could be done to make this chart better represent the data?

The categories should be reordered, "never", "once or twice", "three or four times" or "more than four times".

For data wine volume data below, which should you choose to describe the center of the data?

Either, based on the shape of the data mean and median should be somewhat the same.

True or False? Bar graphs are more flexible than pie charts. Both graphs can display the distribution of a categorical variable, but a bar graph can also compare any set of quantities that are measured in the same units.

False

True or False? When making inference from a sample to a population a larger sample size will always have more accurate results.

False

Study Symbols in your notes

In Goodnotes

Distribution

What values a variable takes and how often it takes these values.

The volume of oxygen consumed (in liters per minute) while a person is at rest and while he or she is exercising (running on a treadmill) were both measured for 50 subjects. The goal is to determine if the volume of oxygen consumed during aerobic exercise can be estimated from the amount consumed at rest. The results are plotted below. If the outlier is removed, the correlation coefficient r will:

Neither increase nor decrease.

Some researchers have noted that adolescents who spend a lot of time playing video or computer games are at greater risk of depression and violence. This is an example of

an observational study with lurking variables that may explain the association.

Choose the correct expression for the standard deviation of the following three numbers: 10 10 13

See photo

For data involving answers to the following question, what is the BEST method of displaying the data? How many times have you accessed the Internet this week? (1) never (2) once or twice (3) three or four times (4) more than four times histogram, boxplot, or bargraph, scatterplot?

bargraph

The volume of oxygen consumed (in liters per minute) while a person is at rest and while he or she is exercising (running on a treadmill) were both measured for 50 subjects. The goal is to determine if the volume of oxygen consumed during aerobic exercise can be estimated from the amount consumed at rest. The results are plotted below. The scatter plot suggests:

both a positive association between volume of oxygen consumed at rest and while running, and there is an outlier in the plot.

After taking an exam, your professor tells you your test score is equal to the 3rd quartile for the class. Which of the following is a correct statement according to this information?

You scored better than 75% of your class

Sickle-cell disease is a painful disorder of the red blood cells that in the United States affects mostly African Americans. To investigate whether the drug hydroxyurea can reduce the pain associated with sickle-cell disease, a study by the NIH gave the drug to 150 sickle-cell sufferers and placebo to another 150. Neither doctors nor patients were told who received the drug. The number of episodes of pain reported by each subject was recorded. This is an example of

a double-blind experiment.

A researcher obtained the average SAT scores of all students in each of the 50 states, and the average teacher salaries in each of the 50 states. She found a negative correlation between these variables. The researcher concluded that a lurking variable must be present. By lurking variable she means

a variable that is not among the variables studied, but that affects the response variable

An apple farmer samples 300 apples from a large cart and marks whether or not the apples have blemishes or bruises. The variable on whether or not an apple has blemishes or bruises is? Categorical or Quantitative?

Categorical

An apple farmer samples 300 apples from a large cart and tastes each of them. He rates each apple as "poor", "fair", "good", or "excellent". The variable of rating the apples tastes is? Categorical or Quantitative?

Categorical

Based on the distribution of the data approximately which values represent the mean and median cost of a haircut?

mean = $24, median = $19

According to some studies, people that drive expensive sports cars have lower blood pressure and fewer cardiovascular health problems. We can't conclude that driving expensive sports cars lowers blood pressure or improves cardiovascular health because the studies described are

observational studies, not experiments—lurking variables may explain the association.

Two variables in a study are said to be confounded if

one cannot separate their effects on a response variable

The following is the distribution in dollars for the price of a haircut for 200 randomly selected students. This plot can be described as... positively skewed, negatively skewed, or bimodal?

positively skewed

What three features can best describe the distribution of a quantitative variable?

shape, center, spread

A student organization wanted to study voting preferences in its student body during the recent presidential election. They selected 120 students at random from each class, freshmen through seniors. The sampling technique being used is

stratified random sampling.

Does exposure to classical music (through instrument lessons or concert attendance) improve children's scholastic performance? In a study, researchers measured the amount of exposure to classical music for many children, along with their scores on the state's academic proficiency exam. The explanatory variable in this study is:

the child's score on the state's proficiency exam.

A scatterplot can be used to illustrate the relationship between:

two quantitative variables

A statistic summarizes data from what?

A sample

Each month, the census bureau mails survey forms to 250,000 households. They ask questions about those living in the household and about things like motor vehicle and housing costs. In one month, responses were obtained from 240,000 of the households contacted. 1. What os the population of interest? 2. The sample is... 3. What proportion of those randomly sampled responded?

1. All U.S. Households 2. the 240,000 households that respond. 3. 96%

The regression equation below relates the scores students in an advanced statistics course received for homework completed and the subsequent midterm exam. Homework scores are based on assignments that preceded the exam. The maximum homework score a student could obtain was 500 and the maximum midterm score was 350. The regression line that was obtained is given by y^=−84.4+.91x If a student had a homework score of 420, the midterm score would be predicted to be (rounded to an integer)

298

Intrigued by a 2013 study at the University of Nebraska, which suggested that marijuana smokers may be thinner than other adults, a group of students did a project to explore the relationship between body mass index (BMI) and amount of time spent under the influence of marijuana (measured in hours per week). Based on a random sample of 33 students at their university, they used BMI as the explanatory variable. The equation of the least-squares regression line is: Hours per month under influence = 49.2 − 1.15 BMI Calculate the residual for a student with an observed BMI of 25 and an observed hrs/month under the influence of 18.8. (Round your answer to two decimal places.)

-1.65

A study of king penguins looked for a relationship between how deep the penguins dive to seek food and how long they stay under water. For all but the shallowest dives, there is a linear relationship that is different for different penguins. The study report gives a scatterplot for one penguin titled "The relation of dive duration (DD) to depth (D)." Duration DD is measured in minutes and depth D is in meters. The report then says, "The regression equation for this bird is: DD = 2.57 + 0.0148D." 1. What is the slope of the regression line? 2. Explain in specific language what this slope says about this penguin's dives. 3. According to the regression line, how long does a typical dive to a depth of 207 meters last?

1. 0.0148 2. The slope tells us the additional minutes spent under water, on average, if the depth of dive is increased by one meter. 3. 5.63 minutes

Match the following features with their letter from the graph.

A. Minimum (non-outlier) B. First Quartile C. Median D. Third Quartile E. Maximum (non-outlier) F. Outliers

The following is a scatter plot for profits versus sales (in tens of thousands of dollars) for a random sample of 11 large companies. The correlation between sales and profits is 0.949. Which of the following statements is FALSE? 1. There is a positive association between x and y 2. If we removed the outlier, r would not change much 3. If we removed the outlier and recreated the scatterplot, we would have the same basic impression of the relationship between sales and profit. 4. None

2

Which of the following statements is FALSE about simple linear regression? 1. The regression line will only model a straight-line relationship 2. It is not necessary to make distinction between the response variable and the explanatory variable. 3. The slope represents the average change in y with a change in x. 4. The explanatory variable can only be quantitative

2

Calculate the standard deviation for the given data set: 3.0, 3.5, 3.5, 4.1, 4.8, 5.2, 7.1, 11.2

2.71

Intrigued by a 2013 study at the University of Nebraska, which suggested that marijuana smokers may be thinner than other adults, a group of students did a project to explore the relationship between body mass index (BMI) and amount of time spent under the influence of marijuana (measured in hours per week). Based on a random sample of 33 students at their university, they used BMI as the explanatory variable. The equation of the least-squares regression line is: Hours per month under influence = 49.2 − 1.15 BMI For a student with a BMI of 25, what is the predicted number of hours under the influence? (Round your answer to two decimal places.)

20.45

Histogram

A diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval. Divides data into classes or "bins" of equal widths. Bar heights are equivalent to the number of observations in each "bin." For quantitative data.

Bar Chart

A form of graph in which numeric values are represented by horizontal or vertical rectangles. Category counts must add to total or percents must add to 100%. Categorical data only.

Box Plot

A graphic way of showing a summary of data using the median, quartiles, and extremes of the data.

Stemplot

A graphical representation of a quantitative data set. Leading values of each data point are presented as stems and second digits are given as leaves.

A parameter summarizes data from what?

A population

Inference is used to make conclusions about a characteristic from what?

A population

Suppose you have the following data: 3 3 4 5 7 8 The mean and standard deviation are 5 and 2.1. If you were to subtract 2 from every observation, what change, if any, would you see in the transformed mean and standard deviation?

For the transformed data, the mean would decrease but the standard deviation would stay the same.

Categorical Variable

Places an individual (an object described by data) into one of several groups or categories. These cannot be described with arithmetic. Display types: Pie chart, bar graph.

An apple farmer samples 300 apples from a large cart weighs each of them. The variable of apple weight is? Categorical or Quantitative?

Quantitative

Identify which of the following is the explanatory variable and which is the response. How sensitive to changes in water temperature are coral reefs? To find out, we can measure the growth of corals in aquariums where the water temperature is controlled at different levels. Growth is measured by weighing the coral before and after the experiment.

The explanatory variable is changes in water temperature . The response variable is growth of corals .

Intrigued by a 2013 study at the University of Nebraska, which suggested that marijuana smokers may be thinner than other adults, a group of students did a project to explore the relationship between body mass index (BMI) and amount of time spent under the influence of marijuana (measured in hours per week). Based on a random sample of 33 students at their university, they used BMI as the explanatory variable. The equation of the least-squares regression line is: Hours per month under influence = 49.2 − 1.15 BMI Which of the following would be a valid conclusion to draw from the study?

There is a weak negative relationship between hours per month under the influence and BMI

In an experiment what is the combination of factors imposed on an experimental unit?

Treatment

True or False? Standard deviation is inflated by outliers.

True

What type of issue might we expect if we conduct a survey to only undergraduate students but want to make inference about all Oregon State students?

Under Coverage

The magazine High Times has a website that once asked visitors whether recreational marijuana use should be legal. This is an example of

Voluntary response sampling

We often describe our emotional reaction to social rejection as "pain." A clever study asked whether social rejection causes activity in areas of the brain that are known to be activated by physical pain. If it does, we really do experience social and physical pain in similar ways. Subjects were first included and then deliberately excluded from a social activity while changes in brain activity were measured. After each activity, the subjects filled out questionnaires that assessed how excluded they felt. A scatterplot shows a moderately strong linear relationship. The figure below shows regression output from software for these data. From the software output what is the least squares regression equation?

brain = -0.1261+0.0608(distress)


Set pelajaran terkait

ECON 211 - Homework 3 (CH. 10 - 13)

View Set

Uprep Electrolytes/ acid base balance/ fluids

View Set

Quiz: Taxation of Group Life Insurance

View Set