Introduction to Probability and Statistics

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

In a class of 25 students, the following scores were received on a quiz: 3,3,3,3,4,4,4,4,4,5,5,6,6,6,7,7,8,8,8,8,8,9,9,9,9 For this distribution, what is the most appropriate measure of central tendency (Middle)

Either the mean or the median

Mean

The average of a set of observations.

Two groups of shoppers, group A and group B, were asked how much time they spent in the store on their last shopping trip. The following side-by-side box plots display the results of the survey. What can be concluded from these box plots?

The range for group A is smaller than the range for group B. The range is from the min to the max, its a part of the five number summary.

The following boxplots show the time that it takes in seconds for assembly-line workers working three different shifts to assemble a part Which shift has the greatest interquartile range?

The three shifts have the same interquartile range because its about the middle section

A study is conducted to determine if there is a difference in final exam scores in high school classrooms when different types of instruction are used. The two types of instruction included in the study are direct instruction and computer-based instruction. What are the explanatory (X) and response (Y) variables for this study?

The type of instruction is the explanatory variable (X) and the test score is the response variable (Y).

Midpoint

The value that divides the distribution so that approximately half the observations take smaller values, and approximately half the observations take larger values.

Over the past 50 years for a particular city there was a fairly strong positive linear relationship between its crime rate and the number of churches it has.Which two conclusions can be made from this?Choose 2 answers

The variable "number of churches" can be used to only predict the "crime rate." The population increase for the city over the past 50 years is a lurking variable that could influence both the crime rate and the number of churches to increase.

IQR for outliers

less than Q1 - 1.5(IQR) more than Q3 +1.5(IQR)

Five number summary

min, max, median, Q1 and Q3

Visitors at a hospital were asked whether they ate at the hospital cafeteria, a restaurant, or brought a sack lunch. The results are shown in the bar graph. What percentage of visitors brought a sack lunch?

21% - take 208 + 256 + 123 ÷ by 123 because these are the people who brought a sack lunch.

Box Plot

A graph that displays the highest and lowest quarters of data as whiskers, the middle two quarters of the data as a box, and the median.

The following graph for Yellowstone National Park's Old Faithful geyser shows the relationship between the duration of the geyser's eruption ( x) and the wait time until the next eruption ( y). Both times are in minutes. The least squares regression line for the data is y = 11.80 x + 31.68, with a correlation coefficient r = 0.93.Which two statements are true about this situation?Choose 2 answers

An eruption duration of 2.4 minutes predicts a wait time of 60.0 minutes. There is a strong linear relationship between the wait time and eruption duration for Old Faithful geyser.

A student surveys 200 people in a statistics lecture class to find out how many hours per night the class members typically spend sleeping. The data showed a normal distribution with a mean of 6.8 hours per night and a standard deviation of .75 hours. Which range of hours of sleep do students typically get who fall in the middle 68% of the data?

Between 6.05 and 7.55 hours

Inter-Quartile Range

Measures the variability of a distribution by giving us the range covered by the MIDDLE 50% of the data. The IQR is found by finding the median of the lower half of the data and the median for the upper half of the data. IQR is Q3 - Q1

The faculty at a large university wanted to know what proportion of the students at the university thought foreign language classes should be required. The statistics department offered to cooperate in conducting a survey, and a simple random sample of 500 students was selected from all the students enrolled in statistics classes. A survey form was sent by email to these 500 students.Which statement is true about this study?

Even though the sample is random, it is not representative of the population of interest.

A local ice cream shop kept track of the number of cans of cold carbonated beverages it sold and the temperature for each day during two months of the summer.The data are displayed in the following scatterplot: The one outlier corresponds to a day the refrigerator for the carbonated beverages was broken.Which statement is true about this scenario?

If the outlier were removed, r would increase.

Median

Midpoint of the distribution. Median = odd (n+1)/2 Median = even n/2 and n/2+1

A study compared the overall college GPAs (upon graduation) of 1,000 traditional students who took math their entire senior year of high school and 1,000 traditional students who did not take math during their senior year of high school. The numerical summary of the data is as follows: Students who took math during senior year: Minimum GPA Q1 Median Q3 Maximum 2.3 2.7 3.0 3.5 4.0 Students who took no math during senior year: GPA Q1 Median Q3 Maximum 1.9 2.3 2.6 3.1 3.9 Using side by side box plots, which statement is true about the interpretation of the data?

It appears that students who took math as seniors in high school graduated college with higher GPAs than those who did not take math, but there is also greater spread in the GPAs of college students who did not take math as high school seniors.

In a class of 25 students, the following test scores were obtained: 51, 59, 59, 59, 67, 68, 68, 69, 69, 71, 73, 73, 75, 75, 78, 79, 82, 84, 85, 85, 87, 91, 92, 93, 93 A histogram of interval width 10 starting with the interval [50, 60] was constructed using the above data. Which statement is true about this distribution as depicted by the histogram?

It is a unimodal symmetric distribution

A correspondent for a news program appears evenings on television throughout the country. This correspondent asked viewers to call in with their answer to this question: "If you had it to do all over again, would you vote for the president?" Of the more than 10,000 viewers who responded, 70% said no.What can be concluded about the viewers?

No meaningful conclusion is possible due to voluntary response bias.

High school seniors in a certain city are to be surveyed regarding their post-high school plans.What would be a cluster sampling for this study?

Obtain a list of all high schools in the city, choose four high schools at random, then survey all seniors at the four high schools.

Mode

One mode or peak is Unimodal Two mode or peak is Bimodal

A local ice cream shop kept track of the number of cans of cold carbonated beverages it sold each day (Y) and the temperature that day (X) for 2 months during the summer.The data are displayed in the following scatterplot: Which statement describes the relationship between X and Y as it appears in the scatterplot?

Positive linear relationship with outlier(s)

A small student group at a large high school wants to know the opinions of students regarding a new school policy.Which survey method is likely to produce the least amount of bias?

Randomly select several classes from all classes at the school, then randomly select 10 students from each of these classes to survey.

School administrators wish to determine if a students score on a standardized statistics exam (out of 100 points possible) is related to or affected by the method of instruction (face-to-face classroom, typical online course, or massive open online course). The massive open online course (MOOC) had the largest spread and they typical online course had the highest median. Which graph represents this?

See picture

A store asked 350 of its customers whether they were satisfied with their service. The responses were also classified according to the gender of the customers. If a person wanted to study whether the level of satisfaction is related to or affected by gender, what is the appropriate table of conditional probabilities?

See the picture

A university wants to poll students regarding their opinions about a new policy. It is expected that men and women will have different views on the policy. Fifty-five percent of students at the university are men and 45% are women.Which sampling method should be used in this situation?

Stratified sampling

When describing the shape of a distribution, we should consider

Symmetry/Skewers of the distribution Peakedness (mode) the number of peaks the distribution has

Spread

The distribution can be described by the approximate range covered by the data. Minimum and Maximum.

In a class of 25 students, the following test scores were obtained: 31, 46, 49, 52, 55, 67, 68, 68, 69, 69, 71, 73, 73, 75, 75, 78, 79, 82, 84, 84, 85, 87, 91, 92, 97 Which statement is true about this distribution?

The distribution is skewed to the left

Customers were asked to rank their satisfaction on a scale of 1 to 5. The following are the results: 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5 Which diagram correctly shows a histogram of the data set and includes the correct description of its shape?

The distribution is symmetric. The explanatory variable is placed on the x axis and the response or the total count for each variable is on the y axis.

observational study

The explanatory variables values are allowed to occur naturally Because of the possibility of lurking variables, it is difficult to establish causation If possible, control for suspected lurking variables by studying groups of similar individuals separately Some lurking variables are difficult to control for; others may not be identified

Experiments

The explanatory variables values are controlled by researchers Randomized assignment to treatments automatically controls for all lurking variables Making subjects blind avoids the placebo effect Making researchers blind avoids conscious or subconscious influences on their subjective assessment of responses A randomized controlled double-blind experiment is generally optimal for establishing causation A lack of realism may prevent researchers from generalizing experimental results to real-life situations Noncompliance may undermine an experiment. A volunteer sample might solve this problem It is impossible, impractical or unethical to impose some treatments

A group of kindergartens were asked their favorite color. The results are summarized in the two bar graphs. What is the difference between the two graphs?

The first graph displays the number of students favoring each color while the second displays the percentages for each color.

The following 5-number summaries were obtained for two distributions, A and B: A 3.2 4.9 5.7 5.9 8.2 B 2.7 5.1 5.7 6.9 7.5 The distribution of A is skewed to the right,whereas the distribution of B is skewed left How are the means of the two distributions related?

The mean of A is higher than the mean of B

At the same time, one group of patients drank a glass of orange juice (OJ) while a second group of patients took a vitamin C capsule (VC). The following box plots show the percentage of remaining vitamin C in the patients' bloodstream after 24 hours for the two groups. How should the results of the study be interpreted?

The median level of vitamin C in the blood stream is higher for those who drank orange juice.

Data on the acceptance rates for professional programs at a university were collected over a five-year period. The results are summarized in the following two-way table: From the table, it can be calculated that approximately 44% of male applicants were accepted, while 39% of female applicants were accepted. Similarly, the acceptance rates were calculated for both men and women for three of the university's professional programs and are summarized below. Which statement describes this data?

The variable "program" might be a lurking variable since the percentage of female applicants accepted is smaller than that of males overall, but when the program is considered, there is a reversal in the association.

The following table shows the age of a child in months (x) and the hours of sleep the child needs during nap time each day (y). The equation of the least squares regression line is y = -0.1351 x + 5.2787 y equals negative 0.1351 x plus 5.2787.Which two statements about the situation are true?

There is a strong curvilinear relationship between the two variables. The regression line predicts that an 11-month-old child would need 3.79 hour long naps.

Overall, Airline A completes 78% of its flights on time, while Airline B completes 68% of its flights on time. When examining the percentage of on-time flights for each airline based on the weather conditions of the flight destinations, the given data is found: Which two statements are true about this scenario?Choose 2 answers

This is an example of Simpson's paradox.

Why is it important for everyone in a population to have an equal chance of being selected for the sample when a survey is conducted?

To avoid bias and to get a representative sample


Kaugnay na mga set ng pag-aaral

Chapter 1. Health and Life Insurance

View Set