Statistics Midterm

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

significantly low or significantly high

A data value is considered​ _______ if its​ z-score is less than −2 or greater than 2.

The term average is not used in statistics. The term mean should be used for the result obtained by adding all of the sample values and dividing by the total number of sample values.

A defunct website listed the​ "average" annual income for Florida as​ $35,031. What is the role of the term average in​ statistics? Should another term be used in place of​ average?

​No, the value of 27.3 cents is not the mean because the 50 amounts are all weighted equally in the​ calculation, but some states consume more gas than​ others, so the mean amount of state sales tax should be calculated using a weighted mean.

A magazine published a list consisting of the state tax on each gallon of gas. If we add the 50 state tax amounts and then divide by​ 50, we get 27.3 cents. Is the value of 27.3 cents the mean amount of state sales tax paid by all U.S.​ drivers? Why or why​ not?

The sample has more than 30​ grade-point averages. If the population of​ grade-point averages has a normal distribution.

A researcher collects a simple random sample of​ grade-point averages of statistics​ students, and she calculates the mean of this sample. Under what conditions can that sample mean be treated as a value from a population having a normal​ distribution?

a. is not; is b. is; is c. is not; is not

A simple random sample of n subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen.​ (A simple random sample is often called a random​ sample, but strictly​ speaking, a random sample has the weaker requirement that all members of the population have the same chance of being​ selected.) Determine whether each of the following is a simple random sample and a random sample. Complete parts a through c. a. In Major League​ Baseball, there are 30​ teams, each with an active roster of 25 players. The names of the teams are printed on 30 separate index​ cards, the cards are​ shuffled, and one card is drawn. The sample consists of the 25 players on the active roster of the selected team. This sample ________ a simple random sample. It ______ is a random sample. b. For the same Major League Baseball population described in part​ (a), the 750 names of the players are printed on 750 separate index​ cards, and the cards are shuffled.​ Twenty-five different cards are selected from the top. The sample consists of the 25 selected players. This sample ______ a simple random sample. It ____ a random sample. c. For the same Major League Baseball population described in part​ (a), a sample is constructed by selecting the 25 youngest players. This sample _________ a simple random sample. It ______ a random sample.

4.81

A successful basketball player has a height of 6 feet 10 ​inches, or 208 cm. Based on statistics from a data​ set, his height converts to the z score of 4.81. How many standard deviations is his height above the​ mean? The player's height is ________ standard deviation(s) above the mean.

The minimum radiation would be a particularly helpful​ statistic, but none of these statistics is helpful for selecting a cell phone for purchase.

Are any of the resulting statistics helpful in selecting a cell phone for​ purchase?

The number of girls is significantly low.

Assume that 900 births are randomly selected and 4 of the births are girls. Use subjective judgment to describe the number of girls as significantly​ high, significantly​ low, or neither significantly low nor significantly high.

Although actresses include the oldest​ age, the boxplot representing actresses shows that they have ages that are generally lower than those of actors.

Compare the two boxplots. Choose the correct answer below.

The data are quantitative because they consist of counts or measurements.

Determine whether the data described below are qualitative or quantitative and explain why. The lengths (in minutes) of movies.

Observational study

Determine whether the description corresponds to an observational study or an experiment. Research is conducted to determine if there is a relation between colon cancer and fat consumption. Does the description correspond to an observational study or an​ experiment?

Statistic because the value is a numerical measurement describing a characteristic of a sample.

Determine whether the underlined number is a statistic or a parameter. A sample of professors is selected and it is found that 50% own a vehicle.

Parameter because the value is a numerical measurement describing a characteristic of a population.

Determine whether the underlined number is a statistic or a parameter. In a study of all 1541 seniors at a college, it is found that 55% own a computer.

Yes, because the frequencies start​ low, proceed to one or two high​ frequencies, then decrease to a low​ frequency, and the distribution is approximately symmetric.

Does the frequency distribution appear to have a normal​ distribution? Explain. Chart (with Temperature degrees F and Frequency) (#25)

​No, there does not appear to be a correlation because there is no general pattern to the data.

Does there appear to be a correlation between the​ president's height and his​ opponent's height? (#37)

All of the weights end in​ 00, so they all appear to be rounded to the nearest 100 grams. This suggests that the mean and median should also be rounded.

Examine the list of birth weights to make an observation about those numbers. How does that observation affect the way that the results should be​ rounded? (#41)

P(56 or more girls); is not; greater than

For 100​ births, P(exactly 56 ​girls)=0.0390 and ​P(56 or more ​girls)=0.136. Is 56 girls in 100 births a significantly high number of​ girls? Which probability is relevant to answering that​ question? Consider a number of girls to be significantly high if the appropriate probability is 0.05 or less. The relevant probability is ______________________, so 56 girls in 100 births ____________ a significantly high number of girls because the relative probability is ___________ 0.05.

(look at the number of males to find critical values & don't forget the positive and negative sign!!!) +- 0.950; in the right tail above the positive critical value; is

For a data set of brain volumes ​(cm3​) and IQ scores of four ​males, the linear correlation coefficient is r=0.975. Use the table available below to find the critical values of r. Based on a comparison of the linear correlation coefficient r and the critical​ values, what do you conclude about a linear​ correlation? The critical values are ______. Since the correlation coefficient r is __________________, there ___________ sufficient evidence to support the claim of a linear correlation. (#38)

(look at the number of males to find critical values & don't forget the positive and negative sign!!!) +-0.576; between the critical values; is not

For a data set of brain volumes ​(cm3​) and IQ scores of twelve ​males, the linear correlation coefficient is r=0.132. Use the table available below to find the critical values of r. Based on a comparison of the linear correlation coefficient r and the critical​ values, what do you conclude about a linear​ correlation? Since the correlation coefficient r is ______________, there ________ sufficient evidence to support the claim of a linear correlation.

There appears to be an upward​ trend, unlike​ drive-in movie​ theaters, which have a downward trend.

Given below are the numbers of indoor movie​ theaters, listed in order by row for each year. Use the given data to construct a​ time-series graph. What is the​ trend? How does this trend compare to the trend for​ drive-in movie​ theaters? What is the​ trend? How does this trend compare to the trend for​ drive-in movie​ theaters? (#35)

Chebyshev's Theorem

Go over #47.

Bell-shaped

Heights of adult males are normally distributed. If a large sample of heights of adult males is randomly selected and the heights are illustrated in a​ histogram, what is the shape of that​ histogram?

ordinal; Such data should not be used for calculations such as an average​ (mean).

Identify the level of measurement of the​ data, and explain what is wrong with the given calculation. In a set of​ data, car rankings are represented as 10 for first, 20 for second, and 30 for third. The average​ (mean) of the 692 car rankings is 25.4. The data are the _________ level of measurement. What is wrong with the given​ calculation?

nominal; Such data are not counts or measures of​ anything, so it makes no sense to compute their average​ (mean).

Identify the level of measurement of the​ data, and explain what is wrong with the given calculation. In a​ survey, the favorite foods of respondents are identified as 100 for italian food, 200 for mexican food, 300 for chinese food, and 400 for anything else. The average​ (mean) is calculated for 785 respondents and the result is 256.1. The data are at the ______________ level of measurement. What is wrong with the given​ calculation?

nominal; Such data are not counts or measures of​ anything, so it makes no sense to compute their average​ (mean).

Identify the level of measurement of the​ data, and explain what is wrong with the given calculation. In a​ survey, the hair colors of respondents are identified as 0 for brown hair, 1 for blond hair, 2 for black hair, and 3 for anything else. The average​ (mean) is calculated for 598 respondents and the result is 1.1. The data are at the __________ level of measurement. What is wrong with the given​ calculation?

nominal; Such data are not counts or measures of​ anything, so it makes no sense to compute their average​ (mean).

Identify the level of measurement of the​ data, and explain what is wrong with the given calculation. In a​ survey, the responses of respondents are identified as 0 for a "yes", 1 for a "no", 2 for a "maybe", and 3 for anything else. The average​ (mean) is calculated for 693 respondents and the result is 1.1. The data are at the _________ level of measurement. What is wrong with the given​ calculation?

cross-sectional

Identify the type of observational study. A researcher plans to obtain data by interviewing offspring of victims who perished in a bombing to see how they're coping now.

Systematic sampling

Identify the type of sampling used​ (random, systematic,​ convenience, stratified, or cluster​ sampling) in the situation described below. A researcher selects every 240th social security number and surveys the corresponding person. Which type of sampling did the researcher ​use?

Stratified

Identify which of these types of sampling is​ used: random,​ systematic, convenience,​ stratified, or cluster. To determine her blood pressure​, Miranda divides up her day into three​ parts: morning,​ afternoon, and evening. She then measures her blood pressure at 4 randomly selected times during each part of the day. What type of sampling is​ used?

The z score of 2.00 is most preferable because it is 2.00 standard deviations above the mean and would correspond to the highest of the five different possible test scores.

If your score on your next statistics test is converted to a z​ score, which of these z scores would you​ prefer: −​2.00, −1.00, ​0, 1.00,​ 2.00? Why?

Yes, it is reasonably close.

In a genetics experiment on​ peas, one sample of offspring contained 429 green peas and 162 yellow peas. Based on those​ results, estimate the probability of getting an offspring pea that is green. Is the result reasonably close to the value of 34 that was​ expected? Is this probability reasonably close to 3/4​? (#71)

The group sample sizes are all large so the researchers could see the effects of the treatment.

In a study designed to test the effectiveness of a medication as a treatment for lower back​ pain, 1643 patients were randomly assigned to one of three​ groups: (1) the 547 subjects in the placebo group were given pills containing no​ medication; (2) 550 subjects were in a group given pills with the medication taken at regular​ intervals; (3) 546 subjects were in a group given pills with the medication to be taken when needed for pain relief. In what specific way was replication applied in the​ study?

The subjects in the study did not know whether they were taking a placebo or the new​ medication, and those who administered the pills also did not know.

In a​ double-blind experiment designed to test the effectiveness of a new medication as a treatment for lower back​ pain, 1643 patients were randomly assigned to one of three​ groups: (1) the 547 subjects in the placebo group were given pills containing no​ medication; (2) 550 subjects were in a group given pills with the new medication taken at regular​ intervals; (3) 546 subjects were in a group given pills with the new medication to be taken when needed for pain relief. What does it mean to say that the experiment was​ "double-blind"?

outlier

In modified​ boxplots, a data value is​ a(n) _______ if it is above Q3+​(1.5)(IQR) or below Q1−​(1.5)(IQR).

The term linear refers to a straight​ line, and r measures how well a scatterplot fits a​ straight-line pattern.

In this section we use r to denote the value of the linear correlation coefficient. Why do we refer to this correlation coefficient as being​ linear?

discrete; there are a finite number of values

Is the random variable given in the accompanying table discrete or​ continuous? Explain The random variable given in the accompanying table is ___________ because _____________________________________.

The probability that the polygraph indicates lying given that the subject is actually telling the truth.

Let event A=subject is telling the truth and event B=polygraph test indicates that the subject is lying. Use your own words to translate the notation P(B|A) into a verbal statement.

The data set is too small for a dotplot to reveal important characteristics of the data. A​ time-series graph would be most​ effective, since the data are listed in order over a period of several years.

Listed below are the numbers of unprovoked shark attacks worldwide for the last several years. Why is it that a dotplot of these data would not be very effective in helping us understand the​ data? Which of the following graphs would be most effective for these​ data: dotplot,​ stemplot, time-series​ graph, Pareto​ chart, pie​ chart, frequency​ polygon? 70, 54, 68, 82, 79, 83, 76, 73, 98, 81 Why is it that a dotplot of these data would not be very effective in helping us understand the​ data? Which of the following graphs would be most effective for these​ data: dotplot,​ stemplot, time-series​ graph, Pareto​ chart, pie​ chart, frequency​ polygon?

descriptive

Methods used that summarize or describe characteristics of data are called​ _______ statistics.

It appears that weights of U.S. Army males increased from 1983 to 2020. (it always increases no matter what years/ if not sure look at graph #51)

Refer to the accompanying boxplots that are drawn on the same scale. The top boxplot represents weights​ (kg) of a sample of male U.S. Army personnel in 1983​, and the bottom boxplot represents weights​ (kg) of a sample of male U.S. Army personnel in 2020. What story is told by these​ boxplots?

No. The data values in each class could take on any value between the class​ limits, inclusive.

Refer to the table summarizing service times​ (seconds) of dinners at a fast food restaurant. How many individuals are included in the​ summary? Is it possible to identify the exact values of all of the original service​ times? Chart (where you add frequencies to get # of individuals who are included in the summary) Is it possible to identify the exact values of all of the original service​ times?

The study is an experiment because subjects were given treatments.

Researchers conducted a study to determine whether magnets are effective in treating back pain. Pain was measured using the visual analog​ scale, and the results given below are among the results obtained in the study. Higher scores correspond to greater pain levels. Is this study an experiment or an observational​ study? Explain. Reduction in Pain Level After Magnet​ Treatment: n=20​, x=0.485​, s=0.963 Reduction in Pain Level After Sham​ Treatment: n=20​, x=0.435​, s= 1.41

is not; greater than (must get s = # cm and put that number and the standard deviation provided in absolute value brackets and subtract to get the answer to the wordy part)

The approximation _____ accurate because the error of the range rule of thumb's approximation is _______________ 1.9 cm.

The waiting line represented by the bottom boxplot is better because the times have much less variation, so fewer customers have to wait a significantly longer time. (real answers: bottom; much less variation; fewer customers have to wait a significantly longer time) (the smallest boxplot is the answer & it comes with the same two other answers)

The boxplots shown below represent customer waiting times​ (minutes) for two different waiting lines. Which line would you​ prefer, or does it not make a​ difference? Explain. (#50)

The z scores are numbers without units of measurement.

The original pulse rates are measured with units of​ "beats per​ minute". What are the units of the corresponding z​ scores?

With a data set that is so​ small, the true nature of the distribution cannot be seen with a histogram.

The population of ages at inauguration of all U.S. Presidents who had professions in the military is​ 62, 46,​ 68, 64, 57. Why does it not make sense to construct a histogram for this data​ set?

​0, 1,​ 2, 3, ... is not discrete

The random variable x represents the number of phone calls an author receives in a​ day, and it has a Poisson distribution with a mean of 6.2 calls. What are the possible values of​ x? Is a value of x=2.6 ​possible? Is x a discrete random variable or a continuous random​ variable? What are the possible values of​ x? Is a value of x=2.6 ​possible? Is x a discrete random variable or a continuous random​ variable? A value of x=2.6 _________ possible because x is a ___________ random variable.

The selections are​ dependent, because the selection is done without replacement. Yes, because the sample size is less than​ 5% of the population.

There are​ 15,958,866 adults in a region. If a polling organization randomly selects 1235 adults without​ replacement, are the selections independent or​ dependent? If the selections are​ dependent, can they be treated as independent for the purposes of​ calculations? Are the selections independent or​ dependent? If the selections are​ dependent, can they be treated as independent for the purposes of​ calculations?

sample space

The​ _______ for a procedure consists of all possible simple events or all outcomes that cannot be broken down any further.

The histogram appears to depict a normal distribution. The frequencies generally increase to a maximum and then​ decrease, and the histogram is roughly symmetric.

Use the frequency distribution to construct a histogram. Does the histogram appear to depict data that have a normal​ distribution? Why or why​ not? Does the histogram appear to depict data that have a normal​ distribution? (#31)

No; the original population is normally​ distributed, so the sample means will be normally distributed for any sample size.

Weights of golden retriever dogs are normally distributed. Samples of weights of golden retriever​ dogs, each of size n=​15, are randomly collected and the sample means are found. Is it correct to conclude that the sample means cannot be treated as being from a normal distribution because the sample size is too​ small? Explain.

Since the probability of each digit being selected is​ equal, lottery digits have a uniform​ distribution, not a normal distribution.

What's wrong with the following​ statement? ​"Because the digits​ 0, 1,​ 2, . . .​ , 9 are the normal results from lottery​ drawings, such randomly selected numbers have a normal​ distribution."

z-score

When a data value is converted to a standardized scale representing the number of standard deviations the data value lies from the​ mean, we call the new value a​ _______.

probability of selecting an adult with blue eyes.; probability of selecting an adult who does not have blue eyes.

When randomly selecting an​ adult, A denotes the event of selecting someone with blue eyes. What do P(A) and P(A) (With a line over A) represent? P(A) represents the: P(A) (with line over A) represents the:

The probability of getting a​ male, given that someone with blue eyes has been selected. No, because P(B|M) represents the probability of getting someone with blue​ eyes, given that a male has been selected.

When randomly selecting​ adults, let M denote the event of randomly selecting a male and let B denote the event of randomly selecting someone with blue eyes. What does P(M|B) ​represent? Is P(M|B) the same as P(B|M)​? What does P(M|B) ​represent? Is P(M|B) the same as P(B|M)​?

the corresponding z-score is negative.

Whenever a data value is less than the​ mean, _______.

Variation

Which characteristic of data is a measure of the amount that the data values​ vary?

Range

Which measure of variation is most sensitive to extreme​ values?

Number of suitcases on a plane.

Which of the following consists of discrete​ data?

Quantitative

Which of the following is NOT a level of​ measurement?

Mean

Which of the following is NOT a value in the​ 5-number summary?

Data that were obtained from an entire population.

Which of the following is associated with a​ parameter?

(Squared root of 2), 5/3, -0.58, 1.23

Which of the following values cannot be​ probabilities? 1​, squared root of 2​, 0​, 0.06​, 1.23​, 5/3​, −0.58​, 3/5


Ensembles d'études connexes

TOPIC 6: STOMACH STRUCTURE AND FUNCTION

View Set

Quantitative & Qualitative Research.

View Set

Macro Chapter 3 Supplementary questions

View Set

Chest & lower respiratory tract disorders PrepU

View Set