Statistics Final Ch. 1-3

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

A​ _______ helps us understand the nature of the distribution of a data set.

Frequency Distribution

​A(n) _______ uses line segments to connect points located directly above class midpoint values.

Frequency Polygon

Which of the following is NOT a value in the​ 5-number summary?

mean

When drawings of objects are used to depict​ data, false impressions can be made. These drawings are called​ _______.

pictographs

In modified​ boxplots, a data value is​ a(n) _______ if it is above Q3plus+​(1.5)(IQR) or below Q1minus−​(1.5)(IQR).

outlier

A​ _______ is a plot of paired data​ (x,y) and is helpful in determining whether there is a relationship between the two variables.

scatterplot

The bars in a histogram​ _______.

touch

When calculating standard deviation

use the calculator and the handouts for ch. 3

The square of the standard deviation is called the​ _______.

variance v=Standard dev^2

Is it possible to identify the exact values of all of the original service​ times?

No, the data values in each class could take on any value between the class limits, inclusive.

The population of ages at inauguration of all U.S. Presidents who had professions in the military is​ 62, 46,​ 68, 64, 57. Why does it not make sense to construct a histogram for this data​ set?

With a data set that is so​ small, the true nature of the distribution cannot be seen with a histogram.

Methods used that summarize or describe characteristics of data are called​ _______ statistics.

descriptive

The heights of the bars of a histogram correspond to _____________ values.

frequency

What is a scatterplot and how does it help​ us?

A scatterplot is a graph of paired​ (x, y) quantitative data. It provides a visual image of the data plotted as​ points, which helps show any patterns in the data.

A histogram aids in analyzing the​ _______ of the data.

The shape of the distribution

Whenever a data value is less than the​ mean, _______.

the corresponding z-score is negative

The measure of center that is the value that occurs with the greatest frequency is the​ _______.

mode

A frequency table of grades has five classes​ (A, B,​ C, D,​ F) with frequencies of 3​, 10, 14​, 8​, and 2 respectively. What are the relative frequencies of the five​ classes?

.08 .27 .38 .22 .05

Which of the following is always​ true?

A. In a symmetric and​ bell-shaped distribution, the​ mean, median, and mode are the same.<-- Correct B.The mean and median should be used to identify the shape of the distribution. C.For skewed​ data, the mode is farther out in the longer tail than the median. D.Data skewed to the right have a longer left tail than right tail.

Which sampling method divides the population up into​ sections, randomly selects some of those​ sections, then chooses all the members from the selected sections to​ study?

Cluster

A magazine published a list consisting of the state tax on each gallon of gas. If we add the 50 state tax amounts and then divide by​ 50, we get 27.3 cents. Is the value of 27.3 cents the mean amount of state sales tax paid by all U.S.​ drivers? Why or why​ not?

No, the value of 27.3 cents is not the mean because the 50 amounts are all weighted equally in the​ calculation, but some states consume more gas than​ others, so the mean amount of state sales tax should be calculated using a weighted mean.

Which measure of variation is very sensitive to extreme​ values?

Range

For data sets having a distribution that is approximately​ bell-shaped, _______ states that about​ 68% of all data values fall within one standard deviation from the mean.

The empirical rule

In this section we use r to denote the value of the linear correlation coefficient. Why do we refer to this correlation coefficient as being​ linear?

The term linear refers to a straight​ line, and r measures how well a scatterplot fits a​ straight-line pattern.

When a data value is converted to a standardized scale representing the number of standard deviations the data value lies from the​ mean, we call the new value a​ _______.

Z-score

A​ _______ is a graph of each data value plotted as a point.

dot plot

We utilize statistical​ _______ to look for features that reveal some useful or interesting characteristics of the data set.

graphs

Which of the following is NOT a characteristic of the​ mean?

A. The mean is sensitive to outliers. B. The mean is relatively reliable. C. The mean takes every data value into account. D. The mean is called the average by statisticians.<--correct answer

Identify the symbols used for each of the​ following: (a) sample standard​ deviation; (b) population standard​ deviation; (c) sample​ variance; (d) population variance.

A. The symbol for sample standard deviation is s. b. The symbol for population standard deviation is σ. c. The symbol for sample variance is s^2 d. The symbol for population variance is σ^2.

Which of the following is NOT true about statistical​ graphs?

A. They utilize areas or volumes for data that are​ one-dimensional in nature.<-- Correct answer B. They can be used to identify extreme data values. C. Similar graphs can be constructed in order to compare data sets. D. They can be used to consider the overall shape of the distribution.

Identify which type of sampling is​ used: random,​ systematic, convenience,​ stratified, or cluster. To determine customer opinion of their check dash in servicecheck-in service​, American Airlines randomly selects 60 flights during a certain week and surveys all passengers on the flight.

Cluster

A study of an association between which ear is used for cell phone calls and whether the subject is​ left-handed or​ right-handed began with a survey​ e-mailed to 5000 people belonging to an otology online​ group, and 717 surveys were returned.​ (Otology relates to the ear and​ hearing.) What percentage of the 5000 surveys were​ returned? Does that response rate appear to be​ low? In​ general, what is a problem with a very low response​ rate?

Convert to percentage 14%. It appears to be low. It creates a serious potential for getting a biased sample that consists of those with a special interest in the topic.

Explain the difference between a​ single-blind and a​ double-blind experiment.

In a​ single-blind experiment, the subject does not know which treatment is received. In a​ double-blind experiment, neither the subject nor the researcher in contact with the subject knows which treatment is received.

Refer to the accompanying data set and use the 30 screw lengths to construct a frequency distribution. Begin with a lower class limit of 0.720 ​in., and use a class width of 0.010 in. The screws were labeled as having a length of 3/4 in.

Length frequency 0.720-0.729 2 0.730-0.739 3 0.740-0.749 11 0.750-0.759 11 0.760-0.769 3

Identify which of these designs is most appropriate for the given​ experiment: completely randomized​ design, randomized block​ design, or matched pairs design. A drug is designed to treat insomnia. In a clinical trial of the​ drug, amounts of sleep each night are measured before and after subjects have been treated with the drug.

Matched pairs design

Refer to the table summarizing service times​ (seconds) of dinners at a fast food restaurant. How many individuals are included in the​ summary? Is it possible to identify the exact values of all of the original service​ times?

No. The data values in each class could take on any value between the class​ limits, inclusive.

If we find that there is a linear correlation between the concentration of carbon dioxide in our atmosphere and the global​ temperature, does that indicate that changes in the concentration of carbon dioxide cause changes in the global​ temperature?

No. The presence of a linear correlation between two variables does not imply that one of the variables is the cause of the other variable.

Suppose that you need to create a list of n values that have a specific known mean. Some of the n values can be freely selected. How many of the n values can be freely assigned before the remaining values are​ determined? (The result is referred to as the number of degrees of​ freedom.)

Of the n​ values, n−1 can be freely selected because the remaining​ value(s) can be expressed in terms of the assigned values and the known mean.

​_______ are sample values that lie very far away from the majority of the other sample values.

Outlier

Identify the type of sampling used​ (random, systematic,​ convenience, stratified, or cluster​ sampling) in the situation described below. A manman experienced a tax audit. The tax department claimed that the man was audited because he was randomly selected from all the tax payers.

Random

Identify which of these types of sampling is​ used: random,​ systematic, convenience,​ stratified, or cluster. A large company wants to administer a satisfaction survey to its current customers. Using their customer​ database, the company randomly selects 60 customers and asks them about their level of satisfaction with the company.

Random

Identify the type of sampling used​ (random, systematic,​ convenience, stratified, or cluster​ sampling) in the situation described below. A womanwoman is selected by a marketing company to participate in a paid focus group. The company says that the woman was selected because she was randomly chosen from all adults.

Random Sampling

______ is used when subjects are assigned to different groups through a process of random selection.

Randomization

Below are the jersey numbers of 11 players randomly selected from a football team. Find the​ range, variance, and standard deviation for the given sample data. What do the results tell​ us? 26, 49, 12, 77, 55, 59, 40, 92, 70, 99, 27

Range equals=87 ​ Sample standard deviation equals =27.9 ​ Sample variance equals=778.4 ​ Jersey numbers are nominal data that are just replacements for​ names, so the resulting statistics are meaningless.

In a​ _______ distribution, the frequency of a class is replaced with a proportion or percent.

Relative Frequency Distribution

Which of the following corresponds to the case when every sample of size n has the same chance of being​ chosen?

Simple Random Sample

Identify which of these types of sampling is​ used: random,​ systematic, convenience,​ stratified, or cluster. To determine her breathing ratebreathing rate​, Carrie divides up her day into three​ parts: morning,​ afternoon, and evening. She then measures her breathing rate at 4 randomly selected times during each part of the day.

Stratified

Which sampling method subdivides the population into categories sharing similar characteristics and then selects a sample from each​ subdivision?

Stratified

Identify the type of sampling used​ (random, systematic,​ convenience, stratified, or cluster​ sampling) in the situation described below. A researcher selects every 221th social security number and surveys the corresponding person.

Systematic

In a study designed to test the effectiveness of a medication as a treatment for lower back​ pain, 1643 patients were randomly assigned to one of three​ groups: (1) the 547 subjects in the placebo group were given pills containing no​ medication; (2) 550 subjects were in a group given pills with the medication taken at regular​ intervals; (3) 546 subjects were in a group given pills with the medication to be taken when needed for pain relief. In what specific way was replication applied in the​ study?

The group sample sizes are all large so the researchers could see the effects of the treatment.

Heights of statistics students were obtained by a teacher as part of an experiment conducted for the class. The last digit of those heights are listed below. Construct a frequency distribution with 10 classes. Based on the​ distribution, do the heights appear to be reported or actually​ measured? What can be said about the accuracy of the​ results?

The heights appear to be reported because there are disproportionately more 0s and 5s. They are likely not very accurate because they appear to be reported.

The table shows the magnitudes of the earthquakes that have occurred in the past 10 years. Use the frequency distribution to construct a histogram. Does the histogram appear to be​ skewed? If​ so, identify the type of skewness.

The histogram has a longer right tail, so the distribution of the data is skewed to the right.

Listed below are the jersey numbers of 1111 players randomly selected from the roster of a championship sports team. What do the results tell​ us?

The jersey numbers are nominal data and they do not measure or count​ anything, so the resulting statistics are meaningless.

One common system for computing a grade point average​ (GPA) assigns 4 points to an​ A, 3 points to a​ B, 2 points to a​ C, 1 point to a​ D, and 0 points to an F. What is the GPA of a student who gets an A in a 33-credit ​course, a B in each of two 2​-credit ​courses, a C in a 3​-credit ​course, and a D in a 2​-credit ​course?

The mean grade point average is a 2.7

If we collect a large sample of blood platelet counts and if our sample includes a single​ outlier, how will that outlier appear in a​ histogram?

The outlier will appear as a bar far from all of the other bars with a height that corresponds to a frequency of 1.

If your score on your next statistics test is converted to a z​ score, which of these z scores would you​ prefer: −2.00, −​1.00, ​0, 1.00,​ 2.00? Why?

The z score of 2.00 is most preferable because it is 2.00 standard deviations above the mean and would correspond to the highest of the five different possible test scores.

For a data set of brain volumes ​(cm3​) and IQ scores of four ​males, the linear correlation coefficient is found and the​ P-value is 0.336. Write a statement that interprets the​ P-value and includes a conclusion about linear correlation.

The​ P-value indicates that the probability of a linear correlation coefficient that is at least as extreme is 33.6, which is high, so there is not sufficient evidence to conclude that there is a linear correlation between brain volume and IQ score in males.

Which of the following is a common distortion that occurs in​ graphs?

Using a​ two-dimensional object to represent data that are​ one-dimensional in nature

Which characteristic of data is a measure of the amount that the data values​ vary?

Variations

Which of the following is NOT a property of the standard​ deviation?

When comparing variation in samples with very different​ means, it is good practice to compare the two sample standard deviations.

In a​ graph, if one or both axes begin at some value other than​ zero, the differences are exaggerated. This bad graphing method is known as​ _______.

a non-zero axis

a. A statistics class with 36 students is arranged so that there are 6 rows with 6 students in each​ row, and the rows are numbered from 1 through 6. A die is rolled and a sample consists of all students in the row corresponding to the outcome of the die. b. For the same class described in part​ (a), the 36 student names are written on 36 individual index cards. The cards are shuffled and six names are drawn from the top. c. For the same class described in part​ (a), the six youngest students are selected.

a. This sample is not a simple random sample. It is a random sample. b. This sample is a simple random sample. It is a random sample. c. This sample is not a simple random sample. It is not a random sample.

Which of the following is NOT a measure of​ center?

census

A​ _______ histogram has the same shape and horizontal scale as a​ histogram, but the vertical scale is marked with relative frequencies instead of actual frequencies.

relative frequency

The Range Rule of Thumb roughly estimates the standard deviation of a data set as​ _______.

s= range/4

A data value is considered​ _______ if its​ z-score is less than minus−2 or greater than 2.

significantly low or significantly high

Class width is found by​ _______.

subtracting a lower class limit from the next consecutive lower class limit

A study is conducted to measure​ children's growth rates without any treatment applied to the children. What best classifies this​ study?

Observational

Find the mean of the data summarized in the given frequency distribution. Compare the computed mean to the actual mean of 51.1 miles per hour.

The computed mean is not close to the actual mean because the difference between the means is morethan​ 5%.

Are the data reported or​ measured?

The data appears to be measured. The heights occur with roughly the same frequency or The data appears to be reported. Certain heights occur a disproportionate number of times.

For a data set of weights​ (pounds) and highway fuel consumption amounts​ (mpg) of six types of​ automobile, the linear correlation coefficient is found and the​ P-value is 0.025. Write a statement that interprets the​ P-value and includes a conclusion about linear correlation.

The​ P-value indicates that the probability of a linear correlation coefficient that is at least as extreme is 2.5%, which is low, so there is sufficient evidence to conclude that there is a linear correlation between weight and highway fuel consumption in automobiles.

A z score​ (or standard score or standardized​ value) is the number of standard​ deviations, s or σ​, that a given value x is above or below the​ mean x or μ. The z score is calculated by using one of the equations shown below.

look on desktop

A value at the center or middle of a data set is​ a(n) _______.

measure of center

p-values

only a small P-value, such as .05 or less (5% chance or less) suggests that the sample results are not likely to occur by chance when there is no linear correlation, so a small P-value supports a conclusion that there is a linear correlation between the two variables.


Kaugnay na mga set ng pag-aaral

chapter 22_______ require skin grafting or the use of skin substitutes for acceptable closure of the wounds.

View Set

Chapter 3 Policies, Procedures, and Awareness

View Set

UNCC BLAW exam 3 (8,9,10,20,34,35)

View Set

Chapter 5 Therapeutic Relationships NCLEX

View Set