Statistics Elementary
Find the range of the data set. Here are costs (in dollars) of 12 electric smooth top ranges. 860 1000 650 560 1460 1100 700 760 870 1300 550 1060 a) $890 b) $930 c) $910 d) $920 e) $900
$910. The range of a data set is the difference between the largest and smallest values. In this case, the largest value is $1460 and the smallest value is $550. So the range of this data set is $1460 - $550 = $910.
Listed below are the measured radiation absorption rates (in W/kg) corresponding to 11 cell phones. Use the given data to construct a boxplot and identify the 5-number summary. 1.13 1.34 0.54 1.26 0.81 0.74 1.33 0.83 1.48 0.58 0.87 The 5-number summary is ____, all in W/kg.
0.54 0.775 0.87 1.295 1.48
The heights (in inches) often randomly chosen American males are shown below. 71 67 67 72 76 72 73 68 72 72 a. 3 b. 2.87 c. 2.72 d. 70
2.72
Find the (a) mean, (b) median, (c) mode, and (d) midrange for the data and then (e) answer the given question. Listed below are the jersey numbers of 11 players randomly selected from the roster of a championship sports team. What do the results tell us? 26 20 44 93 61 54 85 38 58 95 28 Find the mean.
60 (26 + 20 + 44 + 93 + 61 + 54 + 85 + 38 + 58 + 95 + 28) / 11 = 60
Determine the Five Number Summary of the data set. The heights (in inches) of ten randomly chosen American males are shown below. 71 67 67 72 76 72 73 68 72 72 a) 10, 71, 2.9, 67, 73 b) 10, 71, 72 67, 73 c) 67, 71, 72, 72, 76 d) 67, 68, 72, 72, 76
67, 68, 72, 72, 76 TI-84: Press [STAT], select [EDIT] and enter the data values into the L1 column. Press [STAT], scroll right to [CALC] and select 1-Var Stats.
The heights (in inches) often randomly chosen American males are shown below 71 67 67 72 76 72 73 68 72 72 a. 10, 71, 20, 67, 73 b. 10, 71, 7267, 73 c. 67, 71, 72, 72, 76 d. 67, 68, 72, 72, 76
67, 71, 72, 72, 76
Lengths of pregnancies of humans are normally distributed with a mean of 265 days and a standard deviation of 10 days. Use the Empirical Rule to determine the percentage of women whose pregnancies are between 255 and 275 days. a. 95% b. 99.7% c. 68% d. 50%
68%
The body temperatures of a group of healthy adults have a bell-shaped distribution with a mean of 98.350F and a standard deviation of 0.620F. Using the empirical rule, find each approximate percentage below. What is the approximate percentage of healthy adults with body temperatures between 97.11 OF and 99690F?
68%
The heights (in inches) of ten randomly chosen American males are shown below. 71 67 67 72 76 72 73 68 72 72 a) 73 b) 76 c) 72 d) 67
72. The mode is the value that appears most frequently in a data set. In this case, the height that appears most frequently among the ten randomly chosen American males is 72 inches, which appears four times.
Find the mean score. The heights (in inches) of ten randomly chosen American males are shown below. 71 67 67 72 76 72 73 68 72 72 a) 72 b) 67 c) 68 d) 71
72. To find the mean height of the ten randomly chosen American males, you need to add up all the heights and then divide by the number of heights. The sum of the heights is 71 + 67 + 67 + 72 + 76 + 72 + 73 + 68 + 72 + 72 = 720. Since there are 10 heights, the mean height is 720 / 10 = 72 inches
Find the median score. The heights (in inches) of ten randomly chosen American males are shown below. 71 67 67 72 76 72 73 68 72 72 a) 71 b) 67 c) 72 d) 73
72. To find the median height of the ten randomly chosen American males, you need to arrange the heights in ascending order and then find the middle value. If there is an odd number of heights, the median is the middle value. If there is an even number of heights, the median is the average of the two middle values. The heights in ascending order are 67, 67, 68, 71, 72, 72, 72, 72, 73, and 76. Since there are an even number of heights (10), the median is the average of the two middle values: (72 + 72) / 2 = 72.
The body temperatures of a group of healthy adults have a bell-shaped distribution with a mean of 98.350F and a standard deviation of 0.620F. Using the empirical rule, find each approximate percentage below. What is the approximate percentage of healthy adults with body temperatures within 3 standard deviations of the mean, or between 96.490F and 100.21 OF?
99.7%
boxplot
A boxplot (or box-and-whisker diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile Q1 , the median, and the third quartile Q3.
data set
A data set is a collection of related data points or values. These values can represent measurements, observations, or other information collected for a specific purpose. Data sets can be organized in various ways, such as in tables, arrays, or lists. They can also be analyzed using statistical methods to draw conclusions or make predictions.
measure of center
A measure of center is a value that represents the central or typical value in a data set. Common measures of center include the mean, median, and mode. These measures provide information about the location of the data and can be used to summarize and compare different data sets. The choice of measure of center depends on the nature of the data and the purpose of the analysis.
sampling method
A sampling method is a technique used to select a sample from a population for the purpose of conducting research or making inferences about the population. There are many different sampling methods
survey
A survey is a research method used to collect data from a sample of individuals through the use of questionnaires or interviews.
z score
A z score (or standard score or standardized value) is the number of standard deviations that a given value x is above or below the mean. The z score is calculated by using one of the following: Sample z = x-x / 8 or Population z = x-u / q
If we find that there is a linear correlation between the concentration of carbon dioxide in our atmosphere and the global temperature, does that indicate that changes in the concentration of carbon dioxide cause changes in the global temperature? A. No. The presence of a linear correlation between two variables does not imply that one of the variables is the cause of the other variable. B. Yes. The presence of a linear correlation between two variables implies that one of the variables is the cause of the other variable.
A. No. The presence of a linear correlation between two variables does not imply that one of the variables is the cause of the other variable.
The graph to the right compares teaching salaries of women and men at private colleges and universities. What impression does the graph create? Does the graph depict the data fairly? If not, construct a graph that depicts the data fairly. Histogram: Salaries: Women $55,000 Men $71,000 What impression does the graph create? A. The graph creates the impression that men have salaries that are more than twice the salaries of women. B. The graph creates the impression that men have salaries that are slightly higher than that of women. C. The graph creates the impression that men and women have approximately the same salaries. D. The graph creates the impression that women have salaries that are slightly higher than that of men.
A. The graph creates the impression that men have salaries that are more than twice the salaries of women.
Which of the statements below is true concerning bar graphs? Question content area bottom A. The height of each bar represents the category's frequency or relative frequency. B. The largest bar must be displayed first, with the remaining bars in decreasing order. C. Bars must be touching and all the same width. D. Bars must be displayed vertically.
A. The height of each bar represents the category's frequency or relative frequency.
Which of the following correctly describes the relationship between a parameter and a statistic? A. statistic is calculated from sample data and is generally used to estimate a parameter. b. Statistics are a group of subjects selected according to the parameters of a study. c. A parameter is calculated from sample data and is generally used to estimate a statistic. d. A parameter and a statistic are not related.
A. statistic is calculated from sample data and is generally used to estimate a parameter.
Does the graph depict the data fairly? A. No, because the vertical scale does not start at zero. B. No, because the data are two-dimensional measurements. C. Yes, because the bars accurately represent each average. D. Yes, because the vertical scale is appropriate for the data.
A. No, because the vertical scale does not start at zero.
A high school student took two college entrance exams and scored 1120 on the SAT and 27 on the ACT. Suppose that SAT scores have a mean of 950 and a standard deviation of 140 while the ACT scores have a mean of 22 and a standard deviation of 4. Assuming the performance on both tests follows a normal distribution, determine which test the student did better on. a. SAT b. ACT
ACT
experiment
An experiment is a type of research where the researcher manipulates one or more independent variables to observe their effect on a dependent variable while controlling for other variables.
observational study
An observational study is a type of research where the researcher observes and collects data from a sample without manipulating or controlling any variables.
Use z scores to compare the given values. In a recent awards ceremony, the age of the winner for the Best Actor award was 40 and the age of the winner for the Best Actress award was 55. For all recipients of Best Actor, the mean age is 45.9 years and the standard deviation is 5.8 years. For all recipients of Best Actress, the mean age is 38.6 years and the standard deviation is 10.4 years. (Al ages are determined at the time of the awards ceremony) Relative to the award category, who had the more extreme age when winning the award, the winner of Best Actor or the winner of Best Actress? Since the z score for the winner of Best Actor is z = and the z score for the winner of Best Actress is z = the winner of ____ had the more extreme age.
Answer 1.02, 1.58, best actress
What is a scatterplot and how does it help us? A. A scatterplot is a graph of paired (x, y) qualitative data. It provides an organized display of the data, which helps show patterns in the data. B. A scatterplot is a formula that fits a straight line to data points, which helps plot the data. C. A scatterplot is a table of paired (x, y) quantitative data sorted from least to greatest, which helps show the range of the data. D. A scatterplot is a graph of paired (x, y) quantitative data. It provides a visual image of the data plotted as points, which helps show any patterns in the data.
D. A scatterplot is a graph of paired (x, y) quantitative data. It provides a visual image of the data plotted as points, which helps show any patterns in the data.
Which two graphs allow the reader to retrieve the original list of data? A. Histograms and frequency polygons B. Frequency polygons and ogives C. Stem-and-leaf plots and histograms D. Stem-and-leaf plots and dotplots
D. Stem-and-leaf plots and dotplots
descriptive
Descriptive statistics summarize and organize characteristics of a data set.
"Loaded" Survey
If survey questions are not worded carefully, the results of a study can be misleading. Survey questions can be "loaded," or intentionally worded to elicit a desired response.
inferential
Inferential statistics is a field of statistics that uses analytical tools for drawing conclusions about a population by examining random samples. The goal of inferential statistics is to make generalizations about a population.
Survey questions may be misleading if they are "loaded." To what does "loaded" refer?
Intentionally worded to elicit a desired response
Practical Significance
It is possible that some treatment or finding is effective, but common sense might suggest that the treatment or finding does not make enough of a difference to justify its use or to be practical.
qualitative
Qualitative refers to data or information that is non-numerical and describes qualities or characteristics.
Random sampling
Random sampling is a sampling method that selects a sample of observations from a population by random selection, where every member of the population has an equal chance of being selected. Random sampling can be done by using a random number generator, rolling dice, using a calculator, or pulling names from a hat.
sample standard deviation
Sample standard deviation is a measure of the spread of a sample of data. It is calculated in the same way as the population standard deviation, but with one key difference: when calculating the variance, the sum of squared differences between each value and the mean is divided by n-1 instead of n, where n is the sample size. This is known as Bessel's correction and is used to correct for bias in the estimation of the population standard deviation from a sample
simulation
Simulation is the process of creating a model that mimics the behavior of a real-world system or process. It is used to study and analyze the behavior of systems, test different scenarios, and make predictions about future outcomes.
standard deviation
Standard deviation is a measure of the amount of variation or dispersion in a set of values. It is calculated as the square root of the variance, which is the average of the squared differences between each value and the mean of the data set. A low standard deviation indicates that the values in the data set are close to the mean, while a high standard deviation indicates that the values are spread out over a wider range.
statistical analysis
Statistical analysis is the science of collecting, exploring and presenting large amounts of data to discover underlying patterns and trends
When testing a new treatment, what is the difference between statistical significance and practical significance? Can a treatment have statistical significance, but not practical significance?
Statistical significance is achieved when the result is very unlikely to occur by chance. Practical significance is related to whether common sense suggests that the treatment makes enough of a difference to justify its use. It is possible for a treatment to have statistical significance, but not practical significance.
Listed below are the measured radiation emissions (in W/kg) corresponding to cell phones: A, 8, C, D, E, F, G, H, l, J, and K respectively. The media often present reports about the dangers of cell phone radiation as a cause of cancer. Cell phone radiation must be 1.6 W/kg or less. Find the a. mean, b. median, c. midrange, and d. mode for the data. 0.83 1.25 1.31 1.58 0.45 0.84 0.65 1.55 1.01 0.72 0.32 The mean is = The median is = The midrange is = Find the mode =
The mean is = 0.936 The median is = 0.84 The midrange is = 0.95 Find the mode = No mode
the mean
The mean, also known as the average, is a measure of central tendency that represents the typical value in a data set. It is calculated by adding up all the values in the data set and dividing by the number of values. The mean is sensitive to outliers, which are extreme values that can significantly affect the calculation. In such cases, other measures of central tendency, such as the median or mode, may be more appropriate.
the median
The median is a measure of central tendency that represents the middle value in a data set when the values are arranged in ascending order. If the data set has an odd number of values, the median is the middle value. If the data set has an even number of values, the median is calculated as the average of the two middle values. The median is less sensitive to outliers than the mean and can be a more appropriate measure of central tendency for skewed distributions.
the mode
The mode is a measure of central tendency that represents the most frequently occurring value in a data set. A data set can have more than one mode if there are multiple values that occur with the same highest frequency. If no value occurs more than once, the data set has no mode. The mode can be useful for describing categorical or nominal data, where the mean and median may not be appropriate measures of central tendency.
A magazine ran a survey about a web site for downloading music. Readers could register their responses on the magazine's web site. Identify what is wrong.
The sample is a voluntary response sample, so there is a good chance that the results do not reflect the population.
If all the data values in a set are identical, what can you conclude about the standard deviation? a. The standard deviation is zero. b. The standard deviation is not defined if all the data values are identical. c. The standard deviation cannot be calculated without knowing the value in the data set. d. The standard deviation is equal to the data value.
a. The standard deviation is zero.
According to the histogram, there were 5 participants with heart rates between 130 and 140 bpm and 2 participants with heart rates between 140 and 150 bpm. So, a total of 5 + 2 = 7 participants had heart rates between 130 and 150 bpm. Since there were a total of 15 participants in the class, the percentage of participants with heart rates between 130 and 150 bpm is (7/15) * 100% = 46.67%, which can be rounded to 47%. So the correct answer is d. 47% The median LSAT score reported by a Law School is 170. Which of the following is not correct? a. half of the students have an LSAT score of 170 b. half of the students have an LSAT score of 170 or below. c. the fiftieth percentile of students' LSAT score is 170 d. half of the students have an LSAT score higher than 170
a. half of the students have an LSAT score of 170
Listed below are the measured radiation emissions (in W/kg) corresponding to cell phones: A, 8, C, D, E, F, G, H, l, J, and K respectively. The media often present reports about the dangers of cell phone radiation as a cause of cancer. Cell phone radiation must be 1.6 W/kg or less. Find the a. mean, b. median, c. midrange, and d. mode for the data. 0.83 1.25 1.31 1.58 0.45 0.84 0.65 1.55 1.01 0.72 0.32 The mean is = 0.936 The median is = 0.84 The midrange is = 0.95 Find the mode = No mode If you are planning to purchase a cell phone, are any of the measures of center the most important statistic? Is there another statistic that is most relevant? If so, which one? a. The mean of the data set is the most important statistic because cell phones that have values close to it have the safest emissions. b. The maximum data value is the most relevant statistic, because it is closest to the limit of 1.6W/kg and that cell phone should be avoided. c. The midpoint of the data set is the most important statistic because cell phones that have values close to it have the safest emissions. d. The minimum data value is the most relevant statistic, because it is closest to the limit of 1.6W/kg and that cell phone should be purchased.
b. The maximum data value is the most relevant statistic, because it is closest to the limit of 1.6W/kg and that cell phone should be avoided.
A study was conducted to determine how people get jobs. The table lists data from 400 randomly selected subjects. Construct a Pareto chart that corresponds to the given data. If someone would like to get a job, what seems to be the most effective approach? Histogram: Job Sources / Frequency Job source - Help-wanted ads (H), frequency 271 Job source - Executive search firms (E), frequency 29 Job source - Networking (N), frequency 44 Job source - Mass mailing (M), frequency 56 What does a correlation coefficient of O indicate? a. It indicates a non-linear relationship between the two quantitative variables. b. There is no linear relationship between the two quantitative variables. c. There is a strong relationship between the two quantitative variables. d. There is a weak relationship between the two quantitative variables. e. It indicates a calculation error, as the correlation coefficient cannot be 0.
b. There is no linear relationship between the two quantitative variables.
The ages of a group of patients being treated at one hospital for osteoporosis are summarized in the frequency histogram below. Histogram: Age of patient 10-20, freq 25 age of patient 20-30, freq 50 age of patient 30-40, freq 100 age of patient 50-60, freq 375 age of patient 60-70, freq 475 age of patient 70-80, freq 550 age of patient 80-90, freq 500 a. Skew left, the median would provide a better measure of center. b. Skew left, the mean would provide a better measure of center. c. Skew right, the median would provide a better measure of center. d. Skew right, the mean would provide a better measure of center. e. Multimodal
c. Skew right, the median would provide a better measure of center.
The ____________________ is the difference between two consecutive lower class limits or two consecutive upper class limits.
class width
At a local community college, five statistics classes are randomly selected out of 20 and all of the students from each class are interviewed. a) systematic b) random c) stratified d) convenience e) cluster
cluster. In this example, five statistics classes are randomly selected out of 20 at a local community college and all of the students from each class are interviewed. This is an example of cluster sampling because the population is divided into different groups (clusters) and then a random sample of clusters is selected. In cluster sampling, the population is divided into different groups (clusters) and then a random sample of clusters is selected.
categorical data
consist of names or labels (not numbers that represent counts or measurements).
A community college student interviews everyone in a statistics class to determine the percentage of students that own a car. a) convenience b) cluster c) random d) stratified e) systematic
convenience. In this example, a community college student interviews everyone in a statistics class to determine the percentage of students that own a car. This is an example of convenience sampling because the sample is selected based on ease of access. Convenience sampling involves selecting a sample based on ease of access or availability.
A fitness instructor measured the heart rates of the participants in a yoga class at the conclusion of the class. The data is summarized in the histogram below. There were fifteen people who participated in the class between the ages of 25 and 46. Histogram: bpm 90-100, frequency 2 bpm 110-120, frequency 2 bpm 120-130, frequency 3 bpm 130-140, frequency 5 bpm 140-150, frequency 2 bpm 150-160, frequency 1 What percent of the participants had a heart rate between 130 and 150 bpm? a. 33% b. 13% c. 5% d. 47%
d. 47%
A study was done to determine the average commute time per week to and from class for an SPC student attending a lecture class. For convenience, a sample of 60 students was taken by randomly selecting a lecture class on the Gibbs campus (30 students), and also a lecture class on the Clearwater campus (30 students). The students selected for the sample were then asked where they were commuting from, when, and how often. The information was then entered into a GPS system in order to obtain estimated commuting times. a. All SPC students. b. All full-time SPC students. c. All part-time SPC students. d. All SPC students who attend lecture classes.
d. All SPC students who attend lecture classes.
Was it possible for anyone to have the average number of feet? a. No. Everyone has exactly 2 feet b. Yes. The average always represents the typical individual, so someone will have that number of feet c. Yes. Because its the average, someone will have the average number of feet. d. No. It is not possible for anyone to have the number of feet reported as the average.
d. No. It is not possible for anyone to have the number of feet reported as the average.
Among fatal plane crashes that occurred during the past 50 years, 324 were due to pilot error, 62 were due to other human error, 316 were due to weather, 639 were due to mechanical problems, and 667 were due to sabotage. Construct the relative frequency distribution. What is the most serious threat to aviation safety, and can anything be done about it? Histogram: Cause - Pilot error, Relative Frequency = 16.1% Cause - other human error, relative frequency = 3.1% Cause Weather, Relative Frequency = 15.7% Cause Mechanical problems, Relative frequency = 31.8% Cause - Sabotage, relative frequency = 33.2% a. Pilot error is the most serious threat to aviation safety. Pilots could be better trained. b. Mechanical problems are the most serious threat to aviation safety New planes could be better engineered. c. Weather is the most serious threat to aviation safety. Weather monitoring systems could be improved d. Sabotage is the most serious threat to aviation safety Airport security could be increased.
d. Sabotage is the most serious threat to aviation safety Airport security could be increased.
A community college faculty is negotiating a new contract with the school board. The distribution of faculty salaries is skewed right by several faculty members who make over SIOO,OOO per year. If the faculty want to give the community the impression that they deserve higher salaries, should they advertise the mean or median of their current salaries? a. The faculty should use the median to make their argument. The median will be higher than the mean since the median is influenced by the few high salaries. b. The faculty should use the mean to make their argument- The mean will be lower than the median since it will be influenced by the few high salaries. c. The faculty should use the mean to make their argument. The mean will be higher than the median since the mean is influenced by the few extremely high salaries. d. The faculty should use the median to make their argument- The median will be lower than the mean since the mean is influenced by the few extremely high salaries.
d. The faculty should use the median to make their argument- The median will be lower than the mean since the mean is influenced by the few extremely high salaries.
Listed below are the jersey numbers of 11 players randomly selected from the roster of a championship sports team. What do the results tell us? 26 20 44 93 61 54 85 38 58 95 28 What do the results tell us? a. The mean and median give two different interpretations of the average (or typical) jersey number, while the midrange shows the spread of possible jersey numbers. b. The midrange gives the average (or typical) jersey number, while the mean and median give two different interpretations of the spread of possible jersey numbers c. Since only 11 of the jersey numbers were in the sample, the statistics cannot give any meaningful results. d. The jersey numbers are nominal data and they do not measure or count anything, so the resulting statistics are meaningless.
d. The jersey numbers are nominal data and they do not measure or count anything, so the resulting statistics are meaningless.
True or false? A histogram and a relative frequency histogram, constructed from the same data, always have the same basic shape. a. False. The two histograms can have very different shapes depending on the distribution of the data. b. True. They will both have the same shape and the same vertical scale. They will differ only in the scale used on the horizontal axis. c. False. The histograms use different scales on the y-axis, resulting in completely different shapes. d. True. A relative frequency histogram will have a different scale on the y-axis but the same shape as a regular histogram.
d. True. A relative frequency histogram will have a different scale on the y-axis but the same shape as a regular histogram.
Which branch of statistics deals with the organization and summarization of collected information? a. descriptive b. inferential
descriptive
Which branch of statistics deals with the organization and summarization of collected information? a) descriptive b) inferential
descriptive. Descriptive statistics deals with the organization and summarization of collected information. It involves the use of graphical and numerical methods to describe and summarize data. Inferential statistics, on the other hand, involves using sample data to make inferences about a population.
double blind
double- blinding occurred at two levels: (1) The subject being injected didn't know whether they were getting a vaccine or a placebo, and (2) the doctors who gave the injections and evaluated the results did not know either. Codes were used so that the researchers could objectively evaluate the effectiveness of the vaccine.
If your score on your next statistics test is converted to a z score, which of these z scores would you prefer: -200, -1.00, O, 1.00, 200? Why? A. The z score of 1.00 is most preferable because it is 1.00 standard deviation above the mean and would correspond to an above average test score. b. The z score of — 1.00 is most preferable because it is 1 00 standard deviation below the mean and would correspond to an above average test score. c. The z score of 0 is most preferable because it corresponds to a test score equal to the mean. d. The z score of — 2.00 is most preferable because it is 2.00 standard deviations below the mean and would correspond to the highest of the five different possible test scores. e. The z score of 2.00 is most preferable because it is 2.00 standard deviations above the mean and would correspond to the highest of the five different possible test scores.
e. The z score of 2.00 is most preferable because it is 2.00 standard deviations above the mean and would correspond to the highest of the five different possible test scores.
A study where a drug was given to 23 patients and a placebo to another group of 23 patients todetermine if the drug has an effect on a patient's illness. a) simulation b) survey c) experiment d) observational study
experiment. In this study, a drug was given to one group of patients and a placebo to another group to determine if the drug has an effect on a patient's illness. This is an example of an experiment because the researchers are manipulating one variable (the treatment) to see its effect on another variable (the patient's illness). In an experiment, the researcher manipulates one or more independent variables to observe their effect on a dependent variable.
Cluster sampling
first divide the population area into sections (or clusters). Then we randomly select some of those clusters and choose all the members from those selected clusters
From past figures, it is predicted that 19% of the registered voters in California will vote in the June primary. a. descriptive b. inferential
inferential
From past figures, it is predicted that 19% of the registered voters in California will vote in the June primary. a) descriptive b) inferential
inferential. The prediction that 19% of the registered voters in California will vote in the June primary is an example of inferential statistics. Inferential statistics involves using sample data to make inferences about a population. In this case, past figures (sample data) are used to make a prediction about the population of registered voters in California.
parameter
is a numerical measurement describing some characteristic of a population.
statistic
is a numerical measurement describing some characteristic of a sample.
Statistical Significance
is achieved in a study when we get a result that is very unlikely to occur by chance. A common criterion has been this: We have statistical significance if the likelihood of an event occurring by chance is 5% or less.
blind
is used when the subject doesn't know whether he or she is receiving a treatment or a placebo. Blinding is a way to get around the placebo effect, which occurs when an untreated subject reports an improvement in symptoms. (The reported improvement in the placebo group may be real or imagined.)
Listed below are pulse rates (beats per minute) from samples of adult males and females. Find the mean and median for each of the two samples and then compare the two sets of results. Does there appear to be a difference? Male: 59 87 52 86 68 61 61 66 54 73 53 53 59 79 97 Female: 79 81 94 86 83 89 89 62 95 80 72 85 76 69 84 Find the means. The mean for males is ____ beats per minute and the mean for females is ____ beats per minute.
male = 68 female = 82
When an odd number of data values are arranged in order, the _________ is the middle value.
median
If the maximum and minimum values in a data set are averaged, the result is the
midrange
Five Number Summary
minimum First quartile Q1 Second quartile Q2 Third quartile Q3 Maximum
The data value that occurs with the greatest frequency is called the
mode
A(n) _______ distribution has a "bell" shape.
normal
A study where a political pollster wishes to determine if his candidate is leading in the polls for an upcoming election. a) simulation b) observational study c) experiment
observational study. In this study, a political pollster wishes to determine if his candidate is leading in the polls for an upcoming election. This is an example of an observational study because the researcher is observing and collecting data without manipulating any variables. In an observational study, the researcher observes and collects data without manipulating any variables.
______ is/are the entire group of individuals or items being studied.
population
Data come in two basic types
qualitative and quantitative. Qualitative (or categorical) data consist of values that can be placed into nonnumerical categories. Quantitative data consist of values representing counts or measurements.
Eye Color a) qualitative b) quantitative
qualitative. Eye color is a qualitative variable because it cannot be measured numerically and is instead described by categories or labels. Qualitative variables are variables that cannot be measured numerically and are instead described by categories or labels. Quantitative variables, on the other hand, are variables that can be measured on a numerical scale
The cost of a Statistics textbook. a) qualitative b) quantitative
quantitative. The cost of a Statistics textbook is a quantitative variable because it can be measured numerically. Quantitative variables are variables that can be measured on a numerical scale. Qualitative variables, on the other hand, are variables that cannot be measured numerically and are instead described by categories or labels.
quantitative data
refers to data or information that is numerical and can be measured or counted.
continuous
result from infinitely many possible quantitative values, where the collection of values is not countable. (That is, it is impossible to count the individual items because at least some of them are on a continuous scale, such as the lengths of distances from 0 cm to 12 cm.)
discrete data
result when the data values are quantitative and the number of values is finite, or "countable." (If there are infinitely many values, the collection of values is countable if it is possible to count them individually, such as the number of tosses of a coin before getting tails.)
______ is/are a subset of the population that is being studied.
sample
Systematic sampling
select some starting point and then select every kth (such as every 50th) element in the population.
A histogram aids in analyzing the _______ of the data.
shape of the distribution
Convenience sampling
simply use data that are very easy to get.
50 sophomores, 30 juniors and 20 seniors are randomly selected from 500 sophomores, 300 juniors and 200 seniors at a certain high school. a. stratified b. cluster c. convenience d. systematic e. random
stratified
50 sophomores, 30 juniors and 20 seniors are randomly selected from 500 sophomores, 300 juniorsand 200 seniors at a certain high school. a) stratified b) cluster c) convenience d) systematic
stratified. In this example, 50 sophomores, 30 juniors and 20 seniors are randomly selected from 500 sophomores, 300 juniors and 200 seniors at a certain high school. This is an example of stratified sampling because the population is divided into different groups (strata) based on some characteristic (in this case, grade level), and then a random sample is taken from each group. Stratified sampling involves dividing the population into different groups (strata) based on some characteristic and then taking a random sample from each group.
Stratified sampling
subdivide the population into at least two different subgroups (or strata) so that subjects within the same subgroup share the same characteristics (such as gender). Then we draw a sample from each subgroup (or stratum).
Every fifth person boarding a plane is searched thoroughly. a) cluster b) random c) convenience d) systematic
systematic. In this example, every fifth person boarding a plane is searched thoroughly. This is an example of systematic sampling because the sample is selected by choosing every kth element from the population. Systematic sampling involves selecting every kth element from the population after a random start.
The ___________ is found by adding all the data values and dividing by the total number of values.
the mean
distribution
the way that a set of data is spread out over a range of values. It describes the frequency of different values or ranges of values in the data set. There are many different types of distributions, including normal, uniform, and skewed distributions. The shape of a distribution can provide important information about the underlying data and can be used to make inferences or predictions.
What is the symbol used to represent the population mean?
μ mu