STAT CHAP 1,
Which measure of center must be equal to an actual data value? Explain why.
Since the mode is the most frequent observation that occurs in the data set, it must be an actual value from the data set.
julie who cuts and styles hair in her home salon, had 23 customers last week
descriptive stat
the average exam score for my statistics exam was an 88
descriptive stat
the averagee amazon. com rating of the book the complete idiots guide to statistics by 26 reviewers is 4.6 on a scale of 1 to 5
descriptive statistics
_____ statistics consists of organizing and summarizing the collected data, while _____ statistics uses methods that generalize results obtained from a sample to the population.
descriptive; inferential
Karen wants to determine the effectiveness of a to lower blood pressure. In order to do this, she gives a placebo to half of the study group and the blood pressure medicine to the other half and observes the results. What method is KarenKaren using for her study?
experiment
a recent poll showed that 57% of american had a favorable opinion of the president of the united states
inferential statistics
A parameter is a
measurement from the population.
The value that divides a histogram into two equal areas is called the ____________. The value that serves as a balancing point for a histogram is the ____________.
median; mean
A ________________ variable classifies individuals based on some attribute or characteristic.
qualitative
Number of respondents Black 757 white 345 other 227
qualitative/ nominal
Monthly temperatures
quantitative/interval
Your IQ score
quantitative/interval
the number of boxes of frosted flakes on the shelf of a grocery store
quantitative/ratio
What is the symbol used to represent the sample standard deviation?
s
Name a feature of a distribution that is more easily seen in a histogram than a boxplot.
The shape of the distribution
What is the symbol used to represent the sample mean?
__ x
Any measurement from a sample is considered a _____.
statistic.
which is true concerning bar graphs
the height of each bar represents the category frequency or relative frequency
popultion mean?
μ
What is the symbol used to represent the population standard deviation?
σ
What is the symbol used to represent the population variance?
σ^2
Which two graphs allow the reader to retrieve the original list of data?
Stem-and-leaf plots and dotplots
When studying a group of individuals, four techniques that are available are "Direct observation", "Focus group", "Experiment", and "Survey." Which of the following is a correct definition of the term "
A method of gathering data while the subjects of interest are inA method of gathering data while the subjects of interest are in their natural environment , often unaware they are being watched.
The _________________ is/are the entire group of individuals or items being studied.
population
Describe the difference between a statistic and a parameter.
A parameter is a descriptive numerical measure computed from an entire population. A statistic is the corresponding measure for a sample. Descriptive numerical measures, such as an average or a proportion, that are computed from an entire population are called parameters. Corresponding measures for a sample are called statistics.
When studying a group of individuals, four techniques that are available are "Direct observation", "Focus group", "Experiment", and "Survey." Which of the following is a correct definition of the term "Survey"?
A technique where individuals are directly asked asked a series of questions.
Identify the statement below as either descriptive or inferential statistics. Upper A certain website sold an average of 242 books per day last week.A certain website sold an average of 242 books per day last week.
A. It is an example of descriptivedescriptive statistics because it summarizes the information in a sample.
Suppose the equation of a least-squares regression line is ModifyingAbove y with caretyequals=minus−3.17minus−2.4x. What can be said about the correlation coefficient?
It is negative, but its exact value cannot be determined from the given information.
Which information can you obtain from a stem-and-leaf plot but not from a histogram?
Minimum and maximum data values
The correlation coefficient measures the strength and direction of the linear relationship between two variables.
The two variables must be quantitative.
What is the definition of the correlation coefficient?
The correlation coefficient is a measure that describes the direction and strength of the linear relationship between two quantitative variables.
The type of cars driven by students in your class
qualitative/nominal
What does a correlation coefficient of 0 indicate?
There is no linear relationship between the two quantitative variables.
Identify the type of data (qualitative/quantitative) and the level of measurement for the education leveleducation level of survey respondents. Explain your choice. LevelLevel Number of respondents High schoolHigh school 169 Bachelor's degreeBachelor's degree 751 Master's degreeMaster's degree 229 What is the data set's level of measurement?
Qualitative, because descriptive terms are used to measure or classify the data. Ordinal, because the data are categories or labels that can be ranked
The average temperature (in °C) of water samples taken in the lakes in a regionaverage temperature (in °C) of water samples taken in the lakes in a region What is the data set's level of measurement?
Quantitative, because numerical values, found by either measuring or counting, are used to describe the data. Interval, because the differences in the data can be meaningfully measured, butbut the data do not havedo not have a true zero point.
Explain why city altitude is a variable at the interval and not ratio level of measurement.
Differences in altitudes can be found and are meaningful. However, there is no natural zero starting point since some cities have altitudes below sea level. A value of zero does not mean that a city has no altitude. It only means that it is even with sea level. Since there is no natural zero starting point for this variable, it is not at the ratio level of measurement.
Explain why the mean should not be found for a sample of zip codes. Which measure of center should be used instead?
Even though they are numeric data, zip codes are qualitative since they do not measure or count anything. The mean cannot be found since adding zip codes would be meaningless. For qualitative data, the mode is the only measure of center that can be found.
In regression, what is predicting outside the range of the x-values from the sample data called?
Extrapolation
Researchers wondered if brain size has an effect on a person's IQ. From a sample of 20 individuals, the equation of the least-squares regression line is ModifyingAbove y with caretyequals=71.8plus+0.0286x, where x represents the size of a brain in cubic centimeters and y represents IQ. What is the interpretation of the slope?
IQ is predicted to increase by 0.0286 for every 1 cubic centimeter increase in brain size.
In a typical boxplot, the length of the box indicates which measure of spread?
Interquartile range (IQR)
On a website that sells books , the average rating of a certain statistics book by 20 reviewers is is less than the average rating of that same book by 20 reviewers last year.
It is descriptive statistics because it summarizes the information in two samples.
The average salary of a random sample of 50 high school teachers in 2010 was less than the The average salary of a random sample of 50 high school teachers in 2009.
It is descriptive statistics because it summarizes the information in two samples.
Based on a random sample , it was concluded that the average cost of a hotel room in City A has decreased over the past year.
It is inferential statistics because it uses a sample to make a claim about a population
Based on a random sample comma it was concluded that the average cost of a hotel room in City , it was concluded that the average cost of a hotel room in City A was $ 169.99 for a single night.
It is inferential statistics because it uses a sample to make a claim about a population
A study has concluded that the average credit card debt of college graduates is less than the average credit card debt of college undergraduates. Is the above an example of descriptive or inferential statistics?
It is inferential statistics because it uses two samples to make a claim about two populations.
Suppose the equation of a least-squares regression line is ModifyingAbove y with caretyequals=minus−3.17minus−2.4x. What can be said about the y-intercept?
It is minus−3.17.
Suppose you want to know if more technical service calls are made to homes with cable television or with satellite dish television. Should you use frequencies or relative frequencies to make the comparison? Why?
Relative frequencies should be used since there is likely a difference in the number of users of cable and satellite television. If you make comparisons using frequencies, the results can be very misleading for different population sizes.
Suppose a pediatrician is wondering whether there is more variability in the heights or weights of the 2-year-old boys that he sees and collects the data below for a sample of 100 2-year-old boys in his practice. He concludes that the boys' weights vary more than their heights since the standard deviation is greater for weight than for height. What is wrong with this conclusion? Heights: mean=30.2 in., standard deviation=1.9 in. Weights: mean=29.4 lb, standard deviation=2.1 lb
Since the standard deviations have different units, he cannot compare them directly. The coefficient of variation should be used instead.
April calculated a correlation coefficient between sex and GPA as minus−0.25. She said there is a weak correlation between a person's sex and their GPA. Which of the following is an appropriate comment about April's statement?
The correlation coefficient does not make sense to describe the relationship between a categorical and quantitative variable.
The table to the right shows the median weekly earnings of full-time workers by age rangeage range over a five-year period. Identify the cross dash sectionalcross-sectional data in the table. Year 16 dash 24 years16-24 years 25 plus Years25+ Years 2004 $573573 $582582 2005 $585585 $587587 2006 $600600 $587587 2007 $614614 $623623 2008 $638638 $644
The cross dash sectionalcross-sectional data are the five rowsfive rows because they eacheach show values taken from two different subjects in a single yeartwo different subjects in a single year. The time series time series data are the two earnings columnstwo earnings columns because they bothboth show values taken from a single subject across five yearsa single subject across five years.
The table to the right shows the price of a car atprice of a car at threethree major car dealerships on July 29 commamajor car dealerships on July 29, 2009. Identify the data as either time series or cross-sectional. Car DealershipCar Dealership CarCar Price Car Lot 1 $35,000 Car Lot 2 $32,000 Car Lot 3 $41,000
The data are cross dash sectionalcross-sectional because the data are taken from situations that varytaken from situations that vary over time but are measuredover time but are measured during a singleduring a single time period.time period.
Both the Empirical Rule and Chebyshev's Inequality can be used to determine the percentage of data that lie within a certain range. What assumptions must be made about the underlying distribution before using these rules?
The distribution must be roughly bell-shaped for the Empirical Rule to apply, but Chebyshev's Inequality holds for distributions of any shape.
Which measure of center (mean or median) is resistant? Explain what it means for that measure to be resistant.
The median is resistant because it is not sensitive to extreme values in the data set. If the largest observation was doubled, for example, the median would not change since that largest value does not factor into its computation.
A(n) ______________ is a sample that does not represent the intended population and can lead to distorted findings.
biased sample
A qualitative variable involves _______ as opposed to numeric measurements that count or measure something.
categories
The numbers used to separate the classes of a frequency distribution, but without the gaps created by class limits, are called ______
class boundaries
The ____________________ is the difference between two consecutive lower class limits or two consecutive upper class limits.
class width
Remember that a variable is not necessarily quantitative just because it has numbers. It must _________.
count or measure something.
cross-sectional
data values that correspond to a specific time period
time-series
data values that corrosion dot a specific measurement over a range of time periods
seventy eight percent of customers at holiday inn hotel in dover delarware arrived before 6pm last week
descriptive statistics
A correlation coefficient close to 1 is evidence of a cause-and-effect relationship between the two variables. T/F
false
Allie calculated a correlation coefficient of minus−0.5. She made a mistake in her calculation since the correlation coefficient cannot be negative. T/F
false
Heather was investigating the relationship between outside temperature and type of activity people were engaged in (indoor versus outdoor). She can use the correlation coefficient to describe the strength of this relationship as long as the relationship is linear. T/F
false
A(n) ____________________ is a bar graph in which the height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same, and the rectangles touch each other.
histogram
based on random sample of hotels in chicago ad a randoms ample of hotels in atlanta, it was concluded that th average cost of a hotel room in chicago was greater than in atlanta
inferential statistics
households with children under age of 18 are more likely to have access to internet (77%) than family households with no children (68%)
inferential statistics
the genders of the respondents in survey `
qualitative/nominal
the marital status of survey respondents single: 38 married: 189 divorced: 62
qualitative/nominal
the state in which the respondents in a survey reside
qualitative/nominal
the voting intentions of the respondents in a survey classified as republican, democrat, or undecided
qualitative/nominal
The letter grade earned in your statistics course
qualitative/ordinal
a list of graduating high school seniors by class rank
qualitative/ordinal
movie ratings: G, PG, PG-13, R
qualitative/ordinal
the performance rating of employees classified as above expectations, meets expectations or below expectations
qualitative/ordinal
A ________________ variable counts or measures something and has numeric values.
quantitative
The price for one gallon of gasoline
quantitative/ ratio
the average monthly rainfall in inches for the city of wilmington throughout the year
quantitative/ ratio
What is the symbol used to represent the sample variance?
s^s
The _________________ is/are a subset of the population that is being studied.
sample
Suppose that a researcher is interested in the average standardized test score for fifth graders in a local school district. The fifth graders at a specific school would comprise a ___________ and their average test score would be a ___________.
sample; statistic
The line that fits best between the points in a scatterplot is the line that gives the _______ sum of the squared _______ distances between each point and the line.
smallest; vertical
A z-score represents how many ______________ a data value is above or below the ______________.
standard deviations; mean
A correlation coefficient can be 0. T/F
true
Alex calculated a correlation coefficient of minus−1.5. He made a mistake in his calculation since the correlation coefficient has to be between minus−1 and 1. T/F
true