ISDS Final
Calculate the interquartile range from the following data: 1, 2, 4, 5, 10, 12, 18.
5 6 *10* 17
An marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales= β0 +β1 Advertising + ε. The following ANOVA table below shows a portion of the regression results.
54,500
Approximate the percentage of houses that sold for more than $500,000.
60% Draw a vertical line from about 500 on the x axis; this crosses the ogive at approximately 0.4. So about 40% of the houses sold for less than $500,000, which implies that about 60% sold for more than $500,000.
A college professor collected data on the number of hours spent by his 100 students over the weekend to prepare for Monday's Business Statistics exam. He processed the data by Excel and the following incomplete output is available.
The median is most likely to be ________________. *Less than 7 hours*
When using a polygon to graph quantitative data, what does each point represent?
The midpoint of a particular class and its associated frequency or relative frequency
What is the percentage of people who live below the poverty level in the West or Midwest?
The percent frequency is the percent of observations in a category (or categories), and it equals the frequency divided by the total number of observations and multiplied by 100.
How do the tdf and z distributions differ?
There is no difference between the tdf and z distributions. The z distribution has broader tails (it is flatter around zero). *The tdf distribution has broader tails (it is flatter around zero).* The z distribution has asymptotic tails, while the tdf distribution does not.
What is the purpose of calculating a confidence interval?
To provide a range of values that, with a certain measure of confidence, contains the population parameter of interest.
What is(are) the most widely used measure(s) of dispersion?
Variance and standard deviation
A 90% confidence interval is constructed for the population mean. If a 95% confidence interval had been constructed instead (everything else remaining the same), the width of the interval would have been ________ and the probability of making an error would have been _________.
Wider, smaller
Suppose the average price for new cars in 2012 has a mean of $30,100 and a standard deviation of $5,600. Based on this information, what interval of prices would we expect at least 89% of new car prices to fall within?
$24,500 to $35,700 $18,900 to $41,300 *$13,300 to $46,900* $7.700 to $52,500
The starting salary of an administrative assistant is normally distributed with a mean of $50,000 and a standard deviation of $2,500. We know that the probability of a randomly selected administrative assistant making a salary between μ - x and μ + x is 0.7416. Find the salary range referred to in this statement.
$42,825 to $52,825 $42,825 to $57,175 *$47,175 to $52,825* $47,175 to $57,175
Suppose that, on average, electricians earn approximately µ = $54,000 per year in the United States. Assume that the distribution for electricians' yearly earnings is normally distributed and that the standard deviation is σ = $12,000. What is the probability that the average salary of four randomly selected electricians exceeds $60,000?
*0.1587* 0.3085 0.6915 0.8413
Suppose the round-trip airfare between Boston and Orlando follows the normal probability distribution with a mean of $387.20 and a standard deviation of $68.50. What is the probability that a randomly selected airfare between Boston and San Francisco will be more than $450?
*0.1788* 0.0788 0.3369 0.2033
The labor force participation rate is the number of people in the labor force divided by the number of people in the country who are of working age and not institutionalized. The BLS reported in February 2012 that the labor force participation rate in the United States was 63.7% (Calculatedrisk.com). A marketing company asks 120 working-age people if they either have a job or are looking for a job, or, in other words, whether they are in the labor force. What is the probability that fewer than 60% of those surveyed are members of the labor force?
*0.2005* 0.7995 0.8400 0.9706
The height of the probability density function f(x) of the uniform distribution defined on the interval [a, b] is _______.
*1/(b - a) between a and b, and zero otherwise.* (b - a)/2 between a and b, and zero otherwise. (a + b)/2 between a and b, and zero otherwise. 1/(a + b) between a and b, and zero otherwise.
Suppose the wait to passthrough immigration at JFK Airport in New York is thought to be bell-shaped and symmetrical with a mean of 22 minutes. It is known that 68% of travelers will spend between 16 and 28 minutes waiting to pass through immigration. The standard deviation for the wait time through immigration is _________.
*6 minutes* 8 minutes 9 minutes 10 minutes
The national average for an eighth-grade reading comprehension test is 73. A school district claims that its eighth-graders outperform the national average. In testing the school district's claim, how does one define the population parameter of interest?
*The mean score on the eighth-grade reading comprehension test* The number of eighth graders who took the reading comprehension test The standard deviation of the score on the eighth-grade reading comprehension test The proportion of eighth graders who scored above 73 on the reading comprehension test
A professional sports organization is going to implement a test for steroids. The test gives a positive reaction in 94% of the people who have taken the steroid. However, it erroneously gives a positive reaction in 4% of the people who have not taken the steroid. What is the probability of Type I and Type II errors giving the null hypothesis "the individual has not taken steroids."
*Type I: 4%, Type II: 6%* Type I: 6%, Type II: 4% Type I: 94%, Type II: 4% Type I: 4%, Type II: 94%
A Type I error occurs when we ___________.
*reject the null hypothesis when it is actually true* reject the null hypothesis when it is actually false do not reject the null hypothesis when it is actually true do not reject the null hypothesis when it is actually false
Simple linear regression analysis differs from multiple regression analysis in that ___________________________________________________________.
*simple linear regression uses only one explanatory variable* the coefficient of correlation is meaningless in simple linear regression goodness-of-fit measures cannot be calculated with simple linear regression the coefficient of determination is always higher in simple linear regression
It is generally believed that no more than 0.50 of all babies in a town in Texas are born out of wedlock. A politician claims that the proportion of babies born out of wedlock is increasing. When testing the two hypotheses, H0: p ≤ 0.50 and HA: p > 0.50, p stands for _____________.
*the current proportion of babies born out of wedlock* the mean number of babies born out of wedlock the number of babies born out of wedlock the general belief that the proportion of babies born out of wedlock is no more than 0.50
In a simple linear regression model, if the points on a scatter diagram lie on a straight line, which of the following is the coefficient of determination?
+1
The probability P(Z < -1.28) is closest to _______.
-0.10 *0.10* 0.20 0.90
The probability P(Z > 1.28) is closest to _______.
-0.10. *0.10.* 0.20. 0.90.
A statistics student is asked to estimate y = β0 + β1x + ε. She calculates the following values: formula127.mml= 440, formula128.mml, formula129.mml= 1,120, n = 11. The value of the slope b1 is ____.
-1.29
Consider the following data: = 20, sx = 2, formula = -5, sy = 4, and b1 = -0.8. The sample correlation coefficient, rxy is equal to ____.
-40
A regression equation was estimated as formula = -100 + 0.5x. If x = 20,the predicted value of y is _____.
-90
Professor Elderman has given the same multiple-choice final exam in his Principles of Microeconomics class for many years. After examining his records from the past 10 years, he finds that the scores have a mean of 76 and a standard deviation of 12. What is the probability that a class of 36 students will have an average greater than 70 on Professor Elderman's final exam?
0.0014 0.3085 0.6915 *0.9986*
A random sample of size 100 is taken from a population described by the proportion p = 0.60. What are the expected value and the standard error for the sampling distribution of the sample proportion?
0.006 and 0.0024 0.060 and 0.049 0.600 and 0.0024 *0.600 and 0.049*
Find the probability P(-1.96 ≤ Z ≤ 0).
0.0250 0.0500 *0.4750* 0.5250
Find the probability P(-1.96 ≤ Z ≤ 1.96).
0.0500 *0.9500* 0.9750 1.9500
A nursery sells trees of different types and heights. These trees average 60 inches in height with a standard deviation of 16 inches. Suppose that 75 pine trees are sold for planting at City Hall. What is the standard deviation for the sample mean?
1.85
What is za/2 for a 95% confidence interval of the population mean?
1.96
Suppose the average math SAT score for students enrolled at local community college is 490.4 with a standard deviation of 63.7. A random sample of 49 students has been selected. The standard error of the mean for this sample is _______.
10.6 *9.1* 63.7 28.4
An analyst is forecasting net income for Excellence Corporation for the next fiscal year. Her low-end estimate of net income is $250,000, and her high-end estimate is $350,000. Prior research allows her to assume that net income follows a continuous uniform distribution. The probability that net income will be greater than or equal to $337,500 is _______.
12.5%
Consider a population with data values of 12 8 28 22 12 30 14. The median is ____.
14
A bowler's scores for a sample of six games were 172 168 188 190 172 182 174. The bowler's median score is _____.
174
An analyst studies a data set of the 2011 year-end book value per share for all companies listed on the New York Stock Exchange. This data set is best described as
timeseries data. cross-sectional data. neither timeseries nor cross-sectional data. a combination of timeseries and cross-sectional data. CROSS-SECTIONAL DATA
Consider the following simple linear regression model: y = β0 + β1x + ε. The random error term is ____.
y x *ε* β0
The central limit theorem states that, for any distribution, as n gets larger, the sampling distribution of the sample mean _______.
is closer to a normal distribution
The ordinal scale of data measurement is
less sophisticated than the nominal scale. more sophisticated than the interval scale. more sophisticated than the nominal scale. as equally sophisticated as the nominal scale. MORE SOPHISTICATED THAN THE NOMINAL SCALE
The table below gives the deviations of a portfolio's annual total returns from its benchmark's annual returns, for a 6-year period ending in 2011.
mean = -1.67% and median = -0.56%.
A respondent of a survey is asked whether the Philadelphia Flyers' performance in the last game was excellent, good, fair, or poor. The person indicates that the performance was "good." This is an example of
nominal data ordinal data interval data ratio data
Using the central limit theorem, applied to the sampling distribution of the sample proportion, what conditions must be met?
np >_ 5 and n(1-p) >_ 5
Over the entire six years that students attend an Ohio elementary school, they are absent, on average, 28 days due to influenza. Assume that the standard deviation over this time period is σ = 9 days. Upon graduation from elementary school, a random sample of 36 students is taken and asked how many days of school they missed due to influenza. The probability that the sample mean is less than 30 school days is _______.
0.0918 0.4129 0.5871 *0.9082*
According to the 2011 Gallup daily tracking polls (www.gallup.com, February 3, 2012), Mississippi is the most conservative U.S. state, with 53.4 percent of its residents identifying themselves as conservative. What is the probability that at least 100 but fewer than 115 respondents of a random sample of 200 Mississippi residents identify as conservative?
0.1685 0.3370 *0.7085* 0.8770
A university administrator expects that 25% of students in a core course will receive an A. He looks at the grades assigned to 60 students. The probability that the proportion of students who receive an A is not between 0.20 and 0.30 is _______.
0.1867 *0.3734* 0.6266 0.8133
The time to complete the construction of a soapbox derby car is normally distributed with a mean of three hours and a standard deviation of one hour. Find the probability that it would take between 2.5 and 3.5 hours to construct a soapbox derby car.
0.3085 *0.3830* 0.6170 0.6915
Suppose the average price of gasoline for a city in the United States follows a continuous uniform distribution with a lower bound of $3.50 per gallon and an upper bound of $3.80 per gallon. What is the probability a randomly chosen gas station charges more than $3.70 per gallon?
0.3333
A random sample of size 36 is taken from a population with mean µ = 17 and standard deviation σ = 6. What are the expected value and the standard deviation for the sampling distribution of the sample mean?
0.425 and 1.00 0.425 and 2.83 *17 and 1.00* 17 and 2.83
Suppose the round-trip airfare between Boston and Orlando follows the normal probability distribution with a mean of $387.20 and a standard deviation of $68.50. What is the probability that a randomly selected airfare between Boston and San Francisco will be less than $300?
0.4286 *0.1020* 0.2005 0.3192
In the estimation of a multiple regression model with two explanatory variables and 20 observations, SSE = 550 and SST = 1000. Which of the following is the correct value of R2?
0.45
The sample data below shows the number of hours spent by five students over the weekend to prepare for Monday's Business Statistics exam. 3 12 2 3 5. The 75th percentile of the data is the closest to _________.
8.5 hours
Which of the following meets the requirements of a stratified random sample?
A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include six people who volunteer for the sample. A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include six people chosen at random, without regard to age. A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include six males chosen at random, without regard to age. *A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include two people chosen at random under the age of 25 and four people chosen at random over 25.*
What type of relationship is indicated in the scatterplot?
A positive linear or curvilinear relationship When looking at the plotted points, the variables have a positive relationship (y tends to increase as x increases), and the relationship looks linear or curvilinear.
If the p-value for a hypothesis test is 0.027 and the chosen level of significance is α = 0.05, then the correct conclusion is to ________________.
Reject the null hypothesis
Let X be normally distributed with mean μ and standard deviation σ > 0. Which of the following is true about the z value corresponding to a given x value?
A positive z = (x - μ)/σ indicates how many standard deviations x is above μ. A negative z = (x - μ)/σ indicates how many standard deviations x is below μ. The z value corresponding to x = μ is zero. *All of the above.*
A correlation coefficient r = −0.85 could indicate a _______________________________.
A very negative linear relationship
Which of the following is true when using the empirical rule for a set of sample data?
Almost all observations are in the interval formula47.mml *Approximately 68% of all observations are in the interval formula48.mml* Approximately 95% of all observations are in the interval formula49.mml Approximately 68% of all observations are in the interval formula50.mml
Chebyshev's theorem is applicable when the data are___________________.
Any shape
Which of the following identifies the range for a correlation coefficient?
Any value less than 1. Any value greater than 0. Any value between 0 and 1. None of these choices is correct. None of these are correct.
How do we find the median if the number of observations in a data set is odd?
By taking the middle value in the sorted data set
Which of the following statements is the least accurate concerning correlation analysis?
Correlation does not imply causation. The correlation coefficient captures only a linear relationship. The correlation coefficient may not be a reliable measure when outliers are present in one or the both of the variables. *The correlation coefficient describes both the direction and strength of the relationship between two variables only if the two variables have the same units of measurement.*
Statistics are used to estimate population parameters, particularly when it is impossible or too expensive to poll an entire population. A particular value of a statistic is referred to as a(n) _______.
Estimate
What is the name of the variable that is used to predict another variable?
Explanatory
In order to summarize qualitative data, a useful tool is a __________.
FREQUENCY DISTRIBUTION
What is the decision rule when using the p-value approach to hypothesis testing?
Reject Ho if the p-value > α. *Reject Ho if the p-value < α.* Do not reject Ho if the p-value < 1 - α. Do not reject Ho if the p-value > 1 - α.
A ______ is a series of rectangles where the width and height of each rectangle represent the class width and frequency (or relative frequency) of the class, respectively.
HISTOGRAM
Which of the following variables is qualitative?
Height Gender Weight Temperature GENDER
A university is interested in promoting graduates of its honors program by establishing that the mean GPA of these graduates exceeds 3.50. A sample of 36 honors students is taken and is found to have a mean GPA equal to 3.60. The population standard deviation is assumed to equal 0.40. To establish whether the mean GPA exceeds 3.50, the appropriate hypotheses are ______________.
Ho <_ 3.50, H > 3.50
Expedia would like to test if the average round-trip airfare between Philadelphia and Dublin is less than $1,200. The correct hypothesis statement would be __________________________.
Ho >_ 1200 , H < 1200
Which of the following are two-tailed tests?
Ho:μ = 10, HA:μ ≠ 10
This formula is used when we construct a frequency distribution or a histogram for quantitative data.
MAXIMUM VALUE-MINIMUM VALUE/ NUMBER OF CLASSES
The mode is defined as the _____________.
Most frequent value in the data set
In general, the null and alternative hypotheses are __________.
Mutually exclusive
An ________ is a graph that plots the cumulative frequency (or the cumulative relative frequency) of each class against the upper limit of the corresponding class.
OGIVE
A continuous random variable has the uniform distribution on the interval [a, b] if its probability density function f(x) _______.
Provides all probabilities for all x between a and b. Is bell-shaped between a and b. *Is constant for all x between a and b, and 0 otherwise.* Asymptotically approaches the x axis when x increases to +∞ or decreases to -∞.
The following scatterplot indicates that the relationship between the two variables x and y is ______________.
Strong and positive
When using the empirical rule, which of the following assumptions is made?
The data only comes from a sample. The data only comes from a population. The data are exactly symmetric and bell-shaped. *The data are approximately symmetric and bell-shaped.*
The Boston public school district has had difficulty maintaining on-time bus service for its students ("A Year Later, School Buses Still Late," Boston Globe, October 5, 2011). Suppose the district develops a new bus schedule to help combat chronic lateness on a particularly woeful route. Historically, the bus service on the route has been, on average, 12 minutes late. After the schedule adjustment, the first 36 runs were an average of eight minutes late. As a result, the Boston public school district claimed that the schedule adjustment was an improvement—students were not as late. Assume a population standard deviation for bus arrival time of 12 minutes. At the 5% significance level, does the evidence support the Boston public school district's claim?
Yes, because the p-value is less than α.
Is it possible for a data set to have no mode?
Yes, if there are no observations that occur more than once
When testing whether the correlation coefficient differs from zero, the value of the test statistic is t20 = 1.95 with a corresponding p-value of 0.0653. At the 5% significance level, can you conclude that the correlation coefficient differs from zero?
Yes, since the p-value exceeds 0.05. Yes, since the test statistic value of 1.95 exceeds 0.05. *No, since the p-value exceeds 0.05.* No, since the test statistic value of 1.95 exceeds 0.05.
We draw a random sample of size 25 from the normal population with variance 2.4. If the sample mean is 12.5, what is a 99% confidence interval for the population mean?
[11.2600, 13.7400] [11.3835, 13.6165] *[11.7019, 13.2981]* [11.7793, 13.2207]
Population parameters are difficult to calculate due to
cost prohibitions on data collection. the infeasibility of collecting data on the entire population. the fact that samples are difficult to draw due to the nature of the data. both cost prohibitions on data collection and the infeasibility of collecting data on the entire population. BOTH COST PROHIBITIONS ON DATA COLLECTION ON AND THE INFEASIBILITY OF COLLECTING DATA ON THE ENTIRE POPULATION
Horizontal bar charts are constructed by placing _________________________________________________________.
each category on the vertical axis and the appropriate range of values on the horizontal axis each category on the horizontal axis and the appropriate range of values on the vertical axis each interval of values on the vertical axis and the appropriate range of values on the horizontal axis None of the above EACH CATEGORY ON THE VERTICAL AXIS AND THE APPROPRIATE RANGE OF VALUES ON THE HORIZONTAL AXIS
In inferential statistics, we calculate statistics of sample data to
estimate unknown population parameters. conduct tests about unknown population parameters. Both of these choices are correct. Neither of these choices is correct. BOTH OF THESE CHOICES ARE CORRECT
When a characteristic of interest differs among various observations, then it can be termed a
parameter. variable. data. information. VARIABLE
A sample statistic is an estimate of
population parameter. population statistic. sample parameter. descriptive statistic. POPULATION PARAMETER
Your business statistics class had a test last week. The average score for the class is an example of
secondary data qualitative data descriptive statistics inferential statistics DESCRIPTIVE STATISTICS
A newly hired basketball coach promised a high-paced attack that will put more points on the board than the team's previously tepid offense historically managed. After a few months, the team owner looks at the data to test the coach's claim. He takes a sample of 36 of the team's games under the new coach and finds that they scored an average of 101 points with a standard deviation of 6 points. Over the past 10 years, the team had averaged 99 points. What is the value of the appropriate test statistic to test the new coach's claim at the 1% significance level?
t35 = 2.00
When conducting a hypothesis test concerning the population mean, and the population standard deviation is unknown, the value of the test statistic is calculated as __________.
tdf = x - uo
Bias can occur in sampling. Bias refers to _______.
the division of the population into overlapping groups the creation of strata, which are proportional to the stratum's size the use of cluster sampling instead of stratified random sampling *the tendency of a sample statistic to systematically over- or underestimate a population parameter*
A fast-food franchise is considering building a restaurant at a busy intersection. A financial advisor determines that the site is acceptable only if, on average, more than 300 automobiles pass the location per hour. The advisor tests the following hypotheses: Ho: μ ≤ 300. HA: μ > 300. The consequences of committing a Type I error would be that____________________________.
the franchiser builds on an acceptable site *the franchiser builds on an unacceptable site* the franchiser does not build on an acceptable site the franchiser does not build on an unacceptable site
The study of statistics can be defined as
the language of data. the art and science of getting information from data. the study of collecting, analyzing, presenting, and interpreting data. All of these choices are correct. All of these are correct.
Consider the following sample regression equation formula191.mml= 200 + 10x, where y is the supply for Product A (in 1000s)(in 1,000s) and x is the price of Product A (in $). The slope coefficient indicates that if ___________.
the price of Product A increases by $1, then on average, supply increases by 10,000
The ages of MBA students at a university are normally distributed with a known population variance of 10.24. Suppose you are asked to construct a 95% confidence interval for the population mean age if the mean of a sample of 36 students is 26.5 years. If a 99% confidence interval is constructed instead of a 95% confidence interval for the population mean, then _______________________________________________.
the resulting margin of error will increase and the risk of reporting an incorrect interval will increase the resulting margin of error will decrease and the risk of reporting an incorrect interval will increase *the resulting margin of error will increase and the risk of reporting an incorrect interval will decrease* the resulting margin of error will decrease and the risk of reporting an incorrect interval will decrease
The owner of a large car dealership believes that the financial crisis decreased the number of customers visiting her dealership. The dealership has historically had 800 customers per day. The owner takes a sample of 100 days and finds the average number of customers visiting the dealership per day was 750. Assume that the population standard deviation is 350. The value of the test statistic is ____________.
z = 1.429
Find the z value such that P(-z ≤ Z ≤ z) = 0.95.
z = 1.96
Find the z value such that P(Z ≤ z) = 0.9082.
z=1.33
