Unit 1 Test STA 2023 McGraw Hill
A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the model: Salary = β0 +β1 Service + ε. The following ANOVA table below shows a portion of the regression results. df SS MS F Regression 1 555,420 555,420 7.64 Residual 27 1,962,873 72,699 Total 28 2,518,293 Coefficients Standard Error t-stat p-value Intercept 784.92 322.25 2.44 0.02 Service 9.19 3.20 2.87 0.01 Which of the following is the monthly salary of an employee that has worked for 48 months at the bank? rev: 11_17_2017_QC_CS-109762 $441. $785. $1,050. $1,226. Use the regression equation formula216.mml = b0 + b1x with given coefficients and x to compute the value of formula216.mml.
$1,226
Amounts spent by a sample of 200 customers at a retail store are summarized in the following relative frequency distribution. Amount Spent (in $) Frequency 0 up to 10 15 10 up to 20 75 20 up to 30 55 30 up to 40 55 The mean amount spent by customers is the closest to _____. $22.50 $20.00 $50.00 $17.50 The mean for a frequency distribution for grouped data is defined as formula63.mml
$22.50 Find mid-point of each interval, multiply by frequncy given in table then divide by total population/observations or n
Professors at a local university earn an average salary of $80,000 with a standard deviation of $6,000. With the beginning of the next academic year, all professors will get a 2% raise. What will be the average and the standard deviation of their new salaries?
$81,600 and $6,120.
When estimating formula244.mml = b0 + b1x1 + b2x2, the following regression results using ANOVA were obtained. df SS MS F Regression 2 210.9 105.5 114.7 Residual 17 15.6 0.92 Total 19 226.5 Coefficients Standard Error t-stat p-value Intercept −1.6 0.57 −2.77 0.0132 x1 −0.5 0.04 −15.11 2.77E-11 x2 0.1 0.07 1.89 0.0753 Which of the following is the prediction of formula244.mml if x1 = 1 and x2 = 2?
-1.9
Consider the following data: x-bar= 20, sx = 2, yhat= -5, sy = 4, and b1 = 0.40. The sample correlation coefficient, rxy is equal to ____.
0.20 Use formula: rxy=B1(sx/sy)
Costco sells paperback books in their retail stores and wanted to examine the relationship between price and demand. The price of a particular novel was adjusted each week and the weekly sales were recorded in the following table. Sales Price 7 $12 4 $11 5 $10 9 $9 8 $8 8 $7 7 $7 Management would like to use simple regression analysis to estimate weekly demand for this novel using the price of the novel. The coefficient of determination for this sample is equal to ________.
0.273 The coefficient of determination is the proportion of the variation in the response variable that is explained by the sample regression equation. It is computed as formula285.mmlwhere the error sum of squares (SSE) is computed as formula286.mml, and the total sum of squares is computed as formula287.mml.
Consider the following frequency distribution. Class Frequency 12 up to 15 3 15 up to 18 6 18 up to 21 3 21 up to 24 4 24 up to 27 4 What proportion of the observations are at least 15 but less than 18?
0.3 Six observations of the 20 total observations fall in the class of 15 up to 18: 6/20 = 0.30.
Students in Professor Smith's business statistics course have evaluated the overall effectiveness of the professor's instruction on a five-point scale, where a score of 1 indicates very poor performance and a score of 5 indicates outstanding performance. The raw scores are displayed in the accompanying table: 1 4 4 5 5 3 4 3 4 1 5 5 4 4 2 3 3 2 3 3 4 5 5 5 5 3 2 3 3 2 picture Click here for the Excel Data File What is the relative frequency of the students who gave Professor Smith an evaluation of 3?
0.3 Nine of the 30 students gave Professor Smith a 3. The relative frequency is thus 9/30 = 0.3.
Over the past 30 years, the sample standard deviations of the rates of return for stock X and sStock Y were 0.20 and 0.12, respectively. The sample covariance between the returns of X and Y is 0.0096. The correlation of the rates of return between X and Y is the closest to ____.
0.40 The correlation coefficient is computed as formula104.mml
In the estimation of a multiple regression model with two explanatory variables and 20 observations, SSE = 550 and SST = 1000. Which of the following is the correct value of R2?
0.45 The coefficient of determination is computed as formula92.mml
In the estimation of a multiple regression model with two explanatory variables and 20 observations, SSE = 550 and SST = 1000. Which of the following is the correct value of R2?
0.45 R^2=1-(SSE/SST)
The following data represent monthly returns (in percent): -7.24 1.64 3.48 -2.49 9.30 The geometric mean return is the closest to ________.
0.78% The geometric mean return is defined as formula18.mml Wrong answers include arithmetic mean, median, using positive values of growth rates instead of negative ones, and not converting percentages to fractions.
. The following table shows the number of cars sold last month by six dealers at Centreville Nissan dealership and their number of years of sales experience. Years of Experience Sales 1 7 2 9 2 9 4 8 5 14 8 14 Management would like to use simple regression analysis to estimate monthly car sales using the number of years of sales experience. The standard error of the estimate is equal to _______.
1.84 The standard error of the estimate, se is a point estimate of the standard deviation of a random error, and it is computed as. formula267.mml
Sales for Adidas grew at a rate of 0.5196 in 2006, 0.0213 in 2007, 0.0485 in 2008, and -0.0387 in 2009. The average growth rate for Adidas during these four years is the closest to _______.
11.83% Use growth rate formula (Long one)
Thirty students at Eastside High School took the SAT on the same Saturday. Their raw scores are given next. 1,450 1,620 1,800 1,740 1,650 1,710 1,900 1,910 1,950 1,820 1,800 2,010 1,780 1,840 1,490 1,590 2,350 2,260 1,870 1,530 1,620 1,480 2,390 1,640 1,830 1,950 2,000 1,830 1,980 2,100 picture Click here for the Excel Data File Consider a frequency distribution of the data that groups the data in classes of 1400 up to 1600, 1600 up to 1800, 1800 up to 2000, and so on. How many students scored at least 1800 but less than 2000?
12
Costco sells paperback books in their retail stores and wanted to examine the relationship between price and demand. The price of a particular novel was adjusted each week and the weekly sales were recorded in the following table. Sales Price 7 $12 4 $11 5 $10 9 $9 8 $8 8 $7 7 $7 Management would like to use simple regression analysis to estimate weekly demand for this novel using the price of the novel. The error sum of squares for this sample is equal to ________.
13.70 The error sum of squares (SSE) is computed as formula275.mml.
Consider a population with data values of 12 8 28 22 12 30 14. The median is ____.
14
Consider a population with data values of 12 8 28 22 12 30 14. The population mean is ____.
18
An real estate analyst believes that the three main factors that influence an apartment's rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment's square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model as Rent = β0 + β1 Bedroom + β2 Bath + β3 Sqft + ε. The following ANOVA table shows a portion of the regression results. df SS MS F Regression 3 5,694,717 1,898,239 50.88 Residual 36 1,343,176 37,310 Total 39 7,037,893 Coefficients Standard Error t-stat p-value Intercept 300 84.0 3.57 0.001 Bedroom 226 60.3 3.75 0.0006 Bath 89 55.9 1.59 0.1195 Sq ft 0.2 0.09 2.22 0.0276 The standard deviation of the difference between actual rent and the estimate of rent is ____.
193 The standard error of the estimate, se is a point estimate of the standard deviation of a random error, and it is computed as se=sqrt(MSE)
The price to earnings ratio, also called the P/E ratio of a stock, is a measure of the price of a share relative to the annual net income per share earned by the firm. Suppose the P/Es for a firm's common stock during the past four quarters are 10, 12, 15, and 11, respectively. The standard deviation of the P/E ratio over the four quarters is ______.
2.16
Amounts spent by a sample of 200 customers at a retail store are summarized in the following relative frequency distribution. Amount Spent (in $) Frequency 0 up to 10 15 10 up to 20 75 20 up to 30 55 30 up to 40 55 The median amount will fall in the following class interval ____________.
20 up to 30 Given 200 observations, the median will be between the 100th and the 101st observations in the sorted data.
A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the model: Salary = β0 +β1 Advertising + ε. The following ANOVA table below shows a portion of the regression results. df SS MS F Regression 1 555,420 555,420 7.64 Residual 27 1,962,873 72,699 Total 28 2,518,293 Coefficients Standard Error t-stat p-value Intercept 784.92 322.25 2.44 0.02 Service 9.19 3.20 2.87 0.01 The coefficient of determination indicates that __________. 22.06% of the variation in Service is explained by the variation in Salary 81.61% of the variation in Service is explained by the variation in Salary 81.61% of the variation in Salary is explained by the variation in Service 22.06% of the variation in Salary is explained by the variation in Service The coefficient of determination is the proportion of the variation in the response variable that is explained by the sample regression equation, and it is computed as formula221.mml.
22.06% of the variation in salary is explained by the variation in Service
The following data represent the recent sales price (in $1,000s) of 24 homes in a Midwestern city. 187 125 165 170 230 139 195 229 239 135 188 210 228 172 127 139 122 181 196 237 115 199 170 239 picture Click here for the Excel Data File Suppose the data on house prices will be grouped into five classes. The width of the classes for a frequency distribution or histogram is the closest to _______.
25 Width of class = (max value − min value)/(# of classes) Width = (239 − 115)/5 = 24.8; so round up to 25.
Students in Professor Smith's business statistics course have evaluated the overall effectiveness of the professor's instruction on a five-point scale, where a score of 1 indicates very poor performance and a score of 5 indicates outstanding performance. The raw scores are displayed in the accompanying table: 1 4 4 5 5 3 4 3 4 1 5 5 4 4 2 3 3 2 3 3 4 5 5 5 5 3 2 3 3 2 picture Click here for the Excel Data File What is the most common score given in the evaluations?
3
In the accompanying stem-and-leaf diagram, the values in the stem and leaf portions represent 10s and 1s digits, respectively. Picture Which of the following numbers appears in the stem-and-leaf diagram?
38
The following is a list of five of the world's busiest airports by passenger traffic for 2010. Name Location # of Passengers (in millions) Hartsfield-Jackson Atlanta, Georgia, United States 89 Capital International Beijing, China 74 London Heathrow London, United Kingdom 67 O'Hare Chicago, Illinois, United States 66 Tokyo Tokyo, Japan 64 The percentage of passenger traffic in the five busiest airports that occurred in Asia is the closest to ___________.
38%
The following data represent scores on a pop quiz in a statistics section: 45 66 74 72 62 44 55 70 33 82 56 56 84 16 16 47 32 32 17 37 picture Click here for the Excel Data File Suppose the data are grouped into five classes, and one of them will be "30 up to 44." that is, {x; 30 ≤ x < 44}. The frequency of this class is _____.
4
The following data represent scores on a pop quiz in a statistics section: 45 66 74 72 62 44 55 70 33 82 56 56 84 16 16 47 32 32 17 37 picture Click here for the Excel Data File Suppose the data are grouped into five classes, and one of them will be "30 up to 44." that is, {x; 30 ≤ x < 44}. The frequency of this class is _____.
4 There are four data values that are at least 30 but less than 44. They are 32, 32, 33, and 37.
The accompanying relative frequency distribution represents the last year car sales for the sales force at Kelly's Mega Used Car Center. Car Sales Relative Frequency 35 up to 45 0.07 45 up to 55 0.15 55 up to 65 0.31 65 up to 75 0.22 75 up to 85 0.25 If Kelly's employs 100 salespeople, how many of these salespeople have sold at least 45 but fewer than 65 cars in the last year?
46 (0.15 + 0.31)100 = 46 employees.
The sample data below shows the number of hours spent by five students over the weekend to prepare for Monday's Business Statistics exam. 3 12 2 3 5. The interquartile range of the data is the closest to ________.
6 hours The interquartile range is computed as a difference L75 - L25.
Automobiles traveling on a road with a posted speed limit of 65 miles per hour are checked for speed by a state police radar system. The following is a frequency distribution of speeds. The standard deviation of this distribution is closest to
6.81
The following histogram represents the number of pages in each book within a collection. What is the frequency of books containing at least 250 but fewer than 300 pages?
7 Use frequencies shown on the histogram for different number of pages in the book.
Recent home sales in a suburb of Washington, D.C., are shown in the accompanying ogive. Picture Approximate the percentage of houses that sold for less than $600,000.
80% Draw a vertical line from the tick mark for 600 on the x axis; this crosses the ogive at approximately 0.8.
The following stem-and-leaf diagram shows the speeds in miles per hour (mph) of 14 cars approaching a toll booth on a bridge in Oakland, California. Picture How many of the cars were traveling faster than 25 mph but slower than 40 mph?
9
You buy 50 stocks of Company A, 30 of Company B, and 20 of Company C. The annual returns of these companies are 8%, 12%, and 10% respectively. The average return for one year is the closest to
9.6% The mean for a frequency distribution for grouped data is defined as formula66.mml
Use the following data to construct a scatterplot. What type of relationship is implied? x 3 6 10 14 18 23 y 34 28 20 12 5 0
A negative relationship
Which of the following relationships can be concluded from examining the correlation coefficient?
All of the Above positive relationship negative relationship no relationship
How does an ogive differ from a polygon?
An ogive is a graphical depiction of a cumulative frequency or cumulative relative frequency distribution, while a polygon is a graphical depiction of a frequency or relative frequency distribution.
An marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales= β0 +β1 Advertising + ε. The following ANOVA table below shows a portion of the regression results. df SS MS F Regression 1 78.53 78.53 3.58 Residual 23 504.02 21.91 Total 24 582.55 Coefficients Standard Error t-stat p-value Intercept 40.1 14.08 2.848 0.0052 Advertising 2.88 1.52 -1.895 0.0608 Which of the following is the prediction of Sales for a firm with Advertising of $500? $1,480 $148,000 $40,100 $54,500 Use the regression equation 1formula204.mml = b0 + b1x with given coefficients and x to compute the value of 1formula204.mml.
Answer: 54,500
In a marketing class of 60 students, the mean and the standard deviation of scores was 70 and 5, respectively. Use Chebyshev's theorem to determine the number of students who scored less than 60 or more than 80.
At most 15 According to Chebyshev's theorem, for any data set, the proportion of observations that lie within k standard deviations from the mean is at least 1 - 1 / k2. If k = 2, at least 75% of the observations fall in the interval defined by formula54.mml
The accompanying chart shows the numbers of books written by each author in a collection of cookbooks. What type of chart is this? Picture
Bar chart for qualitative data
What is an advantage of the correlation coefficient over the covariance?
Both answers-that it falls between -1 and 1 and that it is a unit-free measure-are correct.
What is(are) the characteristic(s) of the coefficient of variation?
Both it adjusts for differences in the magnitude of means and it allows for direct comparisons across different data sets.
Which of the following Excel's functions return the correlation coefficient?
CORREL
Sampling is used heavily in manufacturing and service settings to ensure high-quality products. In which of the following areas would sampling be inappropriate?
Custom cabinet making Custom cabinets are not meant to be standardized in their characteristics. Therefore, sampling would make no sense.
Typically, it is possible to examine every member of the population.
False
A qualitative variable assumes meaningful numerical values.
False A quantitative variable assumes meaningful numerical values, while values of a qualitative variable are typically described in words.
Geometric mean is greater than arithmetic mean
False Geometric mean is smaller than arithmetic and more sensitive to outliers
When summarizing quantitative data it is always better to have up to 30 classes in a frequency distribution.
False It depends on the size of the data set. The recommended number of classes usually ranges from 5 to 20.
When constructing a scatterplot for two quantitative variables, we usually refer to one variable as x and another one as y. Typically, we graph x on the vertical axis and y on the horizontal axis.
False When constructing a scatterplot for two quantitative variables, we usually refer to one variable as x and another one as y. Typically, we graph x on the horizontal axis and y on the vertical axis.
Cross-sectional data contain values of a characteristic of one subject collected over time.
False Cross-sectional data contain values of a characteristic of many subjects collected at the same or similar point of time.
A professor's gender (male, female) as well as rank (assistant, associate, full) represent ordinal data.
False Professor's gender is nominal and rank is ordinal. The categories for nominal data do not have any natural ordering, while such an ordering exists for ordinal data.
The variance and standard deviation are the most widely used measures of central location.
False The variance and standard deviation are the most widely used measures of dispersion
Over the past 30 years, the sample standard deviations of the rates of return for stock X and Stock Y were 0.20 and 0.12, respectively. The sample covariance between the returns of X and Y is 0.0096. To determine whether the correlation coefficient is significantly different from zero, the appropriate hypotheses are: ____________.
H0:pxy=0 , HA:pxy=/0
Which of the following best describes a frequency distribution for qualitative data?
It groups data into categories and records the number of observations in each category.
What is(are) characteristic(s) of the geometric mean?
It is always less than or equal to the arithmetic mean.
Which of the following statements about the mean absolute deviation (MAD) is themost accurate?
It is denominated in the same units as the original data. The sample mean absolute deviation is formula21.mml the population mean absolute deviation is formula22.mml
Which scale of data measurement is appropriate for the names of companies listed on the Dow Jones Industrial Average?
Nominal Scale There is not any natural ordering of the names of these 30 companies.
A recent survey of 200 small firms (annual revenue less than $10 million) asked whether an increase in the minimum wage would cause the firm to decrease capital spending. Possible responses to the survey question were: "Yes," "No," or "Don?t Know." This data is best classified as
Nominal Scale With nominal scale we can only categorize or group the data
For which of the following data sets will a pie chart be most useful?
Percentage of net sales by product for Lenovo in 2011 Only percentage of net sales by product for Lenovo in 2011 looks at multiple categories of a single qualitative variable, in which the percentage of net sales by product may be meaningfully displayed.
Which of the following is an example of cross-sectional data?
Results of market research testing consumer preferences for soda Cross-sectional data refers to data collected by recording a characteristic of many subjects at the same point in time, or without regard to differences in time.
The teachers? union in California wants to know the average salary for high school teachers throughout the country. What is the teachers? union presumably planning to calculate?
Sample Statistic The teachers? union in California should be considered as a sample, and sample statistic will be calculated.
A stem-and-leaf diagram is constructed by separating each value of a data set into two parts. What are these parts?
Stem consisting of the leftmost digits and leaf consisting of the last digit
Which of the following statements is the least accurate concerning correlation analysis?
The correlation coefficient describes both the direction and strength of the relationship between two variables only if the two variables have the same units of measurement. The sample correlation coefficient is a unit-free measure.
A company wants to estimate the mean price of oil over the past 10 years. What kind of data does the company need?
Time series data Time series data refers to data collected by recording a characteristic of a subject over several time periods.
A knowledge of statistics provides the necessary tools to differentiate between sound and questionable conclusions.
True
In a data set, an outlier is a large or small value regarded as an extreme value in the data set.
True
The branch of statistical studies called descriptive statistics summarizes important aspects of a data set.
True
The empirical rule is only applicable for approximately bell-shaped data.
True
A polygon connects a series of neighboring points where each point represents the midpoint of a particular class and its associated frequency or relative frequency.
True Polygons are graphical depiction of frequency and relative frequency distributions. It connects a series of neighboring points where each point represents the midpoint of a particular class and its associated frequency or relative frequency.
The formula for a z-score is 11formula11.mml
True x-xbar/s
The variance is the arithmetic mean of the squared deviations from the mean.
True (look at variance formula)
The estimation of which of the following requires sampling?
U.S. unemployment rate
Professors at a local university earn an average salary of $80,000 with a standard deviation of $6,000. The salary distribution is approximately bell-shaped. What can be said about the percentage of salaries that are at least $74,000?
about 84% Since the data are bell-shaped, we apply the Empirical Rule. According to the Empirical Rule, approximately 68% of the salaries will lie within one standard deviation of the mean ($74,000 to $86,000). Therefore, half of these salaries (34%) will lie between $74,000 and $80,000. Since 50% of salaries will lie above $80,000, this means 34% + 50% = 84% will be at least $74,000.
The two branches of the study of statistics are generally referred to as
descriptive and inferential statistics.
A car dealership created a scatterplot showing the manufacturer's retail price and profit margin for the cars they have on their lot. Picture As the manufacturer's suggested retail price increases, the profit margin tends to ___________.
increase The graph shows that the higher the MSRP, the higher the profit margin.
Consider the following sample regression equation formula201.mml = 200 + 10x, where y is the supply for Product A (in 1000s)(in 1,000s) and x is the price of Product A (in $). If the price of Product A increases by $3, then we expect the supply for Product A to _______________.
increase by 30,000. In the simple linear regression model, the coefficient b1 measures the change in the predicted value of the response variable formula200.mml given a unit increases in the associated explanatory variable. Estimate the change.
The interval scale of data measurement is
less sophisticated than ratio scale The ratio scale represents the strongest level of measurement.
A college professor collected data on the number of hours spent by his 100 students over the weekend to prepare for Monday's Business Statistics exam. He processed the data by Excel and the following incomplete output is available. Mean 7 Sample Variance 7.84 Skewness 1.17 The median is most likely to be ________________.
less than 7 hours Data are positively skewed if the skewness is positive. For positively skewed distributions the median is likely to be less than the mean.
The R2 of a multiple regression of y on x1 and x2 measures the __________.
percent variability of y that is explained by the variability of x1 and x2 R2 quantifies the sample variability in the response variable y that is explained by changes in the explanatory variable(s), that is, by sample regression equation. It is computed as the ratio of the explained variation of the response variable to its total variation and expressed in percentages.
In the accompanying stem-and-leaf diagram, the values in the stem-and-leaf portions represent 10s and 1s digits, respectively. Picture The stem-and-leaf diagram shows that the distribution is ___________.
positvely skewed
What type of relationship is indicated in the scatterplot? Picture
postive curvilinear relationship
In its standard form, Chebyshev's theorem provides a lower bound on __________________________________
the proportion (or percentage) of observations lying within a certain interval
Over the past 30 years, the sample standard deviations of the rates of return for stock X and Stock Y were 0.20 and 0.12, respectively. The sample covariance between the returns of X and Y is 0.0096. When testing whether the correlation coefficient differs from zero, the value of the test statistic is t28 = 2.31. At the 5% significance level, the critical value is t0.025,28 = 2.048. The conclusion to the hypothesis test is to ___________.
to reject H0; we can conclude that the correlation coefficient differs from zero. Using critical value approach, the decision rule is:is to reject the null hypothesis if the test statistic is greater than the critical value; do not reject the null hypothesis if the test statistic is less than the critical value.
Consider the following simple linear regression model: y = β0 + β1x + ε. β0 and β1 are __________________.
unknown parameters
A statistics student is asked to estimate y = β0 + β1x + ε. She calculates the following values: formula156.mml= 440, formula157.mml, formula158.mml= 1,120, n = 11. Which of the following is the sample regression equation?
y-hat=60.80-1.28x The slope b1 and the intercept b0 of the simple regression equation are computed as formula152.mml and formula153.mml The simple linear regression equation is formula154.mml.
The _______ identifies the number of standard deviations a particular value is from the mean of its distribution.
z-score The z-score measures the distance of a given sample value from the mean in standard deviations.
Consider the sample regression equation formula64.mml = 10 - 5x, with an R2 value of 0.65. Which of the following is the value of the sample correlation between y and formula64.mml?
−0.81 The sample correlation between the response variable y and its predicted value formula62.mml is the square root of R2, that is, formula63.mml. Its sign is determined by the sign on the regression coefficient b1.