EXAM 1
Compare the results from Questions 10 and 13. Did Adjusted R Square increase, decrease or remain the same in Question 13?
Increased
The scatter chart below displays the residuals verses the dependent variable, x. Which of the following conclusions can be drawn from the scatter chart given below?
The residuals have a increasing variance as the dependent variable increases.
_____________ refers to the data set used to compare model forecasts and ultimately pick a model for predicting values of the dependent variable.
Validation set
A variable used to model the effect of categorical independent variables in a regression model which generally takes only the value zero or one is called
a dummy variable.
Use the attached data to create a scatter diagram to show the relationship between Market Capitalization and Profit. Add a trendline. The trendline generally indicates that there is
a positive relationship.
A graphical presentation that uses vertical bars to display the magnitude of quantitative data is known as a
column chart.
Assessing the regression model on data other than the sample data that was used to generate the model is known as
cross-validation.
Corporate-level managers use ______ to summarize sales by region, current inventory levels, and other company-wide metrics all in a single screen.
data dashboards
Data dashboards are a type of _________ analytics.
descriptive
The dummy variable for Smoker is significant at .005 level of significance.
false
Data-ink is the ink used in a table or chart that
is necessary to convey the meaning of the data to the audience.
A disadvantage of stacked-column charts and stacked-bar charts is that
it can be difficult to perceive small differences in areas.
Regression analysis involving one dependent variable and more than one independent variable is known as
multiple regression.
Fitting a model too closely to sample data, resulting in a model that does not accurately reflect the population is termed as
overfitting.
Making visual comparisons between different values of a categorical variable is difficult in a
pie chart.
A forecast that helps direct police officers to areas where crimes are likely to occur based on past data is an example of
predictive analytics.
Considering the results in Question 2, what is the estimated coefficient for Line Speed? Round your answer to three decimal places.
-0.148
What would be the coefficient of determination if the total sum of squares (SST) is 23.29 and the sum of squares due to regression (SSR) is 10.03?
0.43
Johnson Filtration, Inc. provides maintenance service for water filtration systems throughout Southern Florida. Customers contact Johnson with requests for maintenance service on their water filtration systems. To estimate the service time and the service cost. Johnson's managers want to predict the repair time necessary for each maintenance request. Hence, repair time in hours is the dependent variable. Repair time is believed to be related to three factors: the number of months since the last maintenance service, the type of repair problem (mechanical or electrical), and the repairperson who performs the repair (Donna Newton or Sob Jones). Data for a sample of 10 service calls are reported below: Repair Develop the simple linear regression equation to predict repair time given the number of months since the last maintenance service. What is the coefficient of determination? Enter your answer as a decimal with four decimal places?
0.5342
Using the attached LineSpeed data, develop an estimated regression equation to predict the number of defective parts found given the line speed. How much of the variation in defective parts is explained by your model? Enter your answer as a decimal rounded to three decimal places.
0.739
The attached data are the results of a survey on upscale accommodations. The data show the percentages of respondents who rated the locations as excellent or very good on Comfort, Amenities, In-House Dining and Overall. Develop an estimated regression equation to predict the Overall rating using Comfort, Amenities and In-House Dining. How much of the variation in the Overall rating is explained by your model? Enter your answer as a decimal rounded to three decimal places.
0.75
Using the attached data, develop an estimated regression equation to predict the Total Points Earned based on the Hours Spent Studying. How much of the variation in Total Points Earned is explained by your model? Enter your answer as a decimal rounded to three decimal places.
0.828
Consider the previous question. Create a new dummy variable that is equal to zero if the type of repair is mechanical and one if the type of repair is electrical. Develop the multiple regression equation to predict repair time, given the number of months since the last maintenance service and the type of repair. What is the coefficient of determination for this model? Enter your answer as a decimal rounded to four decimal places.
0.8592
A recent 10-year study conducted by a research team at the Great Falls Medical School was conducted to assess how age, systolic blood pressure, and smoking relate to the risk of strokes. Assume that the following data are from a portion of this study. Risk is interpreted as the probability (times 100) that the patient will have a stroke over the next 10-year period. For the smoking variable, define a dummy variable with 1 indicating a smoker and 0 indicating a nonsmoker. Stroke Develop an estimated multiple regression equation that relates risk of a stroke to the person's age. systolic blood pressure, and whether the person is a smoker. What percentage of the variation in Risk is not explained by this model? Enter your answer as a percent with two decimal places?
12.65
The attached file shows the number of U.S. locations for the top 20 U.S. franchises. Create a PivotTable and group the number of locations starting at zero, ending at 39,999 by 10,000. How many franchises are in the 0 to 9999 category?
13
What would be the value of the sum of squares due to regression (SSR) if the total sum of squares (SST) is 25.32 and the sum of squares due to error (SSE) is 6.89?
18.43
Considering the results from Question 6, what is the test statistic associated with the coefficient for Hours Spent Studying? Round your answer to three decimal places.
27.2
Using a PivotTable and the MajorSalary data, determine the average starting salary of management majors. Round your answer to two decimal places and do not include a dollar sign.
3,180
Consider the previous question. Finger Lakes Investments has developed a new electronic trading system and would like to predict overall customer satisfaction assuming they can provide satisfactory service levels (3) for both trade price and speed of execution. Use the estimated regression equation to predict overall satisfaction level for Finger Lakes Investments if they can achieve these performance levels. Enter your answer with one decimal place.
3.1
What is the probability of a stroke over the next 10 years for Art Speen, a 68-year-old smoker who has a systolic blood pressure of 175? Enter your answer as percent with one decimal place.
34.3
Dixie Showtime Movie Theaters, Inc. owns and operates a chain of cinemas in several markets in the southern United States. The owners would like to estimate weekly gross revenue as a function of advertising expenditures. Data for a sample of eight markets for a recent week follow: DixieShowtime Develop an estimated regression equation with the amount of television advertising as the independent variable. How much of the variation in the sample values of weekly gross revenue does the model explain? Enter your answer as a percent with two decimal places.
55.52
The American Association of Individual (AA1I) On-Line Discount Broker Survey polls members on their experiences with electronic trades handled by discount brokers. As part of the survey, members were asked to rate their satisfaction with the trade price and the speed of execution, as well as provide an overall satisfaction rating. Possible responses (scores) were no opinion 1.0), unsatisfied (1), somewhat satisfied (2), satisfied (3), and very satisfied (4). For each broker, summary scores were computed by computing a weighted average of the scores provided by each respondent. A portion of the survey results follow: Broker Develop an estimated regression equation using trade price and speed of execution to predict overall satisfaction with tire broker. What percentage of the variation in Overall Satisfaction is explained by this model? Enter your answer as a percentage with two decimal places.
68.27
A study investigated the relationship between audit delay (the length of time from a company's fiscal year-end to the date of the auditor's report) and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow: Industry A dummy variable coded 1 if the firm was an industrial company or 0 if the firm was a bank, savings and loan, or insurance company. Public A dummy variable coded 1 if the company was traded on an organized exchange or over the counter; otherwise coded 0. Quality A measure of overall quality of internal controls, as judged by the auditor, on a 5-point scale ranging from "virtually none" (1) to "excellent" (5). Finished A measure ranging from 1 to 4, as judged by the auditor, where 1 indicates "all work performed subsequent to year-end" and 4 indicates "most work performed prior to year-end." A sample of 40 companies provided the following data: Audit Develop the estimated regression equation using all of the independent variables included in the data. Enter the value of the intercept rounded to three decimal places.
80.429
Considering the results from Question 6, use the model to predict the number of points a student would earn if the student studied 95 hours. Round your answer to one decimal place.
84.8
Jensen Tire & Auto is deciding whether to purchase a maintenance contract for its new computer wheel alignment and balancing machine. Managers feel that maintenance expense should be related to usage, and they collected the following information on weekly usage (hours) and annual maintenance expense (in hundreds of dollars). Jensen Use the data to develop an estimated regression equation that could be used to predict the annual maintenance expense for a given number of hours of weekly usage. How much of the variation in sample values of annual maintenance expense does the model you estimated in explain? Enter your answer as a percent with two decimal places.
85.62
Consider the previous question. Develop an estimated regression equation with both television advertising and newspaper advertising as the independent variables. How much of the variation in the sample values of weekly gross revenue does the model explain? Enter your answer as a percent with two decimal places.
93.22
In the file, MajorSalary, data have been collected from 111 College of Business graduates on their monthly starting salaries. Create a PivotTable. Which major has the greatest number of graduates?
Accounting
Use the attached data and Excel to create sparklines for revenue for each company and a heat map for revenue of the six companies. Which company exhibited the most consistent growth over the six months? For discussion, which tool helped you the most in answering this question and why?
Allen and Davis, LLC
The company identified in Chapter 7, Analytics in Action, is
Alliance Data Systems
________________ is used to test the hypothesis that the values of the regression parameters B1, B2, ... Bq are all zero.
An F test
Using a PivotTable and the MajorSalary data, determine the highest starting salary for Info Systems majors. Round your answer to two decimal places and do not include a dollar sign.
Between 5,030 and 5,030
Considering the results from Question 10, which if any of the independent variables are not significant?
Comfort
Compare the results from Questions 10 and 13. Did the Standard Error increase, decrease or remain the same in Question 13?
Decreased
Recreate the analysis of Question 10 dropping the independent variable Comfort. Compare the results. Did R Square increase, decrease or remain the same?
Decreased
The software package most commonly used for creating simple charts is
Excel
Considering the results from Question 2, the coefficient for Line Speed is significant at a 1% level.
False
Rerun the the analysis done in Question 3 without the independent variable identified in Question 4. Which independent variable is significant at the 1% level of significance?
Industry
Use the attached data to create a bubble chart. Expected rate of return is the horizontal axis, risk estimate is the vertical axis and size of the bubble is the capital investment. Identify whether each investment is on the efficient frontier. Any investment that has a smaller rate of return for the equivalent or higher risk than another project cannot be on the efficient frontier.
Investment 1 No Investment 2 Yes Investment 3 Yes Investment 4 No Investment 5 Yes Investment 6 Yes
DJ needs to display data over time. Which of the following charts should he use?
Line chart
_____________ refers to the scenario in which the relationship between the dependent variable and one independent variable is different at different values of a second independent variable.
Multicollinearity
Use the attached data to create a stacked-column chart and a clustered column chart. Age category is on the horizontal axis for both charts. What can you say about the relationship of age and cell phone ownership, and which graph displays this better?
Older respondents are more likely to not own a cell phone. Stacked-column chart.
Which one of the following statements is not true concerning PivotTables in Excel?
PivotTables can only be used if one variable is categorical and the other is quantitative data.
_______________ analytics are techniques that use models, constructed from past data, to predict the future or to ascertain the impact of one variable on another.
Predictive
_______________ analytics use techniques that take input data and yield a best course of action.
Prescriptive
Consider the result in Question 3. At a 5% level of significance, which variable is not significant?
Public
__________ is the data set used to build the candidate models.
Training set
Age is significant at a .05 level of significance.
True
Blood Pressure is significant at a .01 level of significance.
True
Consider the previous question for the next four questions. The model is significant.
True
Considering the results from Question 10, we can conclude that the model is significant.
True
Considering the results from Question 2, the coefficient for Line Speed is significant at a 5% level.
True
Considering the results from Question 6, the coefficient for Hours Spent Studying is significant at a 1% level.
True
A chart that is recommended as an alternative to a pie chart is a
bar chart.
The charts that are helpful in making comparisons between categorical variables are
bar charts and column charts.
In order to visualize three variables in a two-dimensional graph, we use a
bubble chart.
An alternative for a stacked column chart when comparing more than a couple of quantitative variables in each category is a
clustered column chart.
The ___________ is a measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation.
coefficient of determination
Fields may be chosen to represent all of the following except ____________ in the body of a PivotTable.
filters
An effective display of trend and magnitude is achieved by using a combination of a
heat map and sparklines.
A two-dimensional graph representing the data using different shades of color to indicate magnitude is called a
heat map.
Deleting the grid lines in a table and the horizontal lines in a chart
increases the data-ink ratio.
In a linear regression model, the variable (or variables) used for predicting or explaining values of the response variable are known as the ________________. It(they) is(are) denoted by x.
independent variable
Which of the following regression models is used to model a nonlinear relationship between the independent and dependent variables by including the independent variable and the square of the independent variable in the model?
quadratic regression model
In many cases, white space in a chart can improve
readability
The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation is known as the
residual
A _____________ is a graphical presentation of the relationship between two quantitative variables.
scatter chart
A regression analysis involving one independent variable and one dependent variable is referred to as a
simple linear regression.
A line chart that has no axes but is used to provide information on overall trends for time series data is called a
sparkline.
The graph of the simple linear regression equation is a(n)
straight line.
Tables should be used instead of charts when
the values being displayed have different units or very different magnitudes.
A _____________ is a line that provides an approximation of the relationship between the variables.
trendline
The ratio of the amount of ink used in a table or chart that is necessary to convey information to the total amount of ink used in the table and chart is known as data-ink ratio. Using additional ink that is not necessary to convey information has what effect on the data-ink ratio?
reduces