BA 352 Final
You are identifying a linear relationship between Arrival delay time in minutes (Y) and Airtime in minute (X). Your regression result is expressed as Y=1+0.7X The slope of the equation of this model can be interpreted as :
1 more minute in airtime increases 0.7 minute increase in arrival delay
Which of the following is a correct interpretation of Dell's beta.
1% increase in market is linked to 1.76% increase in Dell's stock price during a month
You are asked to find a distance between two cities. Suppose cells C10:J17 are selected and named "distances". You try the formula in D26 as shown in the picture. What result will you get?
2052
set random state as 42. You need to partition the data into training and validation datasets. You name training datasets (x_train and y_train) and validation datasets (x_test and y_test). You set the size of training and validation data without setting a specific test/training size - using a default option. What percentage of observations will be randomly selected for x_test?
25%
Using worksheet "Sheet4", you are determining what the least expensive way to produce and ship products to is customers while meeting demand. Here is what is given: Your company produces the product at its Los Angeles, Atlanta, and New York plants. Your plants can produce up to the capacity specified in F4:F6. Your company must ship the number of pounds to meet the demand listed in B2:E2 to the four regions in the US. The cost of producing and shipping to each region is listed in B4:E6. When you use the cheapest way to deliver the quantity to each region, how much is your total cost? (Select the closest value)
$43,000
The daily closing values of the Dow Jones Industrial Average over a period of 30 days are best described as _____ data.
time-series
A population includes all elements or objects of interest in a study, whereas a sample is a subset of the population used to gain insights into the characteristics of the population.
true
A variable (or field or attribute) is a characteristic of members of a population, whereas an observation (or case or record) is a list of all variable values for a single member of a population.
true
Age, height, and weight are examples of numerical data.
true
Correlation is used to determine the strength of the linear relationship between an explanatory variable X and response variable Y.
true
If a histogram of a data set is symmetric and bell shaped, with a mean of 75 and standard deviation of 10. Then, approximately 95% of the data values will be between 55 and 95.
true
If the regression equation includes anything other than a constant plus the sum of products of constants and variables, the model will not be linear.
true
In every regression study there is a single variable that we are trying to explain or predict. This is called the response variable or dependent variable.
true
R^2 is the square of the correlation between the observed Y values and the fitted Y values.
true
The effect of a logarithmic transformation on a variable that is skewed to the right by a few large values is to "squeeze" the values together and make the distribution more symmetric.
true
The mean is a measure of central tendency.
true
The median of a data set with 30 values would be the average of the 15th and the 16th values when the data values are arranged in ascending order.
true
The residual is defined as the difference between the actual and predicted, or fitted values of the response variable.
true
The residuals are observations of the error variable . Consequently, the minimized sum of squared deviations is called the sum of squared error, labeled SSE.
true
The two primary objectives of regression analysis are to study relationships between variables and to use those relationships to make predictions.
true
We should include an interaction variable in a regression model if we believe that the effect of one explanatory variable X1 on the response variable Y depends on the value of another explanatory variable X2.
true
Does the beta mean Dell's stock is more volatile than the market?
yes
If the cell B2 is 60, what will this IF function give you as a result? =IF (B2>55, ''yes'', ''no'')
yes
Outliers are observations that
lie outside the typical pattern of points on a scatterplot.
For a situation described below, identify and match the target cell, changing cells, and constraints. Situation 2 How should an airline company allocate its advertising budget between different advertising formats? Target cell
maximum profit contribution of sales generated by advertising
Adding regularization will add a penalty for coefficients becoming large, which prevents overfitting. .
overfitting
What do you need to include when you are entering text into your formula?
quotes
During a month in which the market goes up 5 percent, you are 95 percent sure that Dell's stock price will increase between which range of values?
-18.7 percent to 42 percent
Consider the die problem. Which of the following is closest to the standard deviation of the number of dots showing when a die is tossed?
1.71
Estimate the beta of Dell. Which one of the following is the correct beta?
1.76
Continuing the warehouse location example, the average distance per shipment is? (Select the closest value)
1048 miles
Using summary statistics, what is the 25 percentile, median, and 75 percentile values of MEDV?
17, 21, 25
Consider the die problem. Which of the following is closest to the variance of the number of dots showing when a die is tossed?
2.92
Consider the die problem. Which of the following is closest to the the mean of the number of dots showing when a die is tossed?
3.5
Consider the data below. What percentage of students scored grade C? GradesNumber of students A 16 B 28 C 33 D 13 Total 90
37%
Suppose that the weekly sales of a certain aircraft component are normally distributed with a mean of 1,000 and standard deviation of 250: There is a 1 percent chance that, fewer than what number of components is sold during a week?
418
Suppose cells B2:D3 are selected and named "lookup". The formula =HLOOKUP(B8, lookup, 2, True) is created in C8. If C8 is dragged through C11, what is the formula in C9?
=HLOOKUP(B9, lookup, 2, True)
Which is the best formula to return just the 5 characters on the left-hand side of the text in cell A1?
=LEFT(A1,5)
Suppose cells H11:I15 are selected and named "Lookup2". You want to obtain the product's price based on a product ID. What is the correct formula in I19 to retrieve the price for the product in H19?
=VLOOKUP(H19, Lookup2, 2, FALSE)
For a situation described below, identify and match the target cell, changing cells, and constraints. An aircraft manufacturing company has $2billion allocated to acquiring (purchasing) component manufacturing companies. Which companies should they buy? Changing Cells
A 0-1 changing cell for each company ( 1 indicates buying the company while a 0 indicates not buying company)
Which of the following criteria is important for assessing the goodness of fit of a model?
All of the above
Which one of the following is the guideline for including/excluding variables in a regression equation?
All of the above
When you use a "min" or "max" function in an optimization problem, you need to use ______________________ solving method.
Evolutionary
Using the Worksheet "Sheet1", create a table of descriptive statistics. Based on the mean and median values among the five stocks, which company performed the worst (lowest mean and median)?
GM
What is the most common type of chart for showing the distribution of a numerical variable?
Histogram
Based on the results above, having a 2 cents off on the price can predict which of the followings?
Increase in sales of about 13 canned tomatoes
Based on the results above, having a notice on cart can predict which of the followings?
Increase in sales of about 20 canned tomatoes
What does these wildcards/signs mean in Excel? &(ampersand)
It joins two or more text strings into a single string
For a situation described below, identify and match the target cell, changing cells, and constraints. An aircraft manufacturing company has $2billion allocated to acquiring (purchasing) component manufacturing companies. Which companies should they buy? Target Cell
Maximize value gained from purchasing companies
If a positively skewed distribution has a median of 10, which of the following statement is true?
Mean is GREATER than 10
Identify the shape of the distribution in the figure below.
Skewed right
What does these wildcards/signs mean in Excel? ""
This sign matches blank cells.
Under what circumstances should we be cautious about using the mean as a measure of central tendency?
When data is positively or negatively skewed.
If the number of observations in a single-variable data set is even, the median is the
average of the two middle observations.
A data set is typically a rectangular array of data, with observations in columns and variables in rows.
false
You can select any number of cells for the Set Objective variable.
false
The percentage of variation () can be interpreted as the fraction (or percent) of variation of the
response variable explained by the regression line.
The percentage of variation (R2) ranges from
0 to +1.
Suppose that the weekly sales of a certain aircraft component are normally distributed with a mean of 1,000 and standard deviation of 250: What is the probability that, during a week, from 400 through 1,100 components are sold?
0.65
You want to reduce the number of predictors using LASSO (Least Absolute Shrinkage and Selection Operator). Based on the LASSO results, which one of the following variables is determined to be less important in predicting the target? (You can select multiple answers)
INDUS, CHAS, NOX
Using worksheet "Sheet3", you need to solve an optimization problem to locate your single warehouse. The location is determined to minimize the total distance the shipments are distributed. Here is what is given: The distance formula is defined as 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 = 69 ∗ [(𝐿𝑎𝑡1 − 𝐿𝑎𝑡2)2 + (𝐿𝑜𝑛𝑔1 − 𝐿𝑜𝑛𝑔2)2] The latitude of your warehouse should be between 31 and 48 degrees. The longitude of your warehouse should be between 81 and 120 degrees. You need to complete the setup by filling in the blue-shaded cells. What is the coordinate of your warehouse? Select the closest values.
34 N, 96 W
What is the mode of the following numbers? 1, 3, 4, 4, 5, 5, 5, 5, 6, 10
5
Which of the following statement is correct about the dimension of the data?
506, 13
Using worksheet "Sheet2", you need to solve an optimization problem to schedule your workforce. Here is what is given: All bank employees work four consecutive days. The number of workers needed each day is shown in row 14 (C14:I14). The number of employees needs to be an integer. You need to complete the setup by filling in the cells that are blue-shaded (B2, C12:I12). What is the minimum number of employees you need to hire while meeting all of the constraints?
58
For data having a bell-shaped distribution, approximately _____ percent of the data values will be within one standard deviation of the mean.
68
Which of the following characteristics is NOT about a normal distribution?
95% of the observations life between +/- one standard deviation
Which of the following is an operator for inserting a not equal to condition in an IFS, COUNTIFS or SUMIFS function?
<>
First, obtain the prediction and the standardized residuals using the test data. Then plot the predictions on the x-axis and the residuals on y-axis. Where do you see standardized residuals become more spread out?
MEDV below 10 and above 30
Based on the p-values, which of the following factors are important in predicting the sales of canned tomatoes? (select multiple factors if applicable)
Having 2 cent off on price Having notice on cart
You will estimate the multiple linear regression model to predict MEDV. Include all predictors. Based on the result, which one of the following variables does not add predictive power to the model?
INDUS
Researchers may try to gain insight into the characteristics of a population by examining a(n) _____ from the population.
sample
What does these wildcards/signs mean in Excel? *(asterisk)
It represents any number of characters. For example, Ex* could mean Excel, Excels, Example, Expert, etc.
What does these wildcards/signs mean in Excel? ?(Question mark)
It represents one single character. For example, Ex?el could mean Excel or Exoel.
A histogram that is positively skewed may also be described as
skewed to the right.
Which is the easiest method to assign a name to a cell or range of cells?
Select the cell(s) and then type the name in the Name Box (to the left of the formula bar)
The picture shows the distributions of three datasets. They all follow a normal distribution. The purple line shows data 1 ;. The pink line data 2. The orange line the data 3. Which standard deviation is the largest?
Standard deviation of Data 3
What three elements do you need to identify before running Excel Solver?
Target cell, changing cells, and constraints
Which of the following characteristics can be used to describe the skewness of a distribution?
skewness
In choosing the "best-fitting" line through a set of points in linear regression, we choose the one with the
smallest sum of squared residuals.
For a situation described below, identify and match the target cell, changing cells, and constraints. An aircraft manufacturing company has $2billion allocated to acquiring (purchasing) component manufacturing companies. Which companies should they buy? Constraints
Total amount spent buying companies <=( less than or equal to) $2 billion.
If a histogram of a data set is symmetric and bell shaped, with a mean of 75 and standard deviation of 10. Then, approximately 68% of the data values will be between 65 and 85.
True
In an extremely right-skewed distribution, the mean is less than the median.
True
The standard error of the estimate () or standard error of the regression is essentially the
standard deviation of the residuals.
For a situation described below, identify and match the target cell, changing cells, and constraints. Situation 1 If a city has only one airport, where should it be located? Changing cells
the long and lat of the airport
Based on the results above, which of the following statement is correct about R square?
Your regression explains 88.7 percent of the variation in the sales of canned tomatoes
For a situation described below, identify and match the target cell, changing cells, and constraints. Situation 2 How should an airline company allocate its advertising budget between different advertising formats? Changing cells
amount spent on advertising in each available media
A sample, selected from a population, taken at one particular point in time is categorized as
cross-sectional.
For a situation described below, identify and match the target cell, changing cells, and constraints. Situation 2 How should an airline company allocate its advertising budget between different advertising formats? Constraints
do not spend more than the budget
Correlation is measured on a scale from 0 to 1, where 0 indicates no linear relationship between two variables, and 1 indicates a perfect linear relationship.
false
In a nonlinear transformation of data, the Y variable or the X variables may be transformed, but not both.
false
To help explain or predict the response variable in every regression study, we use one or more explanatory variables. These variables are also called response variables, target variables, or independent variables.
false
In linear regression, a dummy variable is used to
include categorical variables in the regression equation.
Based on the descriptive statistics of five stocks you created above, you will compare the risk levels between GE and Intel. Intel is riskier than the other stock due to its lower level of standard deviation .
intel, higher, standard deviation
If there is an outlier in a dataset, it will strongly affect which measure of central tendency?
mean
For a situation described below, identify and match the target cell, changing cells, and constraints. Situation 1 If a city has only one airport, where should it be located? Target Cell
minimize total distance passenger's travel to the airport
For a situation described below, identify and match the target cell, changing cells, and constraints. Situation 1 If a city has only one airport, where should it be located? Constraints
none
If you increase the regularization parameter (also referred to as a hyperparameter of the penalty term) equals to zero, the coefficients will not be penalized , and your model will be multiple linear regression .
not be penalized, be multiple linear regression
In order for the characteristics of a sample to be generalized to the entire population, the sample should be _____ the population.
representative of
