MBA 6040

¡Supera tus tareas y exámenes ahora con Quizwiz!

A popular software package for creating imaginative types of visualization techniques is

Tableau Public

Which of the following is true of Tableau Public?

Tableau Public is totally separate from Excel, although Excel data can be imported into Tableau Public.

Given the least squares regression line, Ÿ = 8 - 3X, which statement is true?

The relationship between X and Y is negative.

A Scatterplot allows on to see

The type of relationship (linear vs. nonlinear) can also be identified through a scatterplot. what type of relationship there is between two variables whether there is any relationship between two variables

Which of the following are true statements of pivot tables?

They allow us to "slice and dice" data in a variety of ways. Statisticians often refer to them as contingency tables or crosstabs. Pivot tables can list counts, averages, sums, and other summary measures, whereas contingency tables list only counts.

State the value of the coefficient of determination, R2, and interpret its meaning.

This means that 68.39% of the variation in weekly sales can be explained by the variation in shelf space available for international food.

A moving average is the average of the observations in the past few periods, specified by the span.

True

The Total area under the normal curve is equal to 1

True

A mean absolute error value of zero means that the forecast is exactly accurate and there is no forecast error.

True MAE is average of absolute errors across all periods. If it is equal to zero, it means absolute error for each period is zero.

The residual is defined as the difference between the actual and fitted values of the response variable:

True Residual = actual y - predicted y

The scatterplot is a graphical technique used to describe the relationship between two numerical variables.

True - A scatterplot displays values of two variables using Cartesian coordinates.

Median is a better statistic than average for measuring center of a skewed distribution.

True - Average can be easily influenced by outliers, which are common in a skewed distribution.

Both ordinal and nominal variables are categorical.

True - Categorical Data: Data sets that can be separated into different categories that can be distinguished by non-numeric characteristics (names/labels). Ordinal variable has an implicit order (e.g., Excellent, Good, Average) while nominal variable does not (e.g., gender).

A variable (or field) is an attribute, or measurement, on members of a population, whereas an observation (or case or record) is a list of all variable values for a single member of a population.

True - Correct, variable is an attribute (e.g., age) while an observation consists of all variable values for a single record of a population.

A correlation coefficient of O indicating no linear relationship but there could be nonlinear relationship between the two variables.

True - Correlation is relevant only for measuring linear relationships.

When the scatterplot appears as a shapeless swarm of points, this can indicate that there is no relationship between the response variable Y and the explanatory variable X, at least none worth pursuing.

True - Downward or upward trend indicates negative or positive relationship.

The difference between the actual data value and the forecasted data value is called the forecast error.

True - Forecast error = actual - predicted value.

A negative relationship between an explanatory variable X and a response variable Y means that as X increases, Y decreases, and vice versa

True - Negative relationship means the two variables move in the opposite direction.

A regression analysis between weight (Y in pounds) and height (X in inches) resulted in the following least squares line: 1 = 140 + 5x. This implies that if the height is increased by 1 inch, the weight is expected to increase on average by 5 pounds.

True - Slope shows the change in y for each unit of change in X

The standard deviation is measured in original units, such as dollars and pounds.

True - Variance is in squared units while standard deviation is in original unit (page 38 of the text).

To help explain or predict the dependent variable in every regression study, we use one or more independent variables.

True - We use independent variables (X) to explain or predict the dependent variable (Y) in a regression study.

Given a large sample of employees in a given industry, you run a regression for annual salary (in $1,000s) versus thee X variables: X1, the number of years employed in the industry; X2, a dummy for college degree (1 if employee has at least one college degree, O otherwise); and X3, a dummy for gender (1 for male, O for female). The estimated regression equation is Y = 47.9 + 2.7*X1 + 4.9*X2 + 1.8*X3. Consider two employees, not part of the sample, with the following characteristics: Jim, who has 5 years of experience in the industry and no college degree; and Mary, who has 10 years of experience in the company and a college degree. Which of the following is the prediction of the difference between their annual salaries (Jim's salary minus Mary's salary, in $1,000s)?

Y = 47.9 + 2.7*X1 + 4.9*X2 + 1.8*X3 X1, the number of years employed in the industry; X2, a dummy for college degree (1 if employee has at least one college degree, O otherwise); and X3, a dummy for gender (1 for male, O for female). Jim, who has 5 years of experience in the industry and no college degree. Y = 47.9 + 2.7*5 + 4.9*0 + 1.8*1 = 63.2 Mary, who has 10 years of experience in the company and a college degree. Y = 47.9 + 2.7*10 + 4.9*1 + 1.8*0 = 79.8 63.2-779.8 = -16.6

Which equation shows the process of standardizing?

Z=(X-M)/o

A continuous probability distribution is characterized by

a continuum of possible values.

When using the moving average method, you must select, ______ which represents) the number of terms in the moving average.

a span

A cost accountant is developing a regression model to predict the total cost of producing a batch of printed circuit boards as a linear function of batch size (the number of boards produced in one lot or batch), production plant (Kingsland, and Yorktown), and production shift (day, and evening). In this model, "production shift" is

an independent variable "as a linear function of batch size (the number of boards produced in one lot or batch), production plant (Kingsland, and Yorktown), and production shift (day, and evening)" - so production shift is one of the three independent variables.

A bimodal histogram is often an indication that The

data possible come from two o more distinct populations

There are two types of random variables, they are

discrete and continuous

Regression analysis asks

how a single variable depends on other relevant variables

In regression analysis, the variables used to help explain or predict the response variable are called the

independent variables

The mean # of a probability distribution is a:

measure of central location Normal distribution is characterized by mean and standard deviation. Change mean shifts the curve left or right while change standard deviation makes the curve more or less spread out.

A scatterplot that appears as a shapeless mass of data points indicates:

no relationship between the two variables A shapeless mass of data points indicates no relationship between the two variables.

The standard deviation « of a probability distribution must be:

nonnegative number

In linear regression, the fitted value is

predicted value of the dependent variable

When using exponential smoothing, a smoothing constant o must be used. The value for a

ranges between 0 - 1

The coefficient of determination (R^) can be interpreted as the fraction (or percent) of variation of the

response variable explained by the regression line

In choosing the "best-fitting" line through a set of points in linear regression, we choose the one with the:

smallest sum of squared residuals The "best-fitting" line minimizes sum of squared residuals/errors.

In linear regression, a dummy variable is used:

to include categorical variables in the regression equation A dummy variable is used to represent categorical variables (e.g., gender).

If the value of the standard normal random variable Z is positive, then the original score is where in relationship to the mean?

to the right of the mean Z=(Original Value - Mean)/ Standard Deviation. So if Z>0, then Original Value > Mean.

Holt's model differs from simple exponential smoothing in that it includes a term for:

trend

Which of following statements are true regarding the probability distribution of a random variable X?

• The probabilities must be nonnegative The probabilities must sum to 1

A real estate agent has gathered information on 40 houses that were recently sold in a local community. The data she collected represent the following variables: the selling price of each house (in thousands of dollars), the appraised value of each house (in thousands of dollars), the size of the house (in hundreds of square feet), and the number of bedrooms. Indicate which variable is discrete.

# of Bedrooms Selling price, Appraised value, and Size of the house (Square Footage) are continuous. Number of bedrooms is discrete.

The data below represents sales for a particular product. If you were to use the moving average method with a span of 3 periods, what would be your forecast for period 5? Period Sales (in units) 1 - 90 2 - 120 3 - 110 4 - 100

(120+110+100)/3 = 110

The data below represents sales for a particular product. If you were to use the moving average method with a span of 3 periods, what would be vour forecast for period 4? Period Sales (in units) 1 - 90 2 - 120 3 - 110 4 - 100

(90+120+110)/3 = 106.67

The mode is best described as the

) most frequently occurring value

Which of the following statements are false?

- A categorical variable is nominal if there is natural ordering of its possible values. - A categorical variable is nominal (ordinal, not nominal) if there is natural ordering of its possible values.

Which of the following best describes the concept of marginal probability?

- It is a measure of the likelihood that a particular event will occur, regardless of whether another event occurs. P(A1|B1)= P(A1 and B1)/ P(B1) Marginal probability: P(B1) Joint probability: P(A1 and B1) Conditional probability: P(A1|B1)

If P(A) = P(A|B), then events A and B are said to be

- independent P(A) = P(A|B) says the probability that event A occurs is not affected by the probability of event B occurring - events A and B are independent.

The correlation coefficient is always a value between

-1 and +1 1 indicates a perfect positive correlation and -1 a perfect negative correlation.

The correlation value ranges from

-1 to +1

If A and B are any two events with P(A) = .8 and P\B|A) = .4, then the joint probability of A and B is

.32

Given that Z is a standard normal random variable, P-1.0 ≤ Zs 1.5) is

.7745 =NORMSDIST(1.5)-NORMSDIST(-1)= 0.7745

If the gasket production machine is working properly, the population of gasket OD measures can be reasonably modeled by a Normal distribution with mean OD = 400 mm and standard deviation OD = 2 mm. The engineering specifications provide that a gasket should be between 394 mm and 404 mm, otherwise a gasket is defective. Find the probability that a randomly selected gasket's measure is <= 394 mm. Save your work - the next two questions build on this one.

0.00135 = NORM. DIST(394,400,2,1)

A popular retail store knows that the distribution of purchase amounts by its customers is approximately normal with a mean of $30 and a standard deviation of $9. What is the probability that a randomly selected customer will spend less than $15 at this store?

0.04779 [ NORMDIST(15,30,9,1)]

Consider a random variable X with the following probability distribution: P(X=0) = 0.08, P(X=1) = 0.22, P(X=2) = 0.25, P(X=3) = 0.25, P(X=4) = 0.15, P(X=5) = 0.05 Find P(X<2)

0.08 + 0.22 = 0.3

Researchers studying the effects of a new diet found that the weight loss over a one-month period by those on the diet was normally distributed with a mean of 9 pounds and a standard deviation of 3 pounds. If a dieter is selected at random, what is the probability that the dieter lost at most 7.5 pounds?

0.3085 [NORMDIST(7.5,9,3,1)]

If the gasket production machine is working properly, the population of gasket OD measures can be reasonably modeled by a Normal distribution with mean OD = 400 mm and standard deviation OD = 2 mm. The engineering specifications provide that a gasket should be between 394 mm and 404 mm, otherwise a gasket is defective. Find the probability that a randomly selected gasket is not defective.

0.9759 = NORM.DIST(404,400,2,1)-NORM.DIST(394,400,2,1)

If the gasket production machine is working properly, the population of gasket OD measures can be reasonably modeled by a Normal distribution with mean OD = 400 mm and standard deviation OD = 2 mm. The engineering specifications provide that a gasket should be between 394 mm and 404 mm, otherwise a gasket is defective. Find the probability that a randomly selected gasket's measure is < = 404 mm. Save your work - the next two questions build on this question.

0.9772 = NORM.DIST(404,400,2,1)

The probability of an event and the probability of its complement always sum to:

1

The grades on the final examination given in a large organic chemistry class are normally distributed with a mean of 72 and a standard deviation of 8. The instructor of this class wants to assign an "A" grade to the top 10% of the scores, a "B" grade to the next 10% of the scores, a "C" grade to the next 10% of the scores, a "D" grade to the next 10% of the scores, and an "F" grade to all scores below the 60th percentile of this distribution. For a letter grade of A, find the lowest acceptable score within the established range.

82.3 NORMINVI0.90, 72, 8) *

Suppose that a simple exponential smoothing model is used (with alpha = 0.30) to forecast monthly sandwich sales at a local sandwich shop. After June's demand is observed at 1520 sandwiches, the forecasted demand for June is 1600 sandwiches. What would be the forecasted demand for July?

=0.3*1520+0.7*1600

Given that the random variable X is normally distributed with a mean of 80 and a standard deviation of 10, P85 ≤ X≤ 90) is

=NORMDIST(90,80, 10,1)-NORMDIST(85,80,10,1) = 0.1498

Which of the following are considered measures of association?

Correlation and Covariance

A sample of a population taken at one particular point in time is categorized as:

Cross Sectional

Numerical variables can be subdivided into which two types?

Discrete and Continuous - Based on the data, we can use statistics to make conclusions about the people or elements that the data is describing. Some data sets consist of numbers, while other data sets are non-numerical, consisting of names or labels. To distinguish between the two types of data, we use the terms Numerical Data and Categorical Data.

In the case of simple exponential smoothing, a smoothing constant, alpha, close to 1 places more weight on the prior forecast.

False

To calculate the five-period moving average for a time series, we average the values in the two preceding periods, and the values in the three following time periods.

False No, average the values in the five preceding periods.

If vou are in a class where most students do reasonably well on a test but a few pull down the class average, the histogram describing the test scores will indicate a positively-skewed distribution.

False - "A few pull down the class average" indicates the tail is to the left - a negatively-skewed distribution.

The time required to drive from Detroit to Lansing is an example of a discrete random variable

False - .....is an example of a continuous random variable because any value is possible within a certain range.

The median is one of the most frequently used measures of variability.

False - Median is not a measure of variability but of central tendency.

If two data sets have the same range, the smallest and largest observations in both sets should be the same

False - Range = Max - min

Phone numbers, Social Security numbers, and zip codes are examples of numerical variables.

False - Social Security numbers, and zip codes are considered Categorical Data. Some data sets consist of numbers, while other data sets are non-numerical, consisting of names or labels. To distinguish between the two types of data, we use the terms Numerical Data and Categorical Data.

A data sample has a mode of 140, a median of 130, and a mean of 120. The distribution of the data is positively skewed.

False - The distribution is negatively skewed because the median is larger than the mean - tail is pulled by the mean.

Interpret the meaning of the slope b.

For each increase in shelf space by one foot, there is an expected increase in weekly sales by $7.40.

The forecast error is the difference between

Forecast error = the actual value (what really happened) - forecast (predicted value from the model).

In regression analysis, the variables used to help explain or predict the response variable are called the ____- variables.

Independent

Which of the following statistics is not a measure of central location?

Interquartile Range - measures variability

The measure of forecast accuracy that is not influenced by the measurement scale of the time series data is

MAPE - In MAPE, the measurement scale of the data is eliminated by dividing the absolute error by the time series data value.

Which of the following is not one of the summary measures for forecast errors that is commonly used?

MFE MFE is not a summary measure because the positive and negative errors will cancel each other out.

Which of the following is not one of the commonly used summary measures for forecast errors?

MFE (Mean Forecast Error)

In a normal distribution, changing the standard deviation:

Make the curve more or less spread out

For a boxplot, the vertical line inside the box indicates the location of the

Median

For a boxplot, the box itself represents what percent of the observations?

Middle 50%

Two forecasting models are under consideration. Model 1 has an MAPE of 5.69% and Model 2 7.89%. Which model is more accurate?

Mode 1 - Lower MAPE Value indicates a model is more accurate

In regression analysis, if there are several explanatory variables, it is called _____ regression?

Multiple

Which Metric does StatTools use for optimizing the alpha value in an exponential smoothing model?

RSME

Which of the following statements is true?

Standard deviation cant be negative

You play a game where the amount you win (or lose, if negative) can be $1,000, $100, $0, or -$2,000. Let X be the amount you win (or lose), and assume the distribution of X is the following: P(X = 1,000) = 0.1, PIX = 100) = 0.5, P(X = 0) = 0.2, and PIX = -2,000) = 0.2. What is probability that you win $100 or more? Save your work - the next two questions build on this question.

Sum of P(X = 1,000) = 0.1 and P(X = 100) = 0.5.


Conjuntos de estudio relacionados

Chapter 7: Innovation and Change

View Set

Chapter 2 The Colonies and Wars for Empire

View Set

Chapter 50: Antineoplastic Drugs and Targeted Therapies PrepU

View Set

Foundations of Sonography Exam 1

View Set

ASPP HNU Final Exam Preparation.

View Set

Judicial Review and the Supreme Court

View Set

Get United: Organized Labor- Economy

View Set

Computer Science Final (Chapters 7-14)

View Set