Analytics in Operations Midterm 3
challenges of sampling
- Sample results provide only estimates of the values of the corresponding population characteristic. - Some sampling error are expected to occur as the sample contains only a portion of the population. - Statistical procedures are available to determine the reliability of sample results.
five challenges of census collection
- expensive - time consuming - misleading - unnecessary (may be excessive) - impractical
the null hypothesis is...
- the assumption you're beginning with - the opposite of what you're testing think U.S. legal system, innocent until proven guilty
A time series plot of a period of time (quarterly) versus quarterly sales (in $1,000s) is shown below. Which of the following data patterns best describes the scenario shown? (squiggly but trending up)
Seasonal pattern and linear trend
confidence interval is...
an estimate of a population parameter that provides an interval believed to contain the value of the parameter at some level of confidence
______ involves using a sample to draw conclusion about a population parameter
inferential statistics
the causal forecasting method...
is based on the relationship between the dependent variable and the independent variable NOT the historical values of the same variables as in the previous historical data
SSE formula
observation - mean for each observation squared
If a greater number of past values are considered relevant, then we generally...
opt for a larger value of k to smoothen out random fluctuations
A _____ is used to visualize sample data graphically and to draw preliminary conclusions about the possible relationship between the variables.
scatter chart
A time series that shows a recurring pattern over one year or less is said to follow a _____.
seasonal pattern
A regression analysis involving one independent variable and one dependent variable is referred to as a _____.
simple linear regression
In the graph of the simple linear regression equation, the parameter ß1 is the _____ of the true regression line
slope
The least squares regression line minimizes the sum of the _____.
squared differences between actual and predicted y values
______ uses sample data to make inferences and answer research questions
statistical inference
The Moving Average Method uses...
the average of the most recent k data values in the time series as the forecast for the next period
the alternative hypothesis is...
the claim that you're testing
What would be the coefficient of determination if the total sum of squares (SST) is 23.29 and the sum of squares due to regression (SSR) is 10.03?
the coefficient of determination is equal to the regression sum of squares / total sum of squares (which is total tax variation) r^2 = SSR / SST = 10.03 / 23.29 = 0.430657 r^2 = 43.06% the value of the coefficient of determine on r^2 = 43.0% percentage of the variation in the dependent variable (Y) can be explained by the independent variable (X)
Multicollinearity refers to...
the correlation among the independent variables in multiple regression analysis a common rule-of-thumb test, multicollinearity is a potential problem if the sample correlation coefficient exceeds 0.7 for any two of the independent variables
Horizontal patterns exist when...
the data fluctuate randomly around a constant mean over time
Estimated regression line is...
the graph of the estimated simple linear regression equation
exponential smoothing is...
A weighted-moving-average forecasting technique in which data points are weighted by an exponential function.
what is type 2 error?
Accepting null hypothesis when you should have rejected it
interval estimate is....
An interval, or range of values, used to estimate a population parameter provides information about how close the point estimate is to the value of the population parameter
Mean Absolute Percentage Error (MAPE) is the...
Average of the absolute value of percentage forecast errors
The population parameters that describe the y-intercept and slope of the line relating y and x, respectively, are _____.
B0 and B1
__________ is the amount by which the predicted value differs from the observed value of the time series variable.
Forecast error
what is the coefficient of determination
In a simple linear regression with only one independent variable, the coefficient of determination is the square of the coefficient of correlation. The coefficient of determination is represented by the term r^2 and is the percentage of the total amount of change in the dependent variable (y) that can be explained by changes in the independent variable (x). SSR / SST
A time series plot of a period of time (in years) versus revenue (in millions of dollars) is shown below. Which of the following data patterns best describes the scenario shown? (slowly trending upward)
Nonlinear trend pattern
Choose the correct mathematical model for calculating profit. Let q = production volume (quantity produced) R = revenue per unit FC = fixed cost of production MC = material cost per unit LC = labor cost per unit P(q) = total profit for producing (and selling) q units
P(q) = Rq - FC - (MC)q - (LC)q
______ ______ can be used to develop forecast for time series that has a trend, a seasonal pattern, and both a trend and a seasonal pattern
Regression Analysis
In reviewing the graph below, which of the following inferences can be drawn about the monthly salary? (graph of bell-curve, 3000 as mean)
The average monthly salary is $3,000.
The scatter chart below displays the residuals versus the dependent variable, x. Which of the following conclusions can be drawn based upon this scatter chart? (scatter makes a v about geometric mean)
The model fails to capture the relationship between the variables accurately.
what is p value?
The probability level which forms basis for deciding if results are statistically significant (not due to chance).
level of significance
The probability of making a Type I error when the null hypothesis is true as an equality
Which of the following states the objective of time series analysis?
To uncover a pattern in a time series and then extrapolate the pattern into the future
Using the diagram below, which of the following would be a likely mathematical expression for Total Cost?
Total Cost = Fixed Cost + Total Variable Cost
The value of SSE is...
a measure of the error in using the estimated regression equation to predict the values of the dependent variable in the sample
Residual Plots are...
a scatter chart of the residuals e against the independent variable that effectively assess whether errors/residuals are symmetrically distributed and are independent. This plot is important for determining whether the residuals of the regression model satisfy the conditions necessary for valid inference.
Time series is...
a sequence of observations on a variable measured at a successive point in time over successive periods of time
Stationary time series denote...
a time series whose statistical properties are independent of time - The process generating the data has a constant mean - The variability of the time series is constant over time
Level of confidence is...
says what percentage of all possible samples satisfy the margin of error the probability that the interval estimate contains the population parameter
The _____ is a measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation.
coefficient of determination
______ collects data from every element in the population of interest
census
A ______ ______ exists if the time series plot shows an alternating sequence of points below and above the trendline that lasts for more than one year
cyclical pattern Cyclical effects are often combined with long-term trend effects and referred to as trend-cycle effects
A(n) _____ refers to a model input that can be controlled in a spreadsheet model.
decision variable
______ involves the organization, summarization and display of data
descriptive statistics -identifies patterns & relationships - do not extrapolate or generalize
For any given combination of values of the independent variables x1, x2, . . . , xq, the population of potential error terms ε is normally...
distributed with a mean of 0 and a constant variance
______ involves drawing inferences about two contrasting propositions (each called a hypothesis) relating to the value of one or more population parameters
hypothesis testing
The values for random variables in a Monte Carlo simulation are ______.
generated randomly from probability distributions
confidence level is...
how frequently interval estimates based on samples of the same size taken from the same population using identical sampling techniques will contain the true value of the parameter we are estimating
The coefficient of determination _____.
is used to evaluate the goodness of fit
A _____ decision is one in which companies have to decide whether they should manufacture a product or outsource production to another firm.
make-versus-buy
Spreadsheet models are...
mathematical and logic-based models referred to as "what-if models"
MAE is the...
mean absolute error Measure of forecast accuracy that avoids the problem of positive and negative forecast errors offsetting one another
MSE is the...
mean squared error measure that avoids the problem of positive and negative errors offsetting each other is obtained by computing the average of the squared forecast errors
A description of the range and relative likelihood of possible values of an uncertain variable is known as a _____.
probability distribution
Seasonal patterns are...
recurring patterns over successive periods of time e.g. daily traffic volume shows within-the-day "seasonal" behavior e.g. manufacturer of swimming pools expects low sales activity in the fall and winter months, with peak sales in the spring and summer months to occur every year
what is type 1 error?
rejecting the null hypothesis when it is true (false positive)
_____ can accurately reflect the characteristics of the entire population
representative sample
The Least Square Method uses...
sample data to find the estimated regression between the independent variables and the dependent variables of a datase
______ overcomes the potential difficulty with taking a census.
sampling
The Coefficient of Determination is...
the ratio SSR/SST used to evaluate the goodness of fit for the estimated regression equation. It calculates the percentage/ratio that can be explained by the estimated regression equation. The higher the ratio/percentage and the smaller the SSE, the better the fit.
The level of significance indicates...
the strength of evidence that is needed in the sample data before rejection of the null hypothesis
sum of squares due to regression (SSR)
the sum of squared differences between the sample mean and the predicted values SST = SSR + SSE When SSR = SST, we have a perfect fit (coefficient of determination)
target population is...
the total group to be studied or described and from whom samples may be drawn population about which we want to make inferences
total sum of squares (SST)
the total variability in a set of data; calculated by subtracting the mean from each score, squaring the differences, and summing them
Excel's Goal Seek determines...
the value of an input that will cause the value of related output cell to equal some specified value (i.e. quantity that satisfies the goal of zero savings due to outsourcing)
Which of the following statements is the objective of the moving averages and exponential smoothing methods?
to smooth out random fluctuations in the time series
A _____ _____ is gradual shifts or movements to relatively higher or lower values over a longer period of time
trend pattern
A set of values for the random variables is called a(n) _____.
trial?
a level of significance (α), at 0.05 means...
we expect that 95% of the sample will support the null hypothesis rather that the alternative hypothesis