MSIS Exam 2 Nord ok state
Random variable
a numerical description of the outcome of an experiment. Maybe continuous or discrete.
Standard residuals
are residuals divided by their standard deviation and describe how far each residual is from its mean in units of standard deviations.
A linear regression model
with more than one independent variable is called a multiple linear regression.
Common types of mathematical functions used in predictive analytical models include:
-Linear Functions; show steady increase or decrease over the range of X and is the simplest type of function used in predictive models. -Logarithmic functions; are used when the rate of change in a variable increase or decrease quickly and then levels out, such as diminishing returns to scale. -Polynomial function; a second-order polynomial is parabolic in nature and has only one hill or valley, a third-order polynomial has one or two hills or valley. Revenue models that incorporate price elasticity are often polynomial functions. -Exponential functions; have the property that Y rises or falls at constantly increasing rates. -Power functions; define phenomena that increase at a specific rate. Learning curves that express improving times in performing a task are often modeled with power functions having a>0 and b<0. Power functions are mathematical functions used in predictive analytical models which define phenomena that increase at a specific rate, and is represented by the formula y=ax^b
Normal distribution
A continuous distribution described by the familiar bell shaped curve. Characterized by two parameters: the mean and the standard deviation and has the following properties: the distribution is symmetric, so its measure of skewness is 0. the mean, median, and mode are all equal thus half the area falls above the mean and half falls below it. The range of X is unbounded, meaning that the tails of the distribution extend to negative and positive infinity. The empirical rules apply exactly for the normal distribution (68-95-99.7)
Multiple regression
A regression model that involves two or more independent variables
Multiple R and R square
Are called the multiple correlation coefficient and the coefficient of multiple determination respectively. They indicate the strength of association between the dependent and independent variables. Similar to simple liner regression, R squared explains the percentage of variation in the dependent variable that is explained by the set of independent variables in the model.
Excel's regression tool
Can be used for both simple and multiple linear regressions. Using this tool, the range of dependent variable values can be specified, labels can be included, set a confidence level, and you have the option of forcing the intercept to zero by checking Constant is Zero.
Expected value of a random variable
Corresponds to the notion of the mean, or average for a sample. Expected value can be helpful in making a variety of decisions.
Continuous random varaible
Has outcomes over one or more continuous intervals or real numbers.
Classical definition of probability
If the process that generates the outcome is known, probabilities can be deduced or deducted from theoretical arguments
Interaction
In regression, this occurs when the effect of one variable is dependent on another variable.
Standard error
In the Excel outputs is the variability of the observed Y-values from the predicted values. this is formally called the standard error of the estimate. if the data are clustered close to the regression line, then the standard error will be small; the more scattered the data, the larger the standard error.
Index
Indicators are often combined quantitatively into an index, a single measure that weights multiple indicators, thus providing a measure of overall expectation. Indexes do not provide complete forecasts but rather a better picture of direction of change and thus play an important role in judgmental forecasting
Delphi method
Is a popular judgmental forecasting approach which uses a panel of experts , whose identifies are typically kept confidential from one another, to respond to a sequence of questionnaires. After each round of responses, individual opinions, edited to ensure anonymity, are shared, allowing each to see what other experts think. the Delphi method promotes unbiased exchanges of ideas and usually result in some convergence of opinion.
Regression analysis
Is a tool for building mathematical and statistical models that characterize relationships between a dependent variable and one or more independent, or explanatory, variables, all of which are numerical. Two categories: Cross-sectional data and regression models of time-series data, in which the independent variables are time or some function if time and the focus is on predicting the future.
Outlier
Is an extreme value that is different from the rest of the data
Relative Frequency Definition (probability)
Is based on empirical data. The probability that an outcome will occur is simply the relative frequency associated with that outcome.
Subjective definition of probability
Is based on judgment and experience as sports experts might predict at the start of the football season- What is the probability of a specific team winning the national championship? (not always right. based on experience)
Discrete random variable
Is one for which the number of possible outcomes can be counted.
Normality of errors
Regression analysis assumes that the errors for each individual value of X are normally distributed, with a mean of zero. This can be verified bu examining a histogram of the standard residuals and inspecting for a ball-shaped distribution or by using more formal goodness of fit tests. It is usually difficult to evaluate normality with small sample sizes.
Observed errors
Residuals are the observed errors which are the differences between the actual values and the estimated values of the dependent variables using the regression equation. Observed errors can be negative or positive.
Variance
The average of the squared deviations of the observations from the mean; a common measure of dispersion. The variance measures the uncertainty of the random variable; the higher the variance, the higher the uncertainty of the outcome.
Probability density functions
The distribution that characterizes outcomes of a continuous random variable. Properties of probability density functions: a graph of the density function must lie at or above the x-axis. The total area under the density function above the x-axis is 1. For continuous random variables, there are an infinite number of values. Calculates the probability of a random variable lying within a certain interval such as between two numbers, or to the left or right of a number. P is the area under the density function between A and B.
Binomial distribution
The probability distribution of a random variable with two possible outcomes, each with constant probability of occurrence. Can assume different shapes and amounts of skewness, depending on the parameters.
Simple regression
a regression model that involves a single independent variable
Cyclical effects
describe ups and downs over a much longer time frame, such as several years.
Time-series model
is a stream of historical data, such as weekly sales. Time series may exhibit short-term seasonal effects well as longer-term cyclical effects, or non linear trends.
Seasonal effect
is one that repeats at fixed intervals of time, typically a year, month, week, or day.
The variance of a discrete random variable
may be computed as a weighted average of the squared deviations from the expected value. A common measure of dispersion.
Historical analogy
one judgmental approach in which a forecast is obtained through a comparative analysis with a previous situation. For example, if a new product is being introduced, the response of consumers to marketing campaigns to similar, previous products can be used as a basis to predict how the new marketing campaign might fare. A great deal of insight can be gained through an analysis of past experiences.
Excel's trendline tool
provides a convenient method for providing the best fitting functional relationships among these alternatives for a set of data. r-squared is a measure of the "fit" of the line to the data and will have a value between 0 and 1. the larger the value of R-squared, the better the fit.
The variance measures
the uncertainty of the random variable; the higher the variance, the higher the uncertainty of the outcome