C207
The two main types of research design are observational studies and experimental studies.
1. Observational studies* are also known as quasi-experimental studies. An observational study is sometimes used because it is impractical or impossible to control the conditions of the study. Ie Mystery shopping or A prospective cohort study* observes people going forward in time from the time of their entry into the study. 2. Experimental Studies - all variable measurements and manipulations are under the researcher's control, including the subjects or participants. For example, when studying the impact of price changes on consumers
There are three elements to an experimental study:
1. experimental units - subjects or objects under observation 2. treatments- the procedures applied to each subject 3. responses- the effects of the experimental treatments
Decision Tree
A decision tree is a decision analysis tool that shows a number of options, the paths by which each of these options may be reached, and the possible consequences of choosing each option. A decision tree analysis is designed to establish a logical sequence for decisions, to consider the decision alternatives available, and to evaluate the results they will produce. Decision trees are valuable for risk assessment when a limited number of potential outcomes are possible.
8 quality management principles 1. A focus on customers 2. A commitment to strong leadership 3. Engaged colleagues 4. A focus on process 5. A systems approach 6. A commitment to continuous improvement 7. Dedication to fact-based decision making 8. A collaborative relationship with suppliers
A focus on customers - will lead to increased loyalty from customers, which in turn will lead to an increase in revenue and market share as customers return for repeat business. A commitment to strong leadership - will increase the team's understanding of its objectives and minimize miscommunication among project participants. Engaged colleagues - will be more motivated and committed to project success and continued improvement. A focus on process - results are easier to manage and achieve. A process approach also helps to create consistent, predictable outcomes and uncovers opportunities for improvement. A systems approach - will help to ensure consistency and efficiency among organization-wide activities. A commitment to continuous improvement - This (EE) commitment should benefit both the individual and the organization by increasing skills and effectiveness. Dedication to fact-based decision making - organizations can be sure that options are chosen because they are best for the project and organization. These informed decisions reduce bias and foster trust in decisions and plans. A collaborative relationship with suppliers - improve the capability and performance of quality processes and help you realize benefits that you might not have realized otherwise.
Attribute vs Variable data
Attribute - Quick, easy to collect. Attribute data is collected to show whether a result meets a requirement or not. Think of this information as answers to a "yes or no" question or a "pass/fail" test. Variable - More specific data that his harder to collect. Tests how well a result meets a requirement. For example, a company produces 5-pound bags of flour. They are trying to determine the actual production weights for their bags of flour. If the company were to use attribute data, they could determine the number of bags that have at least 4.8 pounds of flour. The bags either have greater than 4.8 pounds of flour, or they do not have 4.8 pounds of flour. If the company were to use variable data, they could take the weight of each bag and determine the mean and standard deviation of the production weights.
Common Cause Variation vs. Special Cause Variation
Common cause variations - variations that are accepted as part of the normal process because they fall within the amounts that users will tolerate. Special Cause Variation - large variation that is not accepted. Te results are not caused by the "natural" variation and need to be fixed.
ordinary least squares (OLS) regression
Earlier we discussed how a regression line is the "line of best fit" for a scatterplot. This is determined using an approach called "least squares." In fact, linear regression is often referred to as ordinary least squares (OLS) regression. The line of best fit minimizes the vertical distances between that line and the data points in the scatterplot. More accurately, the sum of these distances squared is minimized (i.e., least squares). The distances are squared so that points below the regression line (that will have negative distances) can have a similar contribution as points above the regression line (that have positive distances). Also, the distances are squared so that they can greater consider the points further away.
2. Homoscedasticity
Homoscedasticity* occurs when all of the random variables have the same general finite variance. In other words, the data points in a scatterplot stay approximately the same distance away from the regression line throughout the entire dataset. For ordinary least squares (OLS) regression analysis, homoscedasticity is assumed. Homo -same Scedasicity- variance Homoscedasticity Example A line with points scattered equally above and below the line.
Time series analysis
Time series analysis* is a technique where time is used as an independent variable to assess any influence it may have on an output. Recall that regression analysis allows for one or more independent variables, but requires a single dependent variable. Whether there is one independent variable (simple regression) or more than one independent variable (multiple regression), a time series analysis may be applicable when an independent variable represents time. Ie continuously monitor heart rate
Time series analysis data patterns
Trend* A general slope upward or downward over a long period of time. Cyclicality* Repetition of up (peaks) or down movements (troughs) that follow or counteract a business cycle that can last several years. Seasonality* Regular pattern of volatility, usually within a single year. Irregularity* One-time deviations from expectations caused by unforeseen circumstances such as war, natural disasters, poor weather, labor strikes, single-occurrence company-specific surprises or macroeconomic shocks. Random Variation* The variability of a process which might be caused by irregular fluctuations due to chance that cannot be anticipated, detected, or eliminated.
3. Heteroscedasticity
When heteroscedasticity* occurs, the random variables have an unequal spread of variances. In other words, the data points in a scatterplot tend to be spread to varying distances from the regression line depending on the location of the data point on the line.
Problems with regression Analysis 1 Autocorrelation
a particular concern when performing time series multiple regression ie temp from one day to the next
Multiple linear regression
is an analysis of how multiple independent variables affect one dependent variable.