What is Data Modelling

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Simple Regression

A predictor based on the mean of one variable for best fit from the observed data. An approach for modelling the relationship between a dependent variable y and one or more explanatory variables (or independent variables) denoted X. linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible. The slope of the fitted line is equal to the correlation between y and x corrected by the ratio of standard deviations of these variables. The intercept of the fitted line is such that it passes through the mean (x, y) of the data points.

Multipul Regression

For more than one explanatory variable, the process is called multiple linear regression.

Regression

Predicts Y from X. The relation between selected values of x and observed values of y (from which the most probable value of y can be predicted for any value of x).

Y

The outcome, dependant variable, criterion variable

X

The variable, the predictor

Residuals

The vertical distances from the line to the points they are the "left-over" variation after a regression line is fit. = observed y - predicted y The relationship between the data points and the model data = model + error

Univariant

_______ one dependant variable

Questions about the Data

different formed questions about exploration or the relationships amongst variables, use the same underling statistical approach and come to the same significant conclusions.

Causation

only can design demonstrate causality, because analyses are essentially correlational. Analysis can be used to infer causal relationships between the independent and dependent variables.However this can lead to illusions or false relationships, so caution is advisable; for example, correlation does not imply ________.

Statistical Analysis Examine

relationships among variables and seek to understand the nature of the relationship, statistical analysis are correlational in nature. is a statistical process for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables (or 'predictors')

Models

statistical representation of a relationship between variables in diagrams, (scatterplot). How well the line fits the data tells us how accurate the model is.

GLM

t-test, regression,ANOVA, correlation statistical procedures that fall under, all of these procedures are basically the same

Linear Regression

the goal is prediction, or forecasting, or error reduction, used to fit a predictive model to an observed data set of y and X values. After developing such a model, if an additional value of X is then given without its accompanying value of y, the fitted model can be used to make a prediction of the value of y. Statistical method for finding the "line of best fit".

Modelling

the level of measurement is irrelevant, predictors can be continuously measured variables or categorical. From a statistical view the nature of data underling the predictors is irrelevant. There is flexibility with answering an array of questions within the same basic framework.


Set pelajaran terkait

Public Speaking Final 2500 Clemson University

View Set

Series 7 - Bonds Practice Questions

View Set

Chapt. 3 & 4 Hospitality financial management

View Set

Social Psych Exam 2 (Chapters 6, 7, 8)

View Set

Good Manufacturing Practices Test 1

View Set

First 60 Elements of the Periodic Table

View Set