OBA 311 Week 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Identifying Outliers

No standard definition of what an outlier is. SOMETIMES rules are: - z-score greater than +3 and less than -3

Sample

A subset of the population. Used to obtain sufficient information to draw a valid inference about a population

Population

ALL items of interest for a particular investigation

Mean

Average.. Outliers can affect the value of the mean

Measures of location

Mean, Median, Mode, Midrange

Covariance

Measure of linear ASSOCIATION between two variables X and Y. Population covariance has just N Sample covariance has N-1

Samples Variability

Samples are sensitive to sample size. Different samples will have different histogram shapes, means, standard deviations, ect.

Median

Specifies the middle value when the data are arranged from least to greatest half below and half above not affected by outliers

Interquartile Range (IQR)

The difference between the 1st and 3rd quartiles (Q3-Q1) includes only the middle 50% of the data and is not influenced by extreme values

Association

Two variables have a strong statistical relationship with one another if they appear to move together but sometimes statistical relationships even occur when they aren't cause-and-effect (ice cream sales & murder)

Correlation

a measure of the linear RELATIONSHIP between two variables which does not depend on the units of measurement correlation coefficient is scaled between -1 and 1

Statistical Thinking

a philosophy of learning and action for improvement based on principals that -all work occurs in a system of interconnected processes -variation exists in all processes -better performance results in reducing variation

Variance

average of the squared deviations from the mean for a population the denominator is just N for a sample the denominator is N-1

Skewness

describes the lack of symmetry of data Distributions off to the RIGHT are positive Distributions off to the LEFT are negative

Chebyshev's Theorem for standard deviation

for any data set the proportion of values that lie within k standard deviations of the mean is 1-1/k^2 so for 2 standard deviations of the mean = 1-(1/2^2) = .75 or 75%

Coefficient of Kurtosis (CK)

measures the degree of kurtosis of a population

Sign of z-score

negative if number is LEFT of the mean positive if the number is RIGHT of the mean

Mode

observation that occurs most frequently useful data set that contains small number of unique values can easily identify the mode from a frequency distribution or from a histogram

Coefficient of Variation (CV)

provides a relative measure of dispersion in a data relative to the mean. Provides a relative measure of risk to return CV = (Standard deviation) / (mean)

standardized value aka z-score

provides a relative measure of the distance of an observation is from the mean which is independent of the units of measurement

Kurtosis

refers to peakedness or flatness of a histogram coefficient of kurtosis

Measures of dispersion

refers to the degree of variation in the data; that is, the numerical spread of the data key measures: range interquartile range variance standard deviation

Range

the DIFFERENCE between the maximum and the minimum value in a data set affected by outliers and is often used in small data sets different than midrange because midrange is the average and range is the difference

Midrange

the average of the greatest and least values in the data set Caution: extreme values easily distort the results

proportion

the fraction of data that have a certain characteristic

Standard deviation

the square root of the variance.


Ensembles d'études connexes

Requirements for a Valid Contract

View Set

Pharm III Exam: Drugs used in the Treatment of Angina Pectoris

View Set

CHAPTER 8 - Data Warehouse and Data Mart Modeling

View Set

Chapter 16: Giving Birth: McKinney

View Set

Chapter 6 The Neonatal Period: Changes During the First Month of Life

View Set

Saunders Mental Health and Pharmacology Psychiatric

View Set