Ch 10: Descriptive Statistics

¡Supera tus tareas y exámenes ahora con Quizwiz!

Data Transformation

-collapsing/combining similar categories -used to understand data better ex: agree and strongly agree get combined

Descriptive statistics: step two

frequency distribution -in %'s not #'s -determine what the values are of your attributes -allows you to draw meaning from data ex: values: male and female raw #: male (100) and female (300) translated to %: male (25% of sample) and female (75% of sample)

frequency distribution

providing % of responses in categories

Nominal level of measurement (data)

only mode is meaningful

Measures of Central Tendency: mode

(can get from ALL measurement level data - nominal, ordinal, interval AND ratio) ADV: HELPS describe data by providing... -most frequently occurring #/answer (= mode) -isn't affected by outliers -provides a more characteristic description of data than the mean (displays distribution) DISADV: -weak form of descriptive data (used for mainly nominal data) -can't determine outliers, variance (distribution), mean, or standard deviation ex: do you think online news sources are credible or NOT credible? (nominal) -25% credible; 75% not credible - doesn't give you much info... if it was a SCALE question (i.e. 1-10) you'd get more info.

Measures of Dispersion/Variability: variance

(can get from only INTERVAL and RATIO measurement level data) = how data is distributed relative to the mean (how variance score differs from mean) ADV: HELPS describe data by... -large variance = a lot of answers/opinions and widely scattered scores -small variance = similar answers and close to average.

Measures of Dispersion/Variability: standard deviation

(can get from only INTERVAL and RATIO measurement level data) = mean of variance scores ADV: HELPS describe data by... -(gives variance) - higher standard deviation, more varied the data is. -allows for easy interpretation

Measures of Central Tendency: median

(can get from only INTERVAL and RATIO measurement level data) ADV: HELPS describe data by providing... -mid point (= median) -extreme outliers don't effect median as much as they do the mean. ex: ages of freshman entering BU -child prodigy (Age 14) would skew mean #; they are an OUTLIER

Measures of Central Tendency: mean

(can get from only INTERVAL and RATIO measurement level data) ADV: HELPS describe data by... -only measure than can be defined algebraically DISADV: -outliers pull mean towards their direction (skewing data) -avg. doesn't provide much information (doesn't display data distribution) -can't get sample size. ex: average test score = 80. unknown: -how many people took test -distribution (outliers)

Measures of Dispersion/Variability: range

(can get from only INTERVAL and RATIO measurement level data) range = highest score - lowest score ADV: HELPS describe data by... -gives distribution -removes outliers (acts like book ends) DISADV: -unknown highest and lowest answer -unknown distribution WITH IN range -range increases with sample size

Inferential Statistics

(describes relationships between variables!) -based on probability theory (statistically significant) -formally tests hypotheses to see if relationships truly exist and if they're statistically significant. -do so using 2 types of analysis: 1. correlation analysis 2. multi-variate analysis

Descriptive Statistics (does what)

(helps describe variables!) -focus is on understanding MEANING BEHIND data, not just data itself. -describes and organizes # data -provides frequency distribution and graphical presentation of data (i.e. histograms; pie charts) -2 steps: 1. tabulation of data 2. frequency distribution

Statistical processing software

1. SAS 2. SPSS (easy; common for social science) 3. Excel

Measures of Central Tendency (know how they help describe data and with what level(s) of measurement can you calculate.)

1. mean 2. median 3. mode (all!!!)

Measures of Dispersion/Variability (know how they help describe data and with what level(s) of measurement can you calculate.)

1. range 2. variance 3. standard deviation

How to cross tab

1. segment entire sample population (Larger the sample, more segmented) 2. look for variation between 2 variables. 3. remember a 5 point difference is statistically significant!

Elaboration analysis

a correlation analysis (regarding cross-tabs) -looks at a 3rd variable! ex: favorability by age AND gender

Cross Tabs

a correlation analysis and multi-variate analysis -every survey question is cross-tabbed -make hypothesis about what you want to test, then choose an INDEP. VARIABLE to hold constant. ***a 5 point difference is statistically significant.

Interval and Ratio levels of measurement (data)

mean, median OR mode is meaningful.

Multi-Variate analysis

method of Inferential Statistics -holding 1 or more variables constant, see which variable has the most impact on the others. (which independent variable is strongly correlated to your dependent variable) ex: cross tabs

Correlation analysis

method of Inferential Statistics -studies how variables are correlated; relationships between them ex: cross tabs

Ordinal level of measurement (data)

mode OR median is meaningful

Descriptive statistics: step one

tabulation of data -organize data (in table, etc.) -tallying = when this is done by hand


Conjuntos de estudio relacionados

Chapter 54: Management of Patients With Kidney Disorders

View Set

A book's call number enables you to

View Set

Mental Health Course Point Chapter 16

View Set

Chapter 12: Processes of Birth questions

View Set