Stats121
What is the standard deviation?
"average distance" of values from mean
what is its five steps?
1. organize and summarize 2. discover patterns or deviations 3. interpet patterns 4. distribution and relationship 5. visual and numerical summaries
Histogram organization?
Bins on horizontal (number of how many in each category) Numerical Frequency on vertical (number of books)
what is EDA
Exploratory data analysis?
What is IQR?
Median of second half of data-median of first half of data When n is odd with the first median, it's not included in either sets of data. When it is even, it's split into two halves
Pictogram?
Picture enhanced bar chart
Why use IQR over Range?
Range is highly effected by outliers
what is stats?
Science of extracting meaning from data Art of persuasion through info methodology for using data to answer questions
What is an advantage of histograms over stem plots?
The data set can be any size
Dogma of Stats
always variation variations leads to uncertainty converting data into useful info requires understanding uncertainty
individual?
an entity that is observed
Layout of bar charts?
arbitrary on horizontal (male or female) numerical is vertical (height)
when should you use a pie chart or bar chart?
categorical information
what 5 things should you look of with quantitative distributions?
center spread shape modes outliers
mean?
center of gravity
variable:
characteristic that is measured on each individual
process of stats
collect, summarize, interpet
data set?
data identified with contextual information (table and columns)
What is the spread?
distance of the min and max
What is the population?
entire group of interest
Choosing a sample and collecting date is called what?
first step in BP, producing data
When should you use the median or mean?
if the there are outliers use the median; if it is symmetric, use the mean
What does the mean do to that data?
it balances it out
data?
measurements for a set of individuals
what is stronger: median or mean?
median because mean is affected by outliers; mean will drag toward the skewed tail
median?
middle value
Why are pictograms not the best?
misleading... two dimensions are growing
two aspects of spread?
overall spread and degree of clustering
What is the cycle of the big picture?
population, producing data, exploratory data analysis, probability, inference
when should you use histograms or stems plots?
quantitative
Stem Plot
stem=all but final digit leaf=final digit
sample?
subgroup of population
5 kinds of distributions?
symmetric or bell-shaped right-skewed left-skewed bimodal flat or uniform
if median is between two number?
take the man (.5 will go to the right and the other with go the left)
mode?
value corresponding to a "peak"... value with highest frequency
Measurement?
value of a variable for an individual