Ch. 11: Displaying Distributions with Graphs = HISTOGRAMS
By just looking at a histogram, can you figure out the sample size?
-Yes! just count the number of individuals within each range/group/bin
What is spread, as pertaining to a histogram?
-def: (aka VARIABILITY) describes the overall range of data and where the data is most concentrated ~we can describe the spread of a distribution by giving the smallest and largest values, acknowledging that you are ignoring any outliers (if any are present)
What is center, pertaining to a histogram?
-def: approximately the midpoint of the distribution along the horizontal axis (x-axis); this is NOT the midpoint of the x-axis ~ex: mean, median ~its the value with roughly half the observations taking smaller values and half taking larger values (so may not equally split the x-axis)
What is an outlier?
-def: in a graph, an individual observation that falls outside the overall pattern of the graph (ex: outlier is unusually small/large value, a deviation from the overall pattern) ~outliers can affect the interpretation of data (ex: the mean looks higher/lower than is actually representative of the data)
What is shape, pertaining to a histogram? Name the 3 ways to describe shape.
-def: symmetrical or skewed? how many peaks? -3 Ways to Describe: 1. number of modes or peaks 2. symmetry 3. skewness
What is the bin-width of a histogram?
-def: the width of a histogram's bars ~bins allow histogram to summarize info ~quantitative data is sorted into bins, or intervals, in a histogram
Properties of Shape: What is meant by the number of modes or peaks of a distribution?
-examining the distribution of the measured variable on the histogram, the number of modes or peaks refers to the where the distribution has its bars of highest height -can be unimodal (one peak) or bimodal (two peaks)
How do you interpret histograms? (what do you look at?)
-first, look for an overall pattern and any notable deviation from the pattern (aka an outlier: in a graph, an individual observation that falls outside the overall pattern of graph) -3 ways to describe the overall pattern: 1. shape 2. center 3. spread
What type of graphs are used with categorical variables VS quantitative variables?
-for categorical variables: use bar graphs and pie charts -for quantitative variables: use histograms (and line graphs)
What to do with outliers?
-once you spot an outlier, look for an explanation as to why it occurred -many outliers are due to mistakes or misinterpretations of a survey question -other outliers can be good! i.e. rare genes
Properties of Shape: What is the skewness of a histogram?
-skewness occurs in a histogram when the distribution has a long tail ~a long tail on the right of the distribution where the larger values are = "right skewed" ~a long tail on the left of the distribution where the smaller values are = "left skewed"
Properties of Shape: What is symmetry of a histogram?
-symmetry occurs when the right and left sides of the histogram are approximately mirror images of each other
What is a histogram?
-used to show the entire distribution of a quantitative variable ~look like bar graphs but DIFFERENT -> its x-axis has a continuous quantitative scale (the order of the bars along the x-axis matters; histograms include all the numbers in between the labeled x-axis intervals) ~so histograms group together data into bins in order to better display the distribution (bin = all the data within a specific interval)
Why are histograms useful?
-we can use pie charts and bar graphs to display categorical variables because the only have a few amount of values which it takes -HOWEVER, with quantitative variables, such variables have so many values that graph of their distribution is clearer if nearby values are GROUPED together -> histogram accomplishes this
compare histograms VS bar graphs (3 Differences)
1. Histograms: -bars for each interval touch each other -continuous x-axis, w/ x values in a specific order -displays quantitative variables 2. Bar Graphs: -bars for each category don't touch each other; there are spaces between bars -categories on x-axis in any order (order doesn't matter; x-axis is not continuous) -displays categorical variables
Anatomy of a Histogram (regarding y-axis, x-axis, bars)
1. bars of a histogram: equal width; height tells you how many individuals fall within the specific range; no spaces between bars 2. y-axis: can be labeled "count," "frequency," "percent"; does NOT represent the measured variable; DOES represent how often the values on the x-axis occur 3. x-axis: represents the different measured values of the variable
Can bar graphs be skewed? symmetric?
NO for both! Bar graphs cannot display symmetry or skewness because they do not have continuous x-axes
Can a histogram be symmetric and skewed at the same time?
NO!
