Applied Statistics Chapter 1
Categorical Variable
A categorical variable places an individual into one of several groups or categories.
Skewed Distribution
A distribution is skewed to the right if the right side of the histogram (containing the half of the observations with larger values) extends much farther out than the left side. It is skewed to the left if the left side of the histogram extends much farther out than the right side.
Symmetric Distribution
A distribution is symmetric if the right and left sides of the histogram are approximately mirror images of each other.
Quantitative Variable
A quantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense. The values of a quantitative variable are usually recorded with a unit of measurement such as seconds or kilograms.
Variable
A variable is any characteristic of an individual. A variable can take different values for different individuals.
Outlier
An important kind of deviation is an outlier, an individual value that falls outside the overall pattern.
Bar Graph
Bar graphs bar graphs represent each category as a bar. The bar heights show the category counts or percents. Bar graphs are easier to make than pie charts and also easier to read. *Bar graphs are more flexible than pie charts. Both graphs can display the distribution of a categorical variable, but a bar graph can also compare any set of quantities that are measured in the same units.*
Data
Data is a set of values of qualitative or quantitative variables about some group of individuals.
Finding the Midpoint
If there were an odd number of observations initially, you will be left with a single observation, which is the midpoint. If there were an even number of observations initially, you will be left with two observations, and their average is the midpoint.
Individuals
Individuals are the objects described by a set of data. Individuals may be people, but they may also be animals or things.
Midpoint
One way to describe the center of a distribution is by its midpoint, the value with roughly half the observations taking smaller values and half taking larger values.
Pie Chart
Pie charts show the distribution of a categorical variable as a "pie" whose slices are sized by the counts or percents for the categories.A pie chart must include all the categories that make up a whole. Use a pie chart only when you want to emphasize each category's relation to the whole.
Histogram
Quantitative variables often take many values. The distribution tells us what values the variable takes and how often it takes these values. A graph of the distribution is clearer if nearby values are grouped together. The most common graph of the distribution of one quantitative variable is a histogram.
Exploratory Data Analysis
Statistical tools and ideas help us examine data in order to describe their main features. This examination is called exploratory data analysis.Like an explorer crossing unknown lands, we want first to simply describe what we see.
Distribution of a Variable
The distribution of a variable tells us what values it takes and how often it takes these values
Histogram Patterns
You can describe the overall pattern of a histogram by its shape, center, and variability. You will sometimes see variability referred to as spread.
Charts for Categorical Variables
You can use a pie chart or a bar graph to display the distribution of a categorical variable more vividly.