Statistics Chapter 9
Bias
A failure of the sample to represent the population.
Sample Statistic
A numerical characteristic of the sample. For example, the median test score of all statistics students in Block 1.
Population Parameter
A numerical characteristic of the whole population. For example, the median test score of all statistics students.
Sample
A representative subset of the population.
An airline company wants to survey its customers one day, so they randomly select 5 flights that day and survey every passenger on those flights.
Cluster Sampling
Randomization
Each item is given a fair chance of selection.
Simple Random Sample (SRS)
Each person has an equal chance to be selected and each combination of people has an equal chance.
Undercoverage bias
Part of the population is not represented in the sample. EX: A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a survey by calling 100 people whose names were randomly sampled from the phone book (note that mobile phones and unlisted numbers aren't in phone books). The senator's office called those numbers until they got a response from all 100 people chosen. The poll showed that 42% of respondents were "very concerned" about internet privacy.
Voluntary Response Bias
People are asked to volunteer for the sample. EX: David hosts a podcast and he is curious how much his listeners like his show. He decides to make an online poll. He asks his listeners to visit his website and participate in the poll. The poll shows that 86%, percent of the 200 respondents "love" his show.
Strata
People are the same in each group
Systematic sampling
Picking every nth person of a list to find a sample. (every 10th person, but counting the fifth person as the starting point.)
Stratified Random Sampling
Random sampling is used within each strata before the sample is selected. Then SRS is used within each strata.
A manager gave each employee a number. He wrote each number on a piece of paper, and put them in a cup. Then he took 5 pieces of paper out of the cup without looking.
Simple Random Sampling
Non-response bias
Some of the people who are selected for a sample do not participate. EX: A school board member wanted to know how her district felt about a new school mascot. She conducted a survey by calling people using random digit dialing, where computers randomly generate phone numbers so unlisted and mobile numbers can still be reached. She called over 1,000 random phone numbers—most people didn't answer—until she had reached 100 respondents. The poll showed that 45% of the respondents were "supportive" of a new school mascot.
Cluster Sampling
Splitting population in clusters and doing a census on one cluster.
A large company surveys 100 employees by taking random samples of 10 managers and 90 regular employees
Stratified Random Sampling
A manager at a pizza shop wants to start visually checking finished pizzas, but she doesn't have time to check every pizza. She decides to check a random pizza in the first 20 made each day, and then check every 20th pizza thereafter.
Systematic Sample
Convenience sample
The easiest people to find are chosen for the sample. EX: Anita hosts a YouTube channel and she is curious how much her viewers like her content. She decides to poll the next 100 viewers who subscribe to her channel. They don't all respond, but 94 of the 97 viewers who responded said they "loved" her content. close
Population
The entire group of people or things we want to learn about.
Sampling Variability
The natural tendency of random samples to differ from one another.
Representative
The sample accurately represents the population.
Response bias
The sample is designed to make people respond in a certain way. This could be intentional or unintentional. EX: A high school wanted to know what percent of its students smoke cigarettes. During the week when students visited the counselors to schedule classes, they asked every student in person if they smoked cigarettes or not. The data showed that 3% of students smoked cigarettes.
Census
Understanding a population by researching EVERYONE. difficult to get everyone and people move.
Multistage sampling
Using multiple sampling styles
Clusters
people are different in each group. Every group is different can can be used to represent the entire population.
How do you sample at random?
1. Get sampling frame (list of individuals that you can sample from) 2. Give each person a number 3. Pick numbers randomly