Chapter one Process of statistics
determining class width
(largest data value - smallest data value)/number of classes
convenience sample
find easy group to ask
simple random sample
getting picked for jury duty getting name drawn out of a hat
cohort
identifies a group, observes over long period of time
uniform distribution
the frequency of each value of the variable is evenly spread out across the values of the variable
bell-shaped distribution
the highest frequency occurs in the middle and frequencies tail off to the left and right of the middle
upper class limit
the largest value within the class
response variables
the outcomes we are observing
1. identify the research objective 2. collect the data needed to answer the questions posted in(1) 3. describe the data (organize and summarize) 4. perform inference (draw a conclusion)
the process of statistics
skewed left
the tail to the left of the peak is longer than the tail to the right of the peak
skewed right
the tail to the right of the peak is longer than the tail to the left of the peak
population - census is an example info. from entire group
total group
1. interviewer bias 2. misrepresented answers 3. wording of question 4. order of questions or words
types of response bias
random sampling
use chance to obtain a sample from a population
stem and leaf plot
uses digits to the left of the rightmost digit to form the stem. Each rightmost digit forms a leaf.
inferential statistic
uses methods that takes results from a sample, extends them to the population and measures the reliability of the result.
observational study
watching asking observing - control nothing
intervial +/_
zero does NOT mean the absence of event
ratio
zero does mean absence of event
Qualitative variables: - have category values ... those values cannot be added, subtracted, et
● Examples of Gender Zip code Blood type States in the United States Brands of televisions
correlation
seems to be a relationship
sampling methods
simple random sample stratified sample systematic sample cluster sample convenice sample
sample
subset of population
descriptive statistics
tables and graphs
class width
the difference between consecutive lower class limits
individual
a person or object that is a member of the population being studied
continuous variable
a quantitative variable that has an infinite number of possible values it can take on and can be measured to any desired level of accuracy.
The value is a Th statistic because the 6 comma 076 adults in public rest rooms nothingare a sample.
a study of 6076 that Modifying 23 % with underline did not wash their hands before exiting.
qualitative or categorical variables
allow for classification of individuals based on some attribute or characteristic
variables
are the characteristics of the individuals within the population
ordinal
can be ranked in a specific order.
classes
categories in which data are grouped
observation studies:
cohort cross case
cross-section
collect info about individuals a specific point in time or short period of time.
descriptive statistics
consist of organizing and summarizing data . [numerical data used to measure and describe characteristics of groups. Includes measures of central tendency and measures of variation.]
intervial
consistent interval of measurement
designed experiment
control at least one aspect
data
d a describes characteristics of an individual. - a fact or proposition used to draw a conclusion or make a decsion
statistic
data from sample
qualitative or categorical variables
description ex. number on a jersey
stratified sample
divide population into non- overlapping
simple random sample
each member of the population has an equal chance of being selected
systematic sample
every nth individual from population
continuous variable
examples id you can get 20.4 gallons of gas or 40.67
nonresponse bias
exists when individuals selected to be in the sample who do not respond to the survey have different opinions from those who do
response bias
exists when the answers on a survey do not reflect the true feelings of the respondent
confounding variable
explanatory variable that was considered, but didn't separate from other factors
lurking variables
factor that influences the (explanatory variable) outcome, but was not considered in the study
explanatory variables
factors that are possibly affecting the outcome
Bias - sampling
1. sampling bias 2. nonresponses bias 3. response bias
The population is threaded rods produced at the factory that week; the sample is the 40 threaded rods selected.
A factory overseer selects 40 threaded rods at random from those produced that week at the factory, then she tests their tensile strength.
Parameter
A parameter never changes, because everyone (or everything) was surveyed to find the parameter. For example, you might be interested in the average age of everyone in your class. Maybe you asked everyone and found the average age was 25. That's a parameter, because you asked everyone in the class.
nonresponse bias
A polling organization conducts a study to estimate the percentage of households that have both parents sharing equally in household chores. It mails a questionnaire to 1568 randomly selected households across the country and asks the head of each household if he or she has both parents sharing equally in household chores. Of the 1568 households selected, 30 responded
population is the male university graduates that have a white collared job sample: The 2351 male university graduates who have a white collar job.
A polling organization contacts 2351 male university graduates who have a white collar job and asks whether or not they had received a raise at work the past 4 months.
The choices need to be rotated to minimize response biases.
Consider the following question from a recent poll. Thinking about how the war issue might affect your vote for major offices, would you vote only for a candidate who shares your views on war or consider a candidate's position on war as just one of many important factors? [rotated] Why is it important to rotate the two choices presented in the question?
descriptive statistics
Of 350 randomly selected people in the town of Luserna, Italy, 280 people had the last name Nicolussi. An example of descriptive statistics is the following statement : "80% of these people have the last name Nicolussi."
subset
a part of a larger group of related things
Quantitative variables -have numeric values ... those values can be added, subtracted
Temperature Height and weight Sales of a product Number of children in a family Points achieved playing a video game
Parameter, because the data set of all 50 midterm exams in the math class is a population.
The average grade on the midterm exam in a certain math class of 50 students was an Modifying 88 with underline .
The population is all of the vehicles that pass through the lane with the camera; the sample is the group of every tenth vehicle that passes through the lane.
The state Department of Transportation wants to know about out-of-state vehicles that pass over a toll bridge with several lanes. A camera installed over one lane of the bridge photographs the license plate of every tenth vehicle that passes through that lane.
cluster sampling
To determine customer opinion of their safety features, Daimler minus Chrysler randomly selects 30 service centers during a certain week and surveys all customers visiting the service centers.
Straitified
To determine her power usage, Britney divides up her day into three parts: morning, afternoon, and evening. She then measures her power usage at 2 randomly selected times during each part of the day.
The population is all 18-65 year olds with alcohol dependence. The sample is 371 men and women aged 18 to 65 years diagnosed with alcohol dependence.
To determine if topiramate is an effective treatment for alcohol dependence, researchers conducted a 14-week trial of 371 men and women aged 18 to 65 years diagnosed with alcohol dependence.
systematic sampling
To estimate the percentage of defects in a recent manufacturing batch, a quality control manager at Toshiba selects every 16th laptop that comes off the assembly line starting with the tenth until she obtains a sample of 120 laptops
split stems
When data appear rather bunched
confounding variable
in a study occurs when the effects of two or more explanatory variables are not separated. Therefore, any relation that may exist between an explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study.
data
information referred to in the definition.
discrete variable
is a quantitative variable that has either a finite number of possible values or a countable number of possible values.
sample
is a subset of the population that is being studied
dot plot -when you have little info.
is drawn by placing each observation horizontally in increasing order and placing a dot above the observation each time it is observed.
Statistics
is the science of collecting, or organizing, summarizing, and analyzing information to draw conclusions or answer questions.
discrete variable
it would be like if you ask for half of an ice cream or half of a dvd.
case- control
l studies are observational studies that are retrospective, meaning that they require individuals to look back in time or require the researcher to look at existing records
nominal ex: eye color category
labeling and categorizing
the lower class limit
largest value within class
observational study
looked at response outcome
sampling bias
means that the technique used to obtain the individuals to be in the sample tends to favor one part of the population over another
ordinal
movie rating from one star to five stars
undercoverage bias
occurs when the proportion of one segment of the population is lower in a sample than it is in the population
causation
outcome is caused by the factors
cluster sample
pick x random groups then we ask everyone in each group
parameter
piece of data from population
Quantitative variables
provide numerical measures of individuals. - The values of a quantitative variable can be added or subtracted and provide meaningful results
case control
retrospective, looking back in time or over previous research.