Introduction to Statistics Section 1.1
Experimental Design, Guidelines for planning
1. identify individuals of interest. 2. Specify variable or variables. 3.Specify population. 4.Specify method for measuring or observing. 5. Determine sampling method. 6. Collect data. 7. Use inference to come knowledge. 8. Refine(problems, recommendations)
Blinding
Single blinded: subjects does not know if it is a treatment or a placebo Double blinded: Those experimenter doing the measuring don't know either
simple random sample
a particular type of probability sample in which every member of the population has an equal chance of being selected ex. pulling names out of a hat random number genrator randInt(lowest number, highest number, n) In a study, the sample is chosen by pulling 50 names from a hat
What is an individual?
a person or object you are interested in finding out out information about
observational study
a study based on data in which no manipulation of factors has been employed. They don't change anything.
what is a experiment?
a study in which a treatment is applied and responses are observed
what is a observation study?
a study in which data is collected without anything being done to the subjects
What is a sample?
a subset from the population. It looks just like the population, but contains less data.
Rigorously Controlled Design
carefully assign subjects to different treatment groups, so that those given each treatment are similar in ways that are important to the experiment
What is descriptive statistics?
collect and organize data
What is a cross-sectional study?
data observed, measured, or collected at one point in time.
what is a ratio?
data that is interval, but you can divide one value by another and that ratio makes sense. you can now do all arithmetic on this data. examples of this are height, weight, distance and time.
what is a interval?
data that is ordinal, but you can now subtract one value from another , and that subtraction makes sense. you can do arithmetic on this data. but only addition and subtraction. Examples of this are temperature and time on clock.
convenience sample
not statistically valid its easy ex. selection people you know choosing the people in the front row
Replication
repeating the essence of a research study, usually with different participants in different situations, to see whether the basic finding extends to other participants and circumstances
What is a population?
set of all measurements about which we are interested
what is a Variable (Random variable)?
the measurement or observation of the individual
statistcal significance
the results are probably not due to chance
What is Statistics?
the study of how to collect, organize, analyze, and interpret data collected from a group
Matched Pairs Design
the treatments are given to two groups that can be matched up with each other in some ways. ex. having the same person complete a driving simulator while sober and while drink. Smoking cessation pill- it matters if you are a heavy smoker or not
Experiment
when the investigator changes or imposes a treatment to determine its effect. Intended to establish cause and effect.
hidden bias
when the questions are asked in a way that makes a person respond a certain way -Should illegal immigrant be prosecuted and deported for being in the US illegally or not? 69% favored deportation -Should illegal immigrants who have worked in the US for 2 years be given a chance to keep their jobs and eventually apply for legal status? 62% favored giving them a chance
lurking or confounding variables
when you cannot rule out the possibility that the observed effect is due to some other variable rather than the factor being studied.
Overgeneralization
when you do a study on one group and try to say that it will happen on all groups. -just because it works on a rat doesn't it will work on a person
Voluntary response
where people are asked to respond via phone, email or online. The problem with these is that only people who really care about the topic are likely to call or email. ex. Questionnaire Response -Preparing for her book "Women and Love" - Author sent questionnaires to 100,000 women asking about love, sex and relationships. -Only 4.5% responded The author used those responses to write her book Question: Who is more likely to respond? A: The women who are interested in the content of the book
Non-response
where you send out a survey but not everyone returns the survey
What is a retrospective study?
A retrospective study uses existing data that have been recorded for reasons other than research. looking back on or dealing with past events or situations. data collected from the past using records, interviews and other similar artifacts ex. The incidence rate of smoking among people who have died from heart disease is determined by investigating past medical records
Practical (or clinical) significance
Does the effect really matter? -A pill decreases cholesterol level by 2 points. It is very expensive though.
what is a Qualitative or Categorical Variable?
answer is a word or name that describes a quality of the individual
randomized two-treatment experiment
Control is key concept. Placebo is often used for control.
What are the "Family of Numbers"?
Counting numbers Measuring numbers
What is a ordinal?
Data is qualitative. Data that is nominal, but you can put data in order, since one value is more or less than another value. You cannot do arithmetic on this data, but you can put the data values in order. Examples of this are grades (A,B,C,D,F), place value in a race (1st, 2nd, 3rd), and size of a drink (small, medium, large), the finishers of the Kentucky Derby
What is a nominal?
Data is qualitative. data is just a name or category. No order or numbers. Examples of this are gender, car name, ethnicity and race, dog group (such as working group or hound group)
Example 1.1.3: A biologist wants to estimate the average height of a plant that is given a new plant food. She lives 10 plants the new plant food. State the individual, variable, sample, parameter, and statistic.
Individual: Plant Variable: the height of a plant when given new plant food (qualitative or quantitative) Population: all the heights of plants when the new plant food is used Sample: 10 heights of plants when the new plant food is used Parameter: average height of all plants Statistic: average height of 10 plants
Example 1.1 : In 2010, the Pew Research Center questioned 1500 adults in the U.S to estimate the proportion of the population favoring marijuana use for medical purposes. It was found that 73% are in favor of using marijuana for medical purposes. State the individual, variable, population, sample, parameter, and statistic?
Individual: a U.S adult Variable: proportion of the population favoring marijuana use for medical purposes Population: all responses of adults in the U.S Sample: 1500 responses of U.S adults who are questioned. Parameter: percentage who favor marijuana for medical purposes calculated from population Statistic: percentage who favor marijuana for medical purposes calculated from sample
Order of Question
Order makes people consider previous questions to answer the current one. -"How many dates did you have last month?" -How happy are you with life in general? In this order, the relationship between questions is relatively high. Many dates = happy Few Dates = unhappy In reverse order, the relationship between questions is very low
what is a experimental unit?
The people, animals or things on whom experiments are performed
sampling error
This is the difference between the sample results and the true population results. -This can happen by chance. This is why replication of studies is important.
cause and effect
When people decide that one variable causes the other just because the variable are related -As jacket sales increase, ice cream sales degrees -temperature is a lurking variable
Randomized Block Design
a block is a group of subjects that are similar, but the blocks differ from each other. They randomly assign treatments to subjects inside each block.
what is a Parameter?
a number calculated from the population. This is a fixed, unknown number that you want to find.Parameters are numbers that summarize data for an entire population.
what is a Statistic?
a number calculated from the sample. It is used to estimate the parameter value.numbers that summarize data from a sample, i.e. some subset of the entire population.
What is inferential statistics?
analyze and interpret data
what is a Quantitative or numerical variable?
answer is a number, something that can be counted or measured from the individual. If its a number it's probably quantitative unless it makes no sense to do arithmetic
Branches of Statistics
descriptive and inferential
cluster sample
divide into clusters, randomly select clusters, sample all in selected clusters ex. Checking trees for insect infestation. Hard to check random trees throughout the forest. Divide forest up into group, randomly select group, check every tree in that group for insect infestation. ex. In a study, the sample is chosen by dividing the population into voting precincts, and sampling everyone in the precincts selected
stratified sample
divide into strata, randomly select in each. ex. divide into male and female, then randomly choose from each group ex. In a study, the sample is chosen by separating all cars by size, and selecting 10 of each size grouping
census
every individual of interest is measured.
systematic sample
every n^th, for example ex. choose every 10th item off a production line to inspect ex. In a study, the sample is chosen by surveying every 3rd driver coming through a tollbooth
what is a treatment?
experimental conditions imposed on the subjects
What are measuring numbers?
fractions, decimals, scientific notation, ... for any two distinct (different numbers, there's always one between)
What are counting numbers?
integers, whole numbers, natural numbers no "in-betweenies"
what is a prospective study?(longitudinal or cohort)
likely to happen at a future date; concerned with or applying to the future. watches for outcomes, such as the development of a disease, during the study period and relates this to other factors such as suspected risk or protection factors data collected in the future from groups sharing common factors Ex. The college will collect data by surveying you yearly for the next 10 years