Stats Prelim 1

Ace your homework & exams now with Quizwiz!

statistics

a body of principles for designing the process of data collection and making inferences about the population from information in the sample

experiment

a study where a set of conditions under a SPECIFIC PROTOCOL is established to evaluate the implications for response variables -Researcher controls conditions can show causality

observational study

a study where the researcher observes and CANNOT CONTROL the conditions under which the observational units are exposed

what does simple random sample result in

a) every unit of the population has the same probability of being included in the sample b) the units chosen for the sample are chosen independently from one another

double blind experiment

an experiment where neither the subjects nor individuals measuring the repsonse know which treatments were assigned to which subjects

sample

any subset of measurements from the population that are actually collected - data from part of population

measurement/ response bias

bias incurred when a mehtod of observation produces values different from the true value of the obs unit i.e. uncalibrated insturment

selection bias

bias incurred when a sample systematically excludes some part of the population- doesnt represent the population

nonresponse bias

bias incurred when data are not obtained from all observational units selected for the study i.e. self selected sample- people who are most motivated (good or bad) are the ones who write reviews

bar chart is good for what kind of variable (categorical or quantitative)

categorical

types of variables (2)

categorical quantitative

categorical variable

characteristic is a trait that can only be assigned to categories

what are the three alternatives to simple random sampling

cluster sampling stratified sampling systematic sampling

what does relative frequency allow for

comparison of data with different sample sizes

census

data from entire population

frequency distribution

display of the frequency of each category in categorical data

cluster sampling

divide pop of units into distinct subgroups or clusters. the CLUSTERS are randomly sampled and ALL units in the selected cluster are observed

stratified sampling

divide pop of units into distinct subgroups or strata and then sample independently from each strata -sample every group but not everyone in each group

single blind experiment

experiment where subjects do not know which treatment they recieve but the individuals measuring the response know which treatments were assigned to which subjects

what happens if the population is systematically arranged? ie. students standing with friends

leads to selection bias- in order for systematic sampling to be unbiased, population must be in random arrangement

what is an example of selection bias

looking for average level of happiness in ithaca but dont ask people at cornell mail survey for all US people- excludes homeless people

can an observational study show causality

no

convenience sampling

no random mechanism i.e. sampling only thru phone calls- only sampling people with telephones

extraneous variable

not explanatory variable but affects the response variable

effect of marital status on the wellbeing of older adults in china is an example of an observational study or experiment?

observational study

sampling without replacement

once a unit is selected for the sample it may ot be sampled again

what kinds of tests often fall subject ot nonresponse bias

online surveys- only peopel in sample are people that want to fill out the survey

mode

peak of histogram

experimental unit

physical entity or subject exposed to the treatment independently from other units unit to which treatment is applied - treatment condition affects 1 unit independently

example of stratified sampling

pop of units = cornell undergrads strata= freshman, soph, junior, senior, take random sample from each strata

dot plot useful for when:

quantitative data when there are relatively few (less than 20-25) observations

what kind of data are dotpotplots, historgrams, stem and leaf diagram, boxplots good for

quantitative variables

two continuous graphical summary

scatter plot

graphical summary of data with one categorical, one continuous

side by side boxplot

descriptive statistics

the branch of statistics that is concerned with collecting, summarizing and describing data

inferential statistics

the branch of statistics that is concerned with making inferences about the population from data contained in a sample

variables

the characteristics that have been measured or observed

population of units

the collection of units in which we have scientific interest

observational units (cases)

the entity from which we observe and measure characteristics

number of replications

the number of experimental units to which a treatment has been independently applied

sample size (n)

the number of observations in the sample

frequency

the number of occurrences (count) of each category

observational unit

the physical entity from which a response variable is measured, could possibly be a sample from the experiment unit on which response is measured i.e. happiness on a scale of 1-5, observational unit = person

randomization

the random assignment of treatments to experimental units

treatments

the set of circumstances created for the experiment in response to research hypothesis

simple random sample

the simplest way to draw the samples for statistical inference simple random sample of n units is a sample in which every possible set of n units has THE SAME chance of being selected

bias in sampling

the tendency in samples to differ from the population from which they were drawn in a systematic way

response variable

the variable of primary interest in the experiment

explanatory variable

the variables that have values controlled by the experimenter (independent variable)

why repeat treatment

to show that the effect of the treatment is a result of the treatment not the individual who got treated

all the trees in sapsucker the species of all the trees in sapsucker the species of all the trees within 100 feet of sapsucker woods pond what is population unit, population and sample

trees in sapsucker= population unit species of trees in sapsucker= population species of all the trees near pond= sampel

true or false: obs unit can be an experimental unit but not always

true

true or false: you can have different sample sizes in each stratum?

true

all the undergraduate students at cornell the height of all undergraduate students at cornell the height of undergraduate students taking ILR Stats what is the population of units, what is the population and what is the sample

undergrads at cornell = population of units height of undergrads= population height of undergrads in ILRST= sample

blocking

using extraneous variables to create groups (blocks) that are similar- all experimental conditions (treatments) are then tried in each block - accounts for different conditions in each block (extraneous variable recognized) ie. dif types of soil - how does treatment do in wet block - how does treatment do in dry block

quantitative variable

variable that is naturally numeric and for which arithmetic operations make sense

confounding variables

variables whose effects cant be distinguished from one anotehr

population

(statistical) the set of all MEASUREMENTS or record of some QUALITATIVE trait corresponding to each unit in the collection of units

ch. 2

-----

what are the two types of quantitative variables

1. continuous- takes values in any interval i.e. physical measurements (height, weight) 2. discrete: takes values that are distinct numbers (integers) i.e. counts (bacteria in a petri dish) i.e. age in years

what are the two types of categorical data

1. ordinal- categories have intrinsic order i.e. movie ratings 2. nominal- categories are assigned a numerical code wo any intrinsic order i.e. blood type A=1, B=2, AB=3, O=4

what are the different kinds of biases in sampling (3)

1. selection bias 2. measurement or response bias 3. nonresponse bias

what are the 6 steps of the data analysis proces

1. set clearly defined goals for research study 2. make plan of what data to collect adn how to collect it 3. collect the data 4. data summary and preliminary analysis 5. apply appropriate methods for formal data analysis 6. interpret the info and draw conclusions

what do dotplots show

frequency of observations in a sample

relative frequency

frequency/ total number of observations (count/n)

example of cluster sampling

freshman at cornell are divided into clusters based on residence hall (dickson, donlon, mews, etc) dickson and donlon are selected and then all students are sampled from those two halls

systematic sampling

given a list or map of the units, randomly choose a starting unit and then sample every Kth unit k= (divide population by sample) (N/n)

what is the graphical summary of data with two categorical variables

grouped or segmented bar chart

clusters should be homogeneous or heterogeneous

heterogeneous

direct control

holding extraneous variables constant so that their effects arent confounded with those of the treatments

strata should be homogenous or heterogeneous

homogeneous (all freshman)

replication

independent repetition of a treatment to two or more experimental units


Related study sets

A+ 2.9 Given a scenario, use appropriate networking tools

View Set

Human Growth and Development Final (Chapter 15-20)

View Set

CIS 240 Chapter 8A IPv6 Addressing

View Set

#1 Chapter 37: Assessment and Management of Patients With Allergic Disorders

View Set

Unit 1: Types & Characteristics of Equity Securities

View Set

Mental Health Chapter 2 NCLEX Questions

View Set

Nursing - Communications & Family Dynamics

View Set