Statistics Terms
GIGO
garbage in, garbage out. If data is not valid, then the statistical analysis is not useful.
Observational Study
observe and gather data without attempting to modify the subjects.
Controlling Effects of Variables
planning experiments that minimize the possibility of confounding.
Voluntary (Self-selected) Response Sample
sample in which the respondents decide whether to be included.
Randomization
the assignment of subject to different experimental groups using random selection.
Population
the collection of all elements to be studied.
Confounding
the inability to distinguish among effects of various factors in an experiment.
Replication
the repetition of an experiment on multiple subjects.
Statistics
the science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data.
Blinding
the technique of hiding the information related to treatment from the subject.
The Process of Statistics
Identify the research objective Collect the data needed to answer questions posed in part 1 Describe the data Perform inference
Descriptive Statistics
a branch of statistics that involves the organization, summarization, and display of data. This is the first third of this course.
Inferential Statistics
a branch of statistics that involves using a sample to draw conclusions about a population. This is the last third of this course, which is built upon the first two thirds.
Census
a collection of data from every member of the population
Parameter
a numerical summary of a population.
Statistic
a numerical summary of a sample.
Probability Sample
a sample in which each individual member of a population has known chance of being chosen.
Random Sample
a sample in which each individual member of a population is equally likely to be chosen.
Simple Random Sample
a sample of n subjects that is selected in a way that makes every possible sample of size n is equally likely.
Systematic Sample
a sample selected by choosing a starting point in the list of subjects and then selecting every k^th subject from that point on.
Convince Sample
a sample selected by choosing subjects that are most easily accessed.
Stratified Sample
a sample selected by dividing the population into at least two subgroups and then randomly choosing subjects from each subgroup.
Cluster Sample
a sample selected by dividing the population into at least two subgroups, randomly selecting subgroups, and then choosing all subjects from the selected subgroups.
Cross-sectional Study
a study in which all data is observed, measured, and collected at the same time.
Retrospective Study
a study in which all data is observed, measured, and collected from previous time periods.
Prospective (Longitudinal) Study
a study in which all data is observed, measured, and collected over many time periods from groups (cohorts) sharing common factors.
Sample
a subset of members selected from a population
Correlation does not imply Causation
an association between two variables does not necessarily imply that one value directly determines the other value.
Experiment
apply some treatment, then observe the effects of the treatment on the subjects.
Variables
characteristics of the members of the population
Data
counts, measurements, responses, etc. that has been collected for statistical analysis Note: datum is the singular of data.
Graphical Representations
data can be misrepresented using deceptive graphical techniques.
Ratio Level Data
data that are interval level, and also having meaningful computational differences, and having a significant zero value.
Interval Level Data
data that are ordinal level, and also having meaningful computational differences, but having no significant zero value.
Nominal Level Data
data that are qualitative only.
Ordinal Level Data
data that can be ordered (qualitative or quantitative), but computational differences are meaningless.
Qualitative Data
data that can be separated into different categories that are distinguished by some nonnumeric characteristic.
Quantitative Data
data that consists of numbers representing counts of measurements.
Discrete Data
data that has a finite number or countable number of possible values.
Continuous Data
data that has an infinite number of possible values, with no gaps in the possible values.