Algebra 2 Statistics Unit

Ace your homework & exams now with Quizwiz!

Density curve

A curve that (a) is always on or above the horizontal axis, and (b) has exactly 1 area underneath it.

Block

A group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments.

Regression line

A line that describes how a response variable y changes as an explanatory variable x changes; also known as "line of best fit"

Sampling Frame

A list of individuals from whom the sample is drawn

Stratified random sample

A method of sampling that involves dividing your population into homogeneous subgroups and taking a simple random sample in each subgroup. Internally homogeneous and externally heterogeneous.

Sample

A relatively small proportion of people who are chosen in a survey so as to be representative of the whole.

Cluster Sample

A sample in which a simple random sample of heterogeneous subgroups of a population is selected. Internally heterogeneous and externally homogeneous.

Simple Random Sample (SRS)

A sample of size n selected from the population in such a way that each possible sample of size n has an equal chance of being selected.

Voluntary Response Samples

A sample that consists of people who choose themselves by responding. They often over represent people with strong opinions. BIAS

Random sampling

A sample that fairly represents a population because each member has an equal chance of inclusion

Convenience Sample

A sample which consists of members of a population that are easily accessed. Generally leads to bias.

Census

A study that attempts to collect data from every individual in the population.

Observational study

A study that merely observes conditions of individuals in a population and records information; the population is disturbed as little as possible.

Lurking variable

A variable that has an important effect on the relationship among the variables in a study but is not one of the explanatory variables studied.

Explanatory variable

A variable that may help explain or influences changes in a response variable.

Response variable

A variable that measures an outcome of a study.

Positive association

Above-average values of one variable tend to accompany above-average values of the other, and below-average values also tend to occur together.

Negative association

Above-average values of one variable tend to accompany below-average values of the other, and vice versa.

Completely Randomized Design

All experimental units have an equal chance of receiving each of the treatments

The 68-95-99.7 Rule

Also known as the "Empirical Rule."

Sampling error

An error that occurs when a sample somehow does not represent the target population due to bad sampling methods and/or undercoverage

Factor

An explanatory variable in an experiment

Influential Point

An observation that if removed it would markedly change the result of the calculation.

Outlier

An observation that lies outside the overall pattern of the other observations.

Confidentiality

Any information gathered about a participant must not be revealed without the participants consent.

Random Assignment

Assigning participants to experimental and control conditions by chance, thus minimizing the effects of preexisting differences among those assigned to the different groups.

Mean of a density curve

Balance point

Response Bias

Bias that occurs when the behavior of the respondent or of the interviewer causes inaccurate results

Elements of Experimental Design

CONTROL, RANDOM ASSIGNMENT, AND REPLICATION

Describing a scatterplot

Can be described by the direction, form, and strength of the relationship.

Experiment

Deliberately imposes some treatment on individuals to measure their responses. Causality can be inferred if carried out well.

Replication

Enough units in each group so that any difference in the effects of the treatments can be distinguished from chance differences between the groups. Reduces sample variability

Median of a density curve

Equal areas point

Anonymity

Even the researcher cannot link participants to their data

Placebo effect

Experimental results are caused by expectations alone; double blindness is intended to mitigate this effect.

Randomized block design

Form blocks consisting of individuals that are similar in some way that is important to the response. Random assignment of treatments is then carried out separately within each block.

Standard Normal distribution

Has mean 0 and standard deviation 1

Control group

In an experiment, the group that is administered a placebo treatment (an active treatment) or no treatment; results are compared to the treatment group

Control

In an experiment, the standard that is used for comparison. Reduces lurking variables!

Normal Distribution

Is completely specified by two numbers, mean μ and standard deviation σ.

Least-squares regression line

Line that makes the sum of the squared vertical distances of the data points from the line as small as possible.

describing a distribution of quantitative data

SOCS (Shape-Outlier-Center-Spread)

Double Blind

This term describes an experiment in which neither the subjects nor the experimenter knows whether a subject is a member of the experimental group or the control group.

Cumulative relative frequency graph

Used to examine location within a distribution. Completed graph shows the accumulating percent of observations

Inference about cause and effect

Using experimental results to draw conclusions about causality

Inference about the population

Using sample data to draw conclusions about the population

Nonresponse

When the subjects refuse to cooperate or cannot be reached. This leads to non sampling bias.

μ (mu)

a population mean

statistical question

a question that can be answered by collecting data and where there will be variability in that data

Single Blind

a study in which the participants are unaware of whether they are in the control group or the experimental group

Explanatory Variable

a variable that we think explains or causes changes in the response variable

mean

arithmetic average, measure of center, NOT RESISTANT MEASURE OF CENTER, average value

boxplot

based on 5 number summary, useful for comparing distributions, shows spread of central half of distribution

conditional distributions

describes the values of that variable among individuals who have a specific value of another variable. Can be displayed with a SIDE-BY-SIDE BAR GRAPH or a SEGMENTED BAR GRAPH

distribution

describes what values the variable takes and how often it takes them

skewed to the right

if the right side of the graph with larger values is longer than the left

outlier

individual value that falls outside the overall pattern; it is an outlier if it is more than 1.5 x IQR above the third quarter or below the first quartile

dotplot

individual values on a number line; show distribution of a quantitative variable

standard deviation (s sub-x)

measures the average distance of the observations from their mean; measures spread about the mean, always greater or equal to 0, not resistant, use for reasonably symmetric distributions

IQR

measures the range of the middle 50% of the data; IQR=Q3-Q1; resistant

median (M)

midpoint of a distribution, typical value; in a skewed distribution, the mean is usually farther out

mode; modes

most frequent; major peaks

multimodal

multiple peaks

association

one of the variables tends to occur in common with specific values of the other

First Quartile (Q1)

one quarter up the list; resistant

two-way table

organizes data about two categorical variables; often used to summarize the large amounts of information by grouping outcomes into categories

histogram

plot the counts (frequencies) or percents (relative frequencies) of values in a equal-width classes; show distribution of a quantitative variable

stemplot

separate each observation into a stem and a one-digit leaf; show distribution of a quantitative variable

numerical summary

should report at least its center and spread, or variability

unimodal

single peak

range

subtract the smallest value from largest value

mean

the average

relative frequency table

the distribution of a categorical variable lists the categories and gives the percent of individuals that fall in each category

x-bar

the mean of a set of observations/sample (add their values and divide by the number of observations), use for reasonably symmetric distributions

symmetric

the right and left sides of the graph are symmetric

marginal distributions

the row totals and column totals

Experimental units

the smallest collection of individuals to which treatments are applied

Third Quartile (Q3)

three-quarters up the list; resistant

bimodal

two clear peaks

Correlation

Measures the direction and strength of the linear relationship between two quantitative variables.

five-number summary; summary of spread and center

Minimum, Q1, M, Q3, Maximum

Bias

Occurs when a study design favors some outcomes over others

Undercoverage

Occurs when some groups in the population are left out of the process of choosing the sample

Scatterplot

Plot that shows the relationship between two quantitative variables measured on the same individuals.

Statistically significant

Referring to a correlation, or a difference between two groups, that is larger than would be expected by chance alone.

Standardized values (z-scores)

Tells how many standard deviations a data point is from mean

Slope

The amount by which y is predicted to change when x increases by one unit.

Residual

The difference between an observed value of the response variable and the value predicted by the regression line.

Residual plot

The distribution of residuals; helps us assess how well a regression line fits the data.

Confounding

The effect of some variable on the response variable cannot be separated from the effect of the explanatory variable.

Population

The entire aggregation of individuals from which samples can be drawn

Matched Pair

The most extreme form of blocking. Subjects are matched in pairs as closely as possible and each subject in a pair is randomly assigned to receive one of the treatments.

y intercept

The predicted value of y when x = 0.

Extrapolation

The use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line.

Predicted value

The value predicted by the regression model; read as "y hat"

Pth percentile

The value with P percent of the observations less than it.

pie charts, bar graphs

display the distribution of a categorical variable

frequency table

distribution of a categorical variable lists the categories and gives the count of individuals that fall in each category


Related study sets

Care III Exam 4 Practice Questions

View Set

Aztec, Maya, and Inca (7th Grade)

View Set

N400 Ch16: Outcome Identification and Planning

View Set

Chapter 20: Electroconvulsive Therapy Review Questions

View Set