Statistics Study Guide

¡Supera tus tareas y exámenes ahora con Quizwiz!

Factorial

n! = (n)(n-1)(n-2)...3.2.1

Formula for sample standard deviation

s=√∑(xi-x-bar)²/n-1

Formula for sample variance

s²=∑(xi-x-bar)²/n-1

Sample-mean

x-bar; The mean of a sample data

Formula for sample z-score

z=x-xbar/s

Formula for population z-score

z=x-µ/δ

Formula for expected value

µ=E(X)=∑x p(x)=x₁*p(x₁)+x₂*p(x₂)...

Formula for population standard deviation

δ=√∑(X-µ)²/N

Formula for variance of a discrete random variable

δ²=E(X-µ)²=∑(x-µ)²p(x)=∑x²p(x)-µ²

Formula for population variance

δ²=∑(X-µ)²/N

Descriptive statistics

A branch of statistics that uses numerical and graphical methods to look for patterns in a dataset and to present that information in a convenient form, but does not use analysis

Inferential statistics

A branch of statistics that uses sample data to make estimates, predictions, and decisions about a population

Variable

A characteristic of interest about each individual element of a population or sample

Population

A collection of individuals or objects that is under study

Compound event

A composition of 2 or more events, can be the result of a union or intersection of events

Normal distribution

A distribution that models population distributions with a symmetric mound shaped distribution

Interval estimator (confidence interval)

A formula that tells us how to use the sample data to calculate an interval that estimates the target parameter.

Histogram

A graphical representation of frequency distribution in which observations are divided into classes

Box plot

A graphical representation of the distribution using five numbers: minimum, lower quartile, median, upper quartile, maximum

Pie chart

A graphical summary of data that uses a circle partitioned into sectors to show the relative frequency of items in each class

Bar chart

A graphical summary of data that uses bars of fixed width and varying height to show the frequency or relative frequency of items in each class

Symmetric distribution

A mound- or bell-shaped distribution with a high frequency of observations in the middle of the range

Subjective probability

A probability assigned based on the subjective judgment of an individual

Theoretical probability

A probability in which basic outcomes of the process are defined, probabilities are assigned to the basic outcomes and probabilities of compound events are computed

Experiment

A process that yields a single outcome that cannot be predicted with certainty

Bernoulli

A random variable that can assume only two possible values: 1 (success) and 0 (failure)

Discrete random variable

A random variable that takes either finitely many values or a countably infinite set of values

Point estimator

A rule or formula that tells us how to use the sample data to calculate a single number that can be used as an estimate of the target parameter (example: sample mean).

Simple random sample

A sample in which each element in the population has an equal chance of being selected

Systematic sample

A sample in which the first element is picked at random, and then every kth element is picked

Cluster sample

A sample obtained by sampling some of, but not all of, the possible subdivisions within the population

Stratefied random sample

A sample obtained by stratifying the sampling frame and then selecting a fixed number of elements from each stratum by simple random sampling

Countable set

A set in one to one correspondence with integers

Split stem plot

A stem and leaf plot in which some stems are split into two parts to reduce the size of the plot

Sample

A subset of the population

Stem and leaf plot

A summary of data that divides each observation into two parts, with the leaves grouped on a stem

Relative frequency distribution

A tabular summary of data showing both the frequency and the relative frequency of items in each class

Frequency distribution

A tabular summary of data showing different classes and the frequency of items in each of several non-overlapping classes

Continuous random variable

A variable in which the set of possible values consists of one or more intervals on the number line

Random variable

A variable that assigns a unique value to each outcome of the Sample space, S

Percentile ranking

A way to express where an observation falls in a dataset using percentiles

Complementary event

All sample points that do not belong to the event

Binomial experiment

An experiment in which identical trials are repeated and we are interested in the number of certain outcomes

Outlier

An extreme observation that does not match the general pattern of a dataset

Real line

An interval of the form (-∞,∞)

Finite interval

An interval of the form (a,b) or [a,b]

Semifinite interval

An interval of the form [0,∞)

Interquartile range

Another measure of spread, which gives the middle 50% of a dataset

Event

Any subset of the sample space

Sample point

Basic outcome of an experiment

Independence

Events that do not depend on the outcome of another event

Mutually exclusive events

Events that share no sample points

Chebyshev's rule

For any number k>1, at least (1-1/k²) fraction of the data will lie within k standard deviations of the mean

Additive rule of probability

For any two events A and B, P(AUB) = P(A) + P(B) - P(AB)

The empirical rule

For data with a bell-shaped distribution, approximately 68% of the observations will be within one standard deviation of the mean, approximately 95% of the observations will be within two standard deviations of the mean, and approximately 99.7% of the observations will be within three standard deviations of the mean

Examples of sample points

HH, HT, TH, TT

Probability density function (PDF)

Height of the curve at x

Distribution

How the observations are spread over the range of the data

Area under the curve

How to determine probability using a probability histogram

Central limit theorem

If the sample size n is large then the sampling distribution x-bar is approximately normal with mean µ and variance δ/√n

Basic counting principle

In an experiment done in two independent stages, where Stage I has m possible outcomes and Stage II has n possible outcomes, the experiment can be performed in (m)(n) ways

Quartile

One of three numbers which partition a dataset into four parts

Permutation

Ordered arrangement

Multiplicative rule

P(AB)=P(A)P(B|A)=P(B)P(A|B)

Empirical probability

P(Event) = Relative frequency of the event = Number of occurrences of an event/Number of times experiment is repeated

Range

The difference between the minimum and maximum observations

Stem

The digits of each observation, excluding the leaf (in a stem and leaf plot)

Union

The entire Venn diagram, including both circles and the intersection

Median

The middle-most observation

Mode

The most frequent observation

Binomial random variable

The number of successes in n trials

Intersection

The part of a Venn diagram where the circles overlap

Relative frequency

The proportion of the total number of observations belonging to the class

Leaf

The right-most digit of each observation (in a stem and leaf plot)

Leaf unit

The unit used to separate the leaf from a stem in a stem and leaf plot, assumed to be 1

Target parameter

The unknown population parameter that we are interested in estimating.

Qualitative data

Variables that are not numerical (in the true sense) but are categorized into various groups

Quantitative data

Variables that can assume numerical values (in the true sense)

Law of Large Numbers

When an experiment is repeated many times, then the relative frequency of a particular outcome approaches the actual probability of that particular outcome

Parameter

Numerical descriptive measure of the population

Sample statistic

Numerical descriptive measure of the sample

Mean

The average of a group of numbers

Variance

The average squared distance between all the observations and the mean

Expected value (mean)

The center of the distribution of a random variable

Center

Measure of central tendency

Z score

Measures the relative position of an observation compared to the mean, expressed in terms of standard deviation

Data

Numbers or information with a context

Venn diagram

Pictorial diagram in which the sample space is represented by a rectangle with sample points represented by solid dots inside the rectangle and events by circles within the rectangle

Conditional probability

Probabilities of events change when additional information is provided, in particular, if another related event is known to have occurred

Example of a sample space

S = {HH, HT, TH, TT}

Sample space

Set of all possible outcomes of an experiment, denoted by S

Negative distribution

Skewed to the left; a distribution with a high frequency of observations in the high end of the range

Positive distribution

Skewed to the right; a distribution with a high frequency of observations in the lower end of the range

Probability distribution

Specification of the possible values and probability associated with each possible value of the discrete random variable

Variability

Spread of the data

Standard deviation

Square root of variance, or average between a typical observation and the mean of a dataset

Probability

Study of randomness and uncertainty; numerical measure of chance


Conjuntos de estudio relacionados

CH 30 LISTENING QUIZ Mozart: Eine kleine Nachtmusik, I and III

View Set