Statistics (PSY230) Midterm Study Guide!!!! (Definitions from Textbook!.)

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

(The) Central Limit Theorem

1. If samples of size n, where n ≥ 30, are drawn from any population with a mean µ and a standard deviation σ, then the sampling distribution of sample means approximates a normal distribution. 2. If the population itself is normally distributed, then the sampling distribution of sample means is normally distributed for any sample size n. In either case, the sampling distribution of sample means has a mean equal to the population mean. µ∧x⁻ = µ The sampling distribution of sample means has a variance equal to 1/n times the variance of the population and a standard deviation equal to the population standard deviation divided by the square root of n. σ∧(²/x⁻) = σ² / n σ∧x⁻ = σ / √(n)

Properties of the Standard Normal Distribution

1. The cumulative area is close to 0 for z-scores close to z = -3.49. 2. The cumulative are increases as the z-scores increase. 3. The cumulative area for z 0 is 0.5000. 4. The cumulative area is close to 1 for z-scores close to z = 3.49.

Properties of Sampling Distributions of Sample Means

1. The mean of the sample means µ∧x⁻ is equal to the population mean µ. µ∧x⁻ = µ 2. The standard deviation of the sample means σ∧x⁻ is equal to the population standard deviation σ divided by the square root of the sample size n. σ∧x⁻ = σ/√(n) The standard deviation of the sampling distribution of the sample means is called the standard error of the mean.

Sample standard deviation

= s = √(s²) = √((∑(x - x⁻)²)/(n-1))

Sample variance

= s² = (∑(x - x⁻)²)/(n-1)

random variable (x)

A ______ ________ (_) represents a numerical value associated with each outcome of a probability experiment.

normal distribution

A ______ ____________ is a continuous probability distribution for a random variable x. Its graph is called the normal curve. This has the following properties. 1. The mean, median, and mode are equal. 2. The normal curve is bell-shaped and is symmetric about the mean. 3. The total area under the normal curve is equal to 1. 4. The normal curve approaches, but never touches, the x-axis as it extends farther and farther away from the mean. 5. Between µ - σ and µ + σ (in the center of the curve), the graph curves downward. The graph curves upward to the left of µ - σ and to the right of µ + σ. The points at which the curve changes from curving upward to curving downward are called inflection points.

sample

A ______ is a subset, or part, of a population.

discrete probability distribution

A ________ ___________ ____________ lists each possible value the random variable can assume. together with its probability. This must satisfy the following conditions. 1. The probability of each value of the discrete random variable is between 0 and 1, inclusive. 0 ≤ P(x) ≤ 1 2. The sum of all the probabilities is 1. ∑P(x) = 1

sampling distribution

A ________ ____________ is the probability distribution of a sample statistic that is formed when samples of size n are repeatedly taken form a population. If the sample statistic is the sample mean, then the distribution is the sampling distribution of sample means. Every sample statistic has a sampling distribution.

frequency histogram

A _________ _________ is a bar graph that represents the frequency distribution of a data set. It has the following properties. 1. The horizontal scale is quantitative and measures the data values. 2. The vertical scale measures the frequencies of the classes. 3. Consecutive bars must touch.

frequency distribution

A _________ ____________ is a table that shows classes or intervals of data entries with a count of the number of entries in each class. The frequency f of a class is the number of data entries in the class.

statistic

A _________ is a numerical description of a sample characteristic.

parameter

A __________ is a numerical description of a population characteristic.

population

A __________ is the collection of all outcomes, responses, measurements, or counts that are of interest.

confounding variable

A ___________ ________ occurs when an experimenter cannot tell the difference between the effects of different factors on a variable.

cummulative frequency graph

A ___________ _________ _____, or ogive, is a line graph that displays the cumulative frequency of each class at its upper class boundary. The upper boundaries are marked on the horizontal axis, and the cumulative frequencies are marked on the vertical axis.

probability experiment

A ___________ __________ is an action, or trial, through which specific results (counts, measurements, or responses) are obtained. The result of a single trial in this is an outcome. The set of all possible outcomes of this is the sample space. An event is a subset of the sample space. It may consist of one or more outcomes.

conditional probability

A ___________ ___________is the probability of an event occurring, given that another event has already occurred. This of event B occurring. given that event A has occurred, is denoted by P(B|A) and is read as "probability of B, given A."

ogive

A cummulative frequency graph, or _____, is a line graph that displays the cumulative frequency of each class at its upper class boundary. The upper boundaries are marked on the horizontal axis, and the cumulative frequencies are marked on the vertical axis.

uniform, rectangular

A frequency distribution is _______ (or ___________) when all entries, or classes, in the distribution have equal or approximately equal frequencies. It is also symmetric.

symmetric

A frequency distribution is _________ when a vertical line can be drawn through the middle of a graph of the distribution and the resulting halves are approximately mirror images.

skewed left (negatively skewed)

A frequency distribution is skewed if the "tail" of the graph elongates more to one side than to the other. A distribution is ______ ____ (__________ ______) if its tail extends to the left.

skewed right (positively skewed)

A frequency distribution is skewed if the "tail" of the graph elongates more to one side than to the other. A distribution is ______ _____ (__________ ______) if its tail extends to the right.

discrete

A random variable is ________ if it has a finite or countable number of possible outcomes that can be listed.

continuous

A random variable is __________ if it has an uncountable number of possible outcomes, represented by an interval on the number line.

sampling distribution of sample means

A sampling distribution is the probability distribution of a sample statistic that is formed when samples of size n are repeatedly taken form a population. If the sample statistic is the sample mean, then the distribution is the ________ ____________ __ ______ _____. Every sample statistic has a sampling distribution.

weighted mean

A weighted mean is the mean of a data set whose entries have varying weights. It is given by x⁻ = (∑(x*w)) / ∑w where w is the weight of each entry x.

event

An _____ is a subset of the sample space. It may consist of one or more outcomes.

outlier

An _______ is a data entry that is far removed from the other entries in the data set.

Law of Large Numbers

As an experiment is repeated over and over, the empirical probability of an event approaches the theoretical (actual) probability of the event.

ratio level of measurement

Data at the _____ _____ __ ___________ are similar to data at the interval level, with the added property that a zero entry is an inherent zero. A ratio of two data values can be formed so that one data value can be meaningfully expressed as a multiple of another.

nominal level of measurement

Data at the _______ _____ __ ___________ are qualitative only. Data at this level are categorized using names, labels, or qualities. No mathematical computations can be made at this level.

ordinal level of measurement

Data at the _______ _____ __ ___________ are qualitative or quantitative. Data at this level can be arranged in order, or ranked, but differences between data entries are not meaningful.

interval level of measurement

Data at the ________ _____ __ ___________ can be ordered, and meaningful differences between data entries can be calculated. At this level, a zero entry simply represents a position on a scale; the entry is not an inherent zero.

dependent

Events that are not independent are _________.

Normal Approximation to a Binomial Distribution

If np ≥ 5 and nq ≥ 5, then the binomial random variable x is approximately normally distributed, with mean µ = np and standard deviation σ = √(npq) where n is the number of independent trials, p is the probability of success in a single trial, and q is the probability of failure in a single trial.

(The) Fundamental Counting Principle

If one event can occur in m ways and a second event can occur in n ways, the number of ways the two events can occur in sequence is m * n. This rule can be extended to any number of events occurring in sequence.

double-blind experiment

In a ______-_____ __________, neither the experimenter nor the subjects know if the subjects are receiving a treatment or a placebo. The experimenter is informed after all the data have been collected. This type of experimental design is preferred by researchers.

Binomial Probability Formula

In this, the probability of exactly x successes in n trials is P(x) = ⁿⁿ

Population Parameters of a Binomial Distribution

Mean: µ = np Variance: σ² = npq Standard deviation: σ = √(npq)

mean of a frequency distribution

The ____ __ _ _________ ____________ for a sample is approximated by x⁻ = (∑(x*f)) / n where x and f are the midpoints and frequencies of a class, respectively.

mode

The ____ of a data set is the data entry that occurs with the greatest frequency. A data set can have one, more than one, or none. If no entry is repeated, the data set has none.

mean

The ____ of a data set is the sum of the data entries divided by the number of entries. To find the mean of a data set, use one of the following formulas. Population Mean: µ = (∑x)/N Sample Mean: ⁻x = (∑x)/n The lowercase Greek letter µ (pronounced mu) represents the population mean and x⁻ (read as "x bar") represents the sample mean. Note that N represents the number of entries in a population and n represents the number of entries in a sample. Recall that the uppercase Greek letter sigma (∑) indicates a summation of values.

range

The _____ of a data set is the difference between the maximum and minimum data entries in the set. To find it, the data must be quantitative. _____ = Maximum data entry - Minimum data entry

median

The ______ of a data set is the value that lies in the middle of the data when the data set is ordered. the median measures the center of an ordered data set by dividing it into two equal parts. If the data set has an odd number of entries, the median is the middle data entry. If the data set has an even number of entries, the median is the mean of the two middle data entries.

expected value

The ________ _____ of a discrete random variable is equal to the mean of the random variable. E(x) = µ = ∑xP(x)

standard score, z-score

The ________ _____, or _-_____, represents the number of standard deviations a given value x falls from the mean µ. To find this for a given value, use the following formula. z = (Value - Mean) / Standard deviation = (x-µ) / σ

standard normal distribution

The ________ ______ ____________ is a normal distribution with a mean of 0 and a standard deviation of 1.

relative frequency

The ________ _________ of a class is the portion or percentage of the data that falls in that class. To find this of a class, divide the frequency f by the sample size n. ________ _________ = Class frequency / Sample size = f / n

frequency f

The _________ _ of a class is the number of data entries in the class.

deviation

The _________ of an entry x in a population data set is the difference between the entry and the mean µ of the data set. _________ of x = x - µ

complement of event E

The __________ __ _____ _ is the set of all outcomes in a sample space that are not included in event E. This of event E is denoted by E' and is read as "E prime."

population standard deviation

The __________ ________ ________ of a population data set of N entries is the square root of the population variance. σ = √(σ²) = √((∑ (x-µ)²)/N)

population variance

The __________ ________ of a population data set of N entries is __________ ________ = σ² = (∑ (x-µ)²) / N The symbol σ is the lowercase Greek letter sigma.

cumulative frequency

The ___________ _________ of a class is the sum of the frequencies of that class and all previous classes. This of the last class is equal to the sample size n.

interquartile range (IQR)

The _____________ _____ (___) of a data set is a measure of variation that fives the range of the middle 50% of the data. It is the difference between the third and first quartiles. _____________ _____ (___) = Q₃ - Q₁

sample mean

The formula for ______ ____ is ⁻x = (∑x)/n

population mean

The formula for __________ ____ is µ = (∑x)/N

normal curve

The graph of a normal distribution is called the ______ _____.

µ (pronounced mu)

The lowercase Greek letter _ (pronounced mu) represents the population mean.

midpoint

The midpoint of a class is the sum of the lower and upper limits of the class divided by two. The midpoint is sometimes called the class mark. ________ = (Lower class limit + Upper class limit) / 2

bimodal

The mode of a data set is the data entry that occurs with the greatest frequency. A data set can have one, more than one, or none. If no entry is repeated, the data set has none. If two entries occur with the same greatest frequency, each entry is a mode and the data set is called _______.

Range of Probabilities Rule

The probability of an event E is between 0 and 1, inclusive. That is, 0 ≤ P(E) ≤ 1.

(The) Addition Rule (for the Probability of A or B)

The probability that events A or B will occur, P(A or B), given by P(A or B) = P(A) + P(B) - P(A and B). If events A and B are mutually exclusive, then the rule can be simplified to P(A or B) = P(A) + P(B). This simplified rule can be extended to any number of mutually exclusive events.

(The) Multiplication Rule for the Probability of A and B

The probability that two events A and B will occur in sequence is P(A and B) = P(A) * P(B|A). If events A and B are independent, then the rule can be simplified to P(A and B) = P(A) * P(B). This simplified rule can be extended to any number of independent events.

outcome

The result of a single trial in a probability experiment is an _______.

sample space

The set of all possible outcomes of a probability experiment is the ______ _____.

standard error of the mean

The standard deviation of the sampling distribution of the sample means is called the ________ _____ __ ___ ____.

σ

The symbol _ is the lowercase Greek letter sigma.

quartiles

The three _________, Q₁, Q₂, and Q₃, approximately divide an ordered data set into four equal parts.

second quartile Q₂

The three quartiles, Q₁, Q₂, and Q₃, approximately divide an ordered data set into four equal parts. About one half of the data fall on or below the ______ ________ __.

first quartile Q₁

The three quartiles, Q₁, Q₂, and Q₃, approximately divide an ordered data set into four equal parts. About one quarter of the data fall on or below the _____ ________ __.

third quartile Q₃

The three quartiles, Q₁, Q₂, and Q₃, approximately divide an ordered data set into four equal parts. About three quarters of the data fall on or below the _____ ________ __.

sigma (∑)

The uppercase Greek letter _____ (_) indicates a summation of values.

Transforming a Z-Score to an X-Value

To transform a standard z-score to a data value x in a given population, use the formula x = µ + zσ.

mutually exclusive

Two events A and B are ________ _________ if A and B cannot occur at the same time.

independent

Two events are ___________ if the occurrence of one of the events does not affect the probability of the occurrence of the other event. Two events A and B are independent if P(B|A) = P(B) or if P(A|B) = P(A). Events that are not independent are dependent.

Blinding

________ is a technique where the subjects do not know whether they are receiving a treatment or a placebo. In a double-blind experiment, neither the experimenter nor the subjects know if the subjects are receiving a treatment or a placebo. The experimenter is informed after all the data have been collected. This type of experimental design is preferred by researchers.

Empirical (or statistical) probability

_________ (or ___________) ___________ is based on observations obtained form probability experiments. This of an event E is the relative frequency of event E. P(E) = Frequency of event E / Total frequency = f / n

Classical (or theoretical) probability

_________ (or ___________) ___________ is used when each outcome in a sample space is equally likely to occur. This for an event E is given by P(E) = Number of outcomes in event E / Total number of outcomes in sample space.

Qualitative data

consist of attributes, labels, or nonnumerical entries.

Data

consist of information coming from observations, counts, measurements, or responses.

Quantitative data

consist of numerical measurements or counts.

Randomization

is a process of randomly assigning subjects to different treatment groups.

(The) mean of a discrete random variable

is given by µ = ∑xP(x). Each value of x is multiplied by its corresponding probability and the products are added.

Descriptive statistics

is the branch of statistics that involves the organization, summarization, and display of data.

Inferential statistics

is the branch of statistics that involves using a sample to draw conclusions about a population. A basic tool in the study of this is probability.

Replication

is the repetition of an experiment under the same or similar conditions.

Statistics

is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.

(The) standard deviation of a discrete random variable

is σ = √(σ²) = √(∑(x-µ)²P(x)).

(The) variance of a discrete random variable

is σ² = ∑(x-µ)²P(x).

x⁻ (read as "x bar")

represents the sample mean.

Sample standard deviation (for a frequency distribution)

s = √((∑(x - x⁻)² * f)/(n-1))

Mean of a Binomial Distribution

µ = np

Standard deviation of a Binomial Distribution

σ = √(npq)

Variance of a Binomial Distribution

σ² = npq


Ensembles d'études connexes

Sem 3 - Unit 6 - Cognition (Itellectual Disabilities; Chromos Abn; Fetal Alcohol Syndrome)

View Set

D271 Brazil Final (info after midterm)

View Set

Crash Course: Race & Prejudice - Ch 12

View Set

Exam 2 David Myers, Psychology in Modules 11th

View Set

Object-Oriented Programming (OOP) concept

View Set

Econ Exam - Chapter's Assignment Review

View Set