AP Statistics: Chapters 5-8 Concepts

Ace your homework & exams now with Quizwiz!

fundamental properties of probability

1. For any event E, 0 is less than or equal to the P(E), which is less than or equal to 1. 2. If S is the sample space for an experiment, P(S)= 1. 3. If two events E and F are mutually exclusive, then P(EorF)= P(E)+ P(F). 4. For any event E, P(E ) + P(not E ) = 1. Therefore P(not E)= 1 - P(E ) and P(E )= 1 - P(not E ).

power

1. Models that involve transforming only the x variable (square root, reciprocal, and log) 2. Models that involve transforming the y variable (exponential and _____________.

estimating probabilities empirically

1. Observe a very large number of chance outcomes under controlled circumstances. 2. Estimate the probability of an event to be the observed proportion of occurrence by appealing to the interpretation of probability as a long-run relative frequency and to the law of large numbers.

x

The regression line of y on x should not be used to predict ____, because it is not the line that minimizes the sum of squared deviations in the x direction.

two

The value of r does not depend on which of the ______ variables is considered x.

using stimulation to approximate a probability

1. Design a method that uses a random mechanism (such as a random number generator or table, the selection of a ball from a box, the toss of a coin, etc.) to represent an observation. Be sure that the important characteristics of the actual process are preserved. 2. Generate an observation using the method from Step 1, and determine whether the outcome of interest has occurred. 3. Repeat Step 2 a large number of times. 4. Calculate the estimated probability by dividing the number of observations for which the outcome of interest occurred by the total number of observations generated.

probability calculations for continuous random variables

1. a < x < b, the event that the random variable x assumes a value between two given numbers, a and b 2. x < a, the event that the random variable x assumes a value less than a given number a 3. x > b, the event that the random variable x assumes a value greater than a given number b (this can also be written as b < x)

association

A value of r close to 1 indicates that the larger values of one variable tend to be associated with the larger values of the other variable. This is different from saying that a large value of one variable causes the value of the other variable to be large. Correlation measures the extent of _______________________, but association does not imply causation.

correlation coefficient

Although there are several different correlation coefficients, Pearson's correlation coefficient is by far the most commonly used, and so the name "Pearson's" is often omitted and it is referred to as simply the ______________ ______________.

simple event

An event consisting of exactly one outcome.

outlier

An observation that has a large residual. Outlier observations fall far away from the least-squares line in the y-direction.

influential observation

An observation that is potentially influential if it has an x value that is far away from the rest of the data (separated from the rest of the data in the x direction). To determine if the observation is in fact influential, we assess whether removal of this observation has a large impact on the value of the slope or intercept of the least-squares line.

chance experiment

Any activity or situation in which there is uncertainty about which of two or more possible outcomes will result.

event

Any collection of outcomes from the sample space of a chance experiment.

variability

Choose a model that has small residuals (small se) and accounts for a large proportion of the ______________ in y (large R^2).

calculating probabilities when outcomes are equally likely

Consider an experiment that can result in any one of N possible outcomes. Denote the corresponding simple events by O1, O2, ... , ON. If these simple events are equally likely to occur, then 1. P(O1)= (1/N')*P(O2)=(1/N')........(1/N')*P(O2)=(1/N) 2. For any event E, P(E)= (number of outcomes in E/N)

y bar

Consider using the least-squares line to predict the value of y associated with an x value some specified number of standard deviations away from x. Then the predicted y value will be only r times this number of standard deviations from y. In terms of standard deviations, except when r= 1 or -1, the predicted y will always be closer to _________ than x is to x bar.

causation

Correlation does not imply ______________. A common media blunder is to infer a cause- and-effect relationship between two variables simply because there is a strong correlation between them. Don't fall into this trap! A strong correlation implies only that the two variables tend to vary together in a predictable way, but there are many possible explanations for why this is occurring besides one variable causing changes in the other.

total sum of squares

Denoted by SSTo, is defined as:

predicted

Each residual is the difference between an observed y value and the corresponding _______ y value.

coefficient of determination

The proportion of variation in y that can be attributed to an approximate linear relationship between x and y. The coefficient of determination is denoted by r^2. The value of r^2 is often converted to a percentage (by multiplying by 100) and interpreted as the percentage of variation in y that can be explained by an approximate linear relationship between x and y.

standard deviation of a discrete random variable

The standard deviation of a discrete random variable x is the square root of the variance.

close

The standard deviation of a random variable x, denoted by sx, describes variability in the probability distribution. When the value of sx is small, observed values of x will tend to be ______ to the mean value (little variability). When the value of sx is large, there will be more variability in observed x value.

intercept

The value of a, sometimes called the y-intercept or vertical intercept of the line, is the height of the line above the value x=0.

slope

The value of b is the amount by which y increases when x increases by 1 unit.

measurement

The value of r does not depend on the unit of __________________ for either variable. For example, if x is height, the corresponding z score is the same whether height is expressed in inches, meters, or miles, and the value of the correlation coefficient is not affected. The correlation coefficient measures the inherent strength of the linear relationship between two numerical variables.

straight line

The value of r is a measure of the extent to which x and y are linearly related. r measures the extent to which the points in the scatterplot fall close to a ______________ ____________. A value of r close to 0 does not necessarily mean that there is no relationship between x and y. It is possible that there could still be a strong relationship that is not linear.

negative

The value of r is between -1 and +1. A value near the upper limit, +1, indicates a strong positive linear relationship, whereas an r close to the lower limit,-1, suggests a strong ______________ linear relationship.

plot

The value of the correlation coefficient as well as the values for the intercept and slope of the least-squares line can be sensitive to influential observations in the data set, particularly if the sample size is small. Because potentially influential observations are those whose x values are far away from most of the x values in the data set, it is important to look for such observations when examining the scatterplot. (Another good reason for always starting with a _________ of the data!)

variance of a discrete random variable

The variance of a discrete random variable x, denoted by 2x, is computed by: 1. subtracting the mean from each possible x value to obtain the deviations 2. squaring each deviation 3. multiplying each squared deviation by the probability of the corresponding x value 4. adding these quantities

independent events

Two events E and F are said to be independent if P(E | F) = P(E)

mutually exclusive

Two events are ______________ __________________ if they have no outcomes in common. The term disjoint is also sometimes used to describe events that have no outcomes in common.

no

When E and F are mutually exclusive, the general addition rule simplifies to the previous rule for mutually exclusive events. This is because when E and F are mutually exclusive, E intersect F contains ____ outcomes and P(E U F) = 0.

classical approach to probability

When the outcomes in the sample space of a chance experiment are equally likely, the probability of an event E, denoted by P(E), is the ratio of the number of outcomes favorable to E to the total number of outcomes in the sample space. According to this definition, the calculation of a probability consists of counting the number of outcomes that make up an event, counting the number of outcomes in the sample space, and then dividing.

regression analysis

the collection of methods involving the fitting of lines, curves, and more complicated functions to bivariate and multivariate data.

residual

the vertical deviation of a point in the scatterplot from the regression line.

properties of the population correlation coefficient

1. p is a number between -1 and +1 that does not depend on the unit of measurement for either x or y, or on which variable is labeled x and which is labeled y. 2. p= +1 or -1 if and only if all (x, y) pairs in the population lie exactly on a straight line, so r measures the extent to which there is a linear relationship in the population.

correlation coefficient

A _______________ ______________________ is a numerical assessment of the strength of relationship between the x and y values in a bivariate data set consisting of (x,y) pairs.

properties of a binomial experiment

A binomial experiment consists of a sequence of trials with the following conditions: 1. There are a fixed number of trials. 2. Each trial can result in one of only two possible outcomes, labeled success (S) and failure (F). 3. Outcomes of different trials are independent. 4. The probability that a trial results in a success is the same for each trial. The binomial random variable x is defined as x= number of successes observed when a binomial experiment is performed The probability distribution of x is called the binomial probability distribution.

relationship

A correlation coefficient near 0 does not necessarily imply that there is no _____________ between two variables. Before such an interpretation can be given, it is important to examine a scatterplot of the data carefully. Although it may be true that the variables are unrelated, there may in fact be a strong but nonlinear relationship.

linear

A correlation coefficient of r=1 occurs only when all the points in a scatterplot of the data lie exactly on a straight line that slopes upward. Similarly, r= -1 only when all the points lie exactly on a downward-sloping line. r=1 or r= -1 indicates a perfect ___________ relationship between x and y in the sample data.

curve

A desirable residual plot is one that exhibits no particular pattern, such as curvature. Curvature in the residual plot is an indication that the relationship between x and y is not linear and that a ________ would be a better choice than a line for describing the relationship between x and y.

scatterplot

A graph of bivariate numerical data in which each observation (x, y) is represented as a point located with respect to a horizontal x-axis and a vertical y-axis.

Pearson's sample correlation coefficient

A measure of the strength and direction of a linear relationship between two numerical variables. Denoted by r.

random variable

A numerical variable whose value depends on the outcome of a chance experiment. A random variable associates a numerical value with each outcome of a chance experiment.

delete

A point whose x value differs greatly from others in the data set may have exerted excessive influence in determining the fitted line. One method for assessing the impact of such an isolated point is to ________ it from the data set, recompute the least-squares line, and evaluate the extent to which the equation of the line has changed.

continuous

A probability distribution for a ___________ random variable x is specified by a curve called a density curve. The function that defines this curve is denoted by f(x) and it is called the density function. The following are properties of all continuous probability distributions: 1. f(x) is greater than or equal to 0 (the curve cannot dip below the horizontal axis). 2. The total area under the density curve is equal to 1. The probability that x falls in any particular interval is equal to the area under the density curve and above the interval.

continuous random variable

A random variable is continuous if its set of possible values includes an entire interval on the number line.

discrete random variable

A random variable is discrete if its set of possible values is a collection of isolated points along the number line.

residual plot

A scatterplot of the (x, residual) pairs. Isolated points or a pattern of points in the residual plot indicate potential problems.

transformation

A simple function of the x and/or y variable, which is then used in a regression.

law of large numbers

As the number of repetitions of a chance experiment increases, the chance that the relative frequency of occurrence for an event will differ from the true probability of the event by more than any small number approaches 0. This tells us that, as the number of repetitions of a chance experiment increases, the proportion of the time an event E occurs gets close and stays close to the real probability of E occurring in a single chance experiment even if the value of this probability is not known. This means that we can observe the outcomes of repetitions of a chance experiment and then use the observed outcomes to estimate probabilities.

extrapolating

Be careful in interpreting the value of the slope and intercept in the least-squares line. In particular, in many instances interpreting the intercept as the value of y that would be predicted when x=0 is equivalent to _________________ way beyond the range of the x values in the data set, and this should be avoided unless x= 0 is within the range of the data.

predictions

Beware of extrapolation. It is dangerous to assume that a linear model fit to data is valid over a wider range of x values. Using the least-squares line to make predictions outside the range of x values in the data set often leads to poor ______________.

multiplication rule for k independent events

Events E1, E2, ... , Ek are independent if knowledge that any of the events have occurred does not change the probabilities that any particular one or more of the other events has occurred. Independence implies that P (E1 intersect E2 intersect .... intersect Ek) = P(E1) * P(E2) * .............* P(Ek). This means that when events are independent, the probability that all occur together is the product of the individual probabilities. This relationship also holds if one or more of the events is replaced by its complement.

sigma

For a quadratic regression, the least-squares estimates of a, b1, and b2 are those values that minimize the sum of squared deviations, _______ (y- y hat)^2, where y hat =a+b1x+b2x^2.

general multiplication rule for two events

For any two events E and F, P(E intersect F)= P(E | F)*P(F)

general addition rule for two event

For any two events E and F, P(E U F)= P(E) + P(F) - P(E intersect F).

nonlinear

For quadratic regression, a measure that is useful for assessing fit is R2= 1 - (SSResid/SSTo) where SSResid= sigma (y- y hat)^2. The measure R^2 is defined in a way similar to r^2 for the linear regression model and is interpreted in a similar fashion. The lowercase notation r^2 is used only with linear regression to emphasize the relationship between r^2 and the correlation coefficient, r, in the linear case. For _____________ models, an uppercase R^2 is used.

steps in a linear regression analysis

Given a bivariate numerical data set consisting of observations on a dependent variable y and an independent variable x: Step 1. Summarize the data graphically by constructing a scatterplot. Step 2. Based on the scatterplot, decide if it looks like the relationship between x and y is approximately linear. If so, proceed to the next step. Step 3. Find the equation of the least-squares regression line. Step 4. Construct a residual plot and look for any patterns or unusual features that may indicate that a line is not the best way to summarize the relationship between x and y. If none are found, proceed to the next step. Step 5. Compute the values of se and r 2 and interpret them in context. Step 6. Based on what you have learned from the residual plot and the values of se and r 2, decide whether the least-squares regression line is useful for making predictions. If so, proceed to the last step. Step 7. Use the least-squares regression line to make predictions.

5%

If a random sample of size n is taken from a population of size N, the theoretical probabilities of successive selections calculated on the basis of sampling with replacement and on the basis of sampling without replacement differ by insignificant amounts when n is small compared to N. In practice, independence can be assumed for the purpose of calculating probabilities as long as n is not larger than_______ of N.

residuals

If the relationship between two variables is nonlinear, it is preferable to model the relationship using a nonlinear model rather than fitting a line to the data. A plot of the___________ from a linear fit is particularly useful in determining whether a nonlinear model would be a more appropriate choice.

dependent events

If two events E and F are not independent, they are said to be dependent events. If P(E | F) = P(E), it is also true that P(E | F) = P(F), and vice versa.

straight

If we choose some x values and compute y=a+bx for each value, the points in the plot of the resulting (x, y) pairs will fall exactly on a __________ line.

continuous

If x is a __________ random variable, then for any two numbers a and b with a < b, P(a less than or equal to x less than or equal to b) = P(a < x less than or equal to b) = P(a less than or equal to x < b) = P(a < x < b).

independent variable

In a bivariate data set, the variable that will be used to make a prediction of the dependent variable. The independent variable is denoted by x. The independent variable is also sometimes called the predictor variable or the explanatory variable.

dependent variable

In a bivariate data set, the variable whose values we would like to predict. The dependent variable is denoted by y. The dependent variable is also sometimes called the response variable.

subjective approach to probability

In this view, a probability is interpreted as a personal measure of the strength of belief that a particular outcome will occur. A probability of 1 represents a belief that the outcome will certainly occur. A probability of 0 represents a belief that the outcome will certainly not occur—that it is impossible. Other probabilities are placed somewhere between 0 and 1, based on the strength of one's beliefs

third variable

It frequently happens that two variables are highly correlated not because one is causally related to the other but because they are both strongly related to a __________________ _____________________. Among all elementary school children, the relationship between the number of cavities in a child's teeth and the size of his or her vocabulary is strong and positive. Yet no one advocates eating foods that result in more cavities to increase vocabulary size (or working to decrease vocabulary size to protect against cavities). Number of cavities and vocabulary size are both strongly related to age, so older children tend to have higher values of both variables than do younger ones.

scatterplot

It is important to look for unusual values in the ______________ or in the residual plot. A point falling far above or below the horizontal line at height 0 corresponds to a large residual, which may indicate some unusual circumstance such as a recording error, a non-standard experimental condition, or an atypical experimental subject.

both

It is not enough to look at just r^2 or just se when assessing a linear model. These two measures address different aspects of the fit of the line. In general, we would like to have a small value for se (which indicates that deviations from the line tend to be small) and a large value for r^2 (which indicates that the linear relationship explains a large proportion of the variability in the y values). It is possible to have a small se combined with a small r^2 or a large r^2 combined with a large se. Remember to consider ________ values.

complement

Let A and B denote two events. Not A: The event that consists of all experimental outcomes that are not in event A. Not A is sometimes called the ___________ of A and is usually denoted by Ac, A', or A bar. A or B: The event that consists of all experimental outcomes that are in at least one of the two events, that is, in A or in B or in both of these. A or B is called the union of the two events and is denoted by A union B. A and B: The event that consists of all experimental outcomes that are in both of the events A and B. A and B is called the intersection of the two events and is denoted by A intersect B.

common

Let A1, A2, ... , Ak denote k events. 1. The event A1 or A2 or ... or Ak consists of all outcomes in at least one of the individual events A1, A2, ... , Ak. 2. The event A1 and A2 and ... and Ak consists of all outcomes that are simultaneously in every one of the individual events A1, A2, ... , Ak. These k events are mutually exclusive if no two of them have any __________ outcomes.

the addition rule for mutually exclusive events

Let E and F be two mutually exclusive events. One of the basic properties (axioms) of probability is P(E or F) = P(E U F) = P(E)+P(F) This property of probability is known as the addition rule for mutually exclusive events. More generally, if events E1, E2, ... ,Ek are all mutually exclusive, then P(E1 or E2 or...or Ek)= P(E1 U E2 U..... U Ek)= P(E1) + P(E2) + .... +P(Ek) In words, the probability that any of these k mutually exclusive events occurs is the sum of the probabilities of the individual events.

100%

Multiplying r^2 by 100 gives the percentage of y variation attributable to the approximate linear relationship. the closer this percentage is to ________, the more successful the linear relationship is in explaining variation in y.

continuous

The probability that a ________ random variable x lies between a lower limit a and an upper limit b is P(a < x < b) = (cumulative area to the left of b) - (cumulative area to the left of a) = P(x < b) - P(x < a).

sampling without replacement

Once selected, an individual or object is not returned to the population prior to subsequent selection

sampling with replacement

Once selected, an individual or object is put back into the population before the next selection.

adequacy

Remember that the least-squares line may be the "best" line (in that it has a smaller sum of squared deviations than any other line), but that doesn't necessarily mean that the line will produce good predictions. Be cautious of predictions based on a least-squares line without any information about the__________ of the linear model, such as se and r^2.

standard deviation about the least-squares line

Roughly speaking, the __________________, or se, is the typical amount by which an observation deviates from the least-squares line.

residual sum of squares

Sometimes referred to as the error sum of squares, denoted by SSresid, is defined as:

conditional probability

Suppose that E and F are two events with P(F) . 0. The conditional probability of the event E given that the event F has occurred, denoted by P(E U F), is P(E | F) = P(E intersect F)/ P(F)

predicted values

The ______ __________ result from substituting each sample x value into the equation for the least-squares line. This gives: y hat 1= first predicted value= a + bx1 y hat 2= second predicted value= a + bx2 y hat n= nth predicted value= a + bxn

mean

The ______ value of a random variable x describes where the probability distribution of x is centered.

coefficient of determination

The _____________ _______ ________________________is defined as:

sample space

The collection of all possible outcomes of a chance experiment.

multiplication rule for two independent events

The events E and F are independent if and only if (P intersect E) = P(E) * P(F).

smallest

The least-squares line for predicting y from x is not the same line as the least-squares line for predicting x from y. The least-squares line is, by definition, the line that has the ________ possible sum of squared deviations of points from the line in the y direction (it minimizes sigma(y- y hat)^2). The line that minimizes the sum of squared deviations in the y direction is not generally the same as the line that minimizes the sum of the squared deviations in the x direction. So, for example, it is not appropriate to fit a line to data using y=house price and x=house size and then use the resulting least-squares line Price= a+b(Size) to predict the size of a house by substituting in a price and then solving for size. Make sure that the dependent and independent variables are clearly identified and that the appropriate line is fit.

the danger of extrapolation

The least-squares line should not be used to make predictions outside the range of the x values in the data set because we have no evidence that the linear relationship continues outside this range.

least-squares line

The line that minimizes the sum of squared deviations. The least-squares line is also called the sample regression line.

mean value of a discrete random variable

The mean value of a discrete random variable x is computed by first multiplying each possible x value by the probability of observing that value and then adding the resulting quantities. The term expected value is sometimes used in place of mean value, and E(x) is alternative notation.

sum of squared deviations

The most widely used measure of the goodness of fit of a line y= a+bx to bivariate data (x1, y1),..., (xn, yn) is the sum of the squared deviations about the line.

regression analysis

The objective of ________________ ____________________ is to use information about one variable, x, to predict the value of a second variable, y. For example, we might want to predict y= product sales during a given period when the amount spent on advertising is x= $10,000.

discrete

The probability distribution of a ________ random variable x gives the probability associated with each possible x value. each probability is the long-run relative frequency of occurrence of the corresponding x value when the chance experiment is performed a very large number of times. Common ways to display a probability distribution for a discrete random variable are a table, a probability histogram, or a formula. Notation: If one possible value of x is 2, we often write p(2) in place of P(x = 2). Similarly, p(5) denotes the probability that x= 5, and so on.

relative frequency approach to probability

The probability of an event E, denoted by P(E ), is defined to be the value approached by the relative frequency of occurrence of E when a chance experiment is performed many times. If the number of times the chance experiment is performed is quite large,


Related study sets

Contemporary Nursing-Chapter 18: Effective Communication and Conflict Resolution

View Set

Chapter 10: Customer Relationship Management

View Set

1.18 L'alfabeto e lo Spelling con i Nomi di Città Italiane

View Set

Unit B: Metabolism Practice Quiz

View Set