Stats Test 3 (Ch. 7-9)

Ace your homework & exams now with Quizwiz!

A nutrition major at Sate University was studying the relationship between carbohydrates (X) and calories (Y). For example, a serving of a particular brand of wheat pasta yielded 42 carbohydrates and 210 calories. After collecting X and Y data on many kinds of foods, the student determined the slope of the regression line to be 4.0 and the Y intercept to be 3.0. If a new food is tested, and the number of carbohydrates (X) is 100, what would be the predicted calories (Y')?

403

a _____ is the average (mean) cross product of the z-scores.

correlation

T/F: if correlation is +1.00, the regression line is at a 45* angle

false

know how to find a regression equation when they give data/chart (#24-26, ch. 8)

yes ma'am

the sum of the deviations of the true Y scores from the predicted Y' scores is always

zero

if the correlation is zero and you're trying to find y', using x will not be a reliable predictor. therefore, we use ___ to predict y'

ȳ

which of the following formulas represents the Y intercept of the regression line?

ȳ - (b)(x̄)

computing the standard error of the estimate (Ssuby) by subtracting each Y' from the corresponding Y, squaring the difference, summing the results, dividing by N, and then taking the square root (the defining formula) is difficult and time-consuming. Fortunately, the same value can be computed by multiplying the standard deviation of Y by

√(1-r²)

Which of the following r-values indicates the weakest relationship between two variables? a. +.45 b. -.30 c. +.03 d. -.45

+.03

Calculate the appropriate correlation coefficient for the following data (reading speed test score = X, number of books read = Y)

+.95

Which of the following r-values indicates the strongest relationship between two variables? a. +.65 b. -.89 c. +.10 d. -.10

-.89

what is the Y intercept of the following regression equation? Y' = -4.30X - 1.72

-1.72

what is the slope of the following regression equation? Y' = -8.27X + 3.09

-8.27

a student wanted to know whether memory for items on a grocery list was better when a bizarre image was created for each item, compared with a strategy of simply trying to remember the items. the student had a randomly selected sample use bizarre imagery to remember a list of 15 items; the mean number of items remembered was 10. the student compared this mean to a population mean of 7 and a standard error of 1.20. What is the probability of obtaining a sample mean of 10 or higher?

.0062

if the probability of getting a z score between the mean and +1 standard deviation is .3413, what is the probability of getting a z score lower than -1 standard deviation?

.1587

Assad wants to show Cheryl and Cindy a card trick he has learned. He first asks Cheryl to draw one card at random from a standard deck of 52 cards. Cheryl draws out the 4 of hearts, shows the card to Cindy, and then lays it face down on the table in front of her. Assad then extends the remaining cards for Cindy to make her selection. What is the probability that Cindy also will draw out a heart?

.24

A recent study of burnout among therapists shows a positive correlation of 0.53 between the number of clients a therapist is treating and the therapist's feelings of burnout. What proportion of the variance in feelings of burnout is accounted for by this relationship?

.28

one hundred tickets have been sold. you've purchased 30 tickets. what is the probability of one of your tickets being randomly selected?

.30

on a roll of a die, the probability that 3 or more dots will appear on the top side is .67, and the probability that 2 or fewer dots will appear is .33. If we know that 6 dots showed up on the last throw, what is the probability that 2 or 1 will show up on the next throw?

.33

the president of state college wants to conduct a survey of the students. he knows there are 325 freshmen, 250 sophomores, 175 juniors, 100 seniors, and 150 grad students. What is the probability that the very first person randomly selected for the survey will be a freshman?

.33

what is the probability of getting a sample mean between 500 and 520 if the population mean is 500 and the standard deviation of the sampling distribution (the standard error of the mean) is 20?

.3413

your professor told the class that, typically, 20% of the class receives As, 20% Bs, 35% Cs, 10% Ds, and 15% Fs. What is the probability that the student sitting next to you will receive an A or an F?

.35

an advertising executive wanted to know whether a new ad campaign designed to deter children from smoking was working. after seeing the ads, a randomly selected sample of 36 teenagers rated what they thought of smoking on a -5 (it's terrible) to +5 (it's fantastic). the mean rating for the sample was -.70. the population mean is 0, with a population standard deviation of 2.00. what is the probability of obtaining a sample mean of .-70 or lower (more negative)?

.3632

Suppose you select a candy from a jar with 6 red and 4 green candies. You draw out a red candy and eat it. You then select a second candy from the jar. What is the probability you draw out a green candy this time?

.44

the population of getting heads on a fair coin is .50. The probability of getting tails is also .50. if you toss the coin 3 times and get heads all 3 times, what is the probability of getting tails on the next toss?

.50

Suppose you select a candy from a jar with 6 red and 4 green candies, note its color, and then replace it ten times. Each time, the candy you have selected has been green. What is the probability that on the next drawing you will select a red candy?

.60

if the probability of getting a z score between the mean and +1 standard deviation is .3413, what is the probability of getting a z score of +1 standard deviation?

.8413

When r = 1.0, then Sy equals

0

if correlation is 0, what is the z score of y'?

0

the y' regression line intercepts the Y axis at ȳ if r = ?

0

when r = 0.0, the slope of the regression line equals

0

what's the range of values in probability?

0-1

if there is no relationship between two variables, the slope of the regression will equal

0.0

the Y intercept is the value of Y' when X equals

0.0

what do you get when you add the coefficients of determination and non-determination?

1

what 3 things must you do to establish cause-effect?

1. determine high correlation 2. have a time-order sequence 3. eliminate all plausible alternatives

steps to constructing a regression line for predicting Y from X (and also X from Y)

1. take the two extreme values of X and predict a Y score from them 2. plot the two points on a graph 3. draw a line between the two points. repeat the same but X from Y to get the other regression line

what are reasons for low correlation?

1. the two variables aren't related 2. they are related in a non-linear way (curvilinear relationship) 3. a truncated range 4. low N 5. an outlier 6. heteroscedasticity

what two things do you need to remember when calculating the pearson r?

1. x and y don't have to be the same units of measurement 2. x and y have the same mean

what is the Y intercept of the following regression equation? Y' = .56X + 2.41

2.41

what is the slope of the following regression equation? Y' = 2.69X - 3.92

2.69

how would a statistician define probability?

A mathematical statement indicating the likelihood of an event when we randomly sample a particular population

If you see the notation ∑XY, what should you do?

First multiply each X by its partner Y, then sum the results.

If you see the notation (∑X)(∑Y), what should you do?

First sum the Xs, then sum the Ys, then multiply the sums.

What does a correlation coefficient do?

It quantifies the pattern in a relationship

which of the following formulas represents the slope of the regression line?

N(∑XY) - (∑X)(∑Y) -------------------- N(∑X²) - (∑X)²

Professor Johnston has found a strong positive correlation between wearing neckties and the frequency of strokes (r = .89) He thinks that the necktie reduces blood flow to the brain, preventing the brain from receiving enough oxygen. Professor Johnston and his associates claim to have proven that wearing neckties causes strokes. What error has Professor Johnston made?

Professor Johnston is drawing a causal conclusion from correlational findings.

when knowledge of a relationship is used, the average error remaining after predictions have been made based on the relationship is

Sy²

"The self-confidence of a group of students is positively correlated with their chances of getting through the course." What does this statement mean?

The chances of passing the course tend to increase as the self-confidence scores of the students increase

what is the purpose of the critical value?

To define the minimum absolute z-value required for a sample to be in the region of rejection

a "weak" relationship between two variables is represented by

a large spread of Y scores at each X score

"The more you save, the less you spend" describes

a negative linear correlation

"The bigger they are, the harder they fall" describes

a positive linear correlation

which of the following best describes knowing the relative frequency of every possible event in a population?

a probability distribution

what can we never be sure that a sample represents a population?

a random sample may poorly represent the population, or it may represent a population that is different

which of the following accurately describes a theoretical probability distribution? it is based on

a theoretical model of the relative frequency of events in a population

the "error" in a single prediction is equal to the degree to which a participant's _____ score deviates from the _____

actual; corresponding predicted score

the probability of obtaining either one of two events is equal to the sum of the separate probabilities of each event ex. probability of rolling either a 5 or 6, (1/6) and (1/6) = (2/6)

addition rule for mutually exclusive events

ex. what is the probability of obtaining either a jack or a spade? p(A or B) = p(A) + p(B) - p(A and B) = (4/52) + (13/52) - (1/52) = (16/52)

addition rule when two events are not mutually exclusive

When the correlation coefficient representing the relationship between X and Y is intermediate, then all of the following are true except: a. there is not a perfectly consistent association b. there are different Y scores associated with a single X score c. prediction of Y from a known X score has some error d. all data points fall on the regression line

all data points fall on the regression line

Using a correlational design, a researcher found a relationship between the healthiness of one's heart and the amount of fish oil in one's diet. the researcher should conclude that:

although a relationship exists, one cannot infer that changes in one variable are causing changes in the other variable.

when rolling a pair of fair dice, the probability of rolling a total point value of "7" is .17. if you rolled a pair of dice 1,000 times and the point value of "7" appeared 723 times, what would you probably conclude?

although not impossible, this outcome is so unlikely that the fairness of the dice is questionable

In general, a zero correlation means that

as the values of one variable increase, there is no tendency for the values of the other variable to change in any consistent, predictable fashion.

Professor Miller has found that the correlation between a person's "need for affiliation" (found by taking a test to determine the need to be with others) and the number of hours spent watching television is -.69. He should conclude that

as we observe people with higher and higher need for affiliation, we see a tendency for those people to spend less and less time watching television.

One assumption of linear regression is

at each X, the sample of Y scores should represent an approximately normal distribution

the "error" in all predictions made from a sample using linear regression is the

average spread of actual Y scores around the predicted Y' scores

the standard error of the estimate is defined as the

average spread of actual Y scores around the predicted Y' scores

which of the following is not true of the criterion: a. the criterion is the probability that defines samples as unlikely b. samples that meet the criterion occur more than 5% of the time c. behavioral researchers usually use .05 as their criterion probability d. sample means that occur with a probability less than that of the criterion probability are likely to represent some other population

b. samples that meet the criterion occur more than 5% of the time

can you make better predictions with +1.00 or -1.00

both perfect

At State University Medical Center, a research study has produced a very strong negative correlation between the number of years a person has smoked and that person's lung capacity. Assuming the correlation passes the appropriate inferential test, what should the researchers do next?

calculate the linear regression equation

if we calculate a correlation coefficient and we find that there is a relationship between the two variables, we

cannot conclude that changes in one variable cause changes in the other variable

a high correlation coefficient does not imply that one thing ____ the other. it shows _____ of the relationship, not cause-effect.

caused, strength

if odds are 5 out of 100 or less it could have occurred by chance. if p ≤ .05, we reject _____

chance

In a nonlinear or curvilinear relationship, as the X scores change, the Y scores

change consistently, but in more than one direction

If there is a relationship between "amount of coffee consumed" and "nervousness", then as the amount of coffee consumed increases, the amount of nervousness

changes in some consistent, predictable manner.

probability that assumes an ideal situation (ex. coin flip, throwing dice)

classical

what are the 3 types of probabilities?

classical, empirical, and subjective

how can we determine the representativeness of a sample mean for a particular population?

convert the sample mean to a z score and compare the z score to the critical value

_____ is the statistical index that tells us how much two variables are related

correlation coefficient

y' is the y score we are PREDICTING based upon an x score (and x' is the x score we predict from a y score)

dang that makes sense now

at a basic level, when deciding whether a sample is representative of a particular population, we

decide against low-probability events in favor of of high-probability events

what is the basis of all inferential statistics?

deciding whether or not a sample of scores is representative of a particular population

one sample affects the other - without replacement (ex. removing a card from the deck and not putting it back)

dependent random sampling

coefficient of _____, r², tells us the amount of variation in Y accounted for by X. this is the ____ variation.

determination, true

When heteroscedasticity exists, the problem with r is that it

does not accurately describe the strength of the relationship for all Xs

this probability depends upon what has been found in past studies (ex. knowing the proportion of female students in past years, and the probability of selecting one in a random sample now)

empirical

combination of outcomes (ex. whatever we decide an _____ is with dice like even #'s)

event

when r = 0.0, the Y intercept is equal to

every predicted Y value

if the events exhaust all possible outcomes (ex. heads and tails)

exhaustive

what do you do when you have tied ranks?

find the average

To predict a Y score from a given X score using the regression constants, we would

first multiply X by the slope and then add the Y intercept

the greater the correlation, the ____ y' will be from ȳ

further

Compared to a strong relationship, a weak relationship between two variables results in

greater prediction error and a larger value of Sy

when two variables have similar variability

homoscedasticity

In a linear relationship, as the X scores increase, the Y scores change

in only one direction

In general, a positive correlation means that as values of one variable _____, there is a tendency for the values of the other variable to _____.

increase; increase

occurrence of an event doesn't affect the probability of the other event

independent

you roll a die twice, and both times you roll a 6. what type of events are these two rolls

independent

choice of one sample has no effect on the choice of the next sample (ex. removing a card from the deck, then putting it back)

independent random sampling

as a general rule, when statisticians determine the probability of events, they assume that the events are ____ and sampled _____ replacement.

independent; with

is probability descriptive or inferential statistics?

inferential

with which scale ranking would we use the pearson correlation coefficient?

interval and/or ratio

what is the post hoc fallacy?

is committed when it is assumed that because one thing occurred after another, it must have occurred as a result of it. Mere temporal succession, however, does not entail causal succession.

There are 26 red cards in a playing deck and 26 black cards. The probability of randomly selecting a red card or a black card is 26/52 = 0.50. Suppose you randomly select a card from the deck five times, each time replacing the card and reshuffling before the next pick. Each of the five selections has resulted in a red card. On the sixth turn, the probability of getting a black card

is the same as it has always been if the deck is a fair deck

a cognitive psychologist tested whether people spend more time looking at and comprehending content words (ex. ball, kick) than other words (the, a) when reading a passage. The mean looking time for a sample of content words was compared to the mean looking time for the population of other words by transforming it into a z score. the z score for the sample mean was z = 3.00. with critical values of +-1.96, what should the psychologist conclude about the sample mean?

it is an unlikely sample for the population of looking times for other words and probably represents some other population (times looking for meaningful words)

which of the following is not true of the linear regression equation? a. it is the equation from which the correlation coefficient is calculated b. it defines the straight line that summarizes a relationship c. it describes two characteristics of the regression line: its slope and its Y intercept d. it is the equation that produces the value of Y' at each X

it is the equation from which the correlation coefficient is calculated

Linear regression is important because

it is used to predict unknown Y scores based on X scores from a correlated variable

the lower the correlation, the ____ the standard error of estimate

larger

in a ____ relationship, as the X scores increase, the Y scores tend to change in only one direction

linear

the scores that lie in the tails of a normal distribution have a _____ and a _____ probability of occurring

low; low

the purpose of probability and inferential statistics is to

make decisions about the population that have a good chance of being correct

when the correlation is zero, what is the best measure to use?

mean

the intersection points of your two lines are the _____ of X and Y

means

occurrence of one event affects the next event If I have 2 girl names and 2 boys names in a hat: the odds of drawing girl name is 1/2. if I draw a girl's name, then the next time, what are the odds of drawing a girl's name? 1/2 x 1/3 = 1/6

multiplication rule for dependent events

probability of simultaneously or successive occurrence of two events is the product of the separate possibilities of each event p (A and B) = p(A) p(B) ex. probability of throwing a 5 and then a 6 = (1/6)(1/6) = (1/36)

multiplication rule for independent events

if two events cannot occur simultaneously (ex. can't roll a 2 and 4 at the same time)

mutually exclusive

do you have replacement in opinion polls?

no

in some cases when the correlation is not perfect, can the regression line fall through all the pair of scores?

no

does adding or subtracting a constant affect r? why?

no it also changes the mean, so the relative distance stays the same

What type of relationship does a horizontal line represent?

no relationship

coefficient of _____, 1-r², tells us the error variation; aka the variation not explained by the correlation between the two variables

non-determination

when a z score is not in the region of rejection, we should

not reject the idea that the sample represents the raw score population

The regression line is the best fitting line because

on average, the regression line passes through the center of the various Y scores

The strength of a relationship is indicated by the extent to which _____ paired with each individual value of the _____ variable.

one value of the Y variable is; X

with which scale ranking would we use the spearman correlation coefficient?

ordinal scale

which of the following is the criterion that psychologists usually use to determine the likelihood that a sample mean was obtained by chance?

p = .05

in a _____ _____, each individual has the same z score on both the x and y variable

perfect correlation

one purpose of correlation is to enable us to _____ an unknown value of Y from a known value of X

predict

statisticians use linear regression to

predict unknown Y scores from known X scores

When we divide the error remaining after we use the relationship to predict Y scores by the total error when we use the mean to predict the Y scores and then subtract the results from 1, the final result is

proportion of variance accounted for

the coefficient of determination is interpreted as the

proportion of variance accounted for

when we square the correlation coefficient to produce r², the result is equal to the

proportion of variance accounted for

the ______ is the proportional improvement in the accuracy of our predictions produced by using a relationship to predict Y scores, compared to our accuracy when we do not use the relationship

proportion of variance accounted for (r²)

the coefficient of alienation is interpreted as the

proportion of variance not accounted for

what's the best way to look at linearity?

put it on a standard diagram

each sample has exactly the same chance of being chosen

random sampling

_____ contains means that are so unlikely to be representing the underlying population, we reject they represent the population

region of rejection

what do we call that portion of the sampling distribution in which values are considered too unlikely to have occurred by chance?

region of rejection

The best fitting line through a scatter plot is known as the _____ line

regression

the _____ line summarizes a relationship by passing through the center of the scatterplot

regression

The best-fitting line through a scatterplot is known as the

regression line

if the correlation coefficient turns out to be a relatively high value, then the value of Sy will be

relatively low

In an experimental design _____, whereas in a correlational design _____

researchers assign each person an X score and then measure the score on the Y variable; researchers measure scores on variables that a participant has already experienced.

arises when the range between the lowest and highest scores on one or both variables is limited; this will produce a coefficient that is smaller than it normally would be

restriction of range

the coefficient of determination is equal to

all possible outcomes (ex. flipping a coin, _______ is heads or tails)

sample space

____ occurs when random chance produces a sample statistic that is not equal to the population parameter it represents

sampling error

suppose you take a piece of candy out of a jar, look to determine its color, the put it back into the jar before you randomly select the next piece of candy. this type of sampling is called

sampling with replacement

if you were to plot two variables on a graph, this is called a:

scatter diagram

When plotting correlational data, the appropriate graph to use is the

scatterplot

We should do a scatterplot of the data when we compute a correlation because the scatterplot allows us to

see the nature of the relationship between the two variables.

a study about the college aptitude of seniors at south city high school has resulted in a sample mean with a corresponding z score of 1.89. If the critical value for the region of rejection is +-1.96, what is the correct conclusion?

since the z value does not fall within the region of rejection, we should not conclude this sample mean represents some other population.

a study about the college aptitude of seniors at south city high school has resulted in a sample mean with a corresponding z score of 2.00. If the critical value for the region of rejection is +-1.96, what is the correct conclusion?

since the z value falls within the region of rejection, we should conclude this sample mean likely represents some other population.

the slope of a line is a number indicating the

slant of the line and the direction in which it slants

In looking at the regression constants, we know that the relationship is negative if the

slope value is negative

the __________ is the clearest way to describe the "average" error when using Y' to predict Y scores

standard error of the estimate

type of probability that we use most of the time in everyday situations (ex. asking someone on a date, getting up for class, driving)

subjective

What statistic should be used to find out whether there is a relationship between hours spent participating in sports and GPA?

the Pearson correlation coefficient

What statistic should be used to find out whether there is a relationship between years of education and annual income?

the Pearson correlation coefficient

Which correlation coefficient should we use if we want to find out whether a relationship exists between two variables that are both interval or ration variables?

the Pearson correlation coefficient

Suppose a researcher has trained two observers to rank participants according to their level of frustration when trying to solve a puzzle. What statistic should be used to determine the extent to which the two observers agree in their rankings of frustration?

the Spearman rank-order correlation coefficient

What statistic should be used to find out whether there is a relationship between high school class rank and first-semester college GPA rank?

the Spearman rank-order correlation coefficient

Which correlation coefficient should we use if we want to find out whether a relationship exists between two variables that represent pairs of ordinal scores?

the Spearman rank-order correlation coefficient

Homoscedasticity occurs when

the Y scores at all Xs are spread out to the same degree

Heteroscedasticity

the Y scores have a different degree of spread at different Xs

Linear regression is defined as the procedure for determining

the best-fitting straight line in a linear relationship

which of the following accurately describes an empirical probability distribution? It is based on

the computed relative frequency of observed events

A regression line is usually used when

the correlation coefficient is not 0.0

what is the critical value?

the inner edge of the region of rejection

in general, the greater the proportion of variance accounted for,

the more accurately we can predict behavior

When a sample mean is different from the mean of the sampling distribution (the population mean), two alternatives must be considered: The sample mean may represent _____, or it may represent _____.

the population poorly; a different population

an event's relative frequency in the population equals

the probability of an event

a probability distribution gives us

the probability of every possible event in a population

two events are said to be independent when

the probability of one event is not influenced by the occurrence of the other event

how is the relative frequency of an event defined

the proportion of times an event occurs in the population of events

Suppose you drew a random sample from a population where the mean is 100. The standard error of the sampling distribution is 10. The mean for your sample is 80. What could you conclude about your sample?

the sample mean does not occur very often by chance in the sampling distribution of means and probably did not come from the given population.

what can we conclude when the absolute value of a z score for a sample mean is larger than the critical value?

the sample mean does not represent the particular raw score population on which the sampling distribution is based

Suppose you drew a random sample from a population where the mean is 100. The standard error of the sampling distribution is 10. The mean for your sample is 110. What could you conclude about your sample?

the sample mean occurs very often by chance in the sampling distribution of means and probably did not come from the given population

what can you conclude about a sample mean that falls within the region of rejection?

the sample probably represents some population other than the one on which the sampling distribution was based

To know whether there is a relationship between two variables, you draw a line around the outer edges of a scatterplot. You can tell when there is no relationship when

the scatterplot is either circular or elliptical, with the ellipse being parallel to the X axis

the criterion determines

the size of the region of rejection

in the regression equation, the slope summarizes _____ and the Y intercept indicates _____

the steepness and direction of the regression line; the value of Y' when X=0

we calculate the proportion of variance accounted for because it is the statistical basis for evaluating

the usefulness of a relationship

when there is no relationship between two variables, the value of every Y' is equal to

the value of the Y intercept

Professor Helgin has found that the correlation between the length of a person's index finger and the person's IQ is -.09. He should conclude that

there is a very weak relationship between the length of the index finger and IQ because r is nearly 0.

Which relationship is stronger, r = +.62 or r = -.62

there is no difference in the strength of the two relationships

which of the following is correct regarding means that fall within the region of rejection when the critical values are +-1.96?

they occur with a probability of 5%

In a correlational analysis, N stands for the

total number of pairs of scores

t/f: the magnitude of r increases whenever the variability of either X or Y increases

true

More questions (like 6 of them) on calculating the correlation coefficient so idk memorize those?

uh okay

what can we conclude about a sample mean that is found to lie in the region of rejection? it is extremely ___ to have occurred by chance, and it represents ____.

unlikely; some other population

when do we use the spearman R

when there is perfect correlation


Related study sets

psychology: operant and classical conditioning review

View Set

Study Guide for Final Exam - Chapter 85 & 86 Questions

View Set

Chapter 7 InQuizitive: The Legislature

View Set

BA 325 Practice Questions (Midterm 1-2)

View Set