Statistics, Ch.6- ch.9
What is the Y-intercept of the following regression equation? mc018-1.jpg = -4.30X - 1.72 -1.72 1.72 -4.30 -4.30X
-1.72
What is the probability of getting a sample mean between 500 and 520 if the population mean is 500 and the standard deviation of the sampling distribution (the standard error of the mean) is 20? 0.5000 0.4332 0.3413 0.1915
0.3413
probability distribution
A probability distribution indicates the probability of all events in a population
sampling with replacement
When sampling with replacement, any previously selected individuals or events are replaced back into the population before drawing additional ones
Rejection rule
When a sample's z-score lies beyond the critical value, reject that the sample represents the underlying raw score population reflected by the sampling distribution •When the z-score does not lie beyond the critical value, retain the idea that the sample may represent the underlying raw score population
sampling without replacement
When sampling without replacement, previously selected individuals or events are not replaced in the population before selecting again
one tailed tests
When testing for sample means that are less than a given value, the entire region of rejection is in the left-hand tail When testing for sample means that are greater than a stated value, the entire region of rejection is in the right-hand tail
proportion of variance accounted for
When we do not use the relationship, we use the overall mean of the Y scores (Y) as everyone's predicted Y •The error here is the difference between the actual Y scores and the /Y we predict they got (Y-/Y ) •So, when we do not use the relationship to predict scores, our error is Sy2
proportion of variance accounted for
When we do use the relationship, we use the corresponding Y' as determined by the linear regression equation as our predicted value •The error here is the difference between the actual Y scores and the Y' that we predict they got (Y-Y') •When we do use the relationship to predict scores, our error is Sy 2
In a study about the relationship between their age and women's attitudes about marriage, you survey over 500 undergraduate women and calculate a Pearson correlation coefficient. What mistake have you made? You have collected too much data. There is no way a woman's attitude about marriage can affect her age. You only surveyed young women in college causing a restriction of range. You have used the wrong correlation coefficient the type of data.
You only surveyed young women in college causing a restriction of range.
"The more you save, the less you spend" describes a positive linear correlation. a negative linear correlation. no correlation. a nonlinear correlation.
a negative linear correlation
Central Limit Theorem
a) The distribution of sample means is approximately normally distributed b) the mean of the sampling distribution will equal the mean of the underlying raw score population used to create the sampling distribution c) the variability of the sample means is related to the variability of the underlying raw score population.
The "error" in a single prediction is equal to the degree to which a participant's ______ score deviates from the ______. actual; mean predicted; mean predicted; given X actual; corresponding predicted score
actual; corresponding predicted score
nonlinear, or curvilinear, relationship
as the X scores change, the Y scores do not tend to only increase or only decrease: At some point, the Y scores change their direction of change.
positive linear relationship
as the X scores increase, the Y scores also tend to increase
linear relationship
as the X scores increase, the Y scores tend to change in only one direction.
negative linear relationship
as the X scores increase, the Y scores tend to decrease
Standard normal curve
is a perfect normal z-distribution that serves as our model of any approximately normal z-distribution
Central limit theorem
is a statistical principal that defines the mean, the standard deviation, & the shape of a sampling distribution.
Linear regression
is a statistical procedure that uses relationships to predict unknown Y scores based on the X scores from a correlated variable.
standard error of the estimate
is as close as we will come to computing the "average error" in our predicted scores ( Y's)
correlation coefficient
is the descriptive statistic that summarizes and describes the important characteristics of a relationship.
z-score
is the distance a raw score is from the mean when measure in standard deviations
z-distribution
is the distribution produced by transforming all raw scores in the data into z-scores
strength of a relationship
is the extent to which one value of Y is consistently paired with one and only one value of X
sampling distribution of means
is the frequency distribution of all possible sample means that occur when an infinite number of samples of the same size N are randomly selected from one raw score population.
Relative Standing
reflects the systematic evaluation of a score relative to the sample or population in which the score occurs
When plotting correlational data, the appropriate graph to use is the line graph. bar graph. histogram. scatterplot.
scatterplot
The standard error of the estimate is similar to the sum of the actual Y scores around their predicted Y¢ scores. standard deviation of actual Y scores around their predicted Y¢ scores. degree to which any predicted Y¢ will deviate from its actual Y score. standard deviation of actual Y scores around their actual X scores.
standard deviation of actual Y scores around their predicted Y¢ scores.
regression line
summarizes a relationship by passing through the center of the scatterplot.
Homoscedasticity occurs when there is a nonlinear relationship between the X and Y scores. the Y scores have a different degree of spread at different Xs. the Y scores are spread out to the same degree at every X. the Y¢ scores are spread out to the same degree at every X.
the Y scores are spread out to the same degree at every X.
Two events are said to be independent when the probability of one event is influenced by the occurrence of the other event. you do not have knowledge about the occurrence of either event. the probability of one event is determined by the occurrence of the other event. the probability of one event is not influenced by the occurrence of the other event.
the probability of one event is not influenced by the occurrence of the other event.
A z-score of zero always means that the raw score does not exist. the raw score exists but is negligible. the raw score almost never occurs. the raw score is equal to the mean.
the raw score is equal to the mean.
When +1.645 is used as the critical value instead of ±1.96, the total probability of the region of rejection is decreased. the total probability of the region of rejection is increased. the region of rejection is all placed in the positive tail. the region of rejection is all placed in the negative tail.
the region of rejection is all placed in the positive tail.
To know whether there is a relationship between two variables, you draw a line around the outer edges of a scatterplot. If there is a negative relationship, the scatterplot is simultaneously elliptical and circular. the scatterplot is elliptical and is slanted upward (left to right). the scatterplot is elliptical and is slanted downward (left to right). the scatterplot is either circular or elliptical, and the ellipse is parallel to the X axis.
the scatterplot is elliptical and is slanted downward (left to right).
In the regression equation, the slope summarizes ______ and the Y-intercept indicates ______. predictor variables; what the value of the criterion variable is the steepness and direction of the regression line; the value of y1 when X = 0 the length of the regression line; the starting point of the regression line the starting point from which the Y scores begin to change as the X scores increase; the direction and rate in which Ys change as X increases
the steepness and direction of the regression line; the value of y1 when X = 0
linear regression line
the straight line that summarizes the linear relationship in a scatterplot by, on average, passing through the center of the Y scores at each X.
standard score
z-scores are often refereed to as standard score, the transformation equates or standardizes different distributions.
Slope & Y intercept
•The slope (b) indicates how slanted the regression line is and the direcion in which it slants. •The Y-intercept (a) is the value of Y where the regression line crosses the Y axis (that is, when X equals 0).
scatterplot
is a graph that shows the location of each data point formed by a pair of X-Y scores
Strength of correlation coefficeint
•Perfect Association—a correlation coefficient of +1 or -1 describes a perfectly consistent, maximum strength, linear relationship•Intermediate Association—a correlation coefficient whose absolute value is less than 1 has less consistency in the Y scores at each value of X and, therefore, more variability among the Y scores at each value of X•Zero Association—A correlation coefficient of 0 indicates no relationship is present
Calculate the appropriate correlation coefficient for the following data, assuming X is an interval variable and Y is a ratio variable. Participant Reading Speed Test Score (X) Number of Books Read (Y) -0.65 +0.09 +0.40 +0.59
+0.59
Your sense of direction puts you at the 79th percentile, relative to a raw score distribution with a mean of 3.5 and a standard deviation of 1. What is your z-score? +0.13 +0.50 +0.81 -0.81
+0.81
For the following set of sample scores, what is the z-score for a raw score of 23? 22, 23, 19, 25, 26, 22, 19, 25, 22, 20, 21, 23, 23, 24, 18, 20, 22, 24, 21, 21, 20, 24, 22, 21, 23 1.25 0.50 0.34 0.68
.50
Using the appropriate values from the z-table in the textbook appendix, find the z-score for which the area beyond z in the tail is 0.2546. 0.66 0.2454 -0.66 -0.2454
.66
When r = 0.0, the slope of the regression line equals the mean of all X scores in the sample. every predicted Y value. sy 0.
0
Asaad wants to show Cheryl and Cindy a card trick he has learned. He first asks Cheryl to draw one card at random from a standard deck of 52 cards. Cheryl draws out the 4 of hearts, shows the card to Cindy, and then lays it face down on the table in front of her. Asaad then extends the remaining cards for Cindy to make her selection. What is the probability that Cindy also will draw out a heart? 0.25 0.24 0.02 0.20
0.24
Suppose you select a candy from a jar with 6 red and 4 green candies, note its color, and then replace it ten times. Each time, the candy you have selected has been green. What is the probability that on the next drawing you will select a red candy? 1.0 0.48 0.90 0.60
0.60
2 components of a z-score
1) either a positive or negative sign, which indicates whether the raw score is above or below the mean. 2) the absolute value of the z-score, which indicates how far the scores lie from the mean when measured in standard deviations.
The sample mean for a recent introductory psychology test was 78, and the sample variance was 9. If a student received a score of 82, what was this student's z-score? 0.44 1.33 -1.33 -0.44
1.33
For the following data, what is the predicted test score for a person with a stress level of 10? Participant Stress Level (X) Test Score (Y) 0.76 20.93 -7.16 12.43
12.43
Jason's z-score in the 10-K run was -3. If the raw score standard deviation was 5, and the mean running time for the competitors was 55 minutes, what was Jason's raw score running time? 60 minutes. 50 minutes. 40 minutes. 30 minutes.
40 mins
If a class's scores are normally distributed, with a mean of 70 and a standard deviation of 10, what are the upper and lower limits of the middle 68% of the class? 50 and 90 40 and 100 60 and 80 32 and 68
60 & 80
Outlier
A data point that is relatively far from the majority of data points in a scatterplot
What kind of relationship is depicted in the following graph? mc024-1.jpg A positive linear correlation A negative linear correlation No correlation A nonlinear correlation
A nonlinear correlation
correlation coefficient characteristics
Correlation coefficients may range between -1 and +1. The closer to 1 (-1 or +1) the coefficient is, the stronger the relationship; the closer to 0 the coefficient is, the weaker the relationship. As the variability in the Y scores at each X becomes larger, the relationship becomes weaker.
Assumption 1 & 2 of linear regression
First: •The first assumption of linear regression is homoscedasticity •When the data are homoscedatsic, the standard error of the estimate ( Sy' ) accurately describes the average error Second: The second assumption of linear regression is that the Y scores at each X form an approximately normal distribution
Professor Johnston has found a strong positive correlation between wearing neckties and the frequency of strokes (r = 0.89). He thinks that the necktie reduces blood flow to the brain, preventing the brain from receiving enough oxygen. Professor Johnston and his associates claim to have proven that wearing neckties causes strokes. What error has Professor Johnston made? An r = 0.89 is not a very large r-value. Professor Johnston is drawing a causal conclusion from correlational findings. Not everyone who wears a necktie wears it very tight. Professor Johnston should know that there are other ways for blood to reach the brain.
Professor Johnston is drawing a causal conclusion from correlational findings.
The mean of the population (µ) is 200 on a test that measures math skills of middle school students. The variance øx^2=100
Since the z-value does not fall within the region of rejection, we should not conclude this sample mean represents some other population.
The introductory biology class at State University is conducting a study of water quality in their local community. The population mean of a certain beneficial bacteria found in drinking water (µ) is 100, with øx=25. The bacteria counts from the community are given below. Use a two-tailed rejection region with a total area of 0.05. What should you conclude? 52 67 103 59 71 86 89 48 57 92 81 75 66 85 94 77
Since the z-value falls within the region of rejection, we should conclude this sample mean likely represents some other population.
probability
The probability of an event is equal to the event's relative frequency in the population of possible events that can occur.
proportion of the total area under the normal curve
The proportion of the total area under the normal curve for scores in any part of the distribution equals the probability of those scores.
proportion of variance accounted for
The proportion of variance accounted for is the proportional improvement in accuracy when using the relationship with X to predict Y, compared to using to predict Y.
The region of rejection
The region of rejection contains means that are so unlikely to be representing the underlying population we reject they represent that population
Standard error of the mean
The standard deviation of the sampling distribution of means.
Chris wants to calculate a z-score for his own height. The average height in the class is 66 inches, and Chris's height is 62 inches. Chris calculated his z-score to be +1.5. What's wrong with his calculation? The z-score is an inappropriate calculation here. The z-score should be a higher number. He didn't have the standard deviation. The z-score should be a negative number.
The z-score should be a negative number.
Which relationship is stronger, r = +0.62 or r = -0.62? An r = +0.62 represents a stronger relationship than r = -0.62. An r = -0.62 represents a stronger relationship than r = +0.62. There is no difference in the strength of the two relationships. Without seeing a scatterplot of the data, there is no way to determine which is stronger.
There is no difference in the strength of the two relationships
dependent events
Two events are dependent events when the probability of one is influenced by the occurrence of the other
independent events
Two events are independent events when the probability of one is not influenced by the occurrence of the other
theoretical probability distribution
A theoretical probability distribution is a theoretical model of the relative frequencies of events in a population, based on how we assume nature distributes the events
empirical probability distribution
An empirical probability distribution is created by measuring the relative frequency of every event in the population, based on observation of samples from the population
representative sample
In a representative sample, the characteristics of the individuals and scores in the sample accurately reflect the characteristics of individuals and scores found in the population.
correlation coefficient research
Instead of an independent and dependent variable, we refer to X as the predictor variable and to Y as the criterion variable
Which of the following is not true of the linear regression equation? It is the equation from which the correlation coefficient is calculated. It defines the straight line that summarizes a relationship. It describes two characteristics of the regression line: its slope and its Y-intercept. It is the equation that produces the value of mc013-1.jpg at each X.
It is the equation from which the correlation coefficient is calculated.
Random sampling
Random sampling is selecting a sample in such a way that all elements or individuals in the population have an equal chance of being selected.
_____ occurs when random chance produces a sample statistic that is not equal to the population parameter it represents. Random sampling Sampling error Criterion probability Sampling replacement
Sampling error
Sampling error
Sampling error occurs when random chance produces a sample statistic that is not equal to the population parameter it represents.
To predict a y' score from a given X score using the regression equation, we would first multiply X by the slope and then add the Y-intercept. first multiply X by the Y-intercept and then add the slope. first add the Y-intercept to X and then multiply by the slope. first add the slope to X and then multiply by the Y-intercept.
first multiply X by the slope and then add the Y-intercept.
criterion probability
The criterion probability is the probability that defines samples as too unlikely for us to accept as representing a particular population.
critical value
The critical value of z defines the minimum value of z a sample must have in order to be in the region of rejection Note:A sample mean lies in the region of rejection only if its z-score is beyond the critical value
In a nonlinear or curvilinear relationship, as the X scores change, the Y scores tend to increase. change consistently, but in more than one direction. tend to be the same as the X scores. do not change in a consistent fashion.
change consistently, but in more than one direction.
If there is a relationship between "income" and "happiness," then as the amount of income increases, the amount of happiness also increases. decreases. stays the same. changes in some consistent manner.
changes in some consistent manner.
The z-score transformation is a useful statistical tool because it enables statisticians to compare and interpret scores from virtually any distribution of interval or ratio scores. determine which interval or ratio scores are the "best" scores. transform interval or ratio data by multiplying by a constant. revise the shape of distributions to be more useful.
compare and interpret scores from virtually any distribution of interval or ratio scores.
Pearson correlation coefficient
describes the linear relationship between two interval variables, two ratio variables, or one interval and one ratio variable.
Spearman rank-order correlation coefficient
describes the linear relationship between two variables measured by ranked scores.
In a z-distribution, the standard deviation will always be greater than 1. less than 1. equal to 1. equal to 0.
equal to 1
Homoscedasticity
occurs when the Y scores are spread out to the same degree at every X
Heteroscedasticity
occurs when the spread in Y is not equal throughout the relationship
Which of the following is the criterion that psychologists usually use to determine the likelihood that a sample mean was obtained by chance? p = 0.005 p = 0.05 p = 0.50 p = 0.01
p = 0.05
restriction of range
problem arises when the range between the lowest and highest scores on one or both variables is limited. This will produce a coefficient that is smaller than it would be if the range were not restricted.
When we select a sample so that all elements in the population have an equal chance of being selected, we are using inferential statistics. random sampling. probability. dependent events.
random sampling
