Stats 207

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Why do we do regression lines?

To understand why something varies

True or false: Error variance = sampling error?

True!

What is type I error and what is it related to?

Type I error is rejecting a null hypothesis that is actually true. The p value is above or equal to .05. It uses the alpha coefficient α

What is type II error and what is it related to?

Type II error is retaining the null hypothesis when it is actually false. The p value is below or equal to 0.05. It uses the beta coefficient β

The regression line always passes through the point... a. 0,0 b. Xline, Yline c. Yline, 0 d. a, b

b - this would be the two means, and that is the best prediction as is the regression line

What is so important about the normal distribution?

1. Most dependent variables will end up having normal distributions. 2. If we can assume that a variable is normally distributed then we can make a number of inferences about values in the variable 3. With a crap ton of sample means the theoretical distribution will be shown to be about normal under a variety of conditions. this is called... THE SAMPLING DISTRIBUTION 4. Most statistical procedures we will employ have, somewhere in the derivation, an assumption that a variable is normally distributed

If the whiskers on a boxplot are much longer on the right than on the left, we would suspect that the distribution is.... A. Positively skewed B. Negatively skewed C. Symmetric D. Distorted

A

If there is a correlation of 0.00, what will the "best-fitting" line drawn through the points most likely go? a. straight across the page b. straight down the page c. from upper left to lower right d. from lower left to upper right

A

Sometimes we reject the null hypothesis when it is true. This is technically referred to as... A. A type 1 error B. A type II error C. a mistake D. good fortune

A

The standard deviation of sampling distribution is known as... A. The standard error B. The variance C. error. D. The sampling deviation

A

The symbols a and b are frequently referred to as... a. regression coefficients b. constants c. slopes d. regression correlations

A

The value of the test statistic that would lead us to reject the null hypothesis is called... A. The critical value B. The test value C. The rejection value D. The acceptance value

A

We are most likely to randomly pick which score from an actual data set? A. The mode B. The median C. The mean D. none of the above

A

We want to study the mean difference in autonomy between first-born and second-born children. Instead of taking a random sample of children we take a random sample of families and sort the children into first- and second-born. The dependent variable is a measure of autonomy. This experiment would most likely employ A. a repeated measures analysis. B. an independent measures analysis. C. a correlation coefficient. D. a scatterplot.

A

What is the residual? a. Error of prediction b. Regression coefficients c. Standardized data d. Variability

A

When testing the means of independent samples, the null hypothesis is best thought of as A. the mean of population 1 is equal to the mean of population 2. B. the mean of population 1 is unequal to the mean of population 2. C. the mean of a set of difference scores is equal to 0. D. the two population means don't differ by more than 10 points.

A

When variables A and B are independent, the variance of A - B is equal to A. the variance of A plus the variance of B. B. the variance of A. C. variance of A minus the variance of B. D. it cannot be determined from the information given here.

A

When we speak about pooling variances we are referring to A. a weighted average of the two groups' variances. B. using the variance of the larger group. C. adding up the variances in each of two groups and dividing by 2. D. taking the variance of both sets of data combined

A

When we use a regression equation to make a prediction, the errors that we make are often referred to as... a. residuals b. predictions c. standard errors

A

If we do not know X, our measure of error in predicting Y is... a. The standard deviation of Y b. The standard deviation of X c. the standard error of estimate d. the standardized residual

A (only one of these things that is in the equation besides sx but that is 0 since we don't know it)

The correlation between variables is defined as... a. The covariance of those variables divided by the product of their standard deviations b. the covariance of those variables divided by the variance of X c. The covariance of those variables divided by the variance of Y d. The cross product of all the pairs of scores

A r= (∑(x-mean of x)(y-mean of y) divided by n-1) all divided by the standard deviation of x times the standard deviation of y

The sampling distribution of the sample mean is normally distributed if... (Must list both A and B)

A. n is greater than or equal to 30 OR B. X is normal

Are r and r² related? Why a. yes b. no

A. yes.... r is the correlation coefficient and r² is the amount of variability in that correlation

The equation for a straight line is an equation of the form Y= bX+a. In regression "Y" becomes" a.Y hat b. Y line c. X

A: Y Hat

What is a relationship between variables?

As one variable X changes another variable Y changes

A researcher was interested in examining the different levels of aggression shown by college students and prisoners with life sentences. Aggression was measured by surveys given during one session. Which analysis should the researcher perform on the mean aggression scores of the groups? A. related means t test B. independent means t test C. correlation D. regression

B

Data points at the extremes of the distribution have... A. Little effect on the variance B. More effect on the variance than scores at the center of the distribution C. Are undoubtedly correct D. Distort the usefulness of the median

B

For the following data set [1, 7, 9, 15, 33, 76, 103, 118], what is the median location? A. 5 B. 4.5 C. 33 D. 24

B

In using ordinal data, which measure of central tendency is probably least useful? A. Mode B. Mean C. Median D. You cannot use any measure of central tendency with ordinal data.

B

One of the Problems we face when we try to draw conclusions from data is that we have to deal with... A. means B. error variance C. population size D. hypothesis

B

Professor Neuberg found that the mean number of alcoholic drinks consumed at a party was much higher for males than for females. If the median and mode number of drinks consumed by males and females was both zero, how can the difference in means be explained? A. The mean number of drinks consumed by females was disproportionately increased by outliers drinking lots of drinks. B. The mean number of drinks consumed by males disproportionately increased by outliers drinking lots of drinks. C. The difference in means gives no information that is useful and should not be explained D. The difference most likely comes from an error in calculation

B

The author in the correlation chapter draws vertical and horizontal lines at the mean of each variable to cut the graph into four quadrants. When there is a high positive correlation between two variables, we would expect most of the data points to fall... a. to the right of the vertical line b. in the upper right and lower left quadrants c. below the horizontal line d. equally in all four quadrants

B

The central feature of all hypothesis testing procedure is... A. The sample mean B. The sampling distribution C. The range of outcomes D. The type of experiment we run

B

The t statistic differs from z in that A. it applies to variances. B. it is used when population variances are not known. C. it involves an assumption of normality. D. A and B but not C

B

To go from this equation, z, (which is correct) to the equation for t we must A. substitute sample means for population means. B. substitute sample variances for population variances. C. substitute sample standard deviations for population variances. D. draw very large samples.

B

Using a regression equation to predict a value will always lead to... a. a horribly wrong prediction b. the most plausible prediction c. a highly accurate prediction d. a sad winding road to the dark memories of your past

B

We are most likely to reject a null hypothesis if the test statistic we compute is... A. Very small B. Quite extreme C. What we would expect if the null hypothesis were true D. Equal to the number of observations in the sample

B

When we have standardized data, the slope will be denoted as... a. b b. β c. 1.0 d. r

B

A data set of intelligence scores was collected from high school seniors. The IQ scores ranged from 82 to 113. Which of the following is probably NOT a reasonable estimate of the standard deviation? A. 6.2 B. 4.7 C. 35.4 D. All of the above are reasonable

C

A dichotomous variable is one that... a. can take on any number of values b. can take on one of only three values c. can take on one of only two value d. Can take on only one value

C

For the following data set [1, 9, 9, 9, 11, 28], of the following is false? A. The mode is 9 B. The median is 9 C. The mean is 9 D. The median location is 3.5

C

If the correlation between a body image measure and an eating disorders measure is .50, we can conclude that... a. body image has very little to do with eating disorders b. 50% of the variability in the eating disorders scales is associated with variability in body image c. one quarter of the variability in the eating disorders scores is associated with variability in body image d. overweight people eat too much

C

If the correlation between smoking and lung cancer is .50, smoking accounts for ___% of the variability in lung cancer a. 50% b. 75% c. 25% d. 33.3%

C

If we were to repeat an experiment a large number of times and calculate a statistic such as the mean for each experiment, the distribution of these statistics would be called... A. The distributional distribution B. The error distribution C. The sampling distribution D. The test outcome

C

In a boxplot the width of the box encompasses... A. All of the observed values B. All but the most extreme values C. Approximately 50% of the observed values D. The center-most 10% of the values

C

In a regression, the prediction of health symptoms from stress has beta =0.5. This means that for every 1 points increase in stress there is a .... a. .25 increase in symptoms b. negative increase in symptoms c. .5 increase in symptoms d. no change in symptoms

C

Our assumption that population variances are equal is called the assumption of A. equal dispersion. B. heterogeneity of variance. C. homogeneity of variance. D. pooled variances.

C

People in the stock market refer to a measure called the "standard deviation," although it is calculated somewhat differently from the one discussed here. It is a good guess that this measure refers to.... A. The riskiness of the stock B. The value of the stock C. How much the stock price is likely to fluctuate D. How much money you are likely to earn from buying that stock

C

Spearman's correlation coefficient (rS) applies to... a. any data b. linear data c. data that are ranks d. only continuous data

C

The most important characteristic of two independent samples is A. we measure the same subjects on two separate times. B. the subjects in one sample are part of the same family as subjects in the other sample. C. the set of scores for one sample is uncorrelated with the scores in the other sample. D. the subjects in a group are people who know each other.

C

The sampling distributions help us test hypotheses about means by... A. Telling us exactly what the population mean is B. Telling us how variable the population is C. Telling us what kinds of means to expect if the null hypothesis is true D. Telling us what kinds of means to expect if the null hypothesis is false

C

To look at the sampling distribution of the mean we would... A. Calculate a mean and compare it to the standard deviation B. Calculate a mean and compare it to the standard error C. Calculate many means and plot them D. Look the sampling distribution up in a book

C

When we have an independent sample t test, the degrees of freedom are equal to A. N B. N1 + N2 - 1 C. N1 + N2 - 2 D. N - 1

C

When we have considerable spread of the points about the regression line, the slope of that line can be/will be ______ the slope of a similar line when there is less scatter. a. less than b. more than c. the same as d. more extreme than

C

When we say that the correlation between age and test performance is significant we mean... a. There is an important relationship between age and performance b. The true correlation between and and performance in the population is equal to 0 c. The true correlation between age and performance in the population is not equal to 0 d. Getting older causes you to do poorly on tests

C

Which r-value represents the strongest correlation? a. +.50 b. -.50 c. -.75 d. 1.65

C

The covariance measure is... a. The probability of obtaining a significant result b. the degree to which observations predict each other c. The degree to which observations vary together d. The probability of finding variance

C (covariance.. vary)

If we have a regression line predicting the amount of improvement in your performance as a function of the amount of tutoring you receive, an intercept of 12 would mean that... a. you need to have 12 hour of tutoring to get an A b. if you don't have any tutoring, the best you can do is a grade of 12 c. even without tutoring you will improve d. tutoring helps

C (key point is the amount of improvement, and not actual score)

When we have standardized data, the slope will be denoted as... a. b b. 1.0 c. β d. r

C. beta fish

What is the standard normal distribution?

Created by a z score transformation of normally distributed data mean is 0 and standard deviation is 1

A significant slope means that a. The slope is positive b. there is a significant relationship between X and Y in the population c. the slope is not equal to the 0 in the population d. both b and c

D

For the data set [1, 3, 3, 5, 5, 5, 7, 7, 9], the value "5" is... A. The mode B. The median C. The mean D. all of the above

D

Hypothesis testing is necessarily a part of... A. Descriptive Statistics B. Order Statistics C. Test Construction Statistics D. Inferential Statistics

D

If the association between warm parenting practices and self-esteem is .50, then how much of the variability in self-esteem is INDEPENDENT of warm parenting practices? a. 10% b. 25% c. 90% d. 75%

D

The correlation between two variables is a measure of the degree to which... a. points cluster together around some best fitting straight line b. Differences in one variable can be predicted from differences in the other variable c. One variable varies with the other variable d. all of the above

D

The difference between s and sigma is that sigma is... A. The value of the standard deviation in a sample B. The long range average of the variance over repeated sampling C. The biased estimate of s. D. The value of the standard deviation in a population

D

The distribution of differences between means is A. normally distributed for very large samples. B. approaches normal as sample sizes increase. C. may be nonnormal for small samples. D. all of the above.

D

The population variance is... A. An estimate of sample variance B. Calculated exactly like the sample variance C. A biased estimate D. Usually an unknown that we try to estimate

D

The relationship between the cost of chocolate chip cookies and their rated quality shows a correlation that represents an even distribution among data points. What is most likely the correlation? a. -.50 b. -.80 c. .80 d. .00

D

The standard deviation for 8, 9, and 10 is... A. -3.0 B. 0.0 C. .67 D. 1.0

D

What does beta (β) stand for? a. Fish b. the standardized slope c. regression coefficient d. b and c

D

What does n stand for? A. The size of the population B. The number of scores C. The size of the sample D. B and C

D

When there is only one predictor variable in a regression, beta is equal to what? a. Nothing, beta stands alone b. Equal to r² c. Equal to b d. Equal to r

D

When we restrict the rang of X or Y, we may... a. lower the correlation from what it would otherwise be b. raise the correlation from what it might be c. leave the correlation the same as it would otherwise be d. All of the above are possible

D

Which of the following can be defined algebraically? A. Mean B. Median C. Median Location D. Both a and c

D

Which of the following statements about the mode is true? A. It must be an actual score that occurred in the data set. B. It usually consists of one number C. It cannot be calculated algebraically (with a formula) D. All of the above are true

D

You would obtain a negative value for the variance if... A. All observations were at the mean B. The distribution is very negatively skewed C. The distribution is positively skewed D. You would never obtain a negative variance

D

The following is an example of a negative correlation. As height increases, so does the foot size.

False... Height decreases or positive correlation

Restricted range has no effect on correlation coefficients.

False... There is an effect

Correlation coefficients closer to 0 reflect a strong relationship. T/F?

False... closer to 1 or reflects a weak relationship

Correlations cannot be used to test associations between two dichotomous variables. T/F?

False.... They CAN be used

What is the Central Limit Theorem?

If we take the mean of a bunch of sample populations, we WILL get a normal distribution

What does µxbar = µx mean?

It means that all the mean of the means of the population should be equal to a mean in any given sample

What is a correlation?

Its a relationship between variables

the standard error of estimate is DIFFERENT from other equations because it is divided by...

N-2

In a normal distribution what does the mean, standard deviation, and variance look like

mean is 0 variance is 1 and standard deviation is 1

What is the standard error of estimate equation?

sy=the square root of the sum of y minus y hat squared all divided by n-2

Steps to hypothesis testing

Step 1: Research Hypothesis (H1) Step 2: Null Hypothesis (H0) that directly conflicts with H1 Step 3: Obtain or collect data Step 4: Construct a sampling distribution of a test statistic as if H0 is true Step 5: Calculate the sample test statistics Step 6: Make decision upon the z value obtained

Correlation coefficients can range from -1 to 1. T/F?

TRUE

A correlation of .65 between depression and anxiety suggests that people who are highly depressed are reasonably likely to be highly anxious.

TRUE.... What if it was -.65?

What is the sampling distribution of the mean?

The distribution of all possible sample means (with a fixed n from a population)

How are sampling distributions and sampling errors related?

The sampling error is the standard deviation of sampling distributions

Standard error of the mean (Standard Error)

The standard deviation of a sampling distribution of a sample mean

What is σxbar= σx / √n ?

The standard deviation of the sampling distribution of samples means is equal to the standard deviation of the population divided by the square root.

If X and Y are standardized then what do they equal? a. 1 b. 5 c. .5 d. none of the above

A

If a store manager wanted to stock the men's clothing department with shirts fitting the most men, which measure of central tendency of men's shirt sizes should be employed? A. Mode B. Mean C. Median D. Average

A

If the correlation between X and Y is negative, the slope of the regression equation must be... a. negative b. positive c. non-significant d. it could be a or b

A

What ranges of p-values are highly significant?

.001 to .03

What are the proportions of the area under the curve?

.02, .14, .34, .34, .14, .02

If the correlation between the rating of the cookie quality and the cookie price is .30 and the probability of that result is .35 under the Null Hypothesis, we would say that... a. The correlation is not significant b. The correlation is significant c. The difference is too close to call d.We don't have any way to come to a conclusion

A


Set pelajaran terkait

Anatomy & Physiology II Lab, Ch. 18: Heart

View Set

HRM 324T: Total Compensation TOPICS 1 -12

View Set

Hypovolemia/Hypervolemia - Patel

View Set

Schizophrenia, psychotic, Antipsychotic/Anxiolytic drugs

View Set

Statistics for Social Workers Chapter 2 Frequency Distribution/ Graphs

View Set

Nature of Science statements 11-20

View Set