2443: Exam 3

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

A perfect correlation

(-1.00 or +1.00) all of the data fall in a straight line -Indicates a perfectly consistent relationship -For each change in X, there is a predictable change in Y

Pearson correlation coefficient

(r) is the most commonly used measure of correlation -measures the degree and the direction of the linear relationship between 2 variables

Independent variable

(x-axis) - the variable from which you are predicting (manipulated)

Dependent variable

(y-axis) - the variable you are trying to predict (measured)

Phi coefficient- what is it? what does it measure? what is the equation? what are are the effect sizes?

-A coefficient is used when both variables are dichotomous and nominal- both variables MUST be dichotomous -PHI measures strength in a chi squared test equation= To compute phi--> 1. convert each of the dichotomous variables to numerical values by assigning a 0 to one category and a 1 to the other category for each of the variables 2. Then use pearson correlation to convert data and compute phi coefficient Example: Reseracher examining birth order (1st born/later born) & personality (introvert/extrovert)

When there is a specific prediction about the direction of the correlation, what test would you use for predicting a positive relationship? (notation + meaning)

-conduct a directional, one-tailed test -positive relationship: H0: p < (less than or equal to) 0 (the population correlation is not positive) H1: p > 0 (the population correlation is positive)

When assessing a population for correlation and using a regular, nondirectional test, what does the sample correlation do the null?

-when the sample correlation is near zero provides support for Ho--> leads us to conclusion that the pop correlation is 0 -when the sample population is far from zero it refutes the null--> leads us to the conclusion that there is a real, nonzero correlation in the pop

Chapter 14

...

Alpha/critical values list the values for a 1/2 tailed test for each: .10 .05 .01

.10=1.28 (1 tail) .10=1.645 (2 tail) .05=1.645 (1 tail) .05=1.96 (2 tail) .01=2.33 (1 tail) .01=2.58 (2.58)

What are the two main steps in calculating the linear regression equation?

1) Calculate slope (b) 2) Calculate Y-intercept (a)

Pearson's Correlation (Pearson's r) *Write the equation*

A perfect linear relationship: -Every change in X has a corresponding change in Y -Correlation must be between -1.00 or +1.00 -The formula for r is based on the idea of covariance: how much do X and Y "vary together"?

When assessing a population for correlation and using a directional test, what does the sample correlation do the null?

A positive value for the sample correlation would tend to refute a null hypothesis (the null says that the pop correlation is not positive)

Third Variable Problem- what is it and what is this issue with it? Partial correlation?

A problem that occurs when the researcher cannot directly manipulate variables; as a result, the researcher cannot be confident that another, unmeasured variable is not the actual cause of differences in the variables of interest. *The problem of a correlation between two variables being dependent on another (third) variable* Partial Correlation: A correlational technique that involves measuring three variables and then statistically removing the effect of the third variable from the correlation of the remaining two variables

Experiemental method vs correlation- what is the difference?

An experiment isolates and manipulates the independent variable to observe its effect on the dependent variable, and controls the environment in order that extraneous variables may be eliminated. Experiments establish cause and effect. A correlation identifies variables and looks for a relationship between them.

Magnitude/Strength of the Correlation- what are we measuring? What does a linear relationship indicate? What happens when the relationship between variables is not perfect?

An indication of the strength of the relationship between two variables -Measures the consistency of the relationship -*For a linear relationship, the data points could fit perfectly on a straight line*--> this means every time X increases by one point, the value of Y also changes by a consistent and predictable amount -*Relationships between variables are not always perfect*--> this means that although there is a tendency of Y to increase, when X does, the amount that Y changes is not always the same The degree of the relationship is measured by the numerical value of the correlation coefficient

Outliers- what is it? what can make them easier to spot?

An outlier is an individual with X and/or Y values that are substantially different (smaller or larger) from the values obtained for the other individuals in the data set -Outliers can have a dramatic influence on the value obtained for the correlation--> the outlier alters the value for the correlation & thereby can affect one's interpretation of the relationship between the X and Y variables--> *looking at a scatter plot instead of basing your interpretation on the numerical value of the correlation will help you spot outliers*

A researcher obtains a strong positive correlation between aggressive behavior for six years old children and the amount of violence they watch on television. Based in this correlation, which conclusion is justified?

Children who watch more TV violence tend to exhibit more aggressive hevaiors

What is an example of a non-linear relationship?

Curvilinear

What is a "zero" relationship?

Exists when al of the scores on one variable are associated with a wide range of scores on another variable; a circle on a scatter plot

Example of determining significance for the pearson correlation coefficent: from the hugs and stress example we found r = -0.67 and our n = 6.

From the table we use df = 6 - 2 = 4 With a two-tailed test at α = 0.05 we find our rcv = ± 0.811 (since our r is negative, we use -0.811) Is robt > rcv ? Nope. -0.67 is not greater than (more extreme than) 0.811, so we fail to reject the null so we can conclude in our write-up: Using a Pearson's correlation coefficient, a negative relationship was found, but it was not significant, r(4) = -0.67, p > 0.05. This fails to support the hypothesis that hugs obtained and stress level are related in this sample.

What is the null hypothesis for correlation coefficients?

H0: p = 0 (There is no relationship between the two variables [i.e., population r is equal to 0]) Correlation between the two variables does not exist in the population!

What is the alternative hypothesis for correlation coefficients?

H1: p ≠ 0 (There is a relationship between the two variables [population r is not equal to 0]) Correlation between the two variables does exist in the population!

How are the null hypothesis & alternative hypothesis interpreted for correlation? (notation + meaning)

Ho: p = 0 (there is no pop correlation) h1: p = (equal sign with line through it) (there is a real, nonzero correlation correlation)

Covariance

How much do X and Y "vary together"

What is the goal of making predictions (about relationships)?

Identifying the factors that indicate when an event will occur

If a correlation is positive

Increases in X tend to be accompanied by increases in Y

What does having a strong relationship between two variables allow us to do?

Knowing the score on one variable allows us to predict the score on the other variable within a small range

What does it mean for prediction when there is no relationship between two variables?

Knowing the score on one variable does not allow us to predict the score on the other variable with any degree of precision

Which are more common: linear or nonlinear relationships?

Linear

Draw a scatterplot that shows: -perfect, negative correlation -no linear trend -strong positive correlation -weak negative correlation Which number (approx) are associated with each?

Perfect neg: -1.00 No linear trend: 0.00 A strong pos: +0.90 A weak neg: -0.40

A researcher wants to measure the relationship between gender (male/female) and voter registration (yes/no). Which correlation is appropriate?

Phi-coefficient

Directionality

Possibility that when two variables, A and B, are correlated variable A causes variable B or variable B causes variable A. -The inference made with respect to the direction of a causal relationship between two variables

What is a scatter plot?

Provides an initial indication of how scores on two variable are associated with each other

Magnitude: -1.00 - +1.00

Rare, not likely to happen in Psychology

What is covariance?

Refers to the extent to which two variables vary together such that they have shared variance

What is a regression line?

Regression is a method of finding an equation describing the best-fitting line for a set of data •Both variables are measured at the interval level (interval data) •Data must be from a random sample •Normal data distribution (or have a large sample) •How to define a "best fitting" straight line when there are many possible straight lines?--> a line that is the best fit for the actual data that minimizes prediction errors

What is the covariance between X and Y represented by?

Sum of products (SPxy)

What does the angle of the line indicate?

That changes in one variable are associated with changes in the other variable

What does it mean when both variables contain a certain amount of variance?

That there are differences among the scores for the two variables

Assumption of Causality

The assumption that a correlation indicates a causal relationship between the two variables

When reliability is high, what do we expect about correlation?

The correlation between two measurements should be strong and positive

The formula for r (correlation) is the ratio comparing what?

The covariability of X and Y (numerator) with the variability of X and Y separately (denominator). *write the formula*

What is Pearson r a ratio of?

The covariance between two variables to the variance of the two variables

ρ^2 (rho)

The fraction of the variance in Y due to change in X (as opposed to random variation)

What are the estimates for weak, moderate, and strong correlation coefficients?

The left is correlation coefficient values and the right is strength of relationship + / - = .00 - .29 --> none to weak + / - = .30 - .69 --> moderate + / - = .70 - 1.00 --> strong

What is the purpose of putting a regression line through data?

The line through the data •Makes the relationship easier to see •Shows the central tendency of the relationship •Can be used for prediction *Regression analysis precisely defines the line*

What does N represent in the df formula?

The number of pairs of scores involved in the computation of the correlation

What are continuous variables? What does it mean when variables being analyzed are continuous in nature?

The possible IV values can fall along a numeric continuum; they are more related -They are measured along a numeric continuum and may be illustrated using a scatter plot

The form of the correlational relationship- what do we wants our data points to do?

The relationship tends to have a linear form--> the points on the scatter plot tend to cluster around a straight line

Correlational Method

The technique whereby two or more variables are systematically measured and the naturally occurring relationship between them (i.e. how much can one be predicted from the other) is assessed.

Coefficient of Determination

The value of r^2 -It measure the proportion of variability in one variable that can be determined from the relationship with the other variable -The coefficient of determination gives you the proportion of the variance of your dependent variable (Y) that can be explained by variation in your independent variable (X). -measures the size and strength of the correlation EXAMPLE: A correlation of r= 0.80, r^2= 0.64 (or 64%) of the variability in the Y scores can be predicted from the relationship with X. -How much the variance of one variable can be determined from its relationship with the other variable

Example of third variable problem

There may be a correlation between homeless population and crime rate in that both tend to be high or low in the same locations. Is crime causing homelessness? Are homeless populations causing crimes? Third variables may be: Drug abuse Unemployment

What do you need in order to relate one variable to another?

There must be differences in the scores of BOTH variables

Direction of the relationship

an indication of which values of the dependent variable are associated with which values of the independent variable * The direction in which changes in one variable are associated with changes in another

Pearson's Correlation Example r= 0.67 and n=6

df= 6 - 2 = 4 With a two-tailed test at α = 0.05 we find our rcv = ± 0.811 (since our r is negative, we use -0.811) Is robt > rcv ? Nope. -0.67 is not greater than (more extreme than) 0.811, so we fail to reject the null so we can conclude in our write-up: Using a Pearson's correlation coefficient, a negative relationship was found, but it was not significant, r(4) = -0.67, p > 0.05. This fails to support the hypothesis that hugs obtained and stress level are related in this sample.

Which correlation is used to compute data that are also suitable for an independent-measures t test?

point-biserial

What and how do you report your findings when you calculate correlation?

•Report •Whether it is statistically significant •Concise test results •Value of correlation •Sample size •p-value or level •Type of test (one- or two-tailed) • coefficient of determination (depending on question asked) •E.g., r = -0.76, n = 48, p < .01, two tails.

Positive relationship/correlation

-an increase in one variable is accompanied by an increase in the other variable - indicated by a positive correlation coefficient

What does significance take into account in relation to correlations? What must the obtained correlation do in order to be significant?

-Significance takes into account strength AND the number of participants in your study. -You may have a very strong r (close to ±1), but if you have a very small sample, this strength is meaningless (not significant). Computing r alone, only provides you a measure of strength *Obtained correlation must exceed the magnitude/strength of the critical value*

Curvilinear Relationships *draw this*

-When a correlation coefficient does not adequately indicate the degree of relationship between the variables -Have to use different statistics -(ex. Amount of Anxiety and Performance graph)

Correlations involve the comparison of two measures to see:

-Whether or not there is a relationship -The strength of the relationship magnitude -The direction of the relationship -Whether you can predict a score (Y) from a score (X)

Negative relationship/ correlation

-an increase in one variable is accompanied by a decrease in the other variable - indicated by a negative correlation coefficient

Spearman's rho (ρ) In which 2 situations is the spearman's rho used? Example?

-A coefficient use when at least one variable is measured on an ordinal scale (rank-order data) -Usually not a straight line relationship--> learning a new skill make take time to develop- you may be horrible at the beginning but improve significantly over time as you practice- eventually you reach a point in which you are only improving a tiny bit -Used to measure: 1. Used when the original data are ordinal; that is when X and Y values are ranks. In this case, you simply apply the pearson correlation formula to the set of ranks. 2. When a researcher wants go measure the degree to which the relationship between X and Y is consistently one-directional, independent of the specific form of the relationship--> original scores are first converted to ranks; then the pearson correlation is used with ranks. Measures the degree of consistency in the relationship for the original scores Example: A teach rank-ordering students by leadership abilities Smallest to largest (1st,2nd,3rd)

Example of coefficient of determination- when there is an imperfect prediction & when there is a perfect prediction.

-A moderate, positive correlation, r= +0.60, between IQ scores & GPA -Students with high IQs tend to have higher grades than those with low IQs--> its not a perfect prediction--> meaning that although they tend to have higher grades, this is not ALWAYS true -In this case, knowing a student's IQ score can help explain the fact that different students have different GPA -PART of the differences in GPA are accounted for by IQ -r^2= 0.36, which means that 36% of the variance in GPA (y) can be explained can be explained by IQ (x) -When r= 1.00 (perfect linear relationship), and r^2= 1.00--> there is 100% predictability--> if you know a person's monthly salary you can predict their annual salary with perfect accuracy

Restrictive Range

-A variable that is truncated and has limited variability -A variable by its very nature needs to vary. -Correlation coefficient value (size) will be affected by the range of scores in the data -Severely restricted range may provide a very different correlation than would a broader range of scores -To be safe, never generalize a correlation beyond the sample range of data

Correlations & Outliers- what is it? how to spot an outlier in a scatter plot?

-An outlier is an extremely deviant individual in the sample -Characterized by a much larger (or smaller) score than all the others in the sample -In a scatter plot, the point is clearly different from all the other points -Outliers produce a disproportionately large impact on the correlation coefficient

Would adding 2 points to each of the X values in a pearson correlation change the correlation for the resulting data? What changes the correlation?

-Bc the pearson correlation describes the pattern formed by the data points, any factor that does not change the pattern also does not change the correlation -if 5 points were added to each value of X in a set of data points, then each data would move to the right / if the same were done to the Y values, the data would move to the left -ALL the data points shift the left or right meaning that the overall pattern is not changed (the correlation coefficient does not change) -Multiplying the x and/or y value by positive constant shifts the pattern up, but does not change the correlation -Muiltipyling the x and/or y value by a negative constant DOES change the correlation coefficient to a negative number, forming a mirror image of the pattern

Causality vs Directionality Does determining a relationship assert cause? (Assumptions of a correlation?)

-Causality: The assumption that a correlation indicates a causal relationship between the two variables -Directionality: the inference made with respect to the direction of a causal relationship between two variables *Correlation describes a relationship but does not demonstrate causation*

The strength of the relationship in relation to correlation- how do you find it? what is it called?

-Correlation measure the degree of relationship between 2 variables on a scale of 0 to 1.00. Although the number provides a measure of the degree of relationship, the squared correlation provides a BETTER measure of the strength of the relationship *Squared correlation (r^2) measures the gain in accuracy that is obtained from using a correlation for prediction--> measures the proportion of variability in the data that is explained by the relationship between X and Y* Called: COEFFICIENT OF DETERMINATION

Characteristics of correlational relationships

-Direction: negative (inverse) or positive; indicated by the sign, + or - of the correlation coefficient -Shape/ Form: linear is most common -Magnitude/ Strength (varies from 0 to ± 1) *These characteristics are all independent of each other, i.e. direction has no bearing on the strength of the correlation, etc.*

Linear equation- explain each variable in the equation. How can you find y?

-General equation for a line: Y = bX + a -X and Y are the variables EXAMPLE: X = Height, Y = Weight a and b are fixed constant: -a is the intercept of the line, the vertical height at which the line meets the Y-axis (vertical axis) -b is the slope of the line, how far the line goes up or down for every unit it goes to the right (like a stair step pattern). You can also think of slope as the increase or decrease in Y expected with each unit increase in X. •If you can find a and b from the data, you can predict Y given a value for X

Review of variability

-How much the scores in a set differ from one another, i.e. the degree to which the scores are spread out or clustered together. Ex: Two classes might have the same average exam score, but one set of scores might be much more spread out *Variability tells us distance of scores from the mean, (i.e. how much individual scores vary from the average/middle of the distribution)*

Sum of Products (SP)

-Measures the amount of covariability between two variables *SP is the sum of two difference scores*

A correlation coefficient measures what? What is the squared correlation coefficient (what is this also called)?

-Measures the degree of relationship on a scale from 0 to 1.00 *A correlation is NOT a proportion/percent* -Coefficient of Determination: The squared correlation coefficient may be interpreted as the proportion of shared variability

A correlation of zero indicates- what does this look like on a scatterplot?

-No consistency at all between variables or no relationship -The data points are scattered randomly with no clear trend

Sum of Products (computational formula)- what's the formula and why is it helpful?

-SP = SUM of XY - (SUM of X)(SUM of Y)/n -Definitional formula emphasizes SP as the sum of two difference scores

Why are correlations used? (4 reasons)

1. *Prediction:* If 2 variables are known to be related in some way, it is possible to use one of the variables to make accurate predictions *Example: SAT scores are positively correlated with GPA*--> *Using relationships to make predictions is called regression* 2. *Validity:* Using correlation is a common technique for demonstrating validity Example: If a Psychologist develops a new test for measuring intelligence. If the test measures intelligence, than the scores on the test should be related to other measures of intelligence (such as IQ tests, performance on learning tasks, problem solving ability etc) -The psychologist would measure the correlation between the new test and each of these other measures to demonstrate the new test valid. 3. *Reliability:* A measurement procedure is considered reliable to the extent that it produces stable, consistent measurements--> will produce the same or nearly same results when the individual us measured twice under the same conditions. 4. *Theory Verification:* Theories made about specific variable can be tested by determining the correlation between 2 variables. *Example: Theory about parent's IQ'S and the child's IQ.

How to determine whether the correlation was significant? (5 steps)

1. Calculate r 2. Go to critical values for pearson correlation coeficient 3. Follow the instructions to find your degrees of freedom (df = n - 2) which is your sample size (n) and subtract 2. 4. Always use the column for a two-tailed test at α = 0.05 and find rcv 5. Just like with our z-test, If robt > rcv you can reject the null (find significance).

Things you should consider when encountering a correlation? (4)

1. Correlation simply described a relationship between two variable- it does not explain why the two variables are related--> NOT PROOF OF CAUSE AND EFFECT 2. The value of a correlation can be affected greatly by the range of scores represented in data (restricted range) 3. One or two extreme data points, often called outliers, can have a dramatic effect on the value of the correlation 4. When judging how "good" a relationship is, do not focus on the numeral value of the correlation--> a correlation of +0.50 does not mean that you make the prediction with 50% accuracy--> to describe how accurately one variable predicts the other, you must do r^2 0.50^2--> 0.25 or 25% of the total variability

How do we describe the relationship between variables?

1. Nature 2. Strength 3. Direction

What are the 4 key assumptions of correlation?

1. Ratio or interval scale of measurement 2. Both variables are continuous 3. Each variable is more or less normally distributed 4. Linear relationship between variables

What do the guidelines for interpreting the values of Pearson r not keep in mind?

1. Sample size 2. Statistical significance

When you obtain a non zero correlation for a sample, the purpose of the hypothesis test is to decide between the following 2 interpretations:

1. There is no correlation in the pop (p=0) and the sample value is the result of sampling error. Remember, a sample is not expected to be identical to the population. There is always some error between a sample stat & the corresponding population parameter 2. The nonzero sample correlation accurately represents real, nonzero correlation in the population. This is the alternative stated in H1.

Point-Biserial Correlation- example What does it measure?

A coefficient is used when one variable is interval/ratio (numerical scores) and one variable is dichotomous/nominal Measures: 1. Measures strength of relationship between the two variables- large correlation near +/- 1 would indicate there is a significant/predictable relationship (maybe between cheating and the amount of light in room- well lit/dimmed lights) Examples of dichotomous variables: College grad vs non college grad first born vs later born child success vs failure on a particular task older than 30 yrs old vs young than 30 EXAMPLE of point biserial data being analyzed: number of solved puzzles in a well lit room vs a dimly lit room

Linear relationship

A relationship between variables is appropriately represented by a straight line, such that increases or decreases in scores for one variable are associated with corresponding increases or decreases in scores for another variable

Nonlinear relationsihp

A relationship between variables which is not appropriately represented by a straight line

What does it mean if the amount of covariance is high relative to the variance of the two variables?

A strong relationship will exist between the two variables, and the value for r will be high

r, in a sample

An estimate of a population coefficient of correlation (ρ-rho)

The most common use of correlation is to...

Measure straight line relationships

What does the scatter plot look like as the relationship between two variables becomes weaker?

More spherical

Do we use Pearson r for relationships that are nonlinear?

No

Does correlation = causation? Why?

No, although there may be a causal relationship, the simple existence of a correlation does not prove it--> In order to establish a cause and effect relationship, it is necessary to conduct a true experiment in which one variable is manipulated by a researcher and other variables are rigorously controlled

Sum of Products (definitional formula)

Numerator of the Pearson Correlation equation—> Similar to SS (sum of squared deviations), which ends up as part of the denominator of our equation. SP= SUM (X - Mx) (Y - My)

What does it mean if the amount of covariance is relatively low?

There will be a weak relationship between the two variable as reflected by a smaller value of r

Why do we use a scatter plot?

To initially see if the scores on one variable are associated with scores on another variable

What are correlational statistics based on?

Variance

Magnitude: +/- .50

Very good in Psychology

When do we use statistical procedures such as correlation?

When variables are continuous rather than categorical

Restricted Range

Whenever a correlation is computed from scores that do not represent the full range of possible values, you should be cautious in interpreting the correlation Example: Relationship between IQ and creativity--> you get a sample of limited IQ scores of your fellow college students (110-130), The correlation within this restricted range could be completely different from the correlation that would be obtained from a full range of IQ scores -so the complete correlation could show a strong positive correlation, but then the restricted range (sample group) you look at could show correlation that is near zero *DOES NOT SHOW THE BIGGER PICTURE*


संबंधित स्टडी सेट्स

Prin. Of Macroeconomics Smart Book Chapter 15

View Set

S7 CH6 - U.S. Treasury and Government Agency Securities

View Set

CIE AS Biology: Transport in Plants

View Set

Management Exam 3 - practice questions (ch 11)

View Set

MC 1313 Midterm Pt 2Social media ______.

View Set

Biology Final Reading Assignments

View Set

Chapter 35: Alterations of Pulmonary Function

View Set

Ch. 12: Hard Rock and Heavy Metal

View Set

Unit 3 Check for Understanding 3.5-3.7

View Set

Psychology 041 Chapter 14: Socioemotional Development in Middle Adulthood

View Set

STATE ACCOUNT QUIZIZIZIZIZZIZIIZIZIZI

View Set