Research Methods Chapter #5

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

criterion validity

An empirical form of measurement validity that establishes the extent to which a measure is correlated with a behavior or concrete outcome that it should be related to.

discriminant validity

An empirical test of the extent to which a measure does not associate strongly with measures of other, theoretically different constructs. Also called divergent validity. See also convergent validity.

convergent validity

An empirical test of the extent to which a measure is associated with other measures of a theoretically similar construct. See also discriminant validity.

In your own words, describe the difference between categorical and quantitative variables. Describe the differences between ordinal, interval, and ratio scales.

Categorical: levels are categories Quantitative: levels are coded with meaningful numbers

Which requires stronger correlations for its evidence: convergent validity or discriminant validity?

Convergent validity

internal reliability

In a measure that contains several items, the consistency in a pattern of answers, no matter how a question is phrased. Also called internal consistency.

Name the three common ways in which researchers operationalize their variables.

Self-report measures, observational measures, physiological measures.

validity

The appropriateness of a conclusion or decision. See also construct validity, external validity, internal validity, statistical validity.

test-retest reliability

The consistency in results every time a measure is used.

reliability

The consistency of the results of a measure.

interrater reliability

The degree to which two or more coders or observers give consistent ratings of a set of targets.

content validity

The extent to which a measure captures all parts of a defined construct.

face validity

The extent to which a measure is subjectively considered a plausible operationalization of the conceptual variable in question.

Explain why a variable will usually have only one conceptual definition but can have multiple operational definitions:

The researcher generally just has one definition.

slope direction

The upward, downward, or neutral slope of the cluster of data points in a scatterplot.

What do face validity and content validity have in common?

They're both subject to judgement

Dr. Johnson wants to do a study to investigate whether the physiological measure, heart rate variability, varies over time or whether it is a trait that stays stable within the same person over time. He records participants' heart rate variability once at the beginning of the semester and once at the end of the semester. He finds a high positive correlation (r = .55) between the first and second time points. What would a scatterplot of these results (heart rate variability at the beginning of the semester on the x-axis, heart rate variability at the end of the semester on the y-axis) look like? a. The cloud of points would slope upward from left to right. b. The cloud of points would have no slope at all. c. The cloud of points would slope downward from left to right. d. There isn't enough information given to determine this.

a. The cloud of points would slope upward from left to right. FEEDBACK: Using the Correlation Coefficient r to Evaluate Reliability — A positive correlation coefficient means that there is an upward (from left to right) slope. The higher scores on heart rate variability at the first time point correspond to higher scores on heart rate variability at the second time point.

Which of the following would NOT be considered an operational definition of memory? a. a cognitive process to retain and restore past information b. the accuracy with which a person can retell a short story based on the number of correct details c. the number of nonsense syllables a person can recall d. how quickly a person can tell whether or not a test item appeared in the list studied

a. a cognitive process to retain and restore past information FEEDBACK: More about Conceptual and Operational Variables — A conceptual variable is at an abstract level and the operational definition is the way that the researcher decides to measure that conceptual variable.

Classify each result below as an example of face validity, content validity, convergent and discriminant validity, or criterion validity: a. professor gives a class of 40 people his five-tie measure of conscientiousness (e.g, "I get chores done right away," "I follow a schedule," "I do not make a mess of things"). Average scores are correlated (r=-.20) with how many times each student has been late to class during the semester b. a professor gives a class of 40 people his five-item measure of his conscientiousness (e.g., "I get chores done right away," "I follow a schedule," "I do not make a mess of things"). Average scores are more highly correlated with a self-report measure of tidiness (r=.50) than with a measure of general knowledge (r=.09). c. The researcher e-mails his five-item measure of conscientiousness (e.g., "I get chores done right away," "I follow a schedule," "I do not make a mess of things") to 20 experts in personality psychology, and asks them if they think his items are a good measure of conscientiousness d. The researcher e-mails his five-item measure of conscientiousness (e.g., "I get chores done right away," "I follow a schedule," "I do not make a mess of things") to 20 experts in personality psychology, and asks them if they think he has included all the important aspects of conscientiousness.

a. criterion validity b. convergent and discriminant validity c. face validity d. content validity

Which of the following is an example of a categorical variable? a. declared major in college b. current age c. IQ score d. blood pressure reading

a. declared major in college FEEDBACK: Scales of Measurement — Categorical or nominal variables are those that fit into categories. Majors in college, such as psychology, business, or biology, are categorical.

Sun Mi is designing a questionnaire on loneliness. She is concerned that some features of loneliness are similar to depression and to low self-esteem. What type of validity does she need to show to demonstrate that her questionnaire assesses loneliness and not depression or low self-esteem? a. discriminant validity b. convergent validity c. face validity d. criterion validity

a. discriminant validity FEEDBACK: Convergent Validity and Discriminant Validity: Does the Pattern Make Sense? — Sun Mi wants to show that her results diverge from the results of those that depression or low self-esteem would produce.

Classify each operational variable below as categorical or quantitative. If the variable is quantitative, further classify it as ordinal, interval, or ratio. a. Degree of pupil dilation in a person's eyes in a study of romantic couples (measured in millimeters) b. number of books a person owns c. a book's sales rank on amazon.com d. location of a person's hometown (urban, rural, or suburban) e. nationality of the participants in a cross-cultural study of Canadian, Ghanaian, and French students f. a student's grade in school

a. quantitative, ratio b.quantitative, ratio c. quantitative, ordinal d. categorical e. categorical f. quantitative, interval

Lorenzo is studying aggression in children. First, Lorenzo administers a questionnaire to the children that asks them about their feelings of aggression. Then Lorenzo and his lab partner observe the children while they play and record instances of aggression. What type of measure is the questionnaire? a. self-report b. physiological c. observational d. ordinal scale

a. self-report FEEDBACK: Three Common Types of Measures — The questionnaire used to ask children to report their own aggression is a self-report measure.

When using a measure to assess a trait that is expected to remain stable over time, a researcher would expect to get consistent results each time the measure is used. This type of reliability is known as which of the following? a. test-retest b. internal c. interrater d. discriminant

a. test-retest FEEDBACK: Reliability of Measurement: Are the Scores Consistent? — For traits that are expected to remain stable over time, the measurement results for these traits should remain stable over time.

Classify each of the following results as an example of internal reliability, interrupter reliability, or test-retest reliability a. a researcher finds that people's scores on a measure of extroversion stay stable over 2 months b. an infancy researcher wants to measure how long a 3-month-old baby looks at a stimulus on the right and left sides of a screen. Two undergraduates watch a tape of the eye movements of ten infants and time how long each baby looks to the right and to the left. The two sets of timings are correlated r=.95 c. a researcher asks a sample of 40 people a set of five items that are all capturing how extroverted they are. The Cronback's alpha for the five items is found to be .65

a. test-retest b. interrater reliability c. internal reliability

Julie has developed an intervention to improve the relationship between parents and pre-school-aged children. In order to evaluate the effectiveness of her intervention, Julie video records the parents interacting with their children at the end of the study. She has two research assistants watch the videos and rate the level of warmth in the interaction. Julie then correlates the ratings of the raters. She finds a high positive correlation (r = .87) between the two raters. What would a scatterplot of these results (ratings by the first research assistant on the x-axis, ratings of the second research assistant on the y-axis) look like? a. The cloud of points would slope downward from left to right. b. The cloud of points would slope upward from left to right. c. There isn't enough information given to determine this. d. The cloud of points would have no slope at all.

b. The cloud of points would slope upward from left to right. FEEDBACK: Using the Correlation Coefficient r to Evaluate Reliability — A positive correlation coefficient means that there is an upward (from left to right) slope. The higher ratings of warmth by the first rater correspond to higher ratings of warmth by the second rater.

Some colleges no longer require the SAT I or the ACT tests, instead basing their admissions on other factors, such as high school GPA. A large reason that they have done this is that they have found a low correlation between the scores on the tests and the students' freshman year GPA. In other words, they were concerned that college entrance exams lacked which type of validity? a. face validity b. criterion validity c. discriminant validity d. content validity

b. criterion validity FEEDBACK: Criterion Validity: Does It Correlate with Key Behaviors? — The concern is the lack of criterion validity of these tests. The test scores did not predict the freshman year GPA.

Julie has developed an intervention to improve the relationship between parents and pre-school-aged children. In order to evaluate the effectiveness of her intervention, Julie video records the parents interacting with their children at the end of the study. She has two research assistants watch the videos and rate the level of warmth in the interaction. Julie then correlates the ratings of the raters. She finds a high positive correlation (r = .87) between the two raters. What type of reliability is she examining? a. test-retest b. interrater c. construct d. internal

b. interrater FEEDBACK: Using the Correlation Coefficient r to Evaluate Reliability — Julie had two observers rate the same participants at the same time, and she found a strong positive correlation. This supports strong interrater reliability for the measurement of warmth.

Georgina graduated as valedictorian of her high school class because of her class ranking. What type of scale is used for the quantitative variable of class ranking? a. interval scale b. ordinal scale c. ratio scale d. nominal scale

b. ordinal scale FEEDBACK: Scales of Measurement — Class ranking is based on how you rank relative to the rest of the class, so this is an ordinal scale. The number of grade points between students ranked next to each other varies.

The Department of Motor Vehicles receives a complaint that some of their employees who administer the road test pass a much higher percentage of test-takers than other employees. In this example, what aspect of the road test is being questioned? a. the test-retest reliability of the road test b. the interrater reliability of the road test c. the internal reliability of the road test d. the measurement validity of the road test

b. the interrater reliability of the road test FEEDBACK: Introducing Three Types of Reliability — This is a question of whether two observers would have similar findings and the complaint is asserting that they wouldn't, thus it is an interrater reliability question.

What information can you learn from a scatterplot that you cannot learn from the correlation coefficient? a. the strength of the relationship b. the values for each pair of measurements c. whether the relationship is statistically significant d. the direction of the relationship

b. the values for each pair of measurements FEEDBACK: Using the Correlation Coefficient r to Evaluate Reliability — Both scatterplots and correlation coefficients show the direction and strength, but only the scatterplot allows you to see each plotted point. Neither the scatterplot or correlation coefficient will show whether the relationship is statistically significant.

Mendoza et al. (2009) introduced a coin rotation task as a convenient test of motor dexterity. It involves timed completion of twenty 180° rotations of a nickel using the thumb, index, and middle fingers. The results were compared to the results of another widely used test of motor dexterity, the finger-tapping task, in which participants tap their index fingers as many times as possible in 10 seconds. The results indicated that there was a statistically significant relationship between the finger-tapping task and the coin rotation task (r = -.40). What would a scatterplot of these results (coin rotation scores on the x-axis, finger-tapping scores on the y-axis) look like? a. The cloud of points would slope upward from left to right. b. The cloud of points would have no slope at all. c. The cloud of points would slope downward from left to right. d. There isn't enough information given to determine this.

c. The cloud of points would slope downward from left to right. FEEDBACK: Using the Correlation Coefficient r to Evaluate Reliability — A negative correlation coefficient means that there is a downward (from left to right) slope. The lower scores of the coin rotation task correspond to quicker turning, and higher scores on the tapping task correspond to faster tapping, producing a negative correlation.

Mendoza et al. (2009) introduced a coin rotation task as a convenient test of motor dexterity. It involves timed completion of twenty 180° rotations of a nickel using the thumb, index, and middle fingers. Research participants' results on the coin rotation task are compared with their results on two widely used tests of motor dexterity: the finger-tapping task and the Grooved Pegboard task. What empirical way of assessing construct validity is being used? a. divergent validity b. face validity c. convergent validity d. criterion validity

c. convergent validity FEEDBACK: Convergent Validity and Discriminant Validity: Does the Pattern Make Sense? — If a measure correlates strongly with other measures of the same construct, it shows convergent validity. This is considered evidence for the validity of the measure.

Mendoza et al. (2009) introduced a coin rotation task as a convenient test of motor dexterity. It involves timed completion of twenty 180° rotations of a nickel using the thumb, index, and middle fingers. Research participants' results on the coin rotation task are compared with their results on a test of grip strength — a measure of another construct: global upper-extremity strength. The correlation between the coin rotation task and the grip strength task were found to be not statistically significant. This comparison provides support for which type of measurement validity? a. convergent validity b. predictive validity c. divergent validity d. face validity

c. divergent validity FEEDBACK: Convergent Validity and Discriminant Validity: Does the Pattern Make Sense? — The coin rotation task should correlate strongly with other tests of motor dexterity, but should not correlate strongly with measures of other traits. Grip strength is related to motor functioning, but does not capture dexterity, so this is discriminant evidence for validity.

Dr. Kamran studies domestic violence and has designed a self-report scale that is meant to assess men's negative attitudes toward women. To validate her scale, she administers it to two groups of recently incarcerated male prisoners: prisoners convicted of domestic violence and prisoners convicted of other crimes. Dr. Kamran finds a statistically significant difference in the mean scores of the two groups. What technique is Dr. Kamran using to validate her scale? a. physiological measurements b. interrater reliability test c. known-groups paradigm d. test-retest technique

c. known-groups paradigm FEEDBACK: Known-Groups Evidence for Criterion Validity — She is using two known groups of people, some who committed domestic violence and some who didn't, to test concurrent validity of her scale.

Lorenzo is studying aggression in children. First, Lorenzo administers a questionnaire to the children that asks them about their feelings of aggression. Then Lorenzo and his lab partner observe the children while they play and record instances of aggression. The results of these two parts of the study are compared. The total number of instances of aggression for each child is used as the measure in the observational part of the study. What type of quantitative variable is this? a. categorical b. ordinal scale c. ratio scale d. interval scale

c. ratio scale FEEDBACK: Scales of Measurement — The total instances of aggression for each child is a ratio scale measurement, since a child could have a score of 0 (no aggressive instances) and it would be meaningful to say one child was twice as aggressive as another.

Which of the following is an example of a physiological measure? a. speed in solving a puzzle b. ratings by an observer c. skin conductance d. responses to a questionnaire

c. skin conductance FEEDBACK: Three Common Types of Measures — Skin conductance is the only one of these that operationalizes a variable by recording biological data.

Which statistic is used to represent the internal reliability of multiple-item self-report scales? a. s, the standard deviation b. r, the correlation coefficient c. Kappa d. Cronbach's alpha

d. Cronbach's alpha FEEDBACK: Using the Correlation Coefficient r to Evaluate Reliability — Cronbach's alpha is a statistic based on the average of inter-item correlations. It is used to assess internal reliability of a scale.

Josiane has found an online test that claims to measure IQ. It consists of choosing the correct definitions for a series of words. She is concerned that it doesn't include any tests of other things that are part of IQ, such as problem solving or visual-spatial ability. Which type of validity is she questioning? a. face validity b. discriminant validity c. criterion validity d. content validity

d. content validity FEEDBACK: Face Validity and Content Validity: Does It Look Like a Good Measure? — Her concern is that the test does not capture all parts of the construct of intelligence. In her subjective judgment, parts of the construct of intelligence are not included in a vocabulary test.

Dr. Nolan gives his new anxiety measure to a group of his colleagues who are anxiety experts. They agree that the questions on the measure appear to assess anxiety symptoms. This suggests that Dr. Nolan's measure has which of the following types of measurement validity? a. criterion validity b. content validity c. discriminant validity d. face validity

d. face validity FEEDBACK: Face Validity and Content Validity: Does It Look Like a Good Measure? — Face validity means that a measure appears to be a plausible or reasonable measure of the variable.

Lorenzo is studying aggression in children. First, Lorenzo administers a questionnaire to the children that asks them about their feelings of aggression. Then Lorenzo and his lab partner observe the children while they play and record instances of aggression. The results of these two parts of the study are compared. Lorenzo runs a statistical test to find how consistent the responses are to different wordings of items on the questionnaire. What type of reliability is he examining? a. construct b. interrater c. test-retest d. internal

d. internal FEEBACK: Using the Correlation Coefficient r to Evaluate Reliability — Lorenzo has more than one question measuring the same construct so he needs to check the internal reliability, or whether their responses are consistent.

In a study of aggression in children, a researcher has his undergraduate research assistants watch a group of children on the playground and record the number of instances of physical or verbal attacks. Which category of measured variable is this researcher using? a. neuropsychological measures b. self-report measures c. physiological measures d. observational measures

d. observational measures FEEDBACK: Three Common Types of Measures — The researcher is recording the observational behaviors of the children by recording acts of aggression.

Dr. Johnson wants to do a study to investigate whether the physiological measure, heart rate variability, varies over time or whether it is a trait that stays stable within the same person over time. He records participants' heart rate variability once at the beginning of the semester and once at the end of the semester. He finds a high positive correlation (r = .65) between the first and second time points. What type of reliability is he examining? a. interrater b. construct c. internal d. test-retest

d. test-retest FEEDBACK: Using the Correlation Coefficient r to Evaluate Reliability — Dr. Johnson has measured the same set of participants on the same measure twice and found a strong positive correlation. This is a sign of strong test-retest reliability.

Professor Morgan questions whether the ratings he receives from his students on "teaching effectiveness" indicate how much the students learn in his class or whether they are just a reflection of how much his students like him. What aspect of the ratings is he questioning? a. the statistical significance of the ratings b. the use of an interval scale c. the reliability of the ratings d. the measurement validity of the ratings

d. the measurement validity of the ratings FEEDBACK: Measurement Validity of Abstract Constructs — Professor Morgan is concerned about whether the tool used to assess his teaching effectiveness actually measures that construct or some other construct.

To establish criterion validity, researchers make sure the scale or measure is correlated with ______

some relevant behavior or outcome

Cronbach's alpha

A correlation-based statistic that measures a scale's internal reliability. Also called coefficient alpha.

strength

A description of an association indicating how closely the data points in a scatterplot cluster along a line of best fit drawn through them.

known-groups paradigm

A method for establishing criterion validity, in which a researcher tests two or more groups, who are known to differ on the variable of interest, to ensure that they score differently on a measure of that variable.

physiological measure

A method of measuring a variable by recording biological data.

observational measure

A method of measuring a variable by recording observable behaviors or physical traces of behaviors. Also called behavioral measure.

self-report measure

A method of measuring a variable in which people answer questions about themselves in a questionnaire or interview.

interval scale

A quantitative measurement scale that has no "true zero," and in which the numerals represent equal intervals (distances) between levels (e.g., temperature in degrees). See also ordinal scale, ratio scale.

ordinal scale

A quantitative measurement scale whose levels represent a ranked order, in which it is unclear whether the distances between levels are equal (e.g., a 5-star rating scale). See also interval scale, ratio scale.

ratio scale

A quantitative scale of measurement in which the numerals have equal intervals and the value of zero truly means "nothing." See also interval scale, ordinal scale.

correlation coefficient r

A single number, ranging from -1.0 to 1.0, that indicates the strength and direction of an association between two variables.

categorical variable

A variable whose levels are categories (e.g., male/female). Also called nominal variable.

quantitative variable

A variable whose values can be recorded as meaningful numbers.


Kaugnay na mga set ng pag-aaral

islamic art form function content and context

View Set

Chapter 7 Attitudes and Attitude Change: Influencing Thoughts and Feelings

View Set

ATI - Hematology Dynamic Quizzing

View Set

Chapter 41: Musculoskeletal - NCLEX

View Set

Chapter 1: Thinking Like an Economist

View Set

Ramsey classroom post test chapter 5

View Set

Intro Arch Module 1: Greek Architecture

View Set