Chapter 5: Evaluating Research Review
Match each type of robustness test with the scenario that best fits. 1) Split-half method 2) Test-retest method 3) Pilot testing A) A sociologist has access to a random sample of 300 people to test a new survey on mental health that will go into the field next week B) A sociologist needs to test a new composite meaure of mental health C) A sociologist has access to a sample group for at least a year to test a new survey on mental health
1) B 2) C 3) A
Match each dimension of internal validity of measures with the key question that best fits. 1) Does it seem appropriate and sensible? 2) Does it correlate with a measure that is should predict? 3) Does it cover all of the different meanings of a concept? 4) Do the items in a measure connect to the underlying concept? A) Construct B) Content C) Criterion D) Face
1) D 2) C 3) B 4) A
Which of the following is required to perform the test-retest method of examining robustness?
A longitudinal design *The test-retest method of testing robustness requires that a measure is administered to the same sample at two different times, thus a longitudinal design is absolutely necessary.*
The split-half measure could only be used to evaluate robustness with which type of measure?
Composite variable *The split-half method is appropriate when dealing with a measure consisting of multiple items, as in the CBCL and CES-D. The researcher randomly splits the set of items for a measure into two sets to create two separate measures instead of one. These two measures are then tested in a sample of individuals--the same people respond to each of the items in the first measure and then respond to each of the items in the second. Comparing how the same people respond to the two measures gives some idea of how reliable the overall measure would be. A reliable measure would mean that the average scores for each measure were similar.*
The two dimensions of criterion-related validity are predictive validity and ____.
Concurrent validity *Criterion-related validity can be broken down into two different but closely connected subsets: concurrent validity and predictive validity. A measure with concurrent validity will correlate strongly with some preexisting measure of the construct that has already been deemed valid. When a measure has predictive validity, it will correlate strongly with a measure that it should predict.*
Dr. White has discovered that one of his indicators of popularity--how often you speak on the phone with your friends--is not highly correlated with other indicators of popularity. Which type of validity is Dr. White concerned with?
Construct validity *Construct validity, refers to how well multiple indicators are connected to some underlying factor. This type of validity is important because some phenomena are not directly observable and can only be identified by cataloging their observable symptoms. An indicator that is not highly correlated with other indicators would reduce the construct validity of the overall measure.*
Dr. Gonzales is concerned that including income, education, and occupational status, but not wealth, into a measure of socioeconomic status is problematic. Which type of validity is Dr. Gonzales concerned with?
Content validity *Content validity is all about coverage, or how well a measure is encompassing the many different meanings of a single concept. Does the measure represent all of the facets of the concept? A multidimensional concept is a major challenge to valid measurement. A measure may capture one or several dimensions well but not tap into all dimensions and, therefore, be less valid than one that captures all dimensions. Extremely rich people may not have an income (or even an occupation!) if they have inherited wealth. Thus, they may score low on dimensions of socioeconomic status that would be explained if we included wealth in our measure.*
Dr. Jackson has designed a survey to examine the mental health of college students. One measure he includes in the survey is a composite measure of happiness. Which would be appropriate to examine in order to address Dr. Jackson's concern?
Cronbach's alpha *Cronbach's alpha (α) is a straightforward calculation that measures a specific kind of reliability for a composite variable; that is, a variable that averages a set of items (typically from a survey) to measure some concept. Internal reliability concerns the degree to which the various items lead to a consistent response and, therefore, are tapping into the same concept. The higher the alpha, the more internally reliable the composite is, meaning that people respond to the various items in a consistent way.*
To achieve reliable results, researchers make sure that their measures are capturing a concept in ways that allow for comparison across ____.
Data collections *To know how dependable a single measure is, researchers much make comparisons and the broadest comparison they can make is across data collections.*
Dr. Pedulla applies to jobs with hypothetical candidates by sending résumés to employers and waiting to see if they respond and ask the candidate to come in for an interview. He does this to see how employers respond to different employment histories (e.g., stretches of unemployment, work history) listed on each résumé. What is Dr. Pedulla's manipulation in this field experiment?
Employment histories listed on each résumé *A manipulation is something that is done to some subjects of an experiment but no others so that the outcomes of the two groups can be compared.*
Which of the following is usually considered the most internally valid method?
Experiments *Internal validity of a study refers to the degree to which a study establishes a causal effect of the independent variable on the dependent variable. Experimental designs tend to be the most internally valid method, as their primary advantage over other methods is that they can establish causality.*
____ is the first, and most shallow, assessment of validity that a researcher can make.
Face validity *The most basic and simple dimension of internal validity is face validity: literally, whether something looks valid on the face of it. Face validity is the first, and most shallow, assessment of validity that a researcher can make. It works best when it leads to more rigorous assessments of more complex dimensions of validity.*
External validity involves two basic questions: (1) How representative is the group being studied? and (2) which of the following questions?
How "real" is the study *This question is most commonly asked about experiments, which tend to maximize the internal validity of the study (establishing cause and effect) but are vulnerable to issues of external validity. Some experiments might seem contrived to subjects--would people act in the lab the same way that they do in the real world? People might be self-conscious knowing that they are being observed or simply think that the whole situation seems artificial. In these cases, experiments lose external validity because we do not know if the same results would emerge outside the lab.*
Dr. Larsen has employed multiple observers to record the race of people portrayed in different pictures. Which type of reliability should Dr. Larsen be concerned about with this measure?
Intercoder reliability *When multiple observers or coders are used in data collection, it is important to report measures of intercoder reliability. Intercoder reliability, which can be calculated in several ways, reveals how much different coders or observers agree with one another when looking at the same data. The more agreement among coders, the more reliable the coding protocol is, making comparison and contrasts easier.*
Dr. Jackson has designed a survey to examine the mental health of college students. One measure he includes in the survey is a composite measure of happiness. Which type of reliability should Dr. Jackson be concerned about with this measure?
Internal reliability *Internal reliability is the degree to which the various items in a composite variable lead to a consistent response and, therefore, are tapping into the same concept.*
Dr. Washington conducts a panel study and finds that although respondents answer a particular question about poverty consistently over time, other data on the respondents suggest they may be lying about their answers. He think this may be an issue because respondents are embarrassed to report their true poverty status. What concept captures Dr. Washington's problem?
Low validity *Validity refers to how accurate a measure is. When a measure is accurate, it will give us the "true" answer.*
If you use an ordinary least squares (OLS) regression and find that gender, where female=1 (the independent variable) has a coefficient of -0.15 in an equation with income as the dependent variable, which of the following statements is true?
On average, women make 85 cents for every dollar men make *The regression results suggest that being female (from the binary independent variable gender) is associated with a reduction of 15 cents for every dollar of income associated with being male. Thus, on average, women make 85 cents for every dollar men make.*
____ testing is a key feature of many large-scale data collections.
Pilot *Pilot testing is a key feature of many large-scale data collections. For example, when large-scale projects are proposed to the National Institutes of Health for funding, the proposal includes a section on preliminary studies in which researchers describe the pilot testing they have done to ensure reliability.*
Traditionally, researchers collect data on educational attainment by highest level achieved. For example, categories might include: high school degree or less, some college, bachelor's degree, greater than a bachelor's degree. If a research project instead collected data on educational attainment and asked where individuals went to college to create more refined rankings (e.g. bachelor's degree from flagship state university, bachelor's degree from elite private university), what concept does this highlight?
Precision *Precision is a key element of measurement that supports reliability. The more detailed and precise the measures are, the more reliable they tend to be. In this case, having more precise data on educational attainment, such as where someone went to school, would likely be useful.*
Which statement about reliability and validity is true?
Reliability and validity are independently important standards for evaluating the link between conceptualization and operationalization *Reliability and validity are not mutually exclusive, nor parts of a zero-sum game in which gains in one always come with losses in the other. It is important to remember that reliability and validity are independently important standards for evaluating the link between conceptualization and operationalization.*
Which concept(s) does the following example of a bull's-eye target capture? https://services.wwnorton.com/aws/image?u=0&file=/wwnorton.college.public/coursepacks/soc/tassr/imgs/ARTSCISOC_FIG05un1.jpg
Reliable but not valid *The classic way to illustrate the differences between reliability and validity is to use a bullseye. This target sheet shows someone who is reliable but not valid. All of his shots cluster tightly together (dependable) but are not near the bull's-eye (inaccurate).*
The discussion of the changing poverty rate from 1973 to 2010 at the beginning of Chapter 5 highlights how the poverty rates calculated from the federal poverty line are ____, but not ____.
Reliable; Valid *More accurate estimates show that the federal rates are inaccurate, or have low validity, but are dependable, or have high reliability.*
Which aspect of external validity is easiest to address?
Representativeness *Scientists can take specific, concrete steps to ensure that the participants are representative of the population being studied. Questions about the reality of studies are much more challenging. Within the realm of ethical treatment, there is likely no experiment or study that is as real as real life, and, therefore, we always give up some external validity when we conduct research, experimental or not.*
If a researcher reaches a conclusion that he or she later realizes may be based on faulty statistical logic, which type of validity should he or she be concerned with?
Statistical validity *A major consideration in quantitative research, statistical validity refers to the degree to which a study's statistical operations are in line with basic statistical laws and guidelines. Assessments of statistical validity encompass many different dimensions, including questions about whether the right operational definitions were used, whether the methods designed and implemented truly follow from operational definitions, and whether the methods have been properly applied.*
Which statement about reliability and validity is true?
Validity is much more difficult to assess than reliability *Knowing how close our method is getting to the truth is very ambitious, because truth and objective reality can be quite slippery at times--a core tenet of sociology is the social construction of shared reality. For this reason, validity is much harder to assess than reliability. There is no statistical tool for measuring validity according to some quantitative standard--no Cronbach's alpha. In addition, there are no set standards or conventions for assessing validity.*