Research Methods: Ch's 4-6, & 15
what are the 2 primary types of subject bias? What are the differences between these two types?
1. social desirability: trying to make a good impression 2. obeying the study's demand characteristics: trying to make the researcher look good by producing results that support the hypothesis
a valid measure must have:
1. some degree of reliability and 2. be relatively free of both observer and subject biases
when you critique the introduction, question whether:
1. testing the hypothesis it vital 2. the hypothesis follows logically from theory to past research 3. the authors have found the best way to test the hypothesis
1) Bias refers to error that pushes scores in a particular direction. It is thus systematic. It can be due to the observer, testing conditions, or the participants themselves Bias can have a large effect on the group's overall score. 2) Random error is not systematic and so does not push scores in one direction more than the other. For every random error that increases a score, there will probably be one that decreases another score. Unlike bias, random error usually will not have a major effect on the group's overall score.
2 basic types of measurement error
when the treatment group changes its behavior not because of the treatment itself, but because group members know they are getting special treatment. (p. 133
Hawthorne Effect:
manipulating the variable by giving written or oral instructions. (p.134)
Instructional manipulation:
articles, books, and perhaps some websites where an author is presenting newly discovered information for the first time.
Primary sources
a useful resource that contains abstracts from a wide variety of journals. The Abstracts can be searched by year of publication, topic of article, or author. The electronic version of Psychological Abstracts is called PsycINFO. (p. 72)
Psychological Abstracts
articles, books, and websites that collect information from both primary and secondary sources. Textbooks are a classic example.
Secondary sources
a short, one-paragraph summary of a research proposal or article. The abstract should not be more than 120 words long. It comes before the introduction.
abstract
a short, one-paragraph summary of a research proposal or article. The abstract must not exceed 120 words. It comes before the introduction. (p. 480)
abstract:
can be measured directly by simple frequency, rate, duration, intensity, latency, and with respect to accuracy.
behaviors
systematic errors that can push the scores in a given direction. Bias may lead to "finding" the results that the researcher wanted. (p. 100)
bias
why is bias more serious than random error?
bias is a more serious threat to validity. random error dilutes validity, whereas bias poisons validity.
an observer who is unaware of the participant's characteristics and situation. Using blind observers reduces observer bias. (p. 105)
blind (masked), blind observer:
2 ways to reduce experimenter bias
blind procedures and standardization
The effects of the treatment or combination of treatments is underestimated because the dependent measure places too high a floor on what the lowest response can be. (p. 147)
ceiling effect:
One methodologist believes that close to 50% of published studies contain a type 2 error.
cohen
a study that is based on the original study, but uses different methods to better assess the true relationships between the treatment and dependent variables. Ina conceptual replication, you might use a different manipulation or a different measure. The conceptual replication is the most sophisticated kind of replication and usually is used to improve construct validity. (p. 91)
conceptual replication
a study that is based on the original study, but uses different methods to better assess the true relationships between the variables examined in the original study. In a conceptual replication, you might use a different manipulation or a different measure. The conceptual replication is the most sophisticated kind of replication. (p. 472)
conceptual replication:
the degree to which an operational definition reflects the concept that it claims to reflect. Establishing content, convergent, and discriminant validity are all methods of arguing that your measure has construct validity. (p. 124)
construct validity:
can be measured through participants' own statements, actions, nonverbal behavior, and physiology.
constructs
the extent to which a measure represents a balanced and adequate sampling of relevant dimensions, knowledge, and skills. (p. 124)
content validity:
validity demonstrated by showing that the measure correlates with other measures, manipulations, or correlates of the construct. (p. 126)
convergent validity:
aspects of the study that allow the participant to figure out how the researcher wants that participant to behave. (p. 107)
demand characteristics:
a later copy of the original study. Direct replications are useful for establishing that the findings of the original study are reliable. (p. 86)
direct (exact) replication
a copy of the original study. Direct replications are useful for establishing that the findings of the original study are reliable. (p. 471)
direct replication, exact replication:
the extent to which the measure does not correlate strongly with measures of constructs other than the one you claim to be measuring. (p. 128)
discriminant validity:
the part of the article immediately following the results section that interprets the results. For example, the discussion section may explain the importance of the findings and suggest research projects that could be done to follow up on the study. (p. 85)
discussion
the part of the article, immediately following the results section, that discusses the research findings and the study in a broader context and suggests research projects that could be done to follow up on the study. (p. 478)
discussion:
a manipulation that involves changing the participant's environment rather than giving the participant different instructions. (p. 134)
environmental manipulation:
experimenters being more attentive to participants in the treatment group or giving different nonverbal cues to treatment group participants than to other participants. (p.132)
experimenter bias:
a study investigating an entirely new area of research. Unlike replications, an exploratory study does not follow directly from an existing study. (p. 469)
exploratory study:
the extent to which a measure looks valid to the ordinary person. Face validity has nothing to do with scientific validity. However, for practical or political reasons, you may decide to consider face validity when comparing measures. (p.159)
face validity:
the effect of a treatment or combination of treatments is underestimated because the dependent measure is not sensitive to values below a certain level. (p. 146)
floor effect:
an example is depression, which can be operationally defined a number of ways, such as with the DSM criteria, the BDI-II, physiology, anatomy, and others.
good construct
the degree to which all the items on a measure correlate with each other. If you have high internal consistency, all the questions seem to be measuring the same thing. If, on the other hand, answers to some questions are inconsistent with answers to other questions, this inconsistency may be due to some answers being (1) strongly influenced by random error or being (2) influenced by different constructs. Internal consistency can be estimated through average correlations, split-half reliability coefficients, and Cronbach's alpha. (p. 120)
internal consistency:
the percentage of times the raters agree. (p. 116)
interobserver agreement:
like interobserver agreement, interobserver reliability is an index of the degree to which different raters rate the same behavior similarly. Low interobserver reliability probably means that random observer error is making the measure unreliable. (p. 116)
interobserver reliability:
data for which equal numerical intervals represent equal psychological intervals. That is, the difference between scoring a "2" and a "1"and the difference between scoring a "7" and a "6" is the same not only in terms of scores (both are a difference of 1), but also in terms of the actual amount of the psychological characteristic being measured. Interval scale measures allow us to compare participants in terms of how much of a quality they have. (p. 151)
interval scale data:
the part of the article that occurs right after the abstract. In the introduction, the authors tell you what their hypothesis is, why their hypothesis makes sense, how their study fits in with previous research, and why their study is worth doing. (p. 72)
introduction
after the Abstract comes the introduction to the study. In the introduction, the authors tell you what their hypothesis is, why their hypothesis makes sense, how their study fits in with previous research, and why their study is worth doing. (p.466)
introduction:
a convergent validity tactic that involves seeing whether groups known to differ on a characteristic differ on a measure of that characteristic (e.g., ministers should differ from atheists on a measure of religious beliefs). (p. 127)
known-groups technique:
a question or set of questions designed to determine whether participants perceived the manipulation in the way that the researcher intended.(p. 133)
manipulation check:
the part of the article immediately following the introduction. Whereas the introduction explains why the study was done, the method section describes what was done. For example, it will tell you what design was used, what the researchers said to the participants, what measures and equipment were used, how many participants were studied, and how they were selected. The method section could also be viewed as a "how we did it" section. The method section is often subdivided into three subsections: participants, apparatus, and procedure. (p. 76)
method section
the part of the article immediately following the method section that reports selected statistical results and relates those results to the hypotheses. From reading this section, you should know whether the results supported the hypothesi
method section
the part of the article immediately following the introduction. Whereas the introduction explains why the study was done, the method section describes what was done. For example, it will tell you what design was used, what the researchers said to the participants, what measures and equipment were used, how many participants were studied, and how they were selected. The method section could also be viewed as a "what we did" section. The method section is usually subdivided into at least two subsections: participants and procedure. (p. 475)
method section:
numbers that substitute for names. Different numbers represent different types, kinds, categories, or qualities, but larger numbers do not represent more of a quality than smaller numbers. (p. 149)
nominal scale numbers:
bias created by the observer seeing what the observer expects to see, or selectively remembering/counting/looking for data that support the observer's point of view (also known as scorer bias). (p. 103)
observer bias:
a publicly observable way to measure or manipulate a variable; a "recipe" for how you are going to measure or manipulate your factors. (p. 98)
operational definition
numbers that can be meaningfully ordered from lowest to highest. With ordinal numbers, we know that the higher scoring participant has more of a quality than the lower scoring participant, but we don't know how much more of the quality the higher scoring participant has. (p. 150)
ordinal scale numbers:
a treatment that is known to have no effect. To reduce the impact of subject (participant) bias, the group getting the real treatment is compared to a group getting a placebo treatment—rather than to a group that knows it is getting no treatment. (p. 133)
placebo treatment:
2 ways to reduce subject bias
placebo treatments and unobtrusive measurement
using someone else's words, thoughts, or work without giving proper credit. Plagiarism is considered a serious act of academic dishonesty. Indeed, at some institutions, students convicted of plagiarism are expelled. Furthermore, concerns about plagiarism are no longer limited to colleges and universities. More and more, the world economy is based on information. Thus, more and more, businesses and individuals are concerned about the theft of ideas (intellectual property). Therefore, if you quote someone's work, use quotation marks; and if you paraphrase or in any sense borrow an idea from a source, cite that source. (pp. 475-476)
plagiarism:
concepts like the Oedipal and Electra complex.
poor constructs
you can do a systematic replication to improve:
power, external validity, or construct validity
the chances of obtaining a certain pattern of results if there really is no relationship between the variables. (p. 485)
probability value (p value):
the research ? is more formal than the research ? and should conform to style described in APA
proposal; journal
when you look at the results section:
question any null (non significant) results. the failure to find a significant result may be due to the study not having enough power
when you critique the methods section:
question the construct validity of the measures and manipulation and ask how easy it would have been for the participants to have played along with the hypthesis
inconsistent, unsystematic errors of measurement. Carelessness on the part of the person administering the measure, the person taking the test, and the person scoring the test can cause random error. (p. 100)
random error of measurement:
3 major sources of unreliability are:
random errors in scoring the behaviors, random variations in how the measure s administered, and random fluctuations in the participants performance
numbers having all the qualities of interval scale numbers, but that also result from a measure that has an absolute zero (zero on the measure means the complete absence of the quality). As the name implies, ratio scale numbers allow you to make ratio statements about the quality that you are measuring (Steve is two times as friendly as Tom). (p. 152)
ratio scale numbers:
refers to whether you are getting consistent, stable measurements. measures are relatively free of random error
reliability
the extent to which a measure produces stable, consistent scores. Measures are able to produce such stable scores if they are not strongly influenced by random error. A measure can be reliable, but not valid. However, if a measure is not reliable, it cannot be valid. (p. 111)
reliable, reliability:
a diary of your research ideas and your research experiences. The research journal can be a useful resource when it comes time to write the research proposal. (p.464)
research journal:
These present the author's original scientific research in a very prescribed format discussed below. The lion's share of journal articles in all of science are research reports. Your class project is to produce such a research report of your own investigation. The parts: 1) Title page 2) Abstract 3) Introduction 4) Methods 5) Results 6) Discussion 7) References 8) Addenda (figures, tables, etc)
research report
Knowledge about the natural world Knowledge about the natural world obtained from scientific research is communicated and stored in a vast array of books, scientific journals, and a few others. A journal is a periodical, a document that is published on a regular basis (the word "magazine" is almost synonymous with journal). Sample journals Two psychology journals The big boys The two flagship journals in English in science are Nature, published in Great Britain, and Science, published in the United States
research reports
the part of the article immediately following the method section that reports selected statistical results and relates those results to the hypotheses. From reading this section, you should know whether the results supported the hypothesis. (p. 82)
results section
the part of the article, immediately following the method section, that reports statistical results and relates those results to the hypotheses. From reading this section, you should know if the results supported the hypothesis. (p. 477-478)
results section:
These articles attempt to summarize all the relevant research and theoretical articles on a single defined topic. They often have dozens or hundreds of references. An example would be a review article on the effectiveness of one particular form of psychotherapy.
review articles
a measure's ability to detect differences among participants on a given variable. (p. 142)
sensitive, sensitivity:
reliability is a prerequisite for ?, however it does not guarantee ?
sensitivity
participants acting in a way that makes the participant look good. (p. 110)
social desirability bias:
treating each participant in the same (standard) way. Standardization should reduce experimenter bias. (p. 106)
standardization:
people who seem (to the real participants) to be participants, but who are actually the researcher's assistants. (p. 136)
stooges (confederates):
ways the participant can bias the results. The two main subject biases are (1)trying to help the researcher out by giving answers that will support the hypothesis, and (2) giving the socially desirable response. (p. 107)
subject (participant) biases:
a study that varies from the original study only in some minor aspect. For example, a systematic replication may use more participants, more standardized procedures, more levels of the independent variable, or a more realistic setting than the original study. (p. 88)
systematic replication
a study that varies from the original study only in some minor aspect. For example, a systematic replication may use more participants, more standardized procedures, more levels of the independent variable, or a more realistic setting than the original study. (p. 472)
systematic replication:
interobserver reliability puts a ceiling on
test-retest reliability
one way to measure the extent to which a measure is free of random error is to compute it's:
test-retest reliability
away of assessing the total amount of random error in a measure by administering the measure to participants at two different times and then correlating their results. Low test-retest reliability could be due to inconsistent observers, inconsistent standardization, or poor items. Low test-retest reliability leads to low validity. (p. 113)
test-retest reliability:
These articles attempt to outline and describe in detail a theory or theoretical perspective. They will always make reference to other articles of all three types discussed here. They usually do not contain original research results, although an author may well include her own previously published
theoretical articles
by not letting participants know what you are measuring, you may be able to reduce subject biases
unobtrusive measurement
recording a particular behavior without the participant knowing you are measuring that behavior. Unobtrusive measurement reduces subject biases such as social desirability bias and obeying demand characteristics. (p. 109)
unobtrusive measurement:
refers to whether you are measuring what you claim to be measuring
validity