Research Stats: Exam 3

Ace your homework & exams now with Quizwiz!

Mixed Wtihin-Subjects and Between-Subjects ANOVA

AKA - mixed ANOVA - mixed within-groups and between-groups ANOVA - repeated measures ANOVA with a between-subjects factor Statistical assumptions 1) normal distribution of DV scores at each round (check skewness) 2) sphericity (Mauchly's get and Greenhouse-Geisser epsilon) only evaluate if 3 or more levels - if two levels for within-subjects IV, sphericity is not an assumption, as this only involves a single variance of difference scores - if only 2 within-subjects levels, interpret "sphericity assumed" F tests 3) independence of participants for the between-subjects IV, and non-independence of participants for the within-subjects IV

Repeated Measures ANOVA

AKA - within-groups ANOVA - within-subjects ANOVA IV: Each participant serves in 2 or more rounds/conditions/groups DV: Quantitative (repeatedly measured 2x or more) Considered an omnibus test if 3 or more conditions/groups/rounds) - follow-up with post hoc (pairwise) tests if sig. diff - its to see which rounds/groups significantly differ for every combination - with exactly two rounds don't need post-hoc comparison Conceptual - H0: Main effect not sig. (all means are same) - H1: Main effect sig. (at least 2 means are sig. different) Inferential (generalizing to pop.)

Quasi-Experiments: Comparison Time-Series Design

AKA comparison (groups) time-series design - extension of pretest-posttest control group design - also extension of interrupted time-series design to permit many pretests, many posttests, and 2 or more groups - e.g., increasing age limit to buy alcohol from 18-21 - see decrease in car crashes for younger age groups but not for older age groups - suggest raising legal age decrease accidents - potential confounds - better care laws like use of seatbelts - but could argue impacts all age groups

Quasi-Experiments: Examples

Experiment 1: False memories - pretest-posttest design - fake photograph is the treatment - embedding false memories - impart false memories using digitally altered photos - 30 students shown pic of childhood - shown fake photo in hot air balloon - by end of week think really rode hot air balloon - shown 4 photos from childhood, 1 fake - 50% think were in hot air balloon - believe photo real Experiment 2: strange situation to assess attachment styles - pretest-posttest design (strange situation is treatment) - assess quality of child's attachment to caregiver - unfamiliar setting, how react to stranger, what happen when mother leaves and returns - interested in reunion - when mother returns, how baby reacts - comforted, not comforted, ignores Experiment 3 - split brain - posttest only control group design (split brain participants vs. normal brain participant [show's host] on cognitive performance) - given tasks to perform - each hand does what its half of the brain sees - when word flashed to right brain, don't see - but can draw image with left hand - speaking left brain saw stool, right brain didn't see toad, but could draw toad - right brain sees faces, left brain sees fruit

Carry-Over (Order) Effects

Exposure to an earlier condition carries over and affects performance in next condition Practice effects - performance improvement in DV scores form one condition to the next - e.g., vocab test, treatment, vocab test - exposed similar DV vocab test, second performance probably better and score higher - same with intelligence tests Fatigue effects - performance decline in DV scores from one condition to the next

Counterbalancing

Use difference sequences (preferably, participants randomized to different orderings), to control for (cancel out) carry-over effects e.g., half receive condition A to B and other half receive B to A - cancel out practice/fatigue effects After data collection, statistically compare mean scores of condition A vs. B (each collapsed across its respective sequences) using dependent t test or repeated-measures ANOVA

Dependent t Test APA style

The same participants who received personal tutoring (M = 7.00) and group tutoring (M = 6.00) did not significantly differ on vocabulary scores, dependent t(4) = 1.05, ns.

Dependent t Test APA style SPSS

The skewness values for morning mood (-0.10) and afternoon mood (0.00) indicate that the variables were within reasonable boundaries of normality. The study involved a within-subjects design. The same participants in the morning (M = 6.90, SD = 1.37) compared to the afternoon (M = 5.00, SD = 1.56) reported more positive mood scores, t(9) = 2.97, p < .05.

CBL - external validity of lab experiments

- A commonplace criticism of the social laboratory experiment concerns its artificiality, or reactivity, and the consequent impossibility of determining the adequacy of generalizations based upon experimental data - Even so, the question of the generalizability of our research findings is never definitively settled. Although the certainty of generalizations can never be attained, it can be approximated, or approached, and the approximations can become progressively more exact if we are sensitive to the forces that influence behavior within and outside the experimental laboratory.

CBL - growth curve model

- A growth curve model is conducted to determine the longitudinal trajectory or shape of observations for participants measured at multiple time points - In growth curve modeling, time points of measurement are nested within each participant. - Because multi-level models involve a nested design, the approach statistically controls for the clustering of scores in times across participants - This is a desirable feature, as the technique could be undertaken to estimate the growth pattern across time in the sample as a whole. Accounting for these person-to-person variations (e.g., demographic characteristics) via the intraclass correlation helps to statistically "remove" these individual differences so that detection of the underlying growth trajectory is not obscured

Warner - assumptions for paired samples t test

Assumptions for paired samples t test - assumptions about behavior of d scores o Dependent variables must be quantitative o Must have info about the same variable measured in the same units for the two situations o Change scores must be independent of each other o Change scores should be approximately normally distributed and should not have extreme outliers o Sample 1 and sample 2 cores should be linearly related and at least moderately positively correlated o No person x treatment interaction effect - should not find some people show very different responses to pain than others

Rosenthal - Ethics and research design

Poor quality of research design, poor quality of data analysis, and poor quality of reporting ofthe research all lessen the ethical justification of any type of research project. - Because students', teachers', and administrators' time will be taken from potentially more beneficial educational experiences. Because the poor quality of the design is likely to lead to unwarranted and inaccurate conclusions that may be damaging to the society that directly or indirectly pays for the research. In addition, allocating time and money to this poor-quality science will serve to keep those finite resources of time and money from better quality science in a world that is undeniably zero-sum. - think of our human participants as another "granting agency"—which, we believe, they are, since they must decide whether to grant us their time, atten- tion, and cooperation. Part of our treating them as such is to give them information about the long-term benefits of the research. In giving prospective participants this in- formation, we have a special obligation to avoid hyper- claiming. - Hyperclaiming is telling our prospective participants, our granting agencies, our colleagues, our administra- tors, and ourselves that our research is likely to achieve goals it is, in fact, unlikely to achieve. Causism refers to the tendency to imply a causal relationship where none has been estabhshed - Characteristics of causism include (a) the absence of an appropriate evidential base; (b) the presence of lan- guage implying cause (e.g., "the effect of," "the impact of," "the consequence of," "as a result of") where the appropriate language would have been "was related to," "was predictable from," or "could be inferred from"; and (c) self-serving benefits to the causist. - if because of the poor quality of the science no good can come of a research study, how are we to justify the use of participants' time, attention, and effort and the money, space, supplies, and other resources that have been expended on the research project? Payoffs for doing research - When individual investigators or institutional review boards are confronted with a questionable research pro- posal, they ordinarily employ a cost-utility analysis in which the costs of doing a study, including possible neg- ative effects on participants, time, money, supplies, ef- fort, and other resources, are evaluated simultaneously against such utilities as benefits to participants, to other people at other times, to science, to the world, or at least to the investigator. - Any study with high utility and low cost should be canied out forthwith. Any study with low utility and high cost should not be carried out. Studies in which costs equal utilities are very diffi- cult to decide about. Payoffs for failing to do research - However, Rosnow and I have become convinced that this cost-utility model is insufficient because it fails to consider the costs (and utilities) of not conducting a particular study - In the examples considered so far, the costs of failing to conduct the research have accrued to future genera- tions or to present generations not including the research participants themselves. But sometimes there are incidental benefits to research participants that are so important that they must be considered in the calculus of the good, as in the following example:

Informed Consent

Participants are informed about what participation in the study will involve - to make the decision about whether or not to engage in the research - administered at beginning - usually required - to waive must justify (e.g., reading consent form will ruin effect of study, non experimental field observation)

Solutions to problems of experimenter expectancy bias

Monitoring - - more careful observation of experimenters to insure that their data transcription and analysis were accurate. This could be accomplished by recording the experimenter-participant interaction and comparing the results found by the experimenter with those obtained through the unbiased observers of this interaction. Unfortunately, this process does not preclude the possibility of subtle, nonverbal cuing of the participant by the experimenter. While a more rigorous observation of the social exchange between experimenter and participant is certainly worthwhile, it does not solve completely the potential expectancy problem. Blind procedures - - A more effective solution is the use of double-blind procedures, in which participants as well as the researcher responsible for administering the study are made unaware which treatment condition participants are in. This is used to control for the effects of both researcher bias and participant cuing, and borrows heavily from pharmacological research. The most obvious way this control can be achieved is through the simple expedient of not informing the researchers about the aims of the research hypotheses or the condition to which the participants were assigned. This control is useful and widely used, because if experimenters do not know what is expected, they will be unlikely to pass on any biasing cues to participants.1 Similarly, any recording or calculation errors that might be made should be unbiased and thus would not systematically influence results. - Unfortunately, experimenters, even hired experimenters specifically shielded from information about the theory under development, have been shown to be hypothesis-forming organisms that over the course of an investigation might evolve their own implicit theory of the meaning and nature of the work they are performing. - Thus, the particular application of the experimental blind procedure would not seem to offer a real solution to the experimenter expectancy problem, unless the entire experiment could be completed before a researcher could realistically develop a series of implicit hypotheses - A slight variant of this procedure does not presume to eliminate knowledge of the research hypotheses from the investigators, but rather limits their information about the research condition to which any participant or group of participants has been assigned. Often, in a variant of the double-blind experiment, it is possible to test both treatment and control participants in the same setting, at the same time, without the experimenter's or participants' knowledge of the specific condition into which any individual falls Mechanized procedures - Instructions, manipulations, and procedures could be presented to participants via audiotape, video- tape, or a computer program, with respondents' answers collected using these electronic devices, and data analyzed by an impartial computer. Procedures where data are collected "untouched by human hands" helps render implausible the alternative hypothesis of experimenter expectancy if these devices are employed intelligently and not subverted by inappropriate actions on the part of the investigator. - As experiments administered via computer technologies usually involve using a mouse (or touchscreen) and a keyboard as a proxy for social interactions, such procedures can prove uninvolving, uninteresting, or unreal to the participant. These reactions in turn can give rise to a host of contaminating features. This need not always be the case. With adequate preparation, a realistic, interesting, and sometimes even educational experimental situation can be devised

Common quasi-experimental designs

Pretest-posttest control group design (NR) - pretest used to verify that groups are initially equivalent due to no random assignment - strongest for quasi-experiment - e.g., mindfulness therapy vs. not - participants self-select to receive treatment, measure on depression symptoms to make sure similar in symptoms pre treatment - one group receives therapy, other does not - measures again on depressive symptoms Posttest-only control group design (NR) - no pretest used as groups are presumed to be initially equivalent - might not be true in quasi-experiments - don't know if difference due to treatment - groups could have been in diff in depressed symptoms prior to treatment Pretest-posttest design - only 1 group for study - compare pretest vs. posttest - all participants pretested, all receive treatment, given posttest - perform paired samples t test to determine if pre score sig. diff from posttest score NR = no random assignment (although administered treatment)

Repeated Measures ANOVA: SPSS

The statistical assumptions for repeated-measures ANOVA were evaluated. The skewness index shows that the academic self-efficacy variables for freshman (0.00), sophomore (0.00), and junior (0.00) exhibited normal distributions. Mauchly's test of sphericity was not significant (p = .44) and the Greenhouse-Geisser epsilon was .75, with both indicating that the sphericity assumption was satisfied. The study involved a within-subjects design. The repeated measures ANOVA for the main effect of school year on academic self-efficacy was significant, F(2, 10) = 105.00, p < .05. Thus, post hoc (LSD) tests were performed. Specifically, students in their junior year (M = 7.00, SD = 2.68) compared to their freshman year (M = 2.00, SD = 1.79) scored significantly higher on academic self-efficacy, p < .05. Students attending their junior year scored significantly higher than their sophomore year (M = 3.00, SD = 2.76) on academic self-efficacy, p < .05. The remaining post hoc comparison was not significant. The results are graphed in Figure 1.

CBL - three faces of external validity

- External validity refers to the question of whether an effect (and its underlying processes) that has been demonstrated in one research setting would be obtained in other settings, with different research participants and different research procedures. Robustness - A result that is replicable across a variety of settings, persons, and historical contexts is said to be robust. In its most narrow sense, robustness is concerned with the external validity issue of the extent an effect obtained in one laboratory can be exactly replicated in another laboratory with different researchers - More broadly, the question has to do with whether the general effect holds up in the face of variations in participant populations and settings. - Technically, robustness would be demonstrated if a particular research study were conducted using a randomly selected sample of participants from a broadly defined population in a random sampling of settings - If findings do not replicate in systematically selected samples, we sometimes gain clues as to what demographic factors may be moderators of the effect in question - External validity, or generalizability, is related to settings as well as participant populations. The external validity of a finding is challenged if the relationship found between independent and dependent variables is altered when essentially the same research design and procedures are con- ducted in a different laboratory or field setting, or by experimenters with different characteristics. Ecological validity - is it representative - This is the essence of ecological validity, which is concerned with the external validity issue of the extent an effect occurs under conditions that are typical or representative in the population. The concept of ecological validity derives from Brunswik's (1956) advocacy of "representative design," in which research is conducted with probabilistic samplings of people and situations. - Robustness asks whether an effect can occur across different settings and people; ecological validity asks whether it does occur in the world as is. - findings obtained with atypical populations (e.g., college students) in atypical settings (e.g., the laboratory) never have ecological validity until they are demonstrated to occur naturally in more representative circumstances. - Ecological validity is too restrictive a conceptualization of generalizability for research that is designed to test causal hypotheses. Ecological validity is, however, crucial for research that is undertaken for descriptive or demonstration purposes. - An experimental setting may not resemble features in the real world, but still may capture processes that are representative of those that underlie events in the real world. Relevance - does it matter - relevance, which concerns the external validity issue of the extent an effect obtained is pertinent to events or phenomena that actually occur in the real world. However, relevance also has a broader meaning of whether find- ings are potentially useful or applicable to solving problems or improving quality of life. Relevance in this sense does not necessarily depend on the physical resemblance between the research setting in which an effect is demonstrated and the setting in which it is ultimately applied. - All social research is motivated ultimately by a desire to understand real and meaningful social behavior. But the connections between basic research findings and their application are often indirect and cumulative rather than immediate. Relevance is a matter of social process—the process of how research results are transmitted and used rather than what the research results are

CBL - Participant roles

- In their system, they identified four general participant types: good participants, who attempt to determine the experimenter's hypotheses and to confirm them; negative participants, who also are interested in determining the experimenter's hypotheses, but only in order to sabotage the study (Masling, 1966, referred to this type of reaction as the "screw you" effect); faithful participants, who are willing to cooperate fully with almost any demand by the experimenter, and who follow instructions scrupulously and ignore any suspicions they might have regarding the true purpose of the study; and finally, apprehensive participants, who worry that the experimenter will use their performance to evaluate their abilities, personality, social adjustment, etc., and react accordingly in the study. - We assume that almost all participants are at least partly apprehensive about taking part in an experiment, although this apprehension probably diminishes with experience - Involuntary participants are most likely to be negativistic, as are voluntary participants who learn that the cost-to-benefit ratio of their participation is not as favorable as they thought it would be when they originally signed up for the study - The good and the faithful participant roles are most likely to be assumed by voluntary participants. Nonvoluntary participants, being unaware that they are being studied, are unlikely to assume any of the roles that Webber and Cook defined. - When questioned about their unusual perseverance, they often responded with a guess that the research hypothesis was concerned with endurance, and thus their actions were quite appropriate. Respondent performances of this type have alerted researchers to the tremendous degree of behavioral control the experimenter can intentionally and unintentionally exert in the laboratory. - Nowhere was this fact more evident than in the studies of Stanley Milgram (1963, 1965). While investigating the effects of a range of variables on "obedience" behavior, Milgram's basic experimental procedure took the guise of a two-person (teacher-learner) verbal memory and learning study. In these studies, the naive participant was asked to serve as the "teacher," whose task was to shock the "leaner" (confederate) each time the leaner committed a recall error from a list of words. In fact, a confederate played the role of the learner and received no shocks. - When the 150-volt level was reached, the learner audibly demanded to be set free, to terminate the experiment. Though the learner was unseen, his protestations, amplified from the adjacent room, were truly heartrending. At this point in the investigation, almost all the participants requested to end the study. The researcher, however, always responded, "You have no other choice, you must go on!" A substantial percentage of the participants obeyed all of the researcher's commands. In one of these studies, for example, 61% of the individuals tested continued to the very end of the shock series - The most common rationale offered by participants was that they viewed themselves as an integral part of a scientific investigation, and they were determined to fulfill their commitment to the study, even though this might entail some discomfort.

CBL - three categories of participants

- One characteristic that can be expected to influence participants' awareness of being under study, and how they respond to that awareness, is the degree of freedom that they had in agreeing to commit themselves to the research. Depending on how participants are recruited to the investigation, they can be generally classified into one of three categories, which we have termed voluntary, involuntary, and nonvoluntary participants. Voluntary Participants - Individuals considered to be voluntary participants are aware that they are under investigation, but have made a conscious decision that the potential benefits outweigh the costs (measured in terms of time spent, privacy invaded, etc.) of being in the study. - If this positive mindset is maintained during the experiment (i.e., if the design and procedures do not force participants to revise their estimates of benefits and costs), then it is doubtful that such participants would willfully attempt to subvert the experiment - they consider this potential invasion of privacy to be part of the social con- tract with the experimenter, and are willing, as long as the benefit-to-cost ratio remains positive, to respond in an honest and authentic manner. - It sometimes happens that the voluntary participant proves to be too willing to cooperate, too eager to help the experimenter confirm the research hypotheses. - Rosenthal and Rosnow (1975), in an extensive review of the volunteer participant literature, found volunteers to be better educated than non-volunteers and to have higher occupational status, a higher need for approval, higher intelligence, and better adjustment than non-volunteers. - If these factors contribute to an overly cooperative volunteer, that is, a participant intent to help confirm the research hypotheses, the generalizability of the study could be impacted. Involuntary participants - Individuals who are involuntary participants feel that they have been coerced to spend their time in an experimental investigation and consider it unjustifiable, and therefore vent their displeasure by actively attempting to ruin the study. - Persons forced to comply with the demands of some higher authority can generate considerable resentment, which can seriously affect the outcome of the study. Dissatisfaction with "guinea pig" status - Unless the experimenter can demonstrate that participation in the study is not a one-sided proposition, but rather a cooperative venture in which both parties can gain something of value, there is no reason to expect willing acquiescence to the participant role. Sometimes payment can be used to help the participant justify participation in the study. - More satisfactory is the experimenter's explanation of the reasons for and the importance of the investigation prior to the experiment. Participants' willingness to serve in studies they perceive to be of scientific importance is gratifying and surprising. - By informing participants of the importance of the research project and of the importance of their role in the research process, the experimenter can instill a sense of positive commitment in participants. Nonvoluntary participants - Those considered nonvoluntary participants unknowingly enter into an experimental situation and are unaware that they are part of a study until after the completion of the study. After their responses are recorded, the investigator may (or may not) explain the nature of the study to the "participants." Because they are unaware of their participation in a study, it is obvious that the reactions of nonvoluntary participants cannot be described as artificial or laboratory dependent. - Results based on the reactions of such individuals should enjoy a high degree of external validity, or generalizability. - Nonvoluntary participants are typically used in field research situations outside the laboratory, where participants do not know they are taking part in an experiment. However, even within the context of laboratory experiments, some studies have been designed so that the participants entering (or leaving) an experimental laboratory are treated and tested before they realize the experiment has begun

CBL - program evaluation research

- Program evaluation is the application of social science methodology to the assessment of social programs or interventions by program evaluators. - In evaluation research, the scientist is called the program evaluator, the person responsible for evaluating and assessing the many aspects and stages of a program or intervention. The information derived from program evaluation is disseminated to other stakeholders as feedback to help revise, understand, and continue or discontinue the intervention as necessary. - The purpose of needs assessment is to judge the extent of an existing social problem and determine whether a program or intervention is needed at all. - The purpose of evaluation research at the stage of program development is to provide feedback to program designers that can lead to revisions or alterations in program materials, design, and procedures before the intervention is implemented on a larger and more costly scale - Feasibility studies are conducted on a small scale to determine if the program as planned can be delivered effectively, given the existing constraints. The purpose of this form of small- scale field testing is to decide if the program components can be implemented as intended on a wide-scale basis, and whether services will reach the targeted population. - Program efficacy studies also are conducted on a small scale to determine whether the expected effects from the planned intervention occur as planned - Summative evaluation, which is also known as impact evaluation or outcome evaluation, is con- ducted to assess whether a fully implemented program had an effect on the problem it was designed to alleviate. It is perhaps the primary form of evaluation research - Among the requisite conditions for effectiveness evaluation are the following: (1) The goals or objectives of the program must be sufficiently specified by the designers to allow for definable outcomes; (2) Program features must be defined well enough to determine whether the program is present or absent in a given situation or time; and (3) Some basis for observing or measuring the final outcomes in the presence or absence of the treatment program must be available. - The primary distinguishing characteristic of evaluation research is its explicitly political character. All social research may have direct or indirect political implications to some extent, but the reason for much evaluation research is political decision-making. - Random assignment, for instance, which is a relatively simple matter in the laboratory context, can be a political hot potato when special interests are at stake. - In evaluation research studies, the program evaluators responsible for research design and measurement often are not the same as the individuals responsible for the program's delivery and implementation. - Hence most evaluation projects involve a split between "research people" (program evaluators and methodologists), and "program people" (administrators, social workers, and the like), who sometimes work at cross- purposes. - A common source of conflict between program and research staff revolves around the desirability of making changes in the program or program delivery during the course of the implementation process. For experimental purposes, the treatment variable ideally remains constant throughout the study. Program personnel, however, may be inclined to continually alter or improve aspects of the treatment or policy in response to new information - Program implementers tend to want to know how their program is doing (e.g., is it reaching the intended population, are clients happy with the services received, etc.), whereas researchers want to know what effect the program is having

Bem - rewriting and polishing your article

- Rewriting is difficult for several reasons. First, it is difficult to edit your own writing - lay your manuscript aside for awhile and then return to it later when it has become less familiar. Sometimes it helps to read it aloud. But there is no substitute for practicing the art of taking the role of the nonspecialist reader, for learning to role-play grandma - It requires a high degree of compulsiveness and attention to detail. The prob- ability of writing a sentence perfectly the first time is vanishingly small, and good writers rewrite nearly every sentence of an article in the course of polishing successive drafts. - Finally, rewriting is difficult because it usually means restructuring. Sometimes it is necessary to discard whole sections of an article, add new ones, go back and do more data analysis, and then totally reorganize the article just to iron out a bump in the logic of the argument. Do not get so attached to your first draft that you are unwilling to tear it apart and rebuild it. - omit needless words - every word should tell something new - avoid metacomments on writing - use repetition and parallel construction - use cargo when specialized term more general/precise/free of surplus meaning/extremely well known - This practice produces lifeless prose and is no longer the norm. Use the active voice unless style or content dictates otherwise; and, in general, keep self-reference to a minimum. Remember that you are not the topic of your article. You should not refer to yourself as "the author" or "the investigator." (You may refer to "the experimenter" in the method section, however, even if that happens to be you; the experimenter is part of the topic under discussion there.) - It tends to distract the reader from the topic, and it is better to remain in the background. Leave the reader in the background, too. - The Publication Manual, however, emphasizes that the referent of "we" must be unambiguous; for example, copy editors will object to the sentence - Use the past or present perfect tense when reporting the previous research of others ("Bandura reported..." or "Hardin has reported...") and past tense when reporting how you conducted your study - use the present tense for results currently in front of the reader ("As Table 2 shows, the negative film is more effective ...") and for conclusions that are more general than the specific results - use descriptive terms that either identify them more specifically or that acknowledge their roles as partners in the research process - As these examples illustrate, hyphens should not be used in multiword designations, and terms such as Black and White are considered proper nouns and should be capitalized. - person-first language

CBL - debriefing

- Where deception is seen as necessary to the exploration of some research questions, attention turns to the best method of counteracting the negative implications of its use. - For this purpose, considerable emphasis has been given to the importance of the debriefing, a session following participation in a study in which the participants are informed of the true purpose of the research. - When such attention is devoted to the content and handling of debriefing sessions, this may serve not only to "undeceive" the participants (and thereby relieve the researcher's pangs of conscience) but also to enrich the participant's own experience in terms of understanding human behavior and the nature of social research. - Some research suggests that when used routinely or handled perfunctorily, debriefing procedures can cause more harm than good - Many researchers warn that routine debriefing may produce widespread suspicion and potential dishonesty among populations frequently tapped for participation in psychological research. - Results revealed that only the completely debriefed participants, as classified in the first experiment, reacted with suspicion to the second experiment. - Examination of these differences indicated that the previously deceived participants, in an effort to correct for a deceived self-image, responded with more favorable self-presentation biases than did the non-deceived participants. - Holmes (1967) found that the more research experience participants had, the more favorable were their attitudes toward psychological experiments, and the more they intended to cooperate in future research - They found that participants who had been in deception experiments evaluated their research experience more positively than those who had not been deceived, and that debriefing appeared to eliminate the negative feelings of those who felt they had been harmed by deceptive practices. Those who had participated in experiments using deception reported that they received better debriefing, enjoyed their experiences more, received greater educational benefits from the research, and were more satisfied with the research program - Often, deception-based studies are inherently interesting—they involve complex decision-making, arguing against a strong counter-attitudinal communication, making judgments about people and social events, and so on - With effective debriefing, the negative aspects of the deception studies might be offset, leading to more positive evaluation of the research experience on the part of participants. - This requires, however, that the debriefing be conducted in such a way and with enough thought and effort that participants leave feeling that they have learned something by the experience and that their time has not been wasted. - Effective debriefing also is important in gaining participants' commitment to the purposes of the research and their agreement not to tell potential future participants about the deception employed. - Because of this, badly done debriefings—those that do not succeed in gaining participants' confidence, trust, and cooperation—can potentially harm the scientific enterprise. - for highly concerned participants, even lengthy debriefings were not successful in removing reactions to false information feedback regard- ing personal adequacy. - Particularly when the experimental manipulation involved providing false information of some kind about the participant's personality, competence, or self-esteem, an extensive form of debriefing known as "process debriefing" (Ross et al., 1975) may be required. - Process debriefing entails discussing with participants how the deception may have temporarily influenced their own self-perceptions and the processes by which these effects might occur. - Another technique for enhancing the effectiveness of debriefing is to show participants all of the experimental conditions so they can see how the information they had received was just one version of the experiment to which they had randomly been assigned.

3 Designs

Randomized experiment - essentially equalized around background variables so any difference attributed to the treatment - more conclusively state IV caused DV Quasi - only as good as variables matched on - other potential confounds at play - don't know if responsible for diff between groups Non-experimental - no matching - e.g., questionnaire study and data on SAT scores - potentially confounds responsible for SAT scores - little inferring causal relationship and effect of IV on DV

CBL - Conducting Experiments Outside the Laboratory

- A laboratory is a designated location where participants must go to take part in the research. - A field study is research conducted outside the laboratory, in which the researcher administers the study in participants' own naturalistic environment or context. - With the advent of the Internet as a venue for research, we move the experiment even farther away from the sterile confines of the laboratory to an electronic location - In general, however, the dimension that separates laboratory and field experiments is research participants' awareness that they are involved in a scientific study. - In field settings, even when informed in advance that a research study is underway, participants are less likely to be as conscious of their behaviors and the research goals as they are in a laboratory environment - Studies conducted on the Internet can be more like laboratory experiments if respondents log on to a site knowing that they are participating in a scientific investigation, or more like a field study if the research is being con- ducted using ongoing, naturally occurring forums such as chat rooms or other social media sites. - To test a particular cause and effect relationship, the independent variable of interest must be dis-embedded from its natural context or causal network and translated into appropriate operations to be studied the laboratory. - The laboratory is an ideal setting for the experimenter to take special precautions to control, and therefore prevent, the intrusiveness of extraneous variables from becoming confounds. - It is important to re-embed the causal variable into its natural context, to be certain that its effect is not suppressed or reversed under circumstances in which the phenomenon normally occurs. - The advantages of field experiments are best realized when the operations of the independent and dependent variables are translated to be appropriate to the new context. Such modifications often involve a fundamental rethinking of the theoretical variables and a concern with conceptual, rather than exact, replication of the original study.

Warner - advantages of paired samples

- Advantages - smaller numbers of participants needed, F test may have greater statistical power - Error terms usually smaller because variation in scores due to systematic individual differences among persons is removed from the error term

CBL - comparison time-series design

- Assuming that comparable statistical records are available across times and places, the com- parison time-series design combines features of an interrupted time-series design and a comparison group design - If a social program is introduced in one location or institution but not in some other, preexisting differences between the treatment and comparison site make it difficult to interpret any posttreatment differences. However, if time-series data based on the same record-keeping system are available for both sites, and if both are subject to similar sources of cyclical and noncyclical fluctuations, then the time-series data from the comparison control group can serve as an additional baseline for evaluating differences in change in the experimental series. - When the time series of two groups are roughly parallel in trend of scores prior to the introduction of the quasi-treatment (i.e., the intervention), but diverge significantly afterwards (as illustrated in Figure 10.6), many potential alternative explanations for the change in the lat- ter series can be ruled out. - A second method of forming comparisons for the interrupted time series is to include variables in the analysis that are parallel to the critical variables but which should not be affected by the interruption.

CBL - Regression artifacts and assessment of change

- Because new social programs are introduced into ongoing social systems for the purpose of altering or improving some aspect of that system, the ultimate question for evaluation research is whether or not the system or the persons in it have changed over time as a result of the program. - We will elaborate how regression to the mean can operate to undermine the validity of causal interpretations in quasi-experimental research contexts. - Regression toward the mean is an inevitable consequence of examining the association between scores on imperfectly related measures. Whenever the correlation between two measures (like parental height and offspring height) is less than 1.00, there will be some nonsystematic or random deviation between scores on the first variable and corresponding scores on the second - If the first set of scores were selected for its extremity (i.e., to represent the highest or lowest values in the distribution), there is bound to be a bias in what direction subsequent scores using the same measuring instrument will vary. - isn't much room for variation (ceiling effects) in the other direction. Similarly, deviations from the heights of extremely short parents most often will be in the direction of increased height because of a similar selection bias (floor effects). - An artifact is an artificial or spurious finding, a "pseudo-effect" that results inevitably from the properties of the measuring instrument or from the method of data collection employed - regression to mean arises from the design of the experiment, confound - The extent of deviation in scores between tests reflects the degree of test-retest unreliability of the instrument. The more unreliable the test- retest association, the greater the error, and as such, the greater will be the deviations in scores from the first administration to the next. The degree of similarity in patterns of scores across different administrations of the same test is termed the test-retest reliability. Because reliability involves the relationship between two sets of test scores, it is most commonly measured in terms of the Pear- son correlation - Regression toward the mean can create an apparent or pseudo improvement effect if scores are selected from the lower extremes of pretest values

Bem - common errors of grammar

- Compared with versus Compared to. Similar orders of things are compared with one another; different orders of things are compared to one another - Data. The word data is plural - Different from versus Different than. The first is correct, the second, incorrect (although, alas for us purists, very com- mon and gaining respectability) - Since versus Because. Since means "after that." It should not be used as a substitute for because if there is any ambiguity of interpretation - That versus Which. That clauses (called restrictive) are essential to the meaning of the sentence; which clauses (called nonrestrictive) merely add additional information - While versus Although, But, Whereas. While means "at the same time" and in most cases cannot substitute for these other words.

Rosenthal - data analysis as an ethical arena

- Data dropping - analysis of data that never existed (i.e., that were fabricated). Per- haps more frequent is the dropping of data that contradict the data analyst's theory, prediction, or commitment. Outlier rejection - The technical issues have to do with the best ways of dealing with outliers without reference to the implications for the tenability of the data analyst's theory. The ethical issues have to do with the relationship between the data analyst's theory and the choice of method for dealing with outliers. - At the very least, when outliers are re- jected, that fact should be reported. subject selection - A different type of data dropping is subject selection in which a subset of the data is not included in the analysis. - There may be good technical reasons for setting aside a subset ofthe data—for example, because the sub- set's sample size is especially small or because dropping the subset would make the data more comparable to some other research. However, there are also ethical is- sues, as when just those subsets are dropped that do not support the data analyst's theory. When a subset is dropped, we should be informed of that fact and what the results were for that subset Exploitation - exploitation of data is beautiful - It makes for bad science because while snooping does affect p values, it is likely to turn up something new, interesting, and important (Tukey, 1977). It makes for bad ethics because data are expensive in terms of time, effort, money, and other resources and because the anti- snooping dogma is wasteful of time, effort, money, and other resources. If the research was worth doing, the data are worth a thorough analysis, being held up to the light in many different ways so that our research participants, our funding agencies, our science, and society will all get their time and their money's worth. Meta-analysis - Meta-analytic procedures use more of the information in the data, thereby yielding (a) more accu- rate estimates of the overall magnitude of the effect or relationship being investigated, (b) more accurate esti- mates of the overall level of significance of the entire research domain, and (c) more useful information about the variables moderating the magnitude of the effect or relationship being investigated. Retroactive increase of utilities - Meta-analysis allows us to learn more from our data and therefore has a unique ability to increase retroac- tively the benefits ofthe studies being summarized - increases he utility of all the individual studies being summarized. Other costs of individual studies—costs of funding, supplies, space, investigator time and effort, and other resources—are similarly more justified because the utility of individual studies is so increased by the bor- rowed strength obtained when information from more studies is combined in a sophisticated way. The failure to employ meta-analytic procedures when they could be used thus has ethical implications because the opportunity to increase the benefits of past individual studies has been forgone - In particular, meta-analytic reviews try to explain the inevitable variation in the size of the effect obtained in different studies. Pesudocontreversies - The first problem is the belief that when one study obtains a significant effect and a replica- tion does not, we have a failure to replicate. - A failure to replicate is properly measured by the magnitude of difference be- tween the effect sizes of the two studies. The second problem is the belief that if there is a real effect in a situation, each study of that situation will show a significant effect. Significance testing - Good meta-analytic practice shows little interest in whether the results of an individual study were signif- icant or not at any particular critical level. Rather than recording for a study whether it reached such a level, say, p = .05, two-tailed, meta-analysts record the actual level of significance obtained. This is usually done not by re- cording the p value but by recording the standard normal deviate that corresponds to the p value. - Signed normal deviates are an informative characteristic of the result of a study presented in continuous rather than in dichotomous form. Their use (a) increases the information value of a study, which (b) increases the utility ofthe study and, therefore, (c) changes the cost-utility ratio and, hence, the ethical value ofthe study. Small effects are not small - Another way in which meta-analysis increases re- search utility and, therefore, the ethical justification of research studies is by providing accurate estimates of effect sizes - we can more accurately weigh the costs and utilities of undertaking any particular study.

CBL - demand characteristics

- Demand characteristics are the totality of all social cues communicated in a laboratory not attributable to the manipulation, including those emanating from the experimenter and the laboratory setting, which alter and therefore place a demand on responses of participants. When participants feel con- strained by demands that may explicitly or implicitly suggest the preferred response, the study is said to suffer from demand characteristics. Because of the clues transmitted by the experimenter or the laboratory setting, participants believe they have gained insight into the hypothesis and thereby seek to solve the research problem by performing in a way designed to please the investigator.

CBL - protecting confidentiality of data

- One justification that researchers use for keeping participants uninformed about their inclusion in a field study is that the data collected from such sources are essentially anonymous, with no possibility of personally identifying the persons who provided the responses. - Of course, if video or other recording techniques are used that preserve identifying information about the participants, the data are not anonymous, and participants should be given the right to consent whether their data will be used. - However, when data are recorded without any identifying information of any kind, any invasion of privacy is temporary, and confidentiality of the data is insured in the long run. - Even when research is not disguised, avoiding recording of individual identifying information to maintain confidentiality of data is a good idea - Assuring participants of the confidentiality of their responses is not simply for their benefit but can also increase the likelihood that they will be open and honest in their responses - Those in the confidentiality condition provided data that were less influenced by social desirability biases than those in the control condition. This effect was obtained even though participants recorded their names on the tests that they took. - This can occur when the research involves sensitive information (e.g., testing for HIV) or potentially illegal or dangerous behavior (e.g., child abuse), where reporting to partners or authorities may be seen as an ethical or legal responsibility - Research data in situations such as these are subject to subpoena, and researchers have sometimes been put into a painful conflict between their ethical responsibilities to research participants and their legal obligations. For research on some sensitive topics in the United States, it is possible to obtain a "certificate of confidentiality" from the Public Health Service (Sieber, 1992) that protects participant information from subpoena, but most research involving human participants is not protected in this way.

CBL - Institutional Review Boards

- Every institution in which federally funded research is carried out is required to set up an institutional review board (IRB), a committee that evaluates, approves, and monitors all research projects in an institution with respect to ethical requirements and practices - The IRB is appointed by the university administration, but with certain requirements for representation by members of the community and legal experts, as well as scientists from departments across the institution. - Before any program of research is begun, it is the principal investigator's responsibility to submit a complete description of the proposed research purposes and procedures to the IRB for review. Based on the information provided by the investigator, the members of the IRB evaluate the potential costs and risks to participants in the research as well as the potential benefits of the research if it is conducted as planned. - Approvals are granted for a maximum of 12 months; if the research has not been completed within that time, the project must be resubmitted for continuing approval. If the IRB does not feel that the researcher has provided sufficient information to assess the potential risks of conducting the study, or if the proposed procedures do not appear to be fully justified, the proposal will be sent back to the investigator with contingencies or changes that must be made before the research can be approved - Low-risk research that does not involve deception or issues of confidentiality usually can be handled by expedited review. But in other circumstances, most notably those involving potential danger, deception, or blatant manipulation of one form or another, the internal review committee serves a valuable function by requiring the researcher to defend the legitimacy of the research, the necessity for the questionable practices, and the cost-benefit analysis involved in con- ducting the investigation. - An important feature of the institutional review group is that it typically does not consist solely of the researcher's colleagues - many of whom, perhaps, have planned or conducted research similar to that under consideration—but rather of a group of impartial individuals, scientists and laypersons alike, whose primary goal is the protection of participants' rights.

CBL - regulatory context of research involving human participants

- Except for obviously dangerous or damaging actions on the part of the researcher, ethical decision-making involves a cost-benefit analysis rather than the promulgation of absolute strictures and rules - Much of the responsibility for making this assessment falls on the individual scientist, but an individual researcher alone is not always the best judge of what is valuable and necessary research and what is potentially harmful to participants - In fact, there is good evidence that biases enter into scientists' assessments of the utility of their own research (Kimmel, 1991). For that reason, the conduct of research that meets reasonable ethical standards and procedures is not just a matter of personal judgment—it is the law. - The primary directive is 45CFR46 in the Code of Federal Regulations, known as the "Common Rule." stipulates certain principles for protecting the welfare and dignity of human participants in research and prescribes policies and procedures that are required of institutions in which such research is conducted

CBL - is external validity important

- External validity—like other issues of validity—must be evaluated with respect to the purpose for which research is being conducted - When the research agenda is essentially descriptive, ecological validity may be essential. When the purpose is utilitarian, robustness of an effect is particularly critical. The fragility and non-generalizability of a finding may be fatal if one's goal is to design an intervention to solve some applied problem. On the other hand, it may not be so critical if the purpose of the research is testing explanatory cause and effect relationships, in which case internal validity is more important than satisfying the various forms of external validity. - In effect, Mook argued that construct validity is more important than other forms of external validity when we are conducting theory-testing research. Nonetheless, the need for conceptual replication to establish construct validity requires robustness across research operations and settings. This requirement is similar to that for establishing external validity. The kind of systematic, pro- grammatic research that accompanies the search for external validity inevitably contributes to the refinement and elaboration of theory as well as generalizability.

CBL - field experiment selection of participants

- Field experiments often provide for a broader, or at least differ- ent, representation of participant populations, and these participant variations are even more evident in Internet research - Researchers sometimes select particular field sites specifically to reach participant groups of a particular age or occupation - Other settings (e.g., the city streets) do not permit such selection over a narrowly defined group, but do provide for a wider demographic range of potential participants. - It should be emphasized, however, that moving into field settings does not automatically guarantee greater representative- ness of participants

CBL - Experimenter expectancy and bias

- For example, we have learned that the mere presence of the experimenter can operate as a subtle but nevertheless potentially powerful "treatment," differentially affecting participants' responses as a result not of the experimental manipulation, but of the experimenter's own expectations about the study. - In an important series of studies initiated in the1960s, Robert Rosenthal and his associates demonstrated that the expectations held by a researcher seemingly can be transmitted to his or her participants - A placebo effect occurs when participants' belief in the efficacy of the treatment, rather than the actual effects of treatment, is responsible for the results found in an experiment. Experimenters test this phenomenon by administering an inert "treatment" superficially resembling the real treatment in order to deliberately mislead participants in the placebo group into believing that they are receiving the treatment. - The ratings the two groups of experimenters obtained from their participants were substantially different, in ways consistent with the expectancy hypothesis. Experimenters led to expect positive ratings from the participants recorded and reported significantly more positive ratings from participants than experimenters who expected negative scores - This result suggests that even unintentional vocal intonation and cues of experimenters may inform participants about the hypothesized and desired direction of results. - More subtle and troublesome is the possibility in which a participant, through some subtle and unintentional cues from the experimenter, decides to perform "correctly," that is, in a manner that he or she thinks will please the researcher

CBL - conceptual replication

- In a conceptual replication, an attempt is made to reproduce the results of a previous study by using different operational definitions to represent the same constructs. - To establish construct validity, conceptual replications are required, in which the operationalizations of variables are dissimilar from the original study. - Conceptual replications change both context and procedure

CBL - ethical issues in non-lab research

- In addition to issues related to consent to participate, researchers also must consider issues of privacy and confidentiality when research data are collected in field settings - Thus, the participants may not only be deceived about the purpose of the research, but may even be unaware that they are the subject of research in the first place - The use of "unobtrusive" measures (Webb, Campbell, Schwartz, Sechrest, & Grove, 1981) highlights this strategy, but even more traditional methods of data collection, such as the interview or questionnaire, are frequently presented in such a way as to disguise their true purpose. - Some scientists regard the practice of concealed observation or response elicitation as acceptable as long as it is limited to essentially "public" behaviors or settings normally open to public observation - All of these involve behaviors that Campbell regarded as falling within the "public domain" and thus not requiring permission from participants nor subsequent debriefing - However, there remains the question of subjective definitions of what constitute "public" behaviors, particularly in settings where social norms lead to the expectation of anonymity in public places. - if individuals in these settings do not normally expect to be observed (or, rather, expect not to be), the issue of privacy remains. - Yet collecting data about people's behaviors in these situations clearly violates the spirit of "informed consent," especially when researchers decide it is best not to debrief those who have been observed, even after the fact.

CBL - exact replication

- In an exact replication, an attempt is made to reproduce the results of a previous study by using the same procedures, particularly the same operationalizations, to represent the same con- structs. That is, the operational definitions and conceptual definitions are identical to those of the original study. Only the participants, the time, and the place (and, usually, the experimenter) are changed. The purpose is to determine whether or not a given finding can be reliably repeated under slightly different circumstances - To establish external validity of a research result, it is sufficient to demonstrate that the same independent variable has a similar effect on the dependent variable in different contexts with different types of participants - In principle, exact replications change the contextual environment of the studies while hold- ing research procedures constant.

CBL - time series design

- In an interrupted time-series design, the relative degree of change that occurs after a quasi- experimental treatment may be examined by comparing observations of time points prior to the treatment with observations of time points occurring after. - Of course, knowing that a meaningful and stable change in scores has occurred in a time-series analysis does not rule out sources other than the treatment intervention as possible causes of the change. Alternative explanations might be available, such as an abrupt increase in population density, changes in record-keeping procedures, or other factors related to crime rate that could have occurred simultaneously with the time at which the treatment was implemented - One is that "errors" (i.e., extraneous unmeasured factors) that influence the data obtained at any one time point tend to be correlated with measurements at adjacent time points. - That is, autocorrelated errors occur if random events that affect the measurement obtained at one time are more likely to carry over and be correlated with measurements taken at temporally adjacent points than with farther points in time. Such carryover errors make it more difficult to pinpoint a change in the time series at the one specific time of interest to the evaluation researcher - The second problem that plagues time-series analyses is the presence of systematic trends or cycles that affect the pattern of data over a specific time period and are unrelated to the intervention. Changes due to the treatment of interest must be separated from normal changes that occur cyclically across time. When data are obtained on a monthly basis, for instance, regular seasonal fluctuations that may operate across the year must be taken into account. - Statistical procedures known as "prewhitening" can be applied to remove regularities in the time series before analyses of experimental effect are begun

CBL - Quasi-experimental methods

- In many cases, such random assignment is not politically or practically feasible. Sometimes it is impossible to control who will make use of available services or programs - At other times, programs can be delivered selectively, but the selection decision is outside the researcher's control and is based upon nonrandom factors such as perceived need, merit, or opportunity. - Quasi-experimental designs maintain many of the features of true experiments, but do not have the advantages conferred by random assignment. - The absence of random assignment along with the presence of some form of treatment is a defining feature of quasi-experiments and requires researchers to seek factors that help offset the problems that arise because of the lack of random assignment

CBL - assessing dependent variables in field settings

- In many field contexts, the design and evaluation of dependent measures is parallel to that of laboratory experiments. In the guise of a person-on-the-street interview or a market research survey, for example, field researchers may elicit self-reports of the attitudes, perceptions, judgments, or preferences of randomly selected individuals - One advantage of experimentation in field settings is the potential for assessing behaviors that are, in and of themselves, of some significance to the participant - Instead of asking participants to report on perceptions or intentions, we observe them engaging in behaviors with real consequences. - In such cases, our dependent measures are much less likely to be influenced by experimental demand characteristics or social desirability response biases - we think very few people would choose to engage in a difficult, daylong task unless there were more powerful reasons to do so.

Warner - Carry-over and counterbalancing

- Order and carryover effects o Confound of order of presentation o Counterbalancing - type of treatment and order of presentation being balanced - unconfounded § Complete counterbalancing - presenting treatments in all possible orders - not generally practical § For k treatments, k treatment orders can be sufficient provided orders worked out carefully § Latin squares - treatment orders that control for both ordinal position and sequence of treatments · Each treatment appears once in each ordinal position, each treatment follows each other treatment in just one of the four orders - order not confounded with type of treatment o Sometime when treatments presented in a series if time interval is too brief, effects of one treatment do not have time to wear off before next is introduced - carry-over effects - need to allow for sufficient amount of time to wear off o Changes in behavior as function of time - may improve performance across time, deteriorate due to boredom or fatigue, guess purpose of experiment, sensitive to measure, reactive to measure, maturation if over long period of time

CBL - unobtrusive measures

- In some field settings, the kinds of dependent measures typically employed in laboratory studies might be viewed as so intrusive that they would destroy the natural flow of events. For this rea- son, field experiments often are characterized by the use of concealed or indirect measures of the dependent variable under study. - Indirect measures are those for which the link to the concept of interest involves a hypothetical intervening process. - Indirect measures are among a variety of techniques used by field researchers to make unobtrusive measurements of the dependent variable of interest - Some unobtrusive measures are based on observations of ongoing behavior, using methods of observation that interfere minimally or not at all with the occurrence of the behavior. For instance, voluntary seating aggregation patterns have been used as an index of racial behaviors under varied conditions of classroom desegregation - Other observational techniques may rely on the use of hidden hardware for audio or video recording of events that are later coded and analyzed, though the ethics of such research have come under heavy criticism. - Finally, some techniques make use of the natural recording of events outside the experimenter's control, such as physical traces left after an event - These results highlight the importance of pilot testing one's measures before relying on their use in a full study, as well as two potential problems with unobtrusive measures that must be considered. The first of these has to do with reliability. In general, the reliability of unobtrusive measures will not be as great as the more direct measures they are designed to mimic - As might be expected, therefore, the measurement validity of the dependent variable—the extent to which measures measured what they are supposed to measure—also is likely to be of greater concern with unobtrusive measurements. - This concern comes about because the farther removed the actual measure is from the concept of interest, the less likely it is to prove valid. - The reason for our continued emphasis on multiple operationalization is nowhere more evident than in studies of this type, which make use of creative unobtrusive measurement approaches.

CBL - field experiment control over independent variable

- In some instances, the researcher creates experimental conditions from scratch, controlling background context as well as experimental variations - the experimenter controls less of the setting but introduces some systematic variation into the existing condition - In other field research contexts, the experimenter neither manipulates the stimulus conditions directly nor controls participant attention, but instead selects from among naturally occurring stimuli in the field that represent the independent variable of interest. - experimental and non-experimental research becomes thin indeed, and the distinction depends largely on how well the selected field conditions can be standardized across participants - Significant alterations were necessary to take full advantage of the naturalistic setting - The researchers had considerably less control in the field setting. They could not control the implementation of the stimulus conditions or extraneous sources of variation. On any given night, a host of irrelevant events may have occurred during the course of the movies that could have interfered with the mood manipulation - In addition, the experimenters were unable to assign participants randomly to conditions in this field setting, and had to rely on luck to establish equivalence in initial donation rates between the two groups. - The concurrence of results in laboratory and fielding settings greatly enhances our confidence in the findings obtained from both sets of operationalizations of mood in measuring the same underlying concept. Had the field experiment failed to replicate the laboratory results, however, many possible alternative explanations for this discrepancy would have arisen and would have rendered interpretation very difficult.

CBL - Propensity score matching

- In the standard matching design, it would be cumbersome for the researcher to match pairs of respondents on more than a single criterion or pretest measure - Propensity score matching uses complex statistical procedures to statistically match participants on as many covariates as can be specified by the researcher, to determine differences between comparison groups. Results from propensity score matching reveal differences on an outcome measure between groups after covariates have been accounted for. - A propensity score for a participant represents the conditional probability of membership in one group (e.g., the experimental group) over another (e.g., the control group), given the pat- tern of that person's responses on the covariates - Propensity scores are based on the entire set of pretest measures, and are used to adjust and control for covariates statistically so that groups are initially comparable on the covariate variables - In a randomized two-group design, the propensity score of every participant is expected to be .50 (50% chance of being assigned to either group). In a nonexperimental or quasi-experimental design, because of nonrandomized participant selection into groups (as in our exercise campaign scenario), propensity scores should vary across participants - Notice also that people with the same propensity score have same pattern of responses on the covariates. - This statistical property of propensity scores serves as the main criterion to balance the groups to achieve initial comparability. - The major implication of propensity score matching is that it is possible to deduce mathematically what someone's hypothetical or counterfactual score would have been in the other group based on their patterns of responses to the measured covariates. - The estimated or hypothetical outcome score on the group or condition that a participant was not a member of is known as the counterfactual score. - Using propensity scores, participants across groups are made statistically equivalent on these initial variables so that any posttest observations would rule out these particular characteristics as confounds. - Although the technique allows for the matching of as many pretest covariates as possible, the extent and quality of the control is determined by the number of extraneous variables that were identified and measured in the study. - For the technique to be most effective, the researcher must consider and assess all the potentially important extraneous variables that come to mind and on which groups may differ initially - Thus, it is important to include as many pretest variables as possible, especially those that prior research and theory suggest - Owing to the number of measures on which participants across groups can potentially be matched and the many combinations of patterns of covariate responses, propensity score approaches often necessitate large sample sizes. - Even after controlling for a wide assortment of variables, however, uncertainty always remains that there are lurking variables the researcher failed to measure, and therefore pretest equivalence was not attained. Propensity scoring is advantageous to other statistical techniques that control for covariates insofar as the assumptions about the data are much more relaxed in propensity analysis.

CBL - privacy on the internet

- Internet constitutes "public domain" and therefore can be observed and recorded without obtaining consent, although there is an expectation that information about identity of the senders should be protected. - However, if an Internet study is designed for the purpose of collecting new data from participants (e.g., survey questionnaires and Internet experiments), informed consent should be electronically obtained from participants

CBL - datasets and archival research

- Longitudinal research might require that data records be kept about individual respondents, but these methods do not necessarily require that those individuals be personally identifiable. - Systematic controls to screen researchers who wish to be granted access to data are designed to protect individual privacy and are not necessarily inconsistent with research aims. Sawyer and Schecter - Only objective information be included in dataset, - Individuals be given the right to review their files for accuracy, and to have errors corrected, - Research analyses be restricted to random samples, - Files of individuals be identified only by code numbers, with access to personal identification strictly restricted, and - Security precautions be instituted for screening data users and access to the types of information - These last two suggestions are related to the fact that some identification of participant records might be required for adding new information to existing records longitudinally, or for producing a file for review at the participant's own request. - Creating linking systems of this kind can be expensive, but such costs balance the scientific usefulness of large data banks against the risks to individual privacy

Constructing a field experiment

- Selection of participants - Control over the independent variable - Random assignment in field settings - Assessing dependent variables in field settings

CBL - random assignment in field settings

- Participant self-selection problems plague field experimentation in many different ways - In other settings, too, the research may rely on the essentially haphazard distribution of naturally occurring events as equivalent to controlled experimental design. - Thus, when comparisons are made between such naturally selected groups, the burden of proof rests on the investigator to make a convincing case that the groups were not likely initially to differ systematically in any relevant dimensions other than the naturally arising event of interest. - Often, the nature of the experimental manipulation is such that the researcher can deliver different versions or conditions to potential participants in accord with a random schedule. - conditions are randomly assigned to participants rather than participants randomly assigned to conditions, the researchers could be relatively confident that no prior-existing participant differences influenced their results. - In some field research efforts, the investigator may be able to assign participants randomly to conditions but, once assigned, some participants may fail to participate or to experience the experimental manipulation. If such self-determined participant attrition occurs differentially across treatment conditions, the experimental design is seriously compromised. One way of preserving the advantages of random assignment in such cases is to include participants in their assigned experimental conditions for purposes of analysis, regardless of whether they were exposed to the treatment or not

CBL - conducting experiments online

- Perhaps even more important, having open-access sites like this provides researchers with the possibility of obtaining extremely large and diverse samples, sometimes greater than 20,000 respondents, for study - Internet results mirror lab results across a wide range of psychological effects The first step involves designing the treatment materials and stimuli. - The use of more complex manipulations, such as watching a video of a persuasive health message embedded into a website, is potentially problematic for participants due to technical constraints - Be aware that not everyone with Internet access has the sufficient computer requirements or bandwidth to view the multimedia presentations: Participants might be required to wait patiently for the stimuli of a complex treatment to download. These awkward moments of staring at an inactive screen could contribute to higher participant dropout rates. Second step involves finding a website capable of hosting and administering your online study. - A major advantage of using these online programs is that participant responses are automatically recorded into a database or spreadsheet, avoiding data entry errors, and downloaded at the researcher's convenience. the final step involves the participant recruitment. - These participants, however, represent a self-selected sample of those who actively pursue being a research participant. Just as in traditional approaches, offering an incentive will yield higher responses rates, given the competition of online studies that are now being conducted. - longitudinal studies

Warner - Person Effects

- Reduction in size of SS error that occurs when repeated measures or paired samples are calculated becomes greater as the correlation between scores increases - Person effects - extent to which some persons tend to have high heart rates across different situations, while others tend to have low rates - R can be interpreted as an indication of consistency or reliability of responses - When we have consistent individual differences in the outcome variable we can remove the variation related to individual differences from the error term - we can statistically control for individual differences - Person effects account for a larger share of error variance when person effects consistent (when r is large) - remove this - Major potential advantages of repeated-measures - when each participant provides data in both of the treatment conditions, or when each participant serves as his or her own control, the variance due to stable individual differences among persons can be removed from the error term, and the resulting repeated-measures test often has better statistical power than an independent samples t test applies to the same set of scores - Use smaller numbers of participants

CBL - methodology as ethics

- Rosenthal and Rosnow (1984), for instance, promote the philosophy that sound research design is an ethical imperative as well as a scientific ideal. Taking into account that participants' time, effort, and resources are involved in the conduct of any social research, they argue that researchers are ethically obligated to do only research that meets high standards of qual- ity to ensure that results are valid and that the research has not been a waste of participants' time. - Rosenthal (1994) has gone so far as to suggest that IRBs should evaluate the methodological quality of research proposals as part of the cost-benefit analysis in their decisions about the ethicality of proposed research projects. - Rather than IRBs reviewing a study based strictly on methods to be undertaken, the value of a particular research project must be evaluated in terms of the contribution it will make to a body of research employing various methods, rather than as an isolated enterprise.

CBL - ethics of research practices

- Scientific interests might conflict with values placed on the rights of individuals to privacy and self-determination - informed consent, a process in which potential participants are informed about what participation in the study will involve so that they can make an informed decision about whether or not to engage in the research. - Doing so emphasizes that participation should be voluntary. It is recognized that many phenomena could not be researched at all if this ideal were met fully, and that the rights of participants must be weighed against the potential significance of the research problem - In studies where full information cannot be provided to participants, the panel report recommends that "consent be based on trust in the qualified investigator and the integrity of the research institution." - Rather, judgments as to the relative importance of research programs and researchers' responsibilities for the welfare of their participants are the fundamental bases of research ethics.

Bem - for whom should you write

- Scientific journals are published for specialized audiences who share a common background of substantive knowledge and methodological expertise. If you wish to write well, you should ignore this fact - Accordingly, good writing is good teaching. Direct your writing to the student in Psychology 101, your colleague in the Art History Department, and your grandmother. No matter how technical or abstruse your article is in its particulars, intelli- gent nonpsychologists with no expertise in statistics or experimental design should be able to comprehend the broad outlines of what you did and why - should understand what was learned, why someone—anyone—should give a damn. The introduction and discussion sections in particular should be accessible to this wider audience. - The actual technical materials—those found primarily in the method and results sections—should be aimed at a reader one level of expertise less specialized than the audience for which the journal is primarily published ***good writing is good teaching

CBL - ethics of data reporting

- Selective reporting of some results of a study and not others often occurs, and data from some participants are dropped from analyses if they are suspect in some way. Such practices can be justified to reduce unnecessary error in understanding and interpreting results of a scientific study, but these practices can be abused if used to distort the findings in the direction of reporting only what the researcher had hoped to demonstrate. To avoid such abuses, researchers need to use systematic criteria of dropping data from their analyses and to be scrupulous about reporting how these criteria were applied. - Researchers are expected to be honest about reporting results that do not support their hypotheses as well as results that do support their predictions. In addition, researchers need to be honest about what their hypotheses were in the first place. - When unexpected findings are obtained, we usually can generate explanations post hoc about why things came out that way. - post hoc explanations become hypotheses for new research - However, in a research report, it is important to distinguish between interpretations of findings that are made after the fact and hypotheses that were made before the study began. Post-hoc explanations that are reported as if they had been hypothesized beforehand are a practice that Norbert Kerr has labeled "HARKing"—Hypothesizing After the Results are Known

Warner - Paired Samples t Test

- Smaller error term and greater statistical power than independent-samples t test - Repeated-measures - requires smaller number of participants than independent samples - Each case assessed at two or more points in time or under two or more treatment conditions - Correlated samples = if each person in Group 1 has a partner in group 2 - can occur naturally or formed by researcher - Matching on age - pair someone with other person of same age - form of experimental control - Naturally occurring pairs - different but related persons in the two samples - Correlated samples - members of two samples paired in some way - E.g. naturally occurring pairs - marriage partners - Couple - is the unit of analysis or case - Significant correlation btw husband and wife relationship satisfaction - Significant difference between means for relationship satisfaction - paired samples t test - difference between Mhusband and Mwife - Matched pairs of participants can be created by a researcher as a way of making certain that an important variable is experimentally controlled and made equal or nearly equal across groups and therefore not confounded with treatment condition - Can guarantee most important participant characteristic is equal or close to equal across treatment groups - Approximately matched samples - participants matched by similar ability scores - Ranked list of students divided into pairs - within each box, students have similar scores on ability - within each pair, one member is randomly assigned to treatment 1 or other to treatment 2 - Mean ability scores computed and compared across groups - difference between means smaller as additional matched pairs added - Exact matching - exactly matched on age - preferable - When we have a paired-samples or repeated-measures design, we hope to see a positive correlation

CBL - deception and participant well-being

- Sometimes place research participants in a position of psychological distress or other discomfort. - This potentially violates the other major canon of ethical research—to "do no harm" to those who participate. - As usual, the extremes are fairly well established—the risk of permanent physical or psychological harm without informed consent by the human participant is never permissible - However, consensus as to the acceptability of temporary or reversible psychological distress is more difficult to achieve - Most researchers seem to agree that the potential contribution of this line of research and the transitory nature of the psychological discomfort involved justified its undertaking. - Mil- gram's (1964) response to this criticism emphasized the significance of the research, particularly with regard to the unexpectedly high percentage of people who were willing to obey. He noted that great care had been exercised in post-experimental information sessions to assure that participants suffered no long-term psychological damage. - If more participants had behaved in a humane and independent fashion and refused to administer shocks at high voltage levels, the same research procedures might not have come under such intense attack - he way participants reacted to the situation. Yet those very results—unexpected as they were—are what made the experiments so valuable to social science.

CBL - pseudo-confirmation of expectations

- Systematically err in observation, recording, or analysis of data (whether intentionally or not), and/or - Cue the participant to the correct response through some form of verbal or nonverbal reinforcement.

CBL - Deception in the laboratory

- The extent to which participation is fully voluntary is debatable, given the social and institutional pressures to take part in research that are sometimes involved. - But generally, due to the artificial setup, participants in laboratory studies at least know that they are taking part in a research study. Beyond that, however, the information provided to participants in laboratory investigations is usually minimal to foster research purposes, and sometimes is intentionally misleading. - the methodological strategy of most laboratory research is directed toward motivating participants to behave spontaneously and therefore not self-consciously while involved in the conditions of the study. - Yet such deception is undeniably in violation of values of interpersonal trust and respect. - Some critics have argued that deception is never justified and should not be permitted in the interests of social research - Most researchers take a more moderate view, recognizing that there is an inevitable trade-off between the values of complete honesty in informed consent and the potential value of what can be learned from the research itself - a minimal amount of deception may be tolerated in the service of obtaining trustworthy research data - Kelman - Ethically, he argued, any deception violated implicit norms of respect in the interpersonal relation- ship that forms between experimenter and research participant. - In addition, the practice might have serious methodological implications as participants become less naive and widespread suspicious- ness begins to influence the outcomes of all human research. - To offset these problems, Kelman recommended that social scientists (a) reduce unnecessary use of deception, (b) explore ways of counteracting or minimizing the negative consequences stemming from deception, and (c) use alternative methods, such as role-playing or simulation techniques, which substitute active participation for deception, with participants being fully informed of the purpose of the study. - remains uncertain whether the results of a role-playing simulation can be interpreted in the same way as those obtained under the real experimental conditions. - Thus, the general consensus in the research community is that some level of deception sometimes is necessary to create realistic conditions for testing research hypotheses, but that such deception needs to be justified in light of the purpose and potential importance of the research question being studied.

Bem - how should you write

- The primary criteria for good scientific writing are accuracy and clarity. If your article is interesting and written with style, fine. But these are subsidiary virtues. First strive for accuracy and clarity. - The first step toward clarity is good organization, and the standardized format of a journal article does much of the work for you. It not only permits readers to read the report from beginning to end, as they would any coherent narrative, but also to scan it for a quick overview of the study or to locate specific information easily by turning directly to the relevant section. Within that format, however, it is still helpful to work from an outline of your own - The second step toward clarity is to write simply and directly

Bem - which article should you write

- There are two possible articles you can write: (a) the article you planned to write when you designed your study or (b) the article that makes the most sense now that you have seen the results. They are rarely the same, and the correct answer is (b). - Psychology is more exciting than that, and the best journal articles are informed by the actual empirical findings from the opening sentence. Before writing your article, then, you need to Ana- lyze Your Data. Analyzing Data - before writing anything analyze your data - The rules of scientific and statistical inference that we overlearn in graduate school apply to the "Context of Justification." They tell us what we can conclude in the articles we write for public consumption, and they give our readers criteria for deciding whether or not to believe us. But in the "Context of Discovery," there are no formal rules, only heuristics or strategies. - In the confining context of an empirical study, there is only one strategy for discovery: exploring the data. - Yes, there is a danger. Spurious findings can emerge by chance, and we need to be cautious about anything we discover in this way. Reporting the findings - When you are through exploring, you may conclude that the data are not strong enough to justify your new insights formally, but at least you are now ready to design the "right" study. If you still plan to report the current data, you may wish to mention the new insights tentatively, stating honestly that they remain to be tested adequately. Alter- natively, the data may be strong enough to justify recentering your article around the new findings and subordinating or even ignoring your original hypotheses. - If your study was genuinely designed to test hypotheses that derive from a formal theory or are of wide general interest for some other reason, then they should remain the focus of your article. The integrity of the scientific enterprise requires the reporting of disconfirming results. - Contrary to the conventional wisdom, science does not care how clever or clairvoyant you were at guessing your results ahead of time. Scientific integrity does not require you to lead your readers through all your wrong- headed hunches only to show— voila!—they were wrongheaded. A journal article should not be a personal history of your stillborn thoughts. - Your overriding purpose is to tell the world what you have learned from your study. If your results suggest a compelling framework for their presentation, adopt it and make the most instructive findings your centerpiece. Think of your dataset as a jewel. Your task is to cut and polish it, to select the facets to highlight, and to craft the best setting for it. Many experienced authors write the results section first.

CBL - Ethical issues related to the products of scientific research

- This issue becomes most acute when the factor of research sponsorship is considered. When a research project is financed wholly or in part by some governmental or private agency, the researcher sometimes is obligated to report results directly—perhaps exclusively—to that agency. In such cases the purposes of the sponsoring agency will clearly determine at least the immediate application of information or technical developments derived from that research, and the scientist can hardly deny foreknowledge of such applications, whatever other potential uses the discovery may have - Given the growing costs of research in the physical and social sciences, more and more projects must rely on sources of funding other than those provided by such presumably impartial agencies as universities, and more and more scientists are facing a choice between abandoning a particular line of research or conducting it under the auspices of some private agency with special interests. - Thus, the ethical considerations of any researcher in this area must include who will be privy to this knowledge in the long run, and what are the chances that it will come under the exclusive control of one segment of the social system - What is the responsibility of the designers of these techniques when they are used by corporate personnel officers to weed out unsuspecting employees with potential anti-management values or attitudes? Or, alternatively, what is the responsibility of the researcher whose correlational study of social and attitudinal factors linked to student radicalism is used by university admissions officers to develop screening criteria for rejecting applicants? - The issue of social responsibility is made even more complex when one realizes that the conclusions to be drawn from research results or psychological tests often are grossly misperceived by naive analysts. - Similar issues are raised with respect to research yielding controversial results that reveal differences (e.g., in intelligence or personality variables) between different ethnic or racial groups. Because the ethnic variable is inextricably confounded with cultural and socioeconomic factors in contemporary Western society, the source of such differences in terms of genetic or cultural factors cannot be determined unambiguously, so researchers should report such results in a highly qualified fashion.

CBL - Regression-discontinuity design

- Unlike time-series designs, which may involve the observations across many time points for one or many comparison groups, the regression-discontinuity design involves comparing groups on either side of a cutoff point. - The regression-discontinuity design (RD) is conducted to test the existence of some systematic relationship between a pretest selection variable, used for the purpose of placing participants in comparison groups, and a posttest measure of interest - The RD design is used to determine if there is a discontinuity in scores between those immediately above and immediately below the cutoff for inclusion in the program, and if there are differences in the slopes of the regression lines on each side. - The RD design is quasi-experimental because it is meant to mimic a true experiment in which a group of participants at a cutoff point are randomly assigned to a treatment or a control condition. - In the regression-discontinuity approach, if selection into the special program is associated with a specific cutoff point on the selection variable, we can use those who fall immediately below the cutoff point as a comparison to those falling immediately above the cutoff - his design is especially useful when scarce resources are used as interventions - The regression-discontinuity design is less common than comparison time-series designs in pro- gram evaluation settings, partly because assignment of participants to groups based on a criterion cutoff score is relatively rare, and partly because it requires extensive data collection for individuals across the full range of scores on the selection variable. - However, when regression-discontinuity analyses are used, they often are focused on important, socially relevant issues - In either instance, strict application of the selection rule permits cause-effect conclusions to be drawn if the cutoff rule is strictly enforced. If it is not, the design's capacity for causal interpretation is forfeit. In addition, it is essential that the relationship between pretest and posttest is linear, or the interpretability of the ensuing results is severely compromised. The statistical power of analyses of the RD design also is low: Using RD analysis requires two to four times as many observations as experimental designs to attain the same degree of statistical power. Given these problems, the RD design obviously is not preferable to a randomized experiment, if the latter is possible.

CBL - use of archival data in longitudinal research

- Use of statistical archives has a number of advantages, but also creates some disadvantages for research purposes. First, it limits the dependent measures or outcome variables that can be assessed to the type of information on which records happen to have been kept. Records kept for administrative purposes may or may not reflect the primary goals of the particular social program being evaluated. - Another limitation imposed by reliance on archival records is the time interval covered by a given statistic, which may or may not be the unit of time ideal for research purposes. - Finally, a major worry with many archival studies is the possibility that the nature or method of record keeping has changed over time. Record-keeping systems can be altered in many ways, usually for administrative convenience. - If changes in administrative record- keeping methods occur in close proximity to the introduction of a social program, the records may become useless as measures of program-induced changes. - On a positive note, use of archived data usually provides much broader swaths of the population than is possible in laboratory-oriented research. Often archival data is representative of large groups of the populace—cities, states, or the country.

CBL- quasi experiments

- quasi-experimental studies in which some systematic manipulation has been made for the purpose of assessing a causal effect, but random assignment of participants is not employed - This could occur because the intervention involves some pervasive treatment affecting all participants in the social setting at once. Or it could be that some participants are exposed to the treatment while others are not, perhaps because of some other nonrandom assignment process - These situations bear some resemblance to the pretest- posttest control group experimental design, in that a treatment has been introduced at a specifiable point in time or space. Thus, outcomes before the presence of the quasi-experimental treatment can be compared with outcomes occurring after its introduction across times points, but as a research design the structure of the study lacks a critical element necessary for internal validity. - Although quasi-experiments often have fewer threats to internal validity than cor- relational studies, they are generally more susceptible to such threats than randomized experiments. The most recognized applications of quasi-experimental designs are found in evaluation research, in which the quasi-experimental treatment is usually called a program, policy, or intervention.

CBL - Methodological concern: reliability and validity of internet studies

- When using the web as a research modality, issues of reliability are the same as those we confront when using conventional laboratory methods. The measures we use in Internet research must be psychometrically sound, which is not different from the requirement of the more standard approaches. - Buchanan (2000) suggests that with its abundance of available research participants, and their diversity relative to the standard college sample, developing and refining measurement instruments might be facilitated by using the Internet. Developing scales of fact or opinion typically requires large numbers of respondents, and the necessary sample size often is not available in many traditional research settings. - Assessing measurement validity, specifically the construct validity, of measures using the Inter- net instead of traditional methods is sometimes difficult. One common approach to understand possible differences between the two modalities is to conduct the investigation in the traditional laboratory and also on the web - Parallel results from laboratory and the Internet are encouraging, but may be attributable to a host of factors. Confirming the null hypothesis is not a satisfactory approach to establishing validity under any circumstance, and circumstances involving the Internet are not immune to this problem. - Removal of the experimenter from the online setting was suggested as producing less deception and therefore prompting greater honesty in participant responses.

CBL - code of ethics

- behavioral researchers also are subject to codes of ethics promulgated by scientific societies such as the National Academy of Sciences (1995) and the American Psychological Association - Thus, it is our view that the best guarantee of continued concern over ethical standards is the frequent airing of ethical issues in a way that ensures exposure to each new generation of researchers.

CBL - participant awareness

- critique: conscious of fact under observation - response could be quite different from those of persons not under such experimenter scrutiny - "self-consciousness effect" - In some cases, the participant's awareness of being observed may be the most salient feature of the total experimental situation. If this is so, then the obtained results of such a study should be viewed with caution—but certainly not dismissed from consideration. - In more ideal instances, however, because the situation engages the attention of the participant, it seems likely that the participant's awareness of being studied would be greatly diminished, and thus would have negligible or no impact on the results. - In most cases, the experimenter should attempt to construct situations that reduce the self- consciousness of participants as much as possible.

CBL - Misconceptions about internet research

- diversity, well-adjusted, lack of correspondence with standard methods of data collection, motivation, minimize bias and demand - more diverse - Although the gender gap in Internet use has largely disappeared, a small racial disparity remains with regard to Whites being more likely to have access to this medium than Latinos or Blacks, likely attributed to income differences. Still, the Internet remains impractical for recruiting from certain populations - A second misconception of Internet users is that they are less well-adjusted than those who do not use the Internet. Gosling and colleagues (2004) found that this was not true. Their Internet respondents were not drastically disparate from the usual college sample on indicators of social adjustment or mental health. - A third objection to Internet-based research has to do with the lack of correspondence between its results and those derived from standard (paper and pencil) methods of data collection. However, across many areas of investigation, especially those involving survey or scale research, Internet and traditional forms of data collection yield parallel findings. - A fourth misconception is that participants completing a study online are not necessarily as motivated as those completing a study in person (Gosling et al., 2004). The misconception that online participants tend to be unmotivated stems from the view that online studies are usually anonymous and could be completed by practically anyone, thus increasing the potential to answer questions carelessly. Because it is convenient to abandon a study in the middle if it appears uninteresting, dropout rate is indeed higher in Internet-based studies - With internet o Conduct experiments outside the laboratory. o Draw on a much greater demographic range of participants, helping to enhance participant generalizability. o Minimize experimenter bias and demand.

CBL - Restriction of participant populations

- individual typically used in research (college undergrad) not most ideal choice - avg. college student - more intelligent, healthier, more concerned with social forces operating on physical and social environment, sensitive to various communication media, better informed, less likely hold strongly fixed attitudes - Western, educated, industrialized, rich, democratic societies (WEIRD) - some areas of research - minor influence - e.g., processes so basic that special peculiarities of college student could not reasonable confine generalizability of results - e.g., judgment of attractiveness and stereotypes - Once results are determined from a restricted population, principles of behavior can then be investigated systematically in other demographic groups to test whether participant characteristics limit the generalizability of the original findings.

CBL - Research on social network activity

- online activity can itself be studied as a form of social behavior in a naturalistic setting - Research has been conducted to address questions about who uses online social networks, why people make use of this form of social communication, the content of information that is disclosed online, and the consequences of online social activity for real-life relationships - some research uses online activity as data, including rates of usage of online sites over time, content of user profiles, and demographic statistics on users. For the most part, the research conducted thus far has been descriptive or correlational, similar to observational research methods - However, it is possible to introduce experimental interventions into ongoing network activity, such as randomly assigning participants to increase their frequency of status postings on Facebook to assess effects on perceived social connectedness (Deters & Mehl, 2013), or posting experimentally designed messages to online discussion groups to assess effects on emotional con- tent of responses

Bem - Discussion

- t forms a cohesive narrative with the introduction, and you should expect to move materials back and forth between these two sections as you rewrite and reshape the report. Topics that are central to your story will appear in the introduction and probably again in the discussion. More peripheral topics may not be brought up at all until after the presentation of the results. The discussion is also the bottom of the hourglass-shaped format and thus proceeds from specific matters about your study to more general concerns (about methodological strategies, for example) to the broadest generalizations you wish to make. The sequence of topics is often the mirror image of the sequence in the introduction. - Begin the discussion by telling us what you have learned from the study. Open with a clear statement on the support or nonsupport of the hypotheses or the answers to the questions you first raised in the introduction. But do not simply reformu- late and repeat points already summarized in the results section. Each new statement should contribute something new to the reader's understanding of the problem. What inferences can be drawn from the findings? These inferences may be at a level quite close to the data or may involve considerable abstraction, perhaps to the level of a larger theory regarding, say, emotion or sex differences. - implications - It is also appropriate at this point to compare your results with those reported by other investigators and to discuss possi- ble shortcomings of your study, conditions that might limit the extent of legitimate generalization or otherwise qualify your inferences. Remind readers of the characteristics of your participant sample, the possibility that it might differ from other populations to which you might want to generalize - As noted above, your task is to provide the most infor- mative and compelling framework for your study from the opening sentence. If your new theory does that, don't wait until the discussion section to spring it on us. A journal article is not a chronology of your thought processes. - The discussion section also includes a consideration of questions that remain unanswered or that have been raised by the study itself, along with suggestions for the kinds of research that would help to answer them. In fact, suggesting additional research is probably the most common way of ending a research report. - end with a bang, not a whimper.

CBL - ethics APA recommendation

- that all excluded observations and the reasons for exclusion have been reported in the Methods section - that all independent variables or manipulations have been reported - that all dependent variables or measures that were analyzed for the research study have been reported - that information on how the final sample size for each study was determined has been reported

Bem - title and abstract

- they should accurately reflect the content of the article and include key words that will ensure their retrieval from a database. You should compose the title and abstract after you have completed the article and have a firm view of its structure and content. - contain the problem under investigation (in one sentence if possible); the participants, specifying pertinent characteristics, such as number, type, age, sex, and species; the experimental method, including the apparatus, data-gathering procedures, and complete test names; the findings, includ- ing statistical significance levels; and the conclusion and the implications or applications. - If the conceptual contribution of your article is more important than the supporting study, this can be reflected in the ab- stract by omitting experimental details and giving more space to the theoretical material

Repeated Measures ANOVA: Statistical Assumptions

1) Normal distribution of DV scores (quantitative) - skewness index indicates approx. normal distribution (for each round) - no extreme outliers 2) Sphericity: Variances (for difference scores) approximately equal across rounds (variant of homogeneity of variances test) - if exactly 2 rounds - don't need to detect sphericity - only one diff score/variance - aim for non-sig indicating variance of diff score approx same - Mauchly's test of sphericity: sphericity or not - not sig. (aim): variances of difference scores (across all pairs of rounds) are approx. the same - interpret "sphericity Assumed" F test - sig p < .05: variances of difference scores are not the same - interpret Greenhouse-Geisser F Test - Greenhouse-Geisse epsilon: degree of sphericity (range: .00 to 1.00) - high values (e.g., > .50): variances of difference scores (across all pairs of rounds) are approx the same - interpret "Sphericity Assumed" F text - low value: variances of difference scores are not the same - interpret: Greenhouse-Geisser F Test - if 1.00, variance exactly the same 3) Non-independence of participants - within-groups design: each participant serves in all rounds/conditions/groups

Dependent t Test: Statistical Assumptions

1) Normal distribution of DV scores (quantitative) - skewness index indicates approx. normal distribution at each round - bound by +/- 2 to have normal distribution - no extreme outliers? 2) Non-independence of participants - within-subjects design: each participant serves in both rounds/conditions/groups Not an assumption: homogeneity of variances - no needed for 2 rounds as that involves a single variance (of the difference scores): S^2d - compute deviation scores from the diff of each person - calculation performed on one diff score - when calc variance of diff score, single variance used throughout calculation - bc 1 diff score, don't have another diff score to compare it with

Dependent t Test Formula

1) sum of squares (of the difference scores) - interested in difference within the same participant - take each difference score and subtract by mean of differences score - positive and negative cancel out, so square all values to get positive value - sum of scores of variability across participants 2) variance (of the diff. scores) - variability per participant - why divide by N-1, degree of freedom adjustment for unbiased estimator - don't need N-2 because not 2 conditions with different participants 3) standard error (of the mean of diff. scores) - magnitude of chance - chance for that mean difference - square root so return to original measure 4) dependent t = mean difference / standard error

CBL - ethic research suggestions for sponsored research

1. That research programs that currently rely on exclusive sources of support instead be multiply sponsored, or receive support from a combined scientific research fund supported by budget allotments from several different agencies, 2. That to the maximum extent possible, all research reports, techniques, and summaries be made available for public distribution, 3. That emphasis be given to the social responsibility of individual scientists or groups of scientists to educate the public on the nature of their research results in ways that will enhance under- standing of both the conclusions and the qualifications and limitations that must be placed on the generalization of those conclusions.

Mixed Wtihin-Subjects and Between-Subjects ANOVA SPSS

A mixed within-subjects (room temperature: cold and hot) and between-subjects (gender: male or female) ANOVA was performed. The skewness index discloses that the anger scores in the cold room (0.00) and hot room (0.00) were relatively normal distributed. The room temperature main effect for participants who served in both the cold room (M = 5.00, SD = 2.20) and hot room (M = 5.00, SD = 4.31) was not significantly different on anger, F(1, 6) = 0.00, ns. The gender main effect was significant, as males (M = 7.00) compared to females (M = 3.00) scored significantly higher on anger, F(1, 6) = 17.46, p < .05. The interaction effect attained significance, F(1, 6) = 27.43, p < .05. Specifically, the same males in the hot room compared to when they were in the cold room scored higher on anger. However, the same females in the cold room compared to when they were in the hot room scored higher on anger. The interaction effect is graphed in Figure 1.

Dependent t Test

AKA - dependent groups t test - dependent samples t test - within-subjects t test - related samples t test - paired t test - same group of participants - DV scores in one condition dependent on DV scores in other condition IV: binary (each participant serves in both conditions/groups/rounds) DV: quantitative (repeatedly measures: 2x) Conceptual - H0 (null hyp.): Difference in 2 means not sig. beyond chance - H1 (research hyp.): Difference in 2 means is sig. beyond chance Inferential (generalize to pop) - H0: ud = 0 - H1: ud =/ 0

Bem - shape of an article

An article is written in the shape of an hourglass. It begins with broad general statements, progressively narrows down to the specifics of your study, and then broadens out again to more general considerations. - intro begins broadly - it becomes more specific - until ready to introduce own study in conceptual terms - method and results sections are most specific - neck of hourglass - discussion section begins with implication of study - becomes broader

Repeated Measures ANOVA (F test)

Dependent t test vs. repeated measures F test - the dependent t test is only appropriate for 2 groups/rounds due to the positive versus negative values of this distribution - the repeated measures F test could be used for 2 or more groups/rounds due to the analysis of variance across group means F sampling distribution - graphically appears to be one-tailed - but statistically always tested as two-tailed F test - range: 0 to positive infinity - more likely to reject null hypothesis: larger calculate F test - if exactly 2 rounds, dependent t^2 = repeated measures F - e.g., dependent t = 5, repeated measures F = 25, have same p value - t test compare 2 rounds, M1-M2 or M2-M1 - problematic when 2+ rounds, which round subtract from which? - F test positively skewed - because squared t - F distrib - 2+ rounds - analyzing variance attributed to diff in group means - see if sig beyond change - If 2 rounds - doesn't matter which test use F test = signal/noise = variance including due to group differences in means / variance not due to group differences in means - variance due to diff in means across rounds (signal) - noise - variance not attributed to diff among group means

Warner - disadvantages of paired samples

Disadvantages: order effects, carryover effects, fatigue, practice effects - When repeated-measures study compares two different types of treatment or dosages of treatment, order effects can occur, in which outcomes differ depending on the order of presentation of those treatments - Counterbalancing - order-effect problems can be reduced by presenting treatments in different orders or sequences - e.g. if participants taste sweet and then dry wine, taste confounded with order - To control for this confound, the researcher needs to present the wines in both possible orders - Participants randomly assigned to one of the two orders - Carryover effect occurs when effect of the first treatment has not yet worn off when the second treatment is administered - Waiting long enough for effects of treatment 1 to wear off - When participants are assessed at multiple points in time, they may change - threats to internal validity - potential confounds with type or amount of treatment - impossible to make causal attributions about manipulated treatment variables o If points in time are close together and the task is repetitive, participants may become bored or fatigued o If outcome variables involves skill, may be practice effects such that participants get better at task with repetition o Experience with two diff treatments enable participants to guess research question o If point in time far apart, participants become older, changes in response bc of maturation o Very apart - enough time events outside study may impact participants o Repeatedly experience same measurement - proves of being measured itself cause a change in attitudes or behaviors o Drop out of studies that require two or more sessions - attrition - if stay in study results may be generalizable only to people who are more cooperative and motivated or find treatment easy to tolerate o Series of treatments administered - can be cumulative effects

Institutional Review Board (IRB): 3 General Categories of Review

EEF Exempt Research - no/minimal risk - shortest time to review (still need to submit an app for IRB to decide) - e.g., public datasets, content analysis of websites or social media, unobtrusive observation of public behavior, questionnaires of nonsensitive topics (beliefs/attitudes), no risk experiments - e.g., observe people in food court, what type of food eating, not interviewing/no confederate Expedited Research - medium risk - moderate time to review - e.g., questionnaires of sensitive topics (e.g., sexual assault, illegal use of substances), medium risk experiments Full Review Research - high risk - longest time to review (full board IRB - all committee members meet to discuss your app) - e.g., risky intervention, intrusive or invasive treatment (e.g., ingest substance, extract biological specimen - blood, saliva, breathalyzer) - e.g., Lac - breathalyzer outside Frats - could hav legal ramifications if drink and drive Risk - potential psych/physical distress or harm to participants, legal ramifications if participants identified, degree of deception, etc.

Between-Subjects vs. Within-Subjects Designs: 5 Major Differences

ISPSC Between-subjects (N = 100, Scores = 100) 1) Independence of participants across conditions: should not be violated. Each participant serves in a single condition/group 2) Sample size: Requires larger total N (because each participant provides a score in only one condition/group) 3) Preexisting group differences: Initial differences across conditions/groups is a problem if random assignment not used 4) Standard error: Standard error (noise) is larger in statistical formulas (e.g., between-subjects ANOVA) - participants serving different conditions, participant differences within groups 5) Carry-over effects: None, only serve in 1 condition Within-subjects (N = 50, Scores = 100) 1) Independence of participants across conditions: should be violated. Each participant must serve in 2 or more conditions/groups 2) Sample size: Requires smaller total N (because each participant provides a score in every condition/group_ 3) Preexisting group differences: Initial differences across conditions/groups not a problem, as same individuals serve as their own "controls" by participating in all conditions (participants have same race, ethnicity, personality, etc. in all conditions they serve) 4) Standard error: Standard error (noise) is smaller in statistical formulas (e.g., repeated-measures ANOVA) due to removal of :stable individual differences" component 5) Carry-over effects: Performance in one condition may carry over to the next condition (if no counterbalancing) - how perform in one condition and knowledge/experience have can carry over and influence DV scores in condition B in within-subjects designs: conditions = groups = treatments = rounds = time points

Mixed Wtihin-Subjects and Between-Subjects Design

IV1 (within-subjects): 2 or more rounds (each participant serves in all rounds) IV2 (between-subjects): 2 or more levels (each participant serves in only 1 level) DV: Quantitative 3 possible effects 1) Main effect for IV1 on DV 2) Main effect for IV2 on DV 3) IV1 x IV2 interaction on DV

Within-Subjects Design

Independent variable is within-subjects factor: same participants are assessed 2x or more (on same measure or DV) across conditions AKA - longitude design - some rare within-subjects designs require participants to repeatedly provide DV scores in a single testing occasion (but these are analyzed using longitudinal techniques) - repeated-measures design - DV repeated across diff conditions - within-groups design - time-series design (if many time points) - analysis of change (refers to change across time) A within-subject design can be a randomized experiment (randomly assigned to receive condition A by B or B by A), quasi-experiment, or nonexperiment common types of within-subject design - pretest-posttest control group design (compare pretest vs. posttest) - pretest-posttest design - interrupted time series design - comparison time series design

Rosenthal - reporting of psych research

Intentional misrepresentation - The most blatant intentional misrepresentation is the reporting of data that never were - A somewhat more subtle form of intentional misrepresentation occurs when investigators knowingly allocate to experimental or con- trol conditions those participants whose responses are more likely to support the investigators' hypothesis - investigators record the participants' re- sponses without being blind to the participants' treatment condition, or when research assistants record the partic- ipants' responses knowing both the research hypothesis and the participants' treatment condition. Unintentional misrepresentation - Recording er- rors, computational errors, and data analytic errors can all lead to inaccurate results that are inadvertent misrep- resentations - errors in the data decrease the utility of the research and thereby move the cost-utility ratio (which is used to justify the research on ethical grounds) in the unfavorable direction. - The use of causist language, dis- cussed earlier, is one example. Even more subtle is the case of questionable generalizability. Questionable generalizability Problems of authorship - but it seems that we could profit from further empirical studies in which authors, editors, referees, students, practitioners, and professors were asked to allocate authorship credit to people performing various functions in a scholarly enterprise. Problems of priority - Problems of priority are usually problems existing between research groups. - Priority problems also occur in psychology, where the question is likely to be not who first produced a virus but rather who first produced a particular idea. Self-censoring - Some self-censoring is admirable. When a study has been really badly done, it may be a service to the science and to society to simply start over. Some self-censoring is done for admirable motives but seems wasteful of infor- mation - I would argue that such data should indeed be cited and employed in meta-analytic computations as long as the data were well collected. - Failing to report data that contradict one's earlier research, or one's theory or one's values, is poor science and poor ethics. - A good general policy— good for science and for its integrity—is to report all results shedding light on the original hypothesis or pro- viding data that might be of use to other investigators External censoring - The first is evaluation of the methodology employed in a research study. I strongly favor such external censorship. If the study is truly terrible, it probably should not be reported. - Censoring or sup- pressing results we do not like or do not believe to have high prior probability is bad science and bad ethics - the ethical quality of our research is not inde- pendent ofthe scientific quality of our research.

Bem - Introduction

Introduction Opening statement - The first task of the article is to introduce the background and nature of the problem being investigated 1. Write in English prose, not psychological jargon. 2. Do not plunge unprepared readers into the middle of your problem or theory. Take the time and space necessary to lead them up to the formal or theoretical statement of the problem step by step. 3. Use examples to illustrate theoretical points or to introduce unfamiliar conceptual or technical terms. The more ab- stract the material, the more important such examples become. 4. Whenever possible, try to open with a statement about people (or animals), not psychologists or their research (This rule is almost always violated. Don't use journals as a model here.) - The structure of the writing itself adequately defines the relationships among these things and provides enough context to make the basic idea of the study and its rationale clear. Introduction: Literature review - After making the opening statements, summarize the current state of knowledge in the area of in- vestigation. What previous research has been done on the problem? What are the pertinent theories of the phenomenon? Al- though you will have familiarized yourself with the literature before you designed your own study, you may need to look up additional references if your results raise a new aspect of the problem or lead you to recast the study in a different framework. - avoid nonessential details; instead, emphasize pertinent findings, relevant methodological issues, and major conclusions. Refer the reader to general surveys or re- views of the topic if they are available. - In general, you should use form A, consigning your colleagues to parentheses. - If you take a dim view of previous research or earlier articles in the domain you reviewed, feel free to criticize and complain as strongly as you feel is commensurate with the incompetence you have uncovered. But criticize the work, not the investigators or authors. Ad hominem attacks offend editors and reviewers; moreover, the person you attack is likely to be asked to serve as one of the reviewers of your article - End the introduction with a brief overview of your own study. This provides a smooth transi- tion into the method section, which follows immediately:

Quasi-Experiments

Manipulation/treatment but no random assignment - e.g., participants self select selves to treatment or control Ethical reasons - unethical to randomly assign students in same class to receive TA or not - thus, morning vs. night class of same topic (1 class receives TA vs. other class no TA) - DV: compare 2 classes on exam scores - this example is a "posttest-only control group design" - different groups (no random assignment) on posttest - may be threats to internal validity/confounds - students who sign up for morning class may be more motivated - could be responsible for students performing better on test - others factors at play besides manipulation of IVs Practical reasons - impractical to randomly assign people to receive freeway billboard campaign or not - everyone who drives through freeway is exposed to PSA billboard campaign - DV: compare store condom sales before vs. after PSA introduced - example is a pretest-posttest design - all participants administered pretest & posttest - threats to internal validity - e.g. confound year prior closer to beginning COVID pandemic, fear interpersonal contact engage in sexual activity less - now vaccines more sexual activity, true reason increase in condom sales (history effect) Internal validity - ability infer causal relation of IV on DV - low = non-experiments - middle = quasi-experiments - high = randomized experiments

Quasi-Experiments: Matching Design

Matching Designs = participants from different nonrandomized groups are matched on pretest scores to rule out potential confounds responsible for initial group differences and then measured on the posttest - so diff in pretest not responsible for posttest outcome Example: participants found to use a test prep company or not on SAT verbal scores - use initial pool of N = 500 high school seniors who will take SAT - review literature to identify potential confounds on DV (to match across groups): gender and extraversion - Researcher able to mate 200 (treatment and control groups) participants on 2 pretest variables (gender & extraversion) - the remaining 300 participants could not be matched (discarded) - after participants matched on all relevant variables across both groups, compare the posttest scores - idea afterword, diff mean scores SAT verbal attributed to test prep vs. no test prep Matching - hand matching (spreadsheet) = manually performed - an advanced of matching design using software is called "propensity score matching" - criticism: potentially many confounding variables that the researcher failed to match on

Bem - Method

Method - If you conducted a fairly complex experiment in which there was a sequence of procedures or events, it is helpful to lead the reader through the sequence as if he or she were a participant. First give the usual overview of the study, including the description of participants, setting, and variables assessed, but then describe the experiment from the participant's vantage point. Provide summaries or excerpts of what was actually said to the participant, including any rationale or "cover story" that was given - The purpose of all this is to give your readers a feel for what it was like to be a participant - Name all groups, variables, and operations with easily recognized and remembered labels. Do not use abbreviations (the AMT5% group) or empty labels (Treatment 3). Instead, tell us about the success group and the failure group, the father- watching condition and the mother-watching condition, the teacher sample and the student sample, and so forth. It is also better to label groups or treatments in operational rather than theoretical terms - The method and results sections share the responsibility for presenting certain kinds of data that support the reliability and validity of your substantive findings, and you must judge where this information fits most smoothly into the narrative and when the reader can most easily assimilate it. - Discuss participant dropout problems and other difficulties encountered in executing the study only if they might affect the validity or the interpretation of your results. - Manipulations and procedures that yielded no useful information should be mentioned if they were administered before you collected your main data; their presence could have affected your findings. Usually it will be sufficient to say that they yielded no information and will not be discussed further. - After presenting the methods you used in your study, discuss any ethical issues they might raise. If the research design required you to keep participants uninformed or even misinformed about the procedures, how did you tell them about this afterwards? - End the method section with a brief summary of the procedure and its overall purpose. Your grandmother should be able to skim the method section without reading it; the final paragraph should bring her back "on line."

Deception

NPA No deception - purpose of study is as it appears to participants - e.g., cocaine use questions are actually intended to assess cocaine use Passive deception - withhold information from participants about real purpose of study - e.g., cocaine use questions are actually assessing antisocial personality - e.g., place hand in buckets of ice away as long as possible - not informed participants that # second able to keep hand in ice cold water is actually measuring pain tolerance Active deception - present misinformation to intentionally mislead participants about real purpose of study - e.g., place had in bucket of ice water as long as possible to measure your enjoyment of aquatic activities - misinformation participants because # of second able to keep hand in ice cold water is actually measuring pain tolerance - e.g., Asch (1956) line judgment studies in which confederates (by intentionally giving misleading responses) are actively deceiving the participant

Counterbalancing Always Needed?

Not necessary if: 1) no anticipated carry-over effects 2) long enough temporal gap from condition to condition (so that any potential carry-over effects have dissipated) - e.g., participants administered aspirin and assessed on headache relief, and q week (or longer) later administered Tylenol and assessed on headache relief 3) Research is interested in examining temporal effects - e.g., aging, growing, development and maturation across time - in these scenarios, impossible to counterbalance anyway 1) Quasi-experiments and randomized experiments - repeated measurements (with various interspersed treatments) - usually counterbalancing used (because not interested in order effects) - experience from one condition may carry over to the other - each condition occurs 1st, 2nd, 3rd in one of the conditions - hope cancel out carry-over effects - randomly assign participants to one of three sequences - compare conditions collapsed across sequences 2) Nonexperiments - repeated measurements (no treatments) - usually counterbalanced not used (because interested in order/temporal effects: aging, growing, development, maturation) - interested in growth/developmental effects across time

Debriefing

Participants are informed of the true purpose of the research - administered at end - usually requires - to waive, must justify (e.g., no deception, or deception is minimal risk) - more likely required if active deception or participants are placed at risk - mandated if deception involved so participants know true reason/purpose of study

Bem - Results

Results: Setting the Stage - First, you should present evidence that your study successfully set up the conditions for testing your hypotheses or answering your questions. If your study required you to produce one group of participants in a happy mood and another in a depressed mood, show us in this section that mood ratings made by the two groups were significantly different. - Here is also where you can put some of the data discussed in "The Method Section": Reliabilities of testing instruments, judges, and observers; return rates on mail surveys; and participant dropout problems. - The second preliminary matter to deal with is the method of data analysis. First, describe any overall procedures you used to convert your raw observations into analyzable data. - Next, tell us about the statistical analysis itself. If this is standard, describe it briefly (e.g., "All data were analyzed by two-way analyses of variance with sex of participant and mood induction as the independent variables"). If the analysis is unconventional or makes certain statistical assumptions your data might not satisfy, however, discuss the rationale for it, per- haps citing a reference for readers who wish to check into it further. - And finally, if the results section is complicated or divided into several parts, you may wish to provide an overview of the section Results: Presenting the findings - Begin with the central findings, and then move to more peripheral ones. It is also true within subsections: State the basic finding first, and then elaborate or qualify it as necessary. - Remind us of the conceptual hypothesis or the question you are asking - Remind us of the operations performed and behaviors measured - Tell us the answer immediately and in English - Now, and only now, speak to us in numbers - Now you may elaborate or qualify the overall conclusion if necessary - End each section of the results with a summary of where things stand - Lead into the next section of the results with a smooth transition sentence - by announcing each result clearly in prose before wading into numbers and statistics, and by summarizing frequently, you permit a reader to decide just how much de- tail he or she wants to pursue at each juncture and to skip ahead to the next main point whenever that seems desirable. - Whenever possible, state a result first and then give its statistical significance, but in no case should you ever give the statistical test alone without interpreting it substantively Figures and Tables - Figures and Tables. Unless a set of findings can be stated in one or two numbers, results that are sufficiently important to be stressed should be accompanied by a figure or table summarizing the relevant data. The basic rule of presentation is that a reader be able to grasp your major findings either by reading the text or by looking at the figures and tables - Within the text itself, lead the reader by the hand through a table to point out the results of interest

Repeated Measures ANOVA - concept

Stage 1: Total - variability between each score and the grand mean Between Groups/Rounds (signal) - variability due to different rounds (AKA variability due to mean differences across each round) - are means sig diff from one another or diff due to chance - large discrepancies btw group means, bigger btw groups variability expected to be Within Groups/Rounds - variability not due to different rounds (AKA variability not due to differences in means across rounds) Stage 2: Between Persons (calculate and remove this type of noise) - variability due to different persons (stable individual differences) - participants serve as their own controls across groups/rounds - standard error is smaller because this type of noise will be removed - advantage: same person being controlled for in every round, serve as own control, remove as part of noise Error (final noise) - variability not due to groups/rounds and not due to persons - smaller because subtracting btw person noise, noise component smaller so more likely find significance - more likely significant than between-groups ANOVA because small noise because subtract between person variability component

Repeated Measures ANOVA: Post hoc tests

each time perform sig test, 5% chance Type I error LSD (least square difference) t test - highly similar to dependent t test - except that within groups (error) component for omnibus repeated-measures ANOVA is reused as the error component for each LSD dependent t test - does not control for Type I error (no adjustment for multiple comparisons) - least conservative Bonferroni - adjustment to control for type 1 error - each pairwise comparison evaluated at critical = .05/# of comparisons - highly statistically conservative highest risk for Type I error (does not control for Type I error; least conservative) - no statistical power reduction in detecting sig. = LSD lowest risk for Type I error (controls for Type I error; more conservative) - statistical power reduction in detecting sig. = Bonferroni

Quasi-Experiments: Interrupted Time-Series Design

extension of the "pretest-posttest design" to permit many pretests and many posttests relative degree of change due to treatment examined by comparing the many scores before and after treatment e.g., California's 3 strike law (life sentence for 3 "serious" crimes) - instrumental crimes: more likely premeditated (preplanning - robbery, burglary, grand theft auto) - after law, crime rates for instrumental crimes decreased - suggest 3 strikes law effective in reducing crime - violent crimes: less likely premeditated (crimes of passion - homicide, assault, rape) - didn't see decrease after law, no planning involved as much, law not effective for violent crimes

See all study sets

Research Stats: Exam 3

Related study sets

Chapter 2

patho2

Mgt 352 Chapter 7 Questions

NUR 204 Med Surg Final Practice

Ch. 7 Concept Checks

F3 CPA

Entrepreneurship Exam 3 (Chapters 10-14)

AN234 Exam #3

A&P HW1

Comm in the Work Place Exam 1

CCNA 2 CH 8

CCNA 1 8-10

VASCULAR ARTERIAL

CHAPTER 24 TEST (4)

History Exam: Ch. 6- Scientific Revolution and the Enlightenment The statements below best describe which period? * Captionless Image

Chapter 7 Exam: Group Life Insurance

A&P Muscular System Lecture EXAM #3

Geology 1301 Chapter 10

sugar changed the world

HRMT 101 Week 2 - Basic Principles of APA Style & Formatting