Research Methods Exam 2

Ace your homework & exams now with Quizwiz!

endogenous change

change that is unrelated to experiment; natural development due to other factors - evaluating mentorship program for troubled youth might be hard bc it's natural for most youths to age out of delinquency anyway, whether they get outside help or not - three types of endogenous change a) testing: pre-test can influence post-test scores b) maturation: aging and increases in emotional maturity c) regression: cyclical or episodic trends

Hawthorne effect

changes that occur in experimental or control group simply as a function of being monitored by the research staff - might make them feel special or self-conscious

Panel Design (before&after)

-simplest b&a design, one group measured before a treatment/event/intervention and then again after - very weak & extremely difficult to rule out spuriousness - all kinds of things may have happened between pre&post tests - ways to strengthen: - use multiple samples (multiple-group b&a design) - use multiple pre&post tests on same sample (repeated-measures panel designs) - these designs can be an improvement, but having no control group is still problematic

Constructing Questions

A good strategy—when feasible and warranted—is to follow up closed-ended questions with open-ended ones - Open-ended questions allow for more detail and nuance, and the contrast between the answers to closed- and open-ended questions can be very interesting

What is a Survey?

A questionnaire containing single or multiple items tapping into concepts or events of interest to a researcher ◦Some of the items are straightforward ◦Others are complex (e.g., 10 items measuring different aspects of respondents' attitudes about the police (respectfulness, fairness, bias, etc.)) ◦Remember that we have talked about survey items before (single items, indexes, operationalization, etc.). You will need to know these previous concepts for the current section.

NON-EXPERIMENTAL DESIGN

A way of collecting data that doesn't attempt to test an intervention, treatment, or event - survey research to determine whether small-business owners would consider hiring convicted felons if govt paid those employees salaries - use of secondary data to determine the proportion of all felony assault cases that end in trials versus guilty pleas - qualitative research to learn about the ways youths living in high-crime areas try to avoid violent victimization

Omnibus Surveys

An example of an omnibus survey is the General Social Survey ◦May use a split-ballot design ◦Survey has different versions in which unique questions or different versions of questions are administered to randomly selected subsets ◦ Allows for: ◦ The inclusion of more questions without increasing the cost ◦ Comparisons of different versions of questions ◦A disadvantage may be "the limited depth that can be achieved in any one substantive area"

In-Person Surveys or Interviews

Researcher goes to respondents' homes and administers survey in person ◦Benefits ◦Can build good rapport to maximize response rates ◦Can use longer questionnaires ◦Can probe for details or clarify items when needed ◦Drawbacks ◦Cost and time ◦Potential for presence of interviewer to influence responses ◦Response rates if people in sample are rarely home or don't want to allow researcher in

causality in non-experimental example

causality in non-experimental designs - researcher is studying relationship between violent VGs (IV) and aggressive behavior (DV) - but many other factors could contribute (age, gender, hours/day spent playing) - must control to isolate possible relationship between VG and aggression • statistical controls, then, are independent variables that the researcher includes in the statistical analysis as way of isolating impact of primary IV on DV - getting rid of influence of 3rd (4th, 5th, etc) variables that are a threat to non spuriousness of the IV-DV

Mixed-Mode Surveys

To boost response rates, researchers might use multiple delivery modes ◦Send a mail survey but also provide a telephone number and web address if respondents prefer to do the survey by phone or online ◦Send both a mail survey and call and/or email a sampling unit (though be careful that each person only fills out one questionnaire)

stakeholders

people or groups who have control or influence over the program •Congress or another federal office •A federal funding agency (e.g., National Institute of Justice) •School-district superintendents and/or parents of schoolchildren •Stakeholders affect ER by setting the overarching agenda for the program •This program prevents children from joining gangs •This program reduces drug abuse among probationers •This program reduces crime in certain areas of the city •Stakeholders might also have personal or emotional investment in a program or policy

expectancies of experimental staff

the researcher's enthusiasm for experimental condition (and lack thereof for control) can influence his/her behavior & interaction w/ subjects - physical who knows her patient is getting an active experimental drug (not placebo) is extra positive & encouraging to that patient •solution: double blind procedures

MORE CAUSALITY IN NON-EXPERIMENTAL DESIGNS

• association can be established but other criteria of causality are more challenging - TIME ORDER: can be difficult to establish that change in the IV preceded change in DV - NONSPURIOUS RELATIONSHIPS: can use statistical controls, meaning one variable is held constant so relationship between two or more other variables can be examined apart from influence of "control" variable, to reduce threat of spuriousness

before-and-after experiment (quasi)

• basic idea: assumption that most phenomena move through time in a random, nonsystematic fashion - but if intervention/treatment or event occurs that impacts a certain phenomenon, noticeable change will occur - status before will be sharply different than status after *** distinguishing feature: no control - same/multiple groups b&a an event, treatment/intervention (i.e. pre/post tests) - the idea is that pre & post test scores are different, then the difference is attributable to the intervention/treatment/event

NONEQUIVALENT CONTROL GROUP DESIGNS (quasi)

• difference from experimental: no random assignment used to select treatment group/control group - instead, the treatment group is already selected and researcher employs matching technique to create control group • individual matching: researcher finds each person in the treatment group a counterpart who is similar to them but did not receive treatment • aggregate matching: researcher finds comparison group similar to treatment group on average rather than person-by-person basis

association & causal effects

• established by random assignment - control group provides info on what would have happened w/o intervention. (ceteris paribus)

nonspuriousness & causal effects

• random assignment eliminates many extraneous influences that can create spurious relationships

Note on these

• remember the first 3 are necessary though insufficient conditions for causation if they aren't present causation hasn't been established - the other two help a researcher build a logical argument in favor of causation. don't prove anything but they help: - circumstantial evidence

before & after design reviews

• they are good at allowing association to be tested - significant change on the post-test relative to the pre-test indicates association - time-ordering is built into design - but spuriousness is lingering threat

Evaluation Research

•Not just outcomes, though; evaluation research also focuses on processes •That is, not just the intended (or unintended) effects of the program, but also the implementation of it •Was the protocol followed? Were the staff members qualified? Were there unanticipated problems during implementation? •Process can affect outcome!

Why Evaluate?

•Programs to prevent or reduce crime cost a lot of money—need to know if they work •State or federal funding agencies might require evaluation of new or trial programs •Many programs have been proposed, too—need to know which ones to keep and which to discontinue •Don't want to keep using an ineffective program •Effective or promising programs should be transplanted to other areas or populations •Sometimes programs can have unanticipated (negative) side effects •Boot camps that actually increase youths' aggression •Prisoner rehabilitation programs that encourage antisocial attitudes or behaviors •Need to make sure not only that a program is having its intended effect, but also that it is not having any unforeseen consequences

selection bias

•Selection bias occurs when program participation is voluntary, the result being that there are unknown but probably important differences between those who do the program and those who don't •Selection bias is a big deal, something that good researchers work hard to prevent

evaluation & researcher's control

•Thus, any given ER design is not totally under the researcher's control; the research strategy is affected by stakeholders, and stakeholders might affect the program process as it develops, which then impacts the design •However, researchers are generally not powerless. They can help educate stakeholders on what is needed for a valid program evaluation. A healthy stakeholder- researcher relationship is one where both parties respect and accommodate each other's needs

TESTING FOR CAUSAL EFFECTS

•experimental research/designs: research designs that meet criteria of association. time ordering, and non spuriousness - experiments have three features: - an experimental group and one or more control groups - random assignments to groups - assessment of change in the dependent variable after the intervention (pre-tests and post-tests) •HARD TO ADDRESS MECHANISM AND CONTEXT ELEMENTS IN EXPERIMENTAL DESIGN

threats to internal validity & generalizability

•internal validity (causal validity): the truthfulness of an assertion that A causes B. •experiments are the best to way to minimize threats to internal validity , but not perfect - experiments suffer from questions of generalizability and how well they will apply in the real world •quasi-experiments have less internal validity but better generalizability and are more realistic but don't do good job @ isolating causal effects - threats to internal validity revolve around comparability & quality of experimental & control groups (selection bias, endogenous change, external events/history, contamination, treatment misidentification)

Quasi-experimental designs

•not as rigorous as experiments & usually fail to rule out spuriousness (bc of researcher's lack of control over the manipulation of the experimental condition) - but allow research in situations where experiments are impossible or unethical, so they are a good substitute - also, bc they usually take place in "real world" they might be better at establishing context - controlled studies can be too sanitized, divorced from context

5. CONTEXT

•sometimes, certain relationships are dependent upon a given set of circumstances - that is, there are situational factors that must present for a cause-and-effect sequence to occur - causation is a way of adding layers/nuances to a causal relationship - a lot of times, IV's impact on the DV is every single person (or case) in the sample - sometimes there have to be other factors present for the IV's effect to play out

generalizability

in particular, experimental subjects are usually recruited than sampled so often not clear how representative is • generally, results produced by carefully controlled studies are said to be at least roughly generalizable

quantitative (nomothetic) causation

involves the belief that variation in an independent variable will be followed by variation in the dependent variable when all other things are equal

TIME-SERIES DESIGNS (before&after)

measuring same sample repeatedly b&a event, generally 30+ measurements before an after - useful for policy analysis, can find out impact of large-scale event, such as a new law/policy

Constructing Clear and Meaningful Questions

"All hope for achieving measurement validity is lost unless survey questions are clear and convey the intended meaning to respondents" (p. 213) - The data you collect from a survey project will be extremely useful or totally worthless, depending on how good a job you do at the questionnaire design ◦You must use the same questions with all respondents (one survey instrument) ◦Different people must understand each question in the same way ◦You won't be there to rephrase a question if someone doesn't understand it ◦Don't assume respondents know the same phrases, expressions that you do

Omnibus vs. Topic Surveys

- Most surveys are topic surveys; that is, the researcher has a specific topic/question/idea in mind and builds the entire survey around that central theme - Attitudes toward police, knowledge about local court or sentencing practices, drug use, violent victimization experiences, etc. - The other option is an omnibus survey, where the questionnaire contains items about a variety of topics (i.e., there is no central theme)

time order & causal effects

- can be established by pretests and posttests

systematic differences

- could arise in the absence of random assignment - you are testing whether violent VGs encourage aggressive behavior in male adolescents so set up experimental condition w/ violent VG & control condition w non-violent VG. gather group of subjects do NOT let them choose which group solution: flip coin, this eliminates systematic differences

CAUSATION: FIVE CRITERIA

- each criterion is important or necessary but not SUFFICIENT for causation to be proven - first 3 are absolutely required and other 2 are helpful in lending confidence to causal assertions •WHEN ONE OR MORE CRITERIA ARE NOT PRESENT, CAUSAL CONCLUSIONS ARE IMPOSSIBLE OR, AT LEAST, HIGHLY SUSPECT

1. EMPIRICAL ASSOCIATION

- first & foremost, the two variables have to be associated - the relationship can have a few different looks/forms - the point is that the DV does something predictable in response to a change in the IV - that is if the IV changes (increases or decreases), the DV does too •association is the 1st necessary BUT insufficient condition for causation bc to claim that an IV causes DV, must first show they are related

Nonexperimental tests:

- lacks all 3 criteria - often not concerned w/ testing effects of intervention/events - study phenomena as they naturally occur - studying attitudes, behaviors, experiences, etc

THE MINNEAPOLIS DOMESTIC VIOLENCE STUDY (field experiment)

- looked t effectiveness of methods used by police to reduce domestic violence - misdemeanor assault calls where victim & offender were present when police arrived (51 POs; 330 cases; 17 month long study) - use 1 of 3 approaches for handling DV calls in cases where officers had probable cause to believe an assault had occurred - send the abuser away for 8 hrs, advice and mediation of disputes, make an arrest - 6 month follow-up period - interviews w/ victims & offenders, official records check - arrest was found to be most effective police response STRENGTHS: study design LIMITATIONS: generalizability, 6 month follow up; arrest policy )overnight stay in jail); based on deterrence theory - replicated in 6 diff locations - showed results varied by locale

causation in CCJ research

- most applicable to explanatory research and program evaluation - research where the primary interest is in demonstrating that a particular independent variable causes a particular dependent variable - not important, usually, to exploratory or descriptive research - but is hard to prove; many things are related to each other, but this doesn't mean one causes the other

RANDOM ASSIGNMENT

- this is NOT the same as random sampling - random assignment in no way ensures representativeness or generalizability • randomization is useful for ensuring internal validity , not generalizability - useful for ensuring that the research subjects are representative of larger population, not internal validity - essentially, what it does is help make sure there are no systematic differences between the experimental and control groups

Purposes of Survey Research

- type of nonexperimental research design ◦intended to do things like: ◦Measure the incidence and prevalence of certain phenomena (descriptive research) ◦Examine relationships (empirical associations) between two or more phenomena ◦Test hypotheses or examine research questions - Key commonality here: Researcher is not manipulating the units (people, places, etc.) or the conditions. Instead, observing experiences, conditions, and events as they occur naturally

Causation (Causal Effect)

- when variation in the independent variable is followed by variation in the dependent variable, when all other things are equal - means the IV causes the DV - there is more than association

Avoid Confusing Phrasing or Vagueness

Be brief and stick to the point" ◦When possible, give respondents a specific timeframe or reference period (e.g., "Within the past month" or "In the past 6 months") ◦Define terms or avoid jargon altogether ◦No: "Have you been robbed?" ◦Yes: "In the past 6 months, has anyone taken money or other valuable items from you by hurting or threatening to hurt you?"

Likert-Type Response Categories

Best to avoid being overly restrictive in the options you give respondents—allow them nuance ◦"Yes or No" or "Agree or Disagree" are very blunt; they force people to take a strong position on something that they might actually not have a strong opinion about ◦Better is to offer a Likert-Type response scale, where they get a range (4 or more) of options ◦"Strongly agree Agree Disagree Strongly disagree

Outcomes

Impact on the on the cases; measurement of effectiveness

Outputs

Direct products of service delivery

Avoid Double-Barreled Questions

Double-barreled questions contain more than one concept or idea ◦No: "I believe we should stop spending so much money building prisons and put it into building more schools" (p. 216) ◦Yes: "I believe we spend too much money on prisons" and "We should spend more money on schools" ◦The answer options can also be problematic ◦No: "Do you know anyone who has ever used cocaine?" (p. 217) Yes No I have used cocaine

Exhaustive

Exhaustive: Every possible response option is represented; all respondents have an option that applies to them ◦No: "What is your education level?" High school diploma Some college Bachelor's degree ◦Yes: "What is your education level?" Less than high school HS diploma Some college Bachelor's degree or higher

Tips of Questionnaire Development

Maintain focus: Stick to the subject matter at hand. Every item should be directly related to the central topic ◦Build on existing instruments, No need to reinvent the wheel ◦ Also, sometimes a researcher has to use previously established scales; there are certain areas of research that have set scales for measuring certain concepts (e.g., self-control) ◦ Existing scales may have already been assessed for validity and reliability—new scales are untested and potentially untrustworthy ◦ Makes for consistency and facilitates comparisons across studies if all researchers studying a certain theoretical concept use the same scale to measure that concept ◦Consider translation ◦ This goes back to knowing your population. To ensure the survey is accessible to all possible respondents, you may need one or more non-English versions ◦ Translation should be a three-step process ◦ Translate from English into the non-English language ◦ Have a different person translate the non-English back into English ◦ Compare the original English and the re-translated English, and edit the translation where needed

other organizational guidelines

Major topics should be divided into sections, with short introduction to each one ◦Instructions should be minimal, concise, and neutral ◦Font, font size, and spacing should make for an easy-to-read document; spread things out to ensure people don't get confused or frustrated ◦When possible, include numbers along with response options to facilitate coding later ◦Also a good idea to periodically remind respondents that their answers are anonymous and/or confidential —try to make sure they trust you so they will answer all the questions and will do so honestly ◦Mailed questionnaires often have a cover letter that explains who is conducting the survey and assures respondents of anonymity and/or confidentiality ◦The cover letter often suffices as the informed consent ◦Remember this is human subjects research and governed by all applicable laws, policies, and ethical guidelines

Pre-test vs Post-test

PRE-TESTS: measures each subject's level/rating on the DV before the experiment is applied POST-TEST: measures that level/rating after experiment • PRE-TEST,not required in true experiment but helpful to indicate change over time

3. NONSPURIOUSNESS

SPURIOUSNESS: an apparent association between two variables is actually untrue; both variables are caused by a third variable - even if first two are established, researcher must still rule out spuriousness - establishing causation is about ruling out alternative explanations (i.e. controlling for or eliminating 3rd variable that might be true cause of apparent relationship • a researcher could try to reduce the threat of spuriousness in an evaluation of the rehab program by randomly assigning inmates to participate in the program or not (rather than basing on volunteers) - could test for causal effects of rap music on youth violence by gathering a sample of both male and female children who listen to rap - males commit more aggression that females so if rap-aggression relationship holds for both males/females, we have at least ruled out that alt explanation

2. TEMPORAL ORDERING

TEMPORAL: having to do with time, one event happening before the other •TIME ORDER: the change in IV happened BEFORE the DV, not after or at the same time •failing to prove time ordering undercuts causal hypotheses • youth w/ aggression probe might be drawn to rap and violent vid games - aggressive behavior happened first and then music/games

4. MECHANISM

THE SPECIFIC WAY THAT THE IV CAUSES THE DV •children of incarcerated more likely to commit crime later as adults. why? - having one parent incarcerated reduces the remaining parent's ability to supervise the child, to keep him out of trouble - having a parent incarcerated removes an important role model from child's life/disrupts socialization process - an incarcerated parent is a bad influence on a child, encourages anger and antisocial behavior

Process

The delivery of the prescribed service

Inputs

The persons or units that enter (or are put through) the program; also the staff and others who run the program

external events/history effect

events outside of the study that influence the DV - different from endogenous change bc history effects are unpredictable & have nothing to do w/ the respondents themselves - researcher is conducting a one-year study of the effects of nutritional counseling on people's eating habits. 6 months in, nation is swept by kale fad - the more carefully controlled, the less likely external events will invalidate the casual conclusions of an experiment

experimental vs control group

experimental: group receiving treatment/intervention control: group not receiving treatment/intervention - subjects in experiments may not know which group they are in or that there are even multiple groups (deception)

qualitative (idiographic) causation

explanation is concerned with context and deterministic

FIELD EXPERIMENTS

research designs that use the 3 features of experiments (experimental/control groups; random assignment; assessment of change) but that are conducted in an uncontrolled setting (the "real world") rather than controlled one - helpful because - generalizability of study results is sometimes questionable (to what extent does something in a lab play out in real life) - controlled experiments are often not possible; only way to conduct a study might be in the field

what if an experiment isn't possible?

sometimes difficult to conduct bc of resource limitations or to the fact that the phenomenon under study simply cannot be tested experimentally - can't randomly assign children to be abused in order to determine effect on later criminality - Quasi-experimental designs: research studies that contain one or two- but not all 3- elements of experimental designs

Statistical Control (causality in non-experimental)

using statistical analyses to compensate for lack of design controls; using data, rather than design to help establish time ordering and/or non spuriousness -statistical controls are less rigorous and trustworthy than design controls but they are all that can be done in non experimental - quantitative only; qualitative is different

contamination

when one of the groups affects or is affected by the other two types: - compensatory rivalry: happens when there is a leak of information, or group members are mistakenly allowed to contact one another - demoralization: someone from the control group finds out the experimental group is getting something good & deliberately performs worse or attempts to get what the experimental group is getting

placebo effect

when ppl in control condition feel improvement even absent the experimental treatment - solution: multiple control groups

treatment misidentification

when subjects receive an unintended treatment A) expectancies of experimental staff B) placebo effect C hawthorne effect

Selection Bias

when there's systematic differences between experimental/control groups - minimized w/ random assignment but subjects might still drop out of one condition more than the other (differential attrition) - hard to get ppl to stay in 6 month drug-treatment program - thus, group that started out good can get problematic over time - not real issue for lab studies but matters for long-time studies - also problematic for quasi-experimental designs (i.e. nonequivalent control groups) - pretest helps determine whether selection bias exists & control for it

association

where two variables are related but there is no causal effect

Outcome Evaluation ("Did the Program Work?")

•Also called impact evaluation or analysis •If you know the process went smoothly, proceed to evaluating outcomes •This will generally be quantitative methods or mixed methods (combination of both quantitative and qualitative designs) •First, pick an outcome measure (i.e., the variable you will use to operationalize "effectiveness") •# of self-reported arrests in 6 months; number of actual rearrests in 6 months; number of self-reported instances of drug use per month •Second, pick a design •Experimental is always best, when possible •Quasi-experimental is second best •Non-experimental if other types are unfeasible or inappropriate •The trick is to build a design with strong internal validity (internal validity is prized over generalizability in ER) •You want to be able to say that any improvements seen in the treatment group are for sure the result of the program •That is, rule out problems with temporal ordering and spuriousness

Black Box vs. Program Theory Approaches

•Basically, do you care why a program works (or doesn't)? •Recall empirical observations and theoretical explanations: Theory is a proposed explanation for an observed, empirical phenomenon •The researcher can either stick to empirical observations (program worked or not) or can attempt to explain those observations using theory •If researcher opts to stick to empirical observations, s/he will merely measure whether or not the cases changed "between the time they entered the program as inputs and when the exited the program as outputs" (p. 332) •e.g., pre-test vs. post-test, or experimental group vs. matched controls •No attempt to figure out why it worked (or didn't) •This is the black box approach •Or, a researcher can attempt to form an explanation for why the change occurred (or didn't occur) •What is (in)effective about this program? Approach? Protocol? Implementation? Characteristics of the people/places in the treatment group?

CAUSALITY IN NON-EXPERIMENTAL DESIGNS

•CROSS-SECTIONAL RESEARCH: design in which all data is collected at once •LONGITUDINAL RESEARCH: design in which data are collected at two or more points in time - most CCJ research is cross-sectional and none use features of experimental or quasi-experimental designs (i.e. controls) so must compensate for lack of internal devices that guard against problems of time ordering and spuriousness - association is usually easy to establish in non-experimentals - cross-sectional is easier, fewer resources needed - but longitudinal has many advantages - get to see how things play out over the long run - can possibly conduct some before-and-after testing

Evaluation Research vs. Data Collection

•Data-collection designs gather data for a variety of purposes •Description, exploration, explanation/hypothesis testing •One study can produce a data set that is used for multiple purposes (e.g., to describe a sample and also to test a hypothesis) •Evaluation research (ER) is collected for one and only one purpose: To find out if the program or policy works •Also, ER is different because it can employ the different designs •That is, ER can be experimental, quasi-experimental, and/or non-experimental •ER can be quantitative, qualitative, or both •Any given ER design might include one or multiple elements of these three •Random assignment to treatment and control groups with a quantitative outcome measure (experimental), followed by in-depth interviews with members of each group to learn about their experiences (non-experimental) •In fact, strong ER designs do use more than one research approach •ER is also heavily applied. This means that it has direct implications for policy and real-world practice •By contrast, much other CCJ research is theoretical or somewhat removed from policy •Do multiple-homicide offenders specialize in homicide, or is their offending pattern as divers as single-homicide offenders'? •Do inner-city youths who feel alienated from police engage in violence as a self-help strategy? •Of course, other CCJ research can (and should) be used to inform policy •But ER is applied directly

Feedback Process

•ER research might be ongoing as a program progresses •If so, then a feedback process might be used •Constant assessment of inputs, outputs, and outcomes that might be used to adjust one or more aspects of the program •Feedback processes also distinguish ER from other types of research: ER is dynamic, meaning it moves and is ongoing. It can be used to make changes to the program as the program progresses (no need to let a dysfunctional program continue being ineffective if it can be improved!)

Design Decisions

•Each ER project requires the researcher to make decisions about how to evaluate the program •Will researcher use a black box or a program theory approach? •Whose goals matter more: researcher's or stakeholders'? •Should the methods be quantitative or qualitative (or both)?

Efficiency Evaluation ("Is the Program Worth It?")

•If a program is found to have positive outcomes, the next question is, "Are the outcomes good enough to justify the cost?" •Effective doesn't mean best or only; even when a program is found to work, there might be a more effective or less expensive one out there. Do comparisons! •Summary: Effectiveness at achieving desired outcome is only part of the story; need to also find out if •The benefits are at least equal to (if not greater than) the costs •There is no alternative that would produce a better outcome for the same or fewer costs •Cost-benefit analysis (p.331) •Cost-effectiveness analysis (p.331)

Evaluation Alternatives: Process Evaluation

•Is the Program Working as Planned? •This is process evaluation, •"Systematic attempt by researchers to examine program coverage and delivery" (p. 326) •Before you tackle the question of outputs or outcomes, you need to know if the program is actually being implemented correctly Process evaluation is important in and of itself, but it is also imperative for understanding the final evaluation of the outputs and outcomes •If a program appears to not work, is this because •It's a bad program? OR •It was implemented badly? •Don't want to throw out a potentially good program just because someone messed up the implementation •Process evaluation often relies upon qualitative methods •Participant observation, intensive interviewing, focus groups •Also the collection of secondary data •Criminal histories of the inmates in a program; college education level of staff members; etc.

Evaluation Basics

•Just like with other research, ER requires the researcher to decide how to conduct the study and what to measure •Whether clients are drug free at 6mos. following the program? 1 year? •Whether clients are rearrested? Reconvicted?

Causality w/ mechanism & context

•MECHANISM: the use of intervening variables can help determine how variation in the IV affects variation in the DV •CONTEXT: can be developed, especially when surveys are administered in different settings & with different individuals

Questionnaire Development and Assessment

◦"The questionnaire is the central feature of the survey process" ◦The quality of survey research hinges upon 1. The appropriateness of the sampling technique 2. The quality of the questionnaire ◦The number and type of questions and the overall length of the questionnaire will depend upon the research topic being examined ◦a written questionnaire should take no more than 20 minutes to fill out, and a phone survey should take up to 30 but absolutely no more than 45 minutes to complete ◦Surveys that are too long will fatigue the respondents; they will stop paying attention, start answering questions in a sloppy way, or give up entirely

Group-Administered Surveys Benefits/Drawbacks

◦Benefits: ◦High response rate (harder for people to refuse when they are stuck in a room with the researcher right there) ◦Efficient (can get lots of responses relatively quickly) ◦Cost effective ◦Drawbacks: ◦Can be hard to determine what the population is ◦Possibility of selection bias ◦Possibility that respondents feel coerced or pressured ◦Respondents might not trust that researcher is independent; might not be fully honest in their answers

open vs. closed ended Qs

◦Closed-ended: Most common type. Respondents get a set of answer options and have to pick one. ◦Open-ended: Used in more qualitative surveys. Allowing respondents to write in their own answers and offer as much detail as they want. Less common and often less practical— respondents' answers can be very long, and the number of questions you can ask is limited.

Demographic Questions

◦Demographics are personal characteristics of the respondents ◦Such as age, race, sex, marital status, education level, income level, political affiliation, religious affiliation ◦They are important control variables in statistical analyses (recall Ch. 6 discussion) ◦They help isolate the impact of the independent variable (IV) of interest on the dependent variable (DV)

Survey Research Designs: Which one is Best?

◦Depends: ◦Goals of research ◦Knowledge of the population ◦Amount of funding and time researcher has ◦Researchers have to weigh all factors and options

Web-Based Surveys

◦Generally, a bad idea ◦No sampling frame, hard or impossible to get a representative sample, often don't even know who the population is, selection bias is almost certainly present. ◦Standard problems such as non-completion, and researcher cannot control who fills out the questionnaire. ◦But, in some cases, they can be useful ◦An identified population of which most members have known email addresses ◦e.g., work-climate surveys where all employees can be emailed the link ◦Web-based might become more useful and mainstream as the U.S. population moves toward 100% home/personal access

Ethical Issues in Survey Research

◦Generally, survey research poses minimal risk ◦No manipulation or experimentation taking place ◦Little or no possibility of psychological or physical harm ◦Little or no coercion ◦But respondents still have to be protected, especially if you are asking them to divulge personal information

Combining Questions into an Index

◦Multiple items are usually better than single items ◦Capture nuances; doesn't force people into a corner ◦Better content validity (Ch. 4): capture entire range of a concept, not just a single piece ◦Better reliability (Ch. 4): minimize idiosyncratic variation for a more stable, consistent, trustworthy estimate across all respondents ◦Researchers who use multiple items will later sum or average those items to form a scale (or index) ◦A 4-point Likert-type response option (strongly agree, agree, disagree, strongly disagree) gets assigned numbers (such as, strongly agree = 4; agree = 3; disagree = 2; strongly disagree = 1). Numerically, then, higher numbers represent stronger agreement ◦Then the researcher sums the scores for the entire set of items

Mutually Exclusive

◦Mutually exclusive: No overlap between the categories ◦No: "How many miles do you drive per week?" 0 - 3 3 - 6 6 - 9 9+ ◦Yes: "How many miles do you drive per week?" 0 - 3 4 - 7 8 - 11 12+

Group-Administered Surveys

◦Not always possible, but sometimes a good way to gather data ◦Idea: Researcher brings questionnaires to a setting where people are a captive audience, distributes them, and collects them on the spot ◦Police officer roll calls, student classes, jails, etc.

Telephone Surveys

◦Researcher calls respondents and administers the survey over the phone, recording the respondents' answers ◦Often used in conjunction with random-digit dialing sampling method ◦Benefits: ◦Higher response rates than in mail surveys ◦Less chance for respondents to skip items ◦Lower cost than mail surveys (no printing, postage, etc.) ◦Easier to accommodate those who prefer a non-English version ◦Greater control over the administration process (compared to mail surveys) ◦Two primary drawbacks or concerns ◦Reaching the sampling units (i.e., respondents): How to contact people both via landlines (if that is their primary household phone) and cell phone (if the house has no landline)? ◦ This is usually addressed using random digit dialing ◦ But even RDD will miss many people (e.g., those whose cell doesn't have a local prefix) ◦Getting good response rates ◦ Need people to: (1) answer the phone; and (2) complete the entire survey ◦ Researchers use a call-back rule (e.g., 5 call-backs per number) ◦ Try to make the survey as interesting and non-threatening as possible

Filter Questions

◦Researchers can make surveys longer and more detailed/complex by making certain items applicable only to certain subsets of respondents ◦Filter questions discover whether respondents possess a particular characteristic of interest. This creates a skip pattern where some respondents answer contingent questions and other respondents skip a certain section and move on to the next. ◦If you want to know about robbery victims' satisfaction with the police response, then filter out respondents who have not been victimized ◦"Question 1. In the past 6 months, has anyone taken valuable items or money from you by hurting or threatening to hurt you? Yes (please fill out the following questions) No (please skip to Question 2)"

Combining Questions into an Index

◦Single items versus multiple items: How to choose? ◦Single items are easier and faster ◦But we have seen the problems with items like this ◦Forces people to take a strong stance on an issue they might have complex feelings about ◦ What if they support the death penalty in some cases but not all? ◦Doesn't give people enough information to make knowledgeable decisions ◦ Do they know that sex-offender residency restrictions don't work? ◦Subject to idiosyncratic variation: respondents' answers are their own interpretations of the item, or the emotional reaction they have (you don't want this in your survey items!!!) ◦ When Respondent A reads the sentencing question, he visualizes a hardened, violent offender who preys on innocent victims. Respondent B thinks about her 20-year-old nephew who has always been a nice kid, but got into drugs and is having a rough patch, including some run-ins with the system

Avoid Making all Answer Options Disagreeable

◦Sometimes a question is worded in a way that makes all answer options uncomfortable or inapplicable ◦Classic example: "Do you still beat your wife?" ◦Or: "Police officers should reduce the amount of excessive force they use." Agree Disagree ◦Instead: "If a police officer uses excessive force, he or she should be charged with a crime." Agree Disagree ◦Reduce the likelihood of agreement bias by presenting both sides of attitude scales in the question ◦The question should also be phrased so as to avoid embarrassing the respondent ◦"Have you ever taken an item from the store without paying for it?" rather than "Have you ever shoplifted?" ◦Keep language neutral; avoid stigmatizing or judging respondents

Ethical Issues: Protection of Respondents

◦Survey respondents need ◦Informed consent. Especially if the questionnaire has items that could trigger distressful thoughts or bad memories (e.g., asking people about sexual-assault victimization) ◦Anonymity and confidentiality. Generally, survey research does not collect identifiers (name, address, phone number, etc.). This keeps answers anonymous. The researcher also maintains confidentiality by keeping original documents private and not distributing the data in an unauthorized manner. ◦Protection can be enhanced by considering the necessity of certain items ◦Don't ask sensitive or embarrassing questions unless they are absolutely critical to the research

Organization Matters

◦The researcher has to choose the order of questions and the order of sections (groups of items or sets of multiple-item indexes) ◦Ordering can matter a great deal! Earlier questions can set a tone or frame of reference that then influences respondents' answers to later questions ◦Question 5. "Recently, a widely circulated video showed surveillance footage of a football player punching his girlfriend in the face, knocking her out. This player received deferred prosecution for this offense, and must undergo counseling for domestic violence. Do you think this sentence is harsh enough?" Question 8. "Generally speaking, do you think that courts are too lenient on men accused of physically assaulting their female partners?" ◦Question 8 has been contaminated by Question 5 ◦All sensitive or controversial items should be immediately preceded by neutral items ◦If there are multiple sensitive items, they should be far apart and separated by several neutral items ◦Ease respondents in by putting neutral, non-threatening items at the front of the survey. Then introduce a few that are somewhat personal, then delve into the sensitive ones ◦You have to earn their trust

Survey Research Designs (aka, Administration Options)

◦There are multiple modalities (or manners of administration) in survey research ◦The administration method a researcher chooses depends on the circumstances ◦Funding/costs, time, size of population and size of sample, access to sampling frame, etc

Additional Guidelines for Fixed-Response Questions

◦There are several rules for constructing fixed-response (closed-ended) questions ◦These rules not only help make sure the questionnaire is clear and concise, but they help boost validity and reliability

Fence-Sitting vs. Floating

◦There is a dilemma surrounding including a response option that allows respondents to express no opinion on a subject ◦"Unsure, don't know, undecided" ◦On one hand, this can be a cop out—easy way for respondents to reply to the question without giving it any thought ◦On the other hand, you don't want to force people who genuinely are undecided or unknowledgeable to choose a category that is inapplicable to them (or to leave the question blank, meaning that you lose data) ◦First, try defining terms or providing background about the issue ◦"The Juvenile Justice Bill going before Congress this month will do x, y, and z... Do you think this will help reduce crime committed by juveniles?" ◦Second, consider the substance of the question itself. If it is pure opinion, then maybe you don't need a neutral category; if it requires knowledge, maybe you do need one ◦"During my most recent contact with a police officer, the officer was polite" probably doesn't need a neutral category ◦"Police in the city where I live are polite to people they encounter" might need a "don't know" category

Mail Surveys

◦These are self-administered: respondents have total control over when they complete it, how long it takes, and when they send it back. They read the questions themselves (hopefully carefully!) and answer them (hopefully honestly!) ◦Main drawbacks to mail method ◦Response rate (percentage of people in the sample who complete and return the questionnaire). Hard to get people to fill the survey out and return it. ◦Missing data (items or sections that respondents skip either accidentally or on purpose). Sometimes people miss or skip items. This can cause minor or major loss of data. ◦Also, researcher has little control over who fills out the questionnaire and how long it takes them to return it

Avoid Negatives and Double Negatives

◦Try to phrase everything in positive form; negatives and, especially, double negatives are confusing ◦Negative: "People in my neighborhood don't get along very well." Agree Unsure Disagree ◦Positive: "People in my neighborhood generally get along." Agree Unsure Disagree ◦Double-negative: "Do you disagree with a policy of not letting people charged with felonies plead guilty to misdemeanors?" Yes No ◦Instead: "Would you agree with a policy prohibiting people charged with felonies from being allowed to plead guilty to misdemeanors?" Yes No

Survey Research: Overview

◦Usually, the researcher can only gather a sample, but her/his real interest is in the population ◦Sample of juveniles -> all juveniles ◦Survey research designs usually aim for generalizability, meaning that they rely heavily upon random sampling and good representativeness of those samples ◦The basic idea: Gather a (random) sample of people (respondents) and give each respondent a survey questionnaire to fill out

Benefits of Survey Research

◦Versatility: lots of options. All types of topics, questions, and so on ◦Can get information that can't be collected in experiments or quasi-experiments ◦ Might not be feasible or ethical ◦ Researcher's interest might be in something not subject to manipulation ◦Efficiency: Relatively cost-effective; not free, but often more affordable than other types of research ◦Quick—questionnaires are generally designed to take no more than 20 - 30 minutes for respondents to fill out (though some last longer ◦Generalizability: Proper sampling techniques can be used to get good samples from any population of interest ◦Generalizability is easy to establish, as long as sampling is good (compare this to experimental and quasi-experimental research) ◦Only method to get at people's attitudes, beliefs, expectations, experiences, and natural (i.e., not in the lab) behaviors ◦If you want to know what someone thinks or has experienced, you have to ask


Related study sets

Supply Chain Management Capstone- SCM4350

View Set

351 Promulgated Contract Forms - Lesson 9

View Set

AP US Gov Progress Check Questions

View Set

Anatomy and Physiology Questions

View Set

Heritage Exam #2: Readings (ch. 13, 14, 16)

View Set

Domain 4 - Risk and Control Monitoring and Reporting

View Set

CPA - FAR - 10 & 11 - Bonds, PV, Leases

View Set

Implementing Linear Data Structures

View Set