POLI SCI 170 FINAL EXAM

Ace your homework & exams now with Quizwiz!

theory

a reasoned and precise speculation about why a proposed answer is correct and speculation about the answer to a research Q

disproportionate stratified sampling

a sampling procedure in which strata are sampled disproportionately to population composition

sampling without replacement

a sampling procedure whereby once a case is selected, it is NOT returned to the sampling frame, so that it cannot be selected again

sampling with replacement

a sampling procedure whereby once a case is selected, it is returned to the sampling frame, so that it may be selected again

random selection

a selection process that gives each element in a population an equal or known chance of being selected

observable implications

empirical patterns you'd expect to see if your hypothesis is correct

how we filter new information

expectations, preconception, motivation, prior beliefs influence our interpretation of new information. information consistent with our beliefs are taken at face value and we discount information that contradicts our beliefs. ambiguous info: interpreted in a manner to fit with out preconceptions unambiguous info: compelled to apply more scrutiny to new information, dissect it, question it, etc.

factorial design

experiment in which 2+ variables are manipulated (# levels factor 1) x (# levels factor 2) x (# levels factor 3) x etc.... multiple #s to figure out how many total experimental conditions. ex: experiment with 3 variables each with 3 conditions, 3x3x3 (27 conditions)

factorial designs

experiment in which 2+ variables are manipulation

codebook

explains the rules for establishing each variable's values or score, the source of the data for that variable, what all of the possible scores or values mean

inductive reasoning

explanation emerges from study of specific cases, and then is generalized 1. get data 2. look for patterns 3. formulate theory pitfall: since built theory to fit existing data, need new data to test it identify cases to develop hypothesis, then find new cases to test your hypotheses on

descriptive statistics

procedures for organizing and summarizing data, ex: number of wars per year

measurement

process of assigning numbers numbers or levels to units of analysis (people, objects, events, etc.) to represent their conceptual properties

respect for persons

the Belmont principle that individuals must be treated as autonomous agents who have the freedom and capacity to decide what happens to them, and researchers must protect those with diminished autonomy

beneficence

the Belmont principle that researchers have an obligation to secure the well-being of participants ny maximizing possible benefits and minimizing possible harms

sampling error

the difference between an actual population value (e.g., a percentage) and the population value estimated from a sample

residuals

the difference between observed values of the dependent variable and those predicted by a regression equation

coverage error

the error that occurs when the sampling frame does not match the target population before survey: does sampling frame only include people with a special characteristic? after survey: does sample match the population demographically? (could be for unmeasurable characteristics like motivation, personality type, etc.) if coverage error, before: improve your sampling frame, after: weight your survey (higher weight to underrepresented groups)

target population

the population the researcher would like to generalize their results to

reverse causation

the possibility that Y could cause X

ceteris paribus assumption

"all other things being equal", values of less than 1 IV are changing at any given time, can't determine the effect we observe is from changes to IV1 or IV2

robustness check

"self-check" models using different indicators of key concepts; check to see if the findings are robust to different operationalizations of the variable

assumptions of mill's methods

- assumes that we have a full list of candidate cases to begin with - assumes multiple causation is not a problem (one cause of each effect) --> if these don't hold (which they often don't!), we are learning about associations, not causes

problems with survey research

- respondent dishonesty (perhaps due to SDB) solutions: guarantee anonymity, use forgiving wording, use priming honesty - inattention -- can lead to a "response set" where they favor one choice, or click things randomly solutions: time screen time and throw out ones too low, attention checks - people aren't aware of their own reasons/preferenes - imperfect recall

4 hurdles to establishing causality

1. is there a correlation between x & y? 2. can we eliminate reverse causation? 3. is there a credible causal mechanism that links X to Y? 4. Have we controlled for confounding Z that makes connection between X & Y spurious?

4 attributes of science

1. theory: interconnected set of propositions that explains how/why something occurs 2. data: information recorded from observation 3. systematic observation & analysis: methodical, organized, orderly, minimizes bias & error, easier to replicate 4. logical reasoning: inductive & deductive reasoning

common problems with archives

1. access to collections: difficult to access records in non-democracies 2. incomplete records: individuals/governments may choose not to record sensitive information, or actors can later claim that records are incomplete 3. redaction: to redact - select or adapt (as by obscuring or removing sensitive info) for publications for release 4. interpretation of information: you need to know a lot of background to interpret a document correctly & scholars can have motivated biases (possible solution: active citation)

mill's method of agreement

1. "agreement" on DV (same value) 2. IVs must be dissimilar except for one, which is the same --> here, we learn about whether A causes Y (as long as we have not left out an important IV or confounder.) choose cases that have the same IV and the same DV but all other alternative explanations are different.

the process of quantitative analysis

1. choose domain & unit of analysis 2. measure variables 3. prepare/process data 4. inspect & modify data 5. preliminary hypothesis testing 6. consider potential problems 7. conduct multivariate testing 8. interpret results

attributes of a theory

1. expectation (prediction/hypothesis) 2. causal mechanism 3. assumptions 4. scope conditions

Trachtenberg's method (archives & history)

1. figure out what the most important secondary books are on a topic 2. read those books. for each book, think carefully about the logic of the argument & the evidence, draw conclusions about who is more persuasive 3. where appropriate, go back to the original sources and think about whether they're being interpreted appropriately limits of this method: limited by what secondary sources have been written (authors may not have covered your topic extensively, may not use all of the available archival evidence, may not have a bias-free agenda), therefore using this approach is not preferable, but can be useful if accessing archives is not possible

three ways research can end up flawed

1. fraud 2. p-hacking 3. unintentional bias the peer review process rewards findings that are novel and statistically significant

4 methods of field research

1. interviews and surveys - advantages: very direct method, ask exactly what you want to know - disadvantages: dishonesty, lack of access to people that you want 2. participant observation: researcher becomes a participant in the culture or context being observed, try to observe the world through the eyes of a member of that culture - advantages: gain access you otherwise wouldn't, see things you weren't looking for (great for induction) - disadvantages: presence may alter the phenomenon you're trying to study, inherently subjective, hard to be objective then you know the people 3. direct observation: unobtrusively observe a phenomenon, watch rather than take part (ex: videos, observe protests, ride subways, public meetings, etc.) advantages: easier than participant observation, presence less likely to affect outcome 4. collection of written materials: sometimes ones you want are unavailable, may need to travel to get them, may need to gather pamphlets, flyers, etc.

types of experiments

1. lab experiments - recruit subjects to lab and manipulate some treatment (subjects usually convenience samples) 2. survey experiments - randomly manipulate some aspect of the survey (but ask SAME questions!!) 3. field experiments - randomly assign information "in the field" (like who gets information about something) 4. "natural" experiments - leverage random naturalization, cross-case comparisons

3 general features of survey research

1. large number of respondents chosen to represent a population of interest 2. structured questionnaire or interview procedures that ask a predetermined set of questions 3. the quantitative analysis of survey respondents

mill's joint method

1. method of agreement --> provides evidence that C sufficient cause of E, aka C is enough to get E on its own 2. method of difference --> provide evidence that C is necessary cause of E, aka cant get E without C

steps of the scientific method

1. observe some aspect of the universe 2. generate a hypothesis 3. use hypothesis to make a prediction 4. test those predictions by experiments or further data collection 5. repeat: replicate, question, redesign

mill's method of difference

1. outcome (y) must be different across the two cases 2. IVs are similar except for one, which is different --> A associated with Y and -Y is associated with -A. IV and DV are different but all other alternative explanations are the same.

getting honest answers

1. promise to keep responses confidential 2. emphasize the importance of honesty for academic research ("priming" honesty) 3. signal that people have a wide range of opinions about these topics, thereby reducing social pressure to give a "correct" answer 4. in addition to asking what they think, might ask what other people would think/do 5. in semi-structured interview, allow respondent to bring the sensitive subject up first 6. if large enough sample, use something like a :list experiment"

types of natural experiments

1. randomizing device (with a known probability) divides a population (ex: lottery) 2. jurisdictional studies - make sure of geographic divisions to study similar populations that find themselves by chances on opposite sides of some divide 3. omnibus "other" category - e.g. effect of bad weather on economics

3 advantages of probability sampling

1. remove possibility that researcher bias will affect the selection of cases 2. because of random selection, can apply mathematical probability to estimate the expected accuracy of the measures you calculated from your sample 3. because of random selection, you can generalize to the larger population

crucial things to avoid for qualitative methods

1. selecting on the DV (or "sampling on the DV"): when you choose cases to study based on the value of the DV (ex: studying why people become successful) 2. selection effects & selection bias: (natural) selection of cases screen out cases whole values on key variables are above or below some implicit threshold, result: pool of cases where values are abnormal when compared to actual underlying population. FALSE: if process by which certain cases are "screened out" is natural, then this mitigates the problem of selection bias.

two "flavors" of natural experiments

1. some "perturbation" (treatment) is applied - initial conditions don't matter as much here - might compare treatment/no treatment (e.g., areas of Africa subjected to slave trading), or compare different types of treatment (e.g., two halves of hispanionala colonized by france/spain) 2. no exogenous treatment, but different initial conditions - pacific islands different in geography but settled by single colonizing group (same group colonizes, but colonies have different resources) --> focus is almost always on explaining differences in outcomes

when to use qualitative research

1. to generate hypotheses 2. as a plausibility probe to assess the validity of the theory/causal mechanism 3. to provide a hard test of a theory: if the theory explains this difficult case, then it gains credence 4. to provide an easy test of a theory: identify cases where the theory really should operate 5. to test a theory when the domain/scope result in just a small N go cases inductive theory building, when research Q involves a very rare event (small number of cases), when measuring IV or DV is particularly difficult (like measuring beliefs of leaders), for investigating causal mechanisms

three distinctive aspects of process tracing

CPOs (causal process observations,) description ("static" description is crucial building block in analyzing the processes being studied), sequence (process tracing gives close attention to sequences of independent, dependent, and intervening variables

posttest-only control design

DV is measured after the experimental manipulation

pretest-posttest control group design

DV is measured before and after the experimental manipulation

operational definiton

a detailed description of the research procedures necessary to assign units of analysis to variable categories (so you can retrace steps & justify coding and so other researchers can replicate and validate the measure)

probability distribution

a distribution of the probabilities for a variable, which indicates the likelihood that each category or value of the variable will occur

longitudinal trend study

a longitudinal design in which a research question is investigated by repeated surveys of independently selected samples of the same population. draw new sample from the same population over time and ask the same question ex: presidential approval over time

longitudinal panel study

a longitudinal design in which the same individuals are surveyed more than one, permitting the study of individual and group change. same people, same questions over time ex: looking at beliefs of individuals over time

cluster sampling

a probability sampling design in which the population is broken down into natural groupings or areas called clusters, and a random sample of clusters is drawn

weighting

a procedure that corrects for the unequal probability of selecting one or more segments of the population

confidence interval

a range (interval) within which a population value is estimated to lie at a specific level of confidence 95% confidence that the true population static falls within that range.

observation

a single instance of your DV

standard error

a statistical measure of the "average" sampling error for a particular sampling distribution, which indicates how much sample represents will vary from sample to sample standard error goes down as sample size goes up

domain

a study's spatial and temporal scope

sample

a subset of cases selected from a population

sampling distribution

a theoretical distribution of sample results for all possible samples of a given size

pretest

a trial run of an experiment or survey instrument to evaluate and rehearse study procedures and personnel

N

an abbreviation for the number of observations on which a statistic is based

double-blind experiment

an experiment in which neither the experimenter nor the participants know which participants received which treatment

secondary analysis

analysis or survey of other data originally collected by another researcher

open-ended questions

answer in your own words, write/say whatever you want pros: nuanced answers cons: more expensive, harder to get respondents, harder to analyze in unbiased way

qualitative research question

asks about social processes or meaning and cultural significance of people's actions

quantitative research question

asks about the relationship between two or more variables

random assignment

assignment of research participants to experimental conditions by means of a random device such as a coin toss

test-retest reliability

association between repeated applications of an operational definition

essential empirical evidence for causal statements

association, direct of influence, elimination of plausible rival explanations

Hawthorne effect

being studied changes the subjects' behavior (although poor research design, and new studies show that while there is a minor effect of being observed it is not as big as they thought)

justice

belmont principle - benefits and burdens of research should be fairly distributed. need to make sure some classes are being systematically selected simply because of their easy availability, compromised position, or their manipulability; things should not provide advantages only to those who can afford them and such research should not unduly involve people from groups unlikely to benefit from applications of the research.

omitted variable bias

bias when confounders are not included in the analysis, spurious relationship. mistakenly attributing a causal effect to X when it was really due to Z OR failing to detect a causal effect when one really exists! when we fail to control for confounders

Tuskegee syphilis study

black, illiterate sharecroppers were subjects, told they had "bad blood" when they actually had syphilis and were not given the cure even though it existed

cons of probability sampling

can be expensive/inconvenient, not possible when target population is unknown, not desirable if language access is necessary, not desirable if studying a small number of cases and need to maximize leverage

straw-in-the-wind tests

can increase the plausibility of a given hypothesis (or raise doubts about it,) but are not decisive by themselves. provide neither necessary nor sufficient evidence for accepting/rejecting a hypothesis, and they only significantly weaken rival hypothesis. weakest of the 4 tests.

multivariate

caused by more than one factor

editing

checking data and correcting for errors in completed interviews or questionnaires

close-ended questions

choose responses from those provided (multiple choice, scales, rankings, etc.) pros: compare across individuals, less room for bias, time-efficient cons: answers your provide affect answers you get, respondents are forced to choose answers that don't necessarily reflect how they feel

assumptions

claims or beliefs (often implicit) in your theory about how the world works. need to identify controversial/important ones

partial regression coefficient / partial slope

coefficients in a multiple regression equation that estimate the effects of each IV on the DV when all other variables in equation are held constant

standardized regression coefficients

coefficients obtained from a norming operation that puts partial-regression coefficients on a common footing by converting them to the same metric of standard deviation units

advantages of survey methods

collect data on opinions & self-reported behavior, allows research at the "micro level", representative salad --> generalize to pop of interest

institutional review board (IRB)

committee at nearly all colleges responsible for reviewing research proposals to assess provisions for treatment of subjects . make sure research conforms to federal, state, university guidelines.

listwise deletion

common procedure for handling missing values in multivariate analysis that excludes cases where missing values on any of the variables in the analysis. drop all observations with any missing data.

index

composite measure of a concept constructed by adding or averaging the scores of separate indicators; differs from a scale which uses less arbitrary indicators

scale

composite measure of a concept constructed by combining separate indicators according to procedures designed to ensure unidimensionality or other desirable qualities

doubly decisive tests

confirms one hypothesis while eliminating all others (rare in social science). provides necessary criterion and sufficient criteria for accepting explanation. provide strong inferential leverage that confirms one hypothesis and eliminates all others. both meet the necessary and sufficient standard for establishing causation

conflict of interest

conflict between goal and other motives like financial gain, political interests, etc.

consistency checking

data cleaning involving checking for unreasonable patterns of responses, such as a 120year-old who voted

confidentiality

data obtained cannot be share with others without participants permission

wild-code checking

data-cleaning involving checking for out-of-range of other "illegal" codes among the values recorded for each variable

conceptualization

defining and clarifying the meaning of concepts

factual/procedural questions

describe facts of the world ex: "who was the 14th president?"

manipulated operations

designed to change the value or category of a variable

operational definition

detailed description of the research procedures necessary to assign units of analysis to variable categories

data cleaning

detection and correction of errors in a computer file that may have occurred during data collection, coding and/or entry

conceptualization

development and clarification of concepts

sampling error

difference between the true population value and the estimated value from the sample

ordinal measurement

different numbers indicate rank order to cases on a variable

validity

does the indicator capture the phenomenon we are interested in? does it measure what it is supposed to?

construct validation

does the measure correspond theoretically to what you are trying to measure? measurement validation based on an accumulation of research evidence indicating that a measure is related to other variables as theoretically expected

temporal domain

does theory hold in a particular time period?

spatial domain

does theory hold only in a particular place?

validity

does your indicator actually capture the concept you're interested in, and nothing else?

nonresponse bias

due to incomplete data collection; when nonrespondents (sampled individuals who do not respond) differ systematically from respondents. diagnose: follow-up with people who refused/couldn't be contacted at first, are their answers and demographics similar to people who responded the first time?

within-subjects design

each person serves as both a treatment and a control, we compare outcomes for each individual we learn about individual-level causal effects

reactive measurement effect

effect in which participants' awareness of being studied produces changes in how they ordinarily would respond

ecological fallacy

erroneous use of data describing an aggregate unit. failure in reasoning when you draw an inference about an individual based on an aggregate data for a group ex: assuming someone in a lucrative profession (i.e. doctor) makes a lot of money; assuming people in LA like ice cream more than people in Seattle because sales are higher there.

measurement operations

estimate existing values or categories - sources of measurement operations: verbal, observational, archival records

informed consent

ethical principle that individuals should be given enough information about a study, especially its potential risks and benefits, to make an informed decision about whether ti participate

anonymity

ethical safeguard against invasion of privacy in which data cannot be identified with particular participants

field pretesting

evaluation of a survey instrument that involves trying it out on a small sample of a persons

unidimensionality

evidence that a scale/index is measuring only a single dimension of a concept

internal validity

evidence that rules out the possibility that factors other than the manipulated IV are responsible for the measured outcome threats to internal validity: selection, maturation, history, double-barreled treatments, information leakage could the treatment have manipulated something other than the intended IV?

strategic selection bias

ex: audience costs --> if leaders are punished for making threats and then backing out, they presumably will try to avoid doing that, so audience costs might exist but we never actually observe them

social desirability bias

exaggerate for questions where there is a clear socially acceptable answer to avoiding a question tendency to project ourselves in a socially desirable way stronger in more personal interactions affects Qs about politics and attitudes

audit study

examines racial and other forms of discrimination be sending matched pairs of individuals to apply for jobs, purchase a car, etc.

inter-rater reliability

extent to which different observers or coders get equivalent results when applying the same measure

inter-coder reliability

extent to which different observers or coders get equivalent results when applying the same measure; have multiple people make the same measurement

external validity

extent to which experimental findings may be generalized to other settings, measurements, populations and time periods ex: testing on students when trying to study cops

convergent validation

extent to which independent measures of the same concept are associated with one another. compare your measure against other measures that try to measure the same thing

types of questions

factual/procedural, hypothetical, normative, empirical

data matrix

form of a computer datafile, with rows as cases and columns as variables; each cell representing the value of a particular variable (column) for a particular case (row)

histogram

graphic display in which the heigh of the vertical bar represents the frequency or % of cases in each category of an interval/ratio variable

relationship between reliability and validity

for a measure to be valid it must be reliable, but reliability doesn't ensure validity

inferential statistics

for determining the extent to which one may generalize beyond the data at hand

deterministic relationships

if some cause occurs, effect will occur with certainty

regression line

geometric representation of a bivariate regression equation that provides the best linear fit to the observed data by minimizing the sum of the squared deviations from the line. does not tell you how closely to variables correspond, only tells you slope of the line: how much a 1-unit increase in A changes B

measurement validity

goodness of a fit between an operational definition and the concept its supposed to measure

unstructured interview

guided by broad objectives, questions developed as interview proceeds

interval measurement

has the qualities of ordinal plus equal distances (intervals) between assigned numbers

quantitative methods and hurdles to causation

helpful for hurdles 1 & 4 (causation and controlling for confounding variables), but less good for 2 & 3 (case studies, experiments, interviews better)

ratio measurement

highest level of a measurement, which has the features of other levels plus an absolute (non arbitrary) zero point

causal process observations (CPOs)

highlights the contrast between (a) the empirical foundation of qualitative research (b) the data matrices analyzed by quantitative researches (which may be called data-set observations, DSOs)

structured interview

highly specific objectives, all questions written beforehand and asked in the same order, interviewers remarks are standardized

case study

holistic analysis of a single person, group, or event by one or more research methods

degrees of freedom

how many independent pieces of information we have to use in calculating a statistic, or, in a qualitative context, explaining an outcome # of observations - (# of IVs + 1) want to have "positive" degrees of freedom for better estimates

empirical question

how the world is; how the world works (our focus). questions that can be answered by observing things in the real world (ex: what is the most common justification provided for going to war?)

normative questions

how the world should be (focus of political theory/philosophy)

null hypothesis

hypothesis that an observed relationship is due to chance; significant test rejects the null hypothesis

double-barreled treatments

ideally, the wording would make it such that ONLY ONE THING is changing in between experimental conditions. ex: if some respondents see "prosperous democracy" and others see "poor authoritarian country"

internal consistency

if using a composite measure like an index or scale, is there a general agreement among items? consistency of "scores" across all items of a composite measure (i.e., index or scale)

complexity

in a system (units.elements interconnected,) chains of consequence extend over time and many areas examples: diminishing returns to scale, effect of one variable might depend on another (interactive effect), behavior changes the environment in which we act implications: most behaviors will have multiple effects; actions might impact environment, but actors will then respond to new environment

difference between observational studies and experiments

in observational studies, nature defines the value of the IV. in experiments. you randomly assign the value of the IV, so you know the level of the IV is independent of any other factors, allowing you to rule out confounding use of experiments is limited by whether you can manipulate the key IV (treatment)

saturation

in purposive and theoretical sampling, the point at which new data cease to yield new information or theoretical insights

selecting cases for qualitative research

in small-n context, non-probability methods of case selection are often better for addressing confounding usually can learn most about comparing across two or more cases in small-n we cannot control using multiple regression, we must compare cases that are similar on confounders but have different values for the main IV, same as method of difference

probabilistic relationships

increases in X are associated with increases (or decreases) in the probability of Y occurring

regression coefficient (slope)

indicates how much the DV increases or decreases for every unit change in the IV; the slope of the regression line. ex: for each percent increase in FSS, injury deaths increase 0.8 (per 100k people)

between-subjects design

individuals get assigned to groups and one group gets treatment, one gets control compare across groups or people learn about an "on average" causal effect but not about individuals

cover story

information presented to research participants to obtain their cooperation while disguising the research hypothesis

intervening variable

intermediate between 2 other variables in a causal relationship

explanatory survey

investigates relationships between two or more variables, often attempting to explain them in cause-and-effect terms

order effects

issue related to question wording, people are more likely to pick the first option solution: randomize order

common rule

label given to the federal policy for the protection of human subjects

measurement error

lack of correspondence between a concept and measure that is due to problems with an operational definition or its application

smoking-gun tests

lack of smoking gun doesn't imply innocence, but possession of one is not good (rare in social sciences.) provides a sufficient but not necessary criterion for accepting the casual inference. more demanding than hoop test. if a given hypothesis passes it substantially weakens the rival hypotheses

field research

leaving your institution to collect data or information for a research project key: trying NOT to influence human subjects you are studying

statistical significance

likelihood that the association between variables could have arisen by chance report using the "p-value" -- "p" stands for "probability" that the association occurred randomly, given the size of the effect, the variability of your data, and your sample size convention is that a p-value of less than .05 or smaller is "statistically significant at conventional levels"

statistical significance

likelihood that the results of a study could have occurred by chance

human subjects

living individual about whom an investigator conduction research obtains: 1. data through intervention or interaction with the individual or, 2. identifiable private information

conceptual definition

meaning of a concept expressed in words that is derived from theory and/or observation (also called theoretical definition)

r2 (r squared)

measure of fit in multiple regression that indicates approximately the proportion of the variation in the dependent variable that is predicted or "explained" by the IV. ex: r2 = 0.42 --> 43% of variation in the DV is explained by the IV

correlation coefficient

measure of strength and direction of a linear relationship between two variables (Ranges from -1 to 0 to 1) how closely two variables are associated: how close are X and Y to a perfect relationship?

standard deviation

measure of the variability or dispersion that indicates the average "spread" of observations about the mean

test-retest reliability

measures the same thing or person on different days; have the same person take same survey on different days. etc.; have different researchers measure same thing rule of thumb: should be correlation of at least .80 between measures of the same person/unit

multiple regression

method for determining the simultaneous effects of several IVs on a DV

nonprobablity sampling

methods of case selection other than random selection (i.e., convenience sample)

coverage error

mismatch between the sampling frame and the target population

cross-sectional design

most common survey design, in which data are gathered from a sample of respondents at essentially one point in time ex: feeling thermometer about current class

hoop tests

must "jump through the hoop" in order to still be under consideration. set more demanding standard than straw-in-the-wind. does not provide sufficient criterion for accepting the explanation, but is necessary. can't confirm a hypothesis, but can lead you to reject one.

spurious relationship

non-causal statical association between 2 variables produced by a common cause

percentage distribution

norming operation that facilitates interpreting and comparing frequency distributions by transforming each frequency to a common yardstick of 100 units (1% points) in length; the number of cases in each category is divided by the total and multiplied by 100

nominal measurement

numbers serve only to label categories of a variable, must be both exhaustive and mutually exclusive exhaustive: includes all possible values or categories of a variable so that every case can be classified mutually exclusivity: measurement requirement that each case can be placed in only one category of a variable

antecedent variable

occurs before, and may be a cause of, the DV or IV

counterfactual logic

other causal theories, laws, regularities or principles if it had been the case that C (or not C), it would have been the case that E (or not E) - starting with your hypothesis, suppose you believe C was a cause of E 1. you can research actual cases that resemble your case, except in these new cases: C is absent (or different,) then you have to check to see the correspondence between C & E in all cases 2. you can imagine that C was absent, and ask whether E would have occurred in that counterfactual case

valuable theory characteristics

parsimony: theory explains more using less; predictiveness: can help us understand cases beyond those from which we derived it; falsifiability: cannot logically identify types of evidence that are inconsistent with theory; fertility: suggests other observable implications or hypotheses

advantages to qualitative methods

particularly good at hurdles 2 & 3 (reverse causation and credible causal mechanism) not as good at hurdles 1 & 4 because large sampling error (when N is small, can draw wrong conclusions)

semi-structured interview

permits interviewer some freedom in meeting specific objectives

falsifiers

pieces of evidence you'd expect to find if your hypothesis is incorrect

leading question

possible answer is suggested. common in political campaigns, can be unintentional. ex: "would you agree, as most Americans do..." solutions: use neutral wording, pre-test questions to make sure you aren't accidentally leading people, experimental techniques

deductive theorizing

potential explanations emerge from abstract analysis outside of the context of any specific case 1. develop theory 2. generate hypothesis 3. get data 4. test the theory on those data caveat: in practice, "deductive" theorizing usually has an inductive element

p-hacking

practice of reanalyzing data in many different ways and only presenting the preferred results (aka "fishing expeditions") many of these relationships are spurious. "slicing and dicing" the data

y-intercept

predicted value of the dependent variable in the regression when the independent variable has a value of 0

data processing

preparation of data for analysis prepare data for computerized analysis --> inspect & modify data --> carry out preliminary hypothesis testing --> conduct multivariate testing

simple random sample

probability sampling design in which every case and every possible combination of cases has an equal chance of being included in the sample

stratified random sample

probability sampling design in which the population is divided into strata (or variable categories) and independent random samples are drawn from each stratum

imputation

procedure for handling missing data in which missing values are assigned based on other info, such as the sample mean of known values of the other variables. fill in missing values with data from other observations (for example, using the mean for all missing values)

manipulation check

procedure used to provide evidence that participants interpreted the manipulation of the IV in the way intended - ex: might ask respondents to recall portions of the experiment to make sure that they remember what they read

operationalization

process of identifying a valid observable indicator for your unobservable concept. specifying empirical indicators, spelling out procedures.

response rate

proportion of people in the sample from whom completed interviews or questionnaires are obtained

pros and cons of face-to-face interviews

pros: can clarify and restate Qs, response rates increased when compared to other models, helpful for long interviews or complicated questions cons: skewed sample (older, more female, ex. of non-response bias), very expensive, desire for social desirability bias can invalidate answers (especially if confidentiality is a concern)

pros and cons of internet surveys

pros: cheap, anonymity, much lower risk of SDB cons: hard to get representative samples (especially in some countries)

pros and cons of telephone surveys

pros: cheaper than mail, easy to monitor and record, lower SDB than face-to-face cons: expensive, can be hard to ask complicated questions, can be hard to get representative samples (nonresponse bias and coverage error), SDB because lack of anonymity

pros and cons of mail surveys

pros: cheaper than other modes, can be made anonymous, can get very large samples cons: only most enthusiastic will answer (nonresponse bias), people often skip questions, make it difficult to analyze data rarely used anymore

causal mechanism

provides specific chain of steps, series of links accounting of how or why changes in the causal variable (IV) affect the DV

pre-analysis plan (pre-registration)

publicly posting a set of plans for research and analysis, helps prevent against p-hacking and garden of forking path problems

double-barreled question

question in which two separate ideas are presented together as a unit. ex: "would you support increased funding for the military in order to do more humanitarian intervention?"

replication

repetition of a study using a different sample of participants and often involving different settings and methods

the garden of forking paths

researchers have a lot of flexibility about how to analyze results. researchers often do not have clear path going in and could justify lots of different choices. often muddle their way through, trying lots of options. but without clear research plan, could end up with cherry-picked results.

marginal frequencies

row and column totals in a contingency table (cross-tabulation) that represent the univariate frequency distributions for the row and column variables

probability sampling

sampling based on a process of random selection that gives each case in the population an equal or known chance of being included in the same

theoretical sampling

sampling process used in order to research in which observations are selected in order to develop aspects of an emerging theory

purposive sampling

sampling that involves the careful and informed selection of typical cases or of cases that represent relevant dimensions of the population (also called judgment sampling)

random-digit dialing

sampling-frame technique in which dialible telephone numbers are generated randomly

motivation

see what we expect to see and what we want to see --> "motivational biases" (as opposed to purely cognitive ones) one theory suggests that we hold these beliefs because they satisfy important psychological needs (like positive self-esteem), while others suggest purely cognitive basis for self-serving beliefs (overweight our own efforts) --> both of these probably work in tandem

convenience sample

selection of cases that are conveniently available

debriefing

session at the end of a study in which the investigator tells the participant the real purpose of the study and the nature of deception (if any)

sample statistic

single measure of some attribute of a sample (mean, median, etc)

empirical indicators

single, concrete proxy for a concept such as a questionnaire item in a survey (a single empirical indicator of a concept may be inadequate because 1. may contain errors of classification and 2. unlikely to capture all the meaning of a concept)

computer-assisted personal interviewing

software program that aids interviewers by providing appropriate instructions, question wording, data-entry supervision

natural experiments

some exogenous factor creates a facsimile of random assignment. combination of experimental logic with either case study or quantitative analysis.experiments in which intervention is NOT under the control of the researcher, also called "found experiments" "sometimes nature acts in a way that is close to how a researcher, given the choice, would have intervened" described as "quasi-experimental" because they occur such that the analyst can separate observations into treatment and control groups payoff: much stronger causal identification than traditional case studies or quantitative analyses e.g., hurricane Katrina

coding

sorting of data into numbered to textual categories

reliability

stability/consistency of an operational definition (a measure over time) does the indicator produce the same results if different people use it? more specific questions give you more reliable answers

ethics

standards of moral conduct that distinguish right from wrong

cronbach's alpha

stastical index of internal consistency reliability that ranges from 0 (unreliable) to 1 (perfectly reliable)

margin of error (MOE)

statistic that tells you how much sampling error to expect, given your sample size (useful, since we don't usually know true sampling error!) MOE goes down as random sample size goes up

bivariate analysis

statistical analysis of the relationship between 2 variables

regression analysis

statistical method for analyzing bivariate and multivariate relationships among interval- or ration-scale variables. to do regression, calculate the line that would best fit to your data

scope conditions

temporal and spatial domain in which theories are expected to operate

social desirability effect

tendency of respondents to bias answers to self-report measures so as to project socially desirable traits and attitudes

proportionate stratified sampling

strata are sampled proportionately to population composition

longitudinal design

survey design in which data are collected at more than one point in time

mixed-mode survey

survey that uses more than one mode of data-collection, either sequentially or concurrently, to sample and/or collect the data

process tracing

systematic examination of diagnostic evidence selected and analyzed in light of research questions and hypotheses posed by the investigator method for analyzing single cases, also called "within-case analysis" you are attempting to observe the causal process in action, paying particular attention to the sequence of events (this requires generating additional implications of your theory). purpose: test hypotheses and causal mechanisms (not correlation)

research

systematic investigation designed to yield generalizable knowledge

sampling frame

the set of all cases from which you will select the sample

inferential power in experiments

the treatment and control groups are identical on every dimension except the intervention this means they are similar on observed variables (like age), but power comes from being similar on unobserved variables (like emotional states)

Kitchen sick approach

throwing in all control variables you can think of major problems: (1) statistical estimate becomes less precise (lots of "noise",) and (2) post-treatment bias

population

total membership of a defined class of people, objects, or events the entire group of people/things

descriptive survey

undertaken to provide estimates of the characteristics of a population

unit of analysis

unit of observation. what a single case/observation looks like. across what units do the IV & DV vary? the thing that constitutes a single observation tells you where your inferential leverage is coming from (how were learning the things were learning)

snowball sampling

use chain referral, where each contact is asked to ID additional members of the target population, who then ID others, etc.

chi-square test for independence

used to assess the likelihood that an observed association between two variables could have occurred by chance

snowball sampling

uses a process of chain referral

archives

using files to better understand the phenomenon you're trying to study (Ex: the Vatican, Iraq memory foundation, U.S. presidential archives, etc.)

confounding variable

variable Z that is correlated with both X and Y and that somehow alters the relationship between the two (example: z causes both x & y). failing to control for these can cause spurious relationships

extraneous variable

variable not part of hypothesized relationship

dummy variable

variable or set of variable categories recorded to have values of 0 and 1. dummy coding may be applied to nominal or ordinal scale variables for the purpose of regression or other numerical analysis often done for ease of interpretation (a "1" indicates that the characteristic is present, a "0" indicates it is absent; often used for things like gender)

hypothetical questions

what might be in the future ex: "what will EU look like Brexit happens?"

selection effects

when all cases do not have an equal chance of being in the sample

non-response bias

when non-responders differ in some significant way from responders

simultaneity bias

when the IV causes the DV AND the DV causes the IV

when to reject the null hypothesis?

when you have a low p-value (less than .05). a larger aka insignificant p-value means that changes in predictor not associated with changes in the response

information leakage

you are manipulating some feature of the world in survey experiment, but what if that one feature (e.g. regime type) changes other beliefs? for example, does saying a country is an autocracy make you think it's in a particular part of the world? information leakage is an internal validity problem. when your treatment manipulates other, unspecified, beliefs in additions to the ones you are trying to move.


Related study sets

California Real Estate Law Ninth Edition 2019 Unit 5-8

View Set

Legal and Ethical Responsibilities Test

View Set

Astronomy Terms 1: History of Astronomy

View Set

BHM - Exam 3 (Questions from Book)

View Set

Health information management 8,9,10

View Set

Naming Compounds (methyl, ethyl, etc)

View Set