PSU PLSC 308: EXAM 1
KKV and Bachner
?
Quantitative research
Advantages- -High external validity -Can be more objective (or at least appears to be -Cost-effective and quick -Analysis is easily replicable Disadvantages- -Low internal validity -Hard to control for every single confounding variable -Not very detailed -Statistical models can be advanced and difficult
Qualitative research
Advantages- -High internal validity -Rich, nuanced perspective of the phenomenon of interest -Narrative, so it is often interesting to read Disadvantages- -Low external validity -Easily subjective and prone to researcher bias -Can be time-consuming and resource-intensive
Empirical questions
Ask how the world does work -Descriptive questions ask what happens in the world -Causal questionsask why things happen the way they do
Two general types of variables
Categorical variable -Made up of 2+ groups/categories• -Examples: Presidential vs. parliamentary systems◦ Regime type (democracy, autocracy, etc.)◦ Race◦ Gender Numerical variable -Measured on a scale -Examples:GDP. Democracy Index. Age, Percent of something
Threats to internal validity
Concept -Selection bias: when the sample was not randomly selected• -Omitted variable bias: when not all confounding variables were controlled for -Researcher bias: often unintentional choices that lead the results into one direction -Attrition: when participants drop out before the study is complete
Correlation vs. causation
Correlation -Means there is a relationship between X and Y -As X goes up, so does Y -But we don't necessarily know why -Correlation does not imply causation Causation -Not only is there a relationship, but we know that X causes Y -Much harder to establish than correlation
Bibliography citations examples
Gilens, Martin. 2012. Affluence and Influence: Economic Inequality and Political Power in America. Princeton: Princeton University Press. Pattern: Author. Year. "Article title." Journal Volume, (issue): Page. More then 1: The first author is "Last, First" and all others are "First Last"
Scientific explanations have predictive validity
If a scientific theory consistently fails to predict future events, its usefulness is limited
Example
In text -(Geddes 1999). -Geddes (1999) -(Prior 2019, 105). +1 write out the last names of the authors, separated by and. -(Cusack, Iversen, and Soskice 2007). +4 Write the first author's last name and then et al. - (Enns et al. 2014) Multiple sources;If referencing more than one source in the same sentence, separate them with a semicolon. -(Hammons 1999; Elkins, Ginsburg, and Melton 2009). Multiple sources by the same author in the same year. Alphabetize them by title and place a lowercase letter after the year. -Huntington (1991a; 1991b)
Validity of a study
Internal validity -How well the study shows cause and effect -Given our research design, can we be sure that X is what caused Y in our data? External validity -How generalizable and applicable the study is to the real world -In other contexts, can we be X causes Y in general?
Studies must be replicable
Methods and data should be clear so that other scholars can reproduce the work to ensure consistent findings
How generative AI works
Natural language processing (NLP) -A branch of AI focused on getting to computers to "understand" language Generative Pre-Trained Transformer 3 (by OpenAI) -One of the most powerful and popular language models -Produces human-like text based on a prompt -Trained on the webpages, books, and Wikipedia:
Science is probabilistic, not absolute
No scientific explanation is going to be right 100% of the time, and that's okay
Classifying research designs
Political Science Research = Experimental or Observational Observational= Quantitative or Qualitative Quantitative= Cross-sectional (regression) or Longitudinal (time series)
Question order effects
Previous questions may "prime" people to give different responses than they would other wise give
Establishing causality
Requirements -Correlation must exist betweenX and Y -Causal mechanism: a logical explanation of how X causes Y -Temporal order: X must change before Y -No confounding variables:must make sure no other variables are causing both X and Y
Social desirability bias
Respondents may give dishonest answers that are more socially acceptable than their true opinions
Inattentiveness
Respondents sometimes speed through the survey and give low-quality answers
Scientific work is cumulative and ongoing
Science should build on previous work and continuously expand knowledge
Scientific knowledge is usually explanatory
Scientists generally care more about "why" questions - causes, not just correlations -Explanatory research assesses causes -Descriptive research identifies patterns
Parsimony is the best policy
Simple explanations are preferred over complex explanations
Theory vs. hypothesis
Theory -An explanation of how something works or why something happens -Built on the literature and logic -Stated in terms of general real-world concepts -Example: Economic development leads to democratization Hypothesis -An observable, testable implication of a theory -Based on the theory -Stated in terms of measurements in data -Example: Increasing levels of GDP per capita should predict increases on a democracy index
Specification error
When a question is vague and can be interpreted differently by respondents
Acquiescence bias
When asked to agree/disagree with a statement, people will often agree without really thinking about it
Nonresponse bias
When people in the sample refuse to do the survey and there is a significant difference between them and the people who responded
Regression
is a statistical technique for estimating the relationships between at least one independent variable and a dependent variable
Validity
it measures what it's supposed to measure (accuracy)
Reliability
the measure consistently produces the same results every time its measured (precision) Threats to reliability -Vague conceptualization operationalization -e.g., unclear definitions or coding rules -Inconsistent instrument -e.g., a bathroom scale shows a different weight each time -Change over time/space -e.g, the US Census has changed how it defines race several times, so early racial population numbers are not comparable to today Assessing reliability -Measure it again -e.g., redo a poll with a different sample• -Compare it to similar measures -with graphs, tables, and correlations -Form two separate measures of the same concept, each with half of the indicators -e.g., calculating an ideology score from even-numbered survey questions and another score fromodd-numbered ones to ensure consistency
Random vs. systematic error
(add)
Focus on independent variable
-What effects does X have? -Example: What effects does legalizing marijuana have on drug use and crime?
Probability sampling
-When each person in the target population has an equal probability of being selected -Needed for most survey research
Scientific claims must be falsifiable
If a conspiracy theorist has an excuse for every piece of evidence presented against them, then their belief is not falsifiable. There is no way to prove it false to them.
Findings are generalizable
The results from a study should apply to the relevant population, not just the sample
Scientific research is verified rigorously
Through peer review, replication, feedback, etc.
Normative questions
ask how the world should work
Sampling
-A case is a single country, state, individual, or object of interest -A sample is a group of cases hat we're studying -Abbreviated as N -The population of interest includes all possible cases that exist -Example: a survey of US adults◦ Population: all US adults◦ Sample: 500 US adults◦ Case: a single adult
Dimensions
-A dimension is an aspect orcomponent of a concept -Many concepts have more thanone dimension -An individual observable variablecan only measure one dimension -Example: ideology is oftenconsidered to have 2+dimensions -For most issues in US politics, a singledimension is sufficient to capturenearly the full range of attitudes
Margins of error (MOE)
-A margin of error expresses the uncertainty of a statistic -Based on standard deviation, sample size, and how sure we want to be about it -Usually uses 95% confidence level -Loosely means we can be 95% sure the parameter is within that range -Technically means that if we repeatedly re-ran the survey, 95% of our estimates would contain the parameter
What surveys help us estimate•
-A parameter is the true value of something in the population• -A statistic is an estimate of a parameter based on a sample
Examples of Bad Questions
-A secret society of super-rich elites controls the entire US government -More intelligent leaders produce better policy than less intelligent leaders -The US Constitution is the best way to organize a political system
Definitions of theory
-A social science theory explains how or why something happens -It does not have to be widely confirmed or accepted to be called a theory -But it does need to rely on existing research -This is usually what we mean in this class when we say theory Not to be confused with: -Scientific theories in the natural sciences require repeated testing and widespread acceptance to be called a theory -Theoretical frameworks such as rational choice theory -Political theory usually refers to normative political thought
What is a literature review?
-A summary of the state of the literature on a topic -Can be any of the following: A section of an article A chapter of a book A whole review article
Other CHATGTP notes to keep in mind
-AI-generated writing is usually not very deep, analytical or creative -AI bots usually remain neutral on controversial issues -AI technology is constantly changing, and so are norms around AI use
Aggregating indicators
-Additive index: all indicatorsare added together tomeasure the latent variable -Multiplicative index: indicators are multipliedtogether -Weighted models: analgorithm figures out a morecomplex formula that weightseach variable
The field of political science (subfields)
-American politics•International relations -Comparative politics -Political theory -Public policy -Public administration
Observational
-Analyze existing data using statistical techniques -In political science: Regression models or More advanced statistical techniques -Low internal validity, high external validity
Latent variables
-Are abstractconcepts that cannot be directlymeasured -Example: standard of living
The scientific method
-Ask a question -Do background research -Construct hypothesis -Test with an experiment -Contrust hypothesis -Hypothesis is true = Report Results -Hypothesis is false or partially true (try again)
Cross-sectional studies
-Assess cases at one point in time -In political science: Regression or single wave surveys -Law internal validity, high external validity (in general)
Observable variables
-Can bedirectly measured -aka: indicators, observed variables◦ Example: income, wealthStandardof livingLifeexpectancyEducationalattainmentLiteracyrateGDP percapitaMedianincome
Qualitative methods (aka small-Nresearch)
-Case studies examine one case or a sample of just a few cases A country, event, person, etc. Usually uses qualitative data butcan also use quantitative data -Ethnographies describe behavior or culture based on the researcher's direct observation• -Most political science research nowadays is quantitative -Most political science faculty at Penn State focus on quantitative research
How ChatGPT works
-ChatGPT cannot look up information -Instead, it generates text that could plausibly appear in the real world after the question or prompt
Requirements for causality
-Correlation must exist between X and Y -Causal mechanism: a logical explanation of how X causes Y -Temporal order: X must change before Y -No confounding variables: must make sure no other variables are causing both X and Y
Scientific research is empirical, not normative
-Empirical scholarship examines how things work -Normative scholarship examines how things (should) work
Advantages of experiments
-Experiments provide the strongest possible evidence of causality -Controlled environment• Measurement is precise -More objective
Validity of a measure
-Face validity: the measure on its face seems to reflect the concept• -Content validity: it captures all the necessary components of the concept -Construct validity: it measures the construct that it's supposed to measure -Convergent validity: it is consistent with related measures of the concept -Discriminant validity: it is not consistent with measures of unrelated concepts
Longitudinal studies
-Follow a case or many cases overtime (people, countries, etc.)• In political science: Time series analysis or Multi-wave surveys -High internal validity, low external validity (in general)
A good literature review
-Gives a bird's eye view of the topic but focuses in on the specific topic -Flows like a story -Critically analyzes existing research -Identifies gaps in our knowledge -Builds up to the theory
Examples of Bad Questions
-How much should the US government spend on defense? -Is there a correlation between government healthcare spending and life expectancy? -Which party cares more about ordinary people? -Why do some government publications use Oxford commas and others do not?
What does science mean
-In everyday language, science usually refers to the natural sciences -But more broadly, it can refer to systematic inquiry in any field:
Critiques of political science
-Interpretivism: human behavior is subjective, so scholars can only interpret behavior, not directly observe behavior -Constructivism: humans construct their own identities and social structures, so determining cause and effect is difficult -Practical limitations: society is too complex for scientists to discern objective facts about political behavior and institutions
Anatomy of a typical article
-Introduction -Literature Review -Theory/Hypotheses -Data/Analysis -Conclusion
What makes a good research question?
-It is empirical -An answer could lead to novel descriptions, explanations, or predictions• Its premise is falsifiable -The topic is useful, important, and interesting to you (and ideally to others too)
What makes a good theory?
-It is falsifiable -• It has many observable implications -It is as concrete as possible
What makes a good hypothesis?
-It should be plausible from theory -It must include the direction of the relationship between variables -If using categorical variables, be specific about which group(s) you are talking aboutUse probabilistic language -Avoid language that implies a process over time unless you will be using models that account for time -Make the unit of analysis clear -Mention ceteris paribus in some way
Disadvantages of experiments
-Low external validity• -Convenience sampling isn't always appropriate• -Hawthorne effect -people change their behavior when they know they are being observed
Controlling for variables
-Main idea of regression: If wec an account for all possible variables that could affect Y, then if X still appears to have a relationship, it must affect Y -Control variables are possible confounders included in regression models -Let's add an independentvariable, exposure to an ad
Examples of Good Questions
-More highly educated people are more likely to vote than less educated people. -As a country's middle class grows, its likelihood of becoming a democracy increases. -Presidential systems tend to have higher democracy scores than parliamentary systems. -Presidential systems tend to have higher democracy scores than parliamentary systems.
How to Cite
-Most political science research uses APSA (American Political Science Association) style
Quantitative research designs•
-Most quantitative research uses some type of regression model -Umbrella term for statistical models that estimate relationships among variables -The specific type of model usually depends on: -Dependent variable type (binary, ordinal, frequency, duration, etc.) -Whether time matters (cross-sectional if no, longitudinal if yes)
Case selection strategies
-Most similar: compare 2+ similar cases where only the independent variable is different -Most different: compare 2+ cases that are different on every thing except for the independent variable -Typical case: pick a case that is very representative of the theory that the study is assessing -Deviant case; pick a case that contradicts the theory
Quantitative
-Narrative, non-numerical -In political science: case studies, interviews, and observation High internal validity, low external validity
Quantitative
-Numbers, data, statistics -In political science:Experiments or Regression -Low internal validity, high external validity (in general)
Hypotheses
-Once you have a theory, think through as many observable implications as possible◦ If the theory is correct, then we'd expect to see -Focus on the implication(s) that are:◦ Important◦ Measurable◦ Interesting◦ Falsifiable◦ Concrete◦ etc.
Experimental studies
-Randomly assign subjects to groups and then impose a treatment -In political science: Survey experiments or Field experiments• -High internal validity, low external validity
Probability sampling
-Simple random sampling (SRS) is straight forward; the sample is randomly chosen from the population -Stratified sampling breaks thepopulation up into groups ("strata"such as racial/ethnic groups) andcollects an SRS of each group -Cluster sampling randomly selectsgroups ("clusters" such as districts)and all members of that group are inthe sample
Types of experiments
-Survey experiments present different questions or framings to the treatment and control groups -Lab experiments are usually done in person with a "game" or other activity so researchers can observe participants' behavior -Field experiments are implemented in the real world -Natural experiments are observational studies that leverage conditions that are already randomized by nature or other forces
Examples of Bad Questions
-Tall people are more likely to vote than short people. -The size of the middle class affects the likelihood of a country becoming a democracy. -Presidential systems tend to have higher democracy scores. -Presidential systems have higher democracy scores than parliamentary systems.
Experiments
-The X variable is administered by the researcher rather than observed -Cases are randomly assigned to groups Treatment group(s) receive an intervention of some sort◦ Control group(s) get all the same
Dependent variable
-The outcome of interest -Other names: response variable -Notated as Y
Independent variable
-The predictor or cause of interest -Other names: explanatory variable, predictor, regressor -Notated as X
Operationalization
-The process of defining how tomeasure an abstract concept withobservable variables -A hypothesis is an operationalizedversion of the theory -Example: how should we measuredemocracy?
Conceptualization
-The process of defining what anabstract concept means -The lit review and/or theory sectionlays out how the author isconceptualizing any abstractconcepts -Example: what does "democracy"mean?
General types of longitudinal studies
-Time series analysis follows one case (N = 1) over time (usually at least 100time points) -Example: Assessing the dynamics and drivers of national polarization in the US, 1920-2020 -Panel studies follow many cases (N > 30) at a few points in time (usually 3-5time points) -Example: Analyzing the causes of polarization across the 50 states in 1990, 2000, and 2010 -Time series cross-sectional (TSCS) studies follow many cases (N > 30) at manypoints in time (usually at least 30) -Example: Assessing the dynamics and drivers of state-level polarization in the 50 states,1970-2020
Guidelines
-Use AI-generated text to supplement, not replace, your own writing -Avoid asking AI to generate more than one paragraph of text at a time -Make sure you have permission to use AI -Be transparent about the role of AI in your writing -Credit any AI bot you use in a footnote or cite it in your bibliography. -Fact-check all information -Don't trust any sources cited by AI -Check for plagiarism -Make sure you understand what all the text means -Edit the writing style to be more readable for the target audience -Be aware of biases in the AI's algorithm
Dichotomous
-Variables (akabinary ordummy) have 2categories -Yes or No, Ture or False
Qualitative data
-We usually use the word "data" to refer to numbers -But data can be non-numerical• Non-numerical data Interviews Text (especially primary sources)◦ Direct observation of researchers(field work)
Examples of Good Questions
-Wealthy Americans tend to hold greater influence on politics than poorer Americans -More experienced leaders produce better economic performance than less experienced leaders -Presidential systems tend to have higher standards of living than parliamentary systems
Adjusting unrepresentative samples
-Weighting; respondents from groups that are underrepresented in the sample are counted extra -Surveys are often weighted by gender, age, education, race, etc
Focus on dependent variable
-What causes Y? -Example: Why have the parties become polarized?
Examples of Good Questions
-What effect does defense spending have on a country's safety and stability? -What effect does government healthcare spending have on life expectancy? -Which party accepts more contributions from wealthy individuals, corporations, and PACs? -How does a candidate's vocabulary level affect their chance of winning their election?
Non-probability sampling
-When sampling is non-random and not representative of the population -A convenience sample is a sample of people who are easy to reach and not necessarily representative of the population
Nominal
-akacategorical) variables classify things with 2 or more categories -Examples: race, form of government
Ratio variables
-are numerical and 0 means that there is nothing of that variable -Examples: GDP, age
Inter valvariables
-are numerical but 0 doesn't really mean anything -Examples: year, temperature
Ordinal variable
-shave ordered categories such as rankings or a scale of 1-5 -Examples: agreement with a statement, regime type