Designing Randomized Evaluations
How do you calculate discount factor?
(today/future?^1 for a year ^1/10 for ten years ^12 for a month
In a trust game, Person A receives $50, and can give any amount to Person B. The amount Person B receives is tripled, and then Person B can choose to give any amount back to Person A. If Person A believes that people act solely in their own self-interest, what amount would Person A give to Person B to maximize the money he (Person A) receives?
0
In a cluster randomized trial where the intra cluster correlation (rho) is 0, we estimate that our Minimum Detectable Effect (MDE) is 0.15. If we increase the total sample size, from 1400 to 2800, what is the best description of what will happen to our MDE?
0.106
If the minimum detectable effect for a 50:50 allocation was 0.20 SD, and we were to change the allocation ratio (sample size in treatment : sample size in control) to 90:10, about what would be the new MDE?
0.33 Sqrt(1/(0.5*0.5)) = 2 Sqrt(1/(0.1*0.9)) = 3.333 The relative ratio of these two is 1.6666, and therefore the MDE in the 1/10 allocation is 1.6666 times higher than 0.20, and therefore 0.33.
What is a double barreled question?
A questionnaire item that is unclear because it combines two questions.
What is unambiguously a sign of empowerment in developing countries?
Ability to decide when and whom to marry
What area on the graph shows power?
Areas to the right of the right-most null significance line
How do you calculate intention to treat?
Avg outcome of those assigned to the treatment group against the outcome for the control group
Consider a sample of 250 districts that you would like to randomly assign to two groups. Your implementing partner has the funding and mandate to conduct the intervention in exactly 150 villages, leaving 100 for the control. An allocation method that will certainly achieve this goal is:
Complete randomization (sorting a list randomly and assigning the top 60% from that list to the treatment)
Project check
Computing attrition rates by treatment status
Information security
Control and protection against unauthorized access, use, disclosure, disruption, modification, or destruction of information
The government recently passed legislation that all 500 districts in the country must have a hospital that can provide basic emergency care. Currently, only 20% of districts have such hospitals. Because the construction of hospitals is expensive and can take up to two years to build, the government plans to phase this program in over 10 years. It is willing to randomize and wants to know the short-run (1 year) impact of this program on health outcomes. However, individuals from neighboring districts will likely use the hospital if one does not exist in their own district, and therefore they will likely see improved health outcomes as well, even if they are not in a "treatment" district. What strategy would best manage the spillover (described above)?
Create buffer
Learning outcomes are often reported in terms of standard deviations rather than raw test scores primarily because:
Doing so allows us to compare results across studies that use different tests
A bank in the Philippines has just opened a new "commitment savings account" for which there are penalties if clients withdraw money before a prespecified date, or before they reach a certain pre-specified target amount. This helps clients who want to save resist the temptation of withdrawing for unnecessary expenses. They want to measure the account's impact on overall savings, but also the potential side affect of being more vulnerable to shocks. Due to internal policies, they are not allowed to deny anyone access to this account. Looking at the above evaluation question, what is the appropriate method of randomization?
Encouragement design
As part of a survey administered on women in developing countries, the indirect method of polling booth surveys could be used to:
Estimate what proportion of women use birth control pills without their husband's knowledge Determine if women have a choice in deciding whether to use birth control pills or not BUT NOT identify which women used bcpills without permission from husband
The Progress out of Poverty Index (PPI):
Estimates the probability that a household is above or below the poverty line
As part of our training on financial literacy for microenterprises, we teach entrepreneurs how to keep financial diaries. This also allows us to obtain accurate data on their profits. For members of the control group (who are not given training, and do not keep financial diaries), we conduct a monthly survey on revenues and costs to measure profits. Which assumption is violated by this example?
Excludability
The above image represents a :
Gantt chart
If respondents were asked about drug use using a polling booth method with a dozen respondents per polling booth, what methods would allow us to disaggregate responses by gender?
Have different boxes for different genders correct Ensure each group was only one gender
What are the disadvantages of Interactive Voice Response (IVR) and Short Messaging Services (SMS) techniques via-a-vis paper and digital data collection?
IVR and SMS may be difficult to implement for long surveys Respondents cannot easily seek any clarifications on the question/survey
What is an issue relating to top coding?
If a researcher asks each node to only report two friends
What is a true positive?
In this case, the intervention has a true impact, and we are able to reject the null hypothesis, and therefore conclude that the intervention has an impact. This is a true positive.
How do you calculate incidence?
Incidence rate = # new cases / # people at risk
What is true about consumption?
Includes goods that are produced and consumed at home correct Captures accumulated past income and expectations of future income
An evaluation of a financial literacy intervention shows that the intervention has an impact on the revenues of a family's small business. What information do we need to see if the intervention improved overall household income?
Income from other sources correct A measure of total business costs How much of (remaining) profits are used by the household as income correct Whether the household pays itself a salary from the business, and if so, how much
If a study subject refuses to participate in the survey because she doesn't want to reveal private information, we should:
Mark refusal on the survey sheet
Which of the following is true regarding the MET indices of teacher effort:
Measures vary highly in their predictive validity depending on the subject in school Measures of teacher effort predict student achievement correct Measures vary highly in their predictive validity depending on which tool is used
Which group is used to estimate the counterfactual when difference-in-differences is used for the identification strategy?
Non-participants before and after the program has been implemented
What is generally true regarding paper data collection?
Paper typically means waiting longer to access data after data collection
How do you detect survey execution errors?
Questions involving specific notes or instructions to the surveyors
Type 1 error
Rejecting null hypothesis when it is true false positive
What do we expect to happen as our sample size increases?
Sampling distribution will approach a bell curve
A recent study titled "The White Man Effect" found that when measuring altruism in Sierra Leone through a "dictator game", the amount "dictators" gave "the recipient" in the game varied significantly depending on who was present. In particular, they found that when white, foreign, observers were present, respondents tended to give more of their share to the recipient, signaling higher levels of altruism. However, they found that if the players in the game were from communities that received significant foreign aid, this effect went down, perhaps in an attempt to signal the inability to simply "give money away". The above is an example of which type of bias?
Social desirability bias
What is true about conducting a survey in-house as opposed to a survey firm?
Sometimes it requires more investment in understanding the field site
recall bias
Systematic error due to differences in accuracy or completeness of recall to memory of past events or experiences.
negative predictive value formula
TN/(TN+FN)
positive predictive value formula
TP/(TP+FP)
To check whether someone has contracted a highly contagious viral infection, there are three common test options. Test A has a sensitivity of 99% and specificity of 95%. Test B has a sensitivity of 95% and specificity of 99%. Test C has a sensitivity of 98% and specificity of 98%. Which test would we use if our largest concern was detecting the disease in those who have it?
Test A
If the surveyor records a respondent's response incorrectly, at what point will this likely be caught in the data entry process?
The double data entry process is designed to catch data entry error, not surveyor error
Compared to other methods of randomization, what are the main limitations of the encouragement design?
The incentive to take-up treatment may have a direct impact It measures impact of only those who change behavior due to the incentive to take-up treatment correct
Primary outcome
The main outcome of interest in a research study.
In a cluster randomized trial where the ICC is 0, we estimate that our Minimum Detectable Effect Size (MDE) is 0.4 Standard Deviations. If we increase the total sample size, from 1500 to 3000 (randomly selected) individuals from the same population, what is the best description of what will happen to our MDE?
The new MDE will decrease to 0.28 standard deviations
Consider an intervention to inform physicians of the dangers of drug-resistant bacteria and overprescription of antibiotics. Prescriptions are ordered through an electronic system. The intervention works by creating alerts about the dangers each time a prescription is filled. The outcome is number of antibiotic prescriptions. Considering the above problem, which of the following arguments is the most convincing for the appropriate unit of randomization?
The physician level, because physicians may learn about the dangers of drug resistance after the alert has notified them a few times, and this may affect their decisions about future patients
What criteria might suggest to an IRB that they should also consider the ethics of the program being evaluated when reviewing the ethics of research?
The researcher designed and implemented the program The program design has been altered to facilitate the evaluation
The central limit theorem suggests that:
The sampling distribution converges to a bell curve as the sample gets larger
Given non-compliance in the above example, what is the correct inference of the Complier Average Causal Effect (CACE, also known as Local Average Treatment Effect--LATE) estimate?
This is the impact on those who took up because they were assigned to treatment
Which of the following ingredients would you use to calculate incidence of a disease?
Time period correct Total population of uninfected individuals New cases in a specified time period
Sometimes, survey companies (to whom we have outsourced data collection) may ask for additional payments outside those mentioned in the contract. What are some of the reasons it may be appropriate to make additional payment?
Training and field piloting had to be redone because of last minutes changes to the research design
True or False: Intelligence, intrinsic teacher motivation, and interest in learning are all constructs.
True
What are some of the reasons why it may be useful to measure subjective well-being?
Useful to get a measure of what an individual 'feels' about his/her consumption or income level
Where on the graph do we reject the null hypothesis?
We reject the null hypothesis for any estimate in the critical region (the tails) which is depicted by these areas
If not properly contained, and we were unaware of this potential for spillover, how might our results be affected?
We would likely underestimate the impact
Suppose that in our impact analysis of the program described above we are only comparing endline outcomes, without any controls or covariates. If in the control group (but not the treatment group), some of the particularly poor and disadvantaged districts refused to particpate in the study, and we did nothing to correct for the attrition, what might that do to our results?
We would likely underestimate the impact
Which of the following is true regarding measuring savings?
When collecting primary data, asking for changes in savings in the last 90 days is more accurate than asking for specific amount saved in those days
difference in differences
a method for identifying causality by looking at the way in which the average change over time in the outcome variable is compared to the average change in a control group
Skip Patterns in Surveys
a question or series of questions associated with a conditional response
regression discontinuity
a regression analysis study in which the assignment to treatment is based on a cut point for a single quantitative assignment variable
subgroup
a small number of people within a larger group who function separately from the group
anchoring bias
a tendency to fixate on initial information, from which one then fails to adequately adjust for subsequent information
measurement effects
a threat to external validity because various procedures used to collect data in the study changed the results of that study
sensitivity
ability of a test to correctly identify those who actually have the disease
specificity
ability of a test to correctly identify those who do not actually have the disease
The primary purpose of survey tracking is to ensure that:
all willing respondents are surveyed
perturbation
altering data to help with privacy
always taker
always takes treatment regardless of assignment
How do you identify problems with the questionaire?
asking questions about important variables to determine accuracy
Cost Effectiveness Analysis:
can help us decide whether piloting a new, untested innovation is worth the effort
Enumerator check
checking for time taken to complete an interview
spot checks/random checks
checks done without warning
Device testing
checks hardware and software function
What is theory of change?
comprehensive description and illustration of how and why a desired change is expected to happen in a particular context
Information privacy
control and protection over the extent and circumstances of information collection, sharing, and use
covariate
control variable
what is the solution to reproducability
data publication
What is a process evaluation?
determines whether program activities have been implemented as intended and resulted in certain outputs
Consider an evaluation that assigns some individuals to receive financial counseling at a specific credit union branch. Measuring financial health i.e. variables such as investments made, amount saved for retirement etc. using data from that same branch could lead to:
differential coverage bias
Defier
does the opposite of assignment
Complier
does whatever they are assigned
Bench testing
ensures skips and validations are working
type 2 error
false negative accepting null when it is false
What is an impact evaluation?
focuses on immediate observable effects of a program
Generalization
generalizing - summarizing
local suppression
hiding individual data points
selection bias
in an experiment, unintended differences between the participants in different groups
How do you calculate prevalence?
incidence x duration
Logic check
include checking for errors of omission and errors of commission (like missing skip patterns)
"In this research, a baseline survey was conducted on a [random] sub-sample of households on income, health status, accident occurrence and risks, and smoking habits. In the second stage, a local rural bank offered hospitalization insurance to both surveyed and non-surveyed individuals, on average, two months after the [baseline] survey. Insurance take-up was measured using administrative data from the bank and found that those surveyed were 3.4 percentage points (6 percent) more likely to buy the insurance on average." This observation most likely describes which issue?
measurement effects
Never taker
never takes treatment regardless of assignment
pre-post design
participants are measured on the dependent variable before and after an intervention
When thinking about whether lessons generalize to a new context, the main reason we worry about general behaviors is because:
people may not respond the same way in a different context
telescoping bias
people perceive recent events as being more remote than they are and distant events as being more recent than they are
phase in design
phased in over time
solution to specification searching
pre-analysis plan
"Do girls in public schools receive bicycles as promised by the new state government program?" Which of the following evaluation types could be used to answer the above research question?
process evaluation
negative predictive value definition
proportion of people who tested negative who actually don't have the disease
What should be included in the coding files header?
purpose of code input and output of code
How do you detect errors in fraud?
questions such as materials used
Researchers are interested in evaluating the benefits of nutrition counselling on improving iron levels among adolescent girls. 100 Public schools in Busia, Kenya were randomly assigned to treatment and control. In each treatment school, 50 female students through grades 8-10 were randomly selected to receive counselling. At baseline, the survey team visited treatment and control schools to implement a blood test and baseline survey. Teachers in several schools, both treatment and control, requested the team to treat students that they felt were severely anemic so that they can get free blood tests as part of the study. When conducting analysis, including the students recommended by teachers in the treatment group violates which assumption?
random assignment
If survey respondents are subject to a "White Man Effect" (as described in the previous question), we would expect to find that some deliberately claim higher levels of wealth when the surveyor appears rich. This effect may be in the reverse direction if the respondent believes the surveyor is linked to aid decisions, in which case the respondent may claim lower levels of wealth. This type of bias likely occurs at which stage in the response process?
reporting the answer
estimation bias
simple bias from estimation
What is a needs assessment?
the process of identifying, analyzing, and prioritizing the needs of a priority population
positive predictive value definition
the proportion of people who tested positive who actually have the disease
Retrieval Bias
the result of employing a procedure that makes some information easier to recall than other information
aggregation
to collect in group
data flow testing
to ensure the integrity of data are maintained when stored
what is the solution to publication bias
trial registration
Information Harm
use of results or data to learn about an individual as a result of their participation in the research, and then violate their rights or negatively impact their interests
Encouragement designs
used to test the effect of providing an incentive for participation
Factorial design study
uses 2 different interventions in 2 or more groups