chapter 9
Per student spending has risen enormously
Between 1960 and 2010, per-student expenditures grew in real terms by almost 300%
Krueger and Whitmore Results
% taking SAT/ pre-college tests increased; mostly for blacks/ free-lunch kids (Other persistent changes:) Small class size associated with 1.6 percentage point lower teen birth rate for white females Small class size associated with 1.2 percentage point reduction in criminal convictions (though imprecisely measures)
Relationship between teacher Quality and Teacher Characteristics (Rivkin, Hanushek, and Kain)
**The one exception is experience: New teachers in particular tend to perform worse than more experienced teachers a question of primary policy importance is how we can identify and hire more high-value-added teachers. it turns out that the vast majority of what we can observe about teachers is uncorrelated with value-added In Texas: show that teachers with a graduate degree do not have higher value-added, but novice teachers perform 0.03-0.07 standard deviations worse than teachers with six or more years of experience. This experience effect is 30-70% of the effect of a standard deviation increase in value-added
Kane, T.; J. Rockoff and D. Staiger (2007) Photo Finish: Certification Does Not Guarantee a Winner
-New York City Public Schools -NCLB mandate in July 2006 to have a "highly qualified" teacher in every classroom (BA, be state-certified, and prove they know the subjects they teach) Results: * the greatest potential for school districts to improve student achievement seems to rest not in regulating minimum qualifications for new teachers but in selectively retaining those who are the most effective Sum of the observables is better *Simply put, a teacher's certification status matters little for student learning. We find no difference between teaching fellows and traditionally certified teachers or between uncertified and traditionally certified teachers in their impact on math achievement. *Classrooms of students assigned to TFA teachers actually scored 2 percent of a standard deviation higher than students assigned to traditionally certified teachers. **Teacher Experience does matter (how many years they are on the job) We found that teaching fellows, TFA corps members, and uncertified teachers may fare slightly worse as rookie teachers than certified teachers, but they quickly make up the lost ground; Yet by their third year of teaching, teaching fellows are eliciting student achievement as well as third-year traditionally certified teachers. ***This catch up might not be because they're learning on the job- it could be that the very bad TFA/ uncertified teachers leave teaching, so the average increases; they end up following the teachers that leave
Coleman Report (1966) (Equality of Educational Opportunity)
-money per student didn't increase educational attainment -education increases income -socialization: hidden curriculum -obvious stuff: functional literacy/numeracy -school resources had no effect on student achievement 1. Family background and peers did -plateau at doctorate Although they do find evidence of a stronger correlation between school resources and academic achievement among racial and ethnic minority students relative to White students, this effect is small in relation to the importance of family characteristics and socioeconomic status.
Hanushek Critique Caveats
A critique of the critique- that the composition of the U.S has changed dramatically overtime- more immigrants Also because schools serve more disabled children/ Changes in school policies (requirement to serve students with disabilities) - Changes in demographics, including more children in poverty and single parent families
S c o r e i g s t y = α + θ t + δ g + ρ s + τ y + σ i + β X g s t y + μ Z i y + ϵ i g s t y
A final way that economists often estimate value-added models is to employ student fixed effects instead of (or in addition to) lagged test scores: Here, the student fixed effects (σi) control for fixed differences across students in terms of their test scores. This model accounts for fixed family background factors as well as educational inputs that occurred prior to the start of the data. The fixed effects typically use more than two years of data, and so models like this control for student ability over a longer period of time.
Project STAR outcomes
A summary of Project STAR findings in terms of effect sizes is reported in Table 9.3. Overall, the results of Project STAR showed that assignment to a small class was associated with an effect size of about 0.2. Interestingly, the estimates were largest in kindergarten and first grade, fading out somewhat in the higher two grades. This fade-out has led some researchers to question the conclusions drawn from this experiment, because if smaller classes raise student achievement, repeated exposure should produce higher and higher test scores. This does not appear to be the case in Project STAR. shows as well that minority students were particularly influenced by assignment to a smaller class, as were students from low-income backgrounds in second and third grades. In kindergarten, the effect is large enough to eliminate almost two-thirds of the Black-White test score gap. These sizable effects come from reductions in class size of about seven or eight students, or close to one-third of a normal-sized class.
Nonexperimental Class Size Studies
Angrist and Maimondides Rule The researchers find large and significant effects of class size on student achievement for fourth- and fifth-graders but not for third-graders. Large declines in class sizes are associated with sizable increases in reading test scores. These estimates are highly consistent with those shown in Krueger (1999) from Project STAR.
Teacher Salaries have increased
At the same time as teachers are becoming more expensive, more of them are being hired. The average student-teacher ratio in U.S. public schools dropped from 26.4 in 1960 to 14.8 in 2013, a 44% decline.
Lagged test scores
Basically, the lagged test score acts as a control that accounts for selection of students to teachers based on test score levels. Prior test score levels reflect the history of educational inputs each student has received as well as differences in family circumstances and genetics. The argument behind this model is that controlling for lagged test scores allows one to account for these various difficult-to-observe factors that influence current test scores (instead of attributing all of the gain in test scores to their current teachers)
NAEP scores caveats
Cali is different from NY, is different from Utah etc. The correlation between the two variables shown in the top panel of the figure is problematic because there are lots of differences between states that affect student achievement beyond spending on schooling. One way states likely differ is in labor market conditions that affect the price of inputs to schooling, particularly teacher salaries After controlling for differences across states: **Now, there is no relationship between state per-student spending and NAEP exam scores While this evidence is unlikely to be causal due to concerns about why per-student spending varies within states over time, the relationship in Figure 9.5 is sufficiently weak that it raises serious concerns about whether education spending has any effect on measured academic achievement.
Thomas Kane and Douglas Staiger (2008).
Can Value-Added Models Be Trusted? They conducted a randomized experiment in Los Angeles in which they first calculated value-added measures with preexperimental data (when the students were not randomly assigned to teachers), using models very similar to those just discussed. Then, using the random assignment of students to teachers from the experiment, they reanalyzed the models. They found that the value-added models that control for lagged test scores and average classroom characteristics in the prerandomization period did a good job of matching the experimental estimates. In short, their results indicate that the lagged test score model adequately controls for student sorting, such that the value-added estimates from such a model give you the same result as if there were random assignment of students to teachers.
economists David Card and Alan Krueger (1992a) te
Card and Krueger find that rates of return to education are higher for those who attended schools with higher resources One very important implication of the education production function is that students who are exposed to more resources (i.e., to higher school "quality") should receive more human capital from their education. test whether this is the case by estimating how the returns to education vary with the quality of K-12 schools to which students were exposed as children. Their empirical approach has two stages Ostensibly, this method examines whether the growth in school resources over time within a state is linked to the returns to education experienced by the students who were exposed to those increased resources. They then relate these estimated rates of return to the cohort-state averages of three school quality measures workers would have been exposed to as children: student-teacher ratio, teacher wages, and length of school year.
Measuring Teacher Qual
Comparing average test scores across teachers in a given year is an example of a cross-sectional estimator but Even if we controlled for observed characteristics of students, such as race, gender, family income, and parental education, it is unlikely we could overcome the biases associated with nonrandom sorting of students into classrooms with cross-sectional methods. *By estimating the change in scores, we can better isolate each teacher's contribution to her students' learning (as measured on the exam) in that year.
Why Project STAR is the gold standard
Could potentially actually establish a causal link Control group: No treatment • Treatment group: Some intervention If assignment to treatment and control is random and experiment is appropriately implemented, then results indicate causal effect of intervention on outcome of interest. Causal effect: only systematic reason for difference between outcome of treatment and control group is intervention
Coleman Report; Does Money Matter?
Do these funding differences translate into achievement differences?
How krueger deals with non random attrition in Project STAR
First, he assigns students who leave the sample their most recent test score. Effects using these imputed scores are virtually identical to the main results, suggesting attrition from the sample is not generating the main findings. However, this method only deals with students who show up at least once; scores for students induced by Those that didn't were about 2-4% of the pop, so that's a really small percentage and is not considered an issue
How Kruger deals with Non-random reassignment in Project STAR
For the first concern, Krueger leverages the random nature of the initial assignment. While actual class size exposure might have been endogenous because of switching, initial assignment was not. Using the initial assignment rather than the actual class the student wound up in allows Krueger to estimate the effect of small classes among students whose class sizes differed because they complied with the experimental assignment
Kane and Staiger: Randomly Accountable 2002
Gives example of states and how less than 2% of recorded schools had increases in scores for five years in a row* This could be compounded by the fact that bills would have required annual increases for every racial subgroup → would have reduced the odds of improvements every year for schools with more racial subgroups, as year-to-year fluctuations are nearly independent for each racial group Multiple years of data are required to measure improvements in performance reliability Incentives targeted at schools with test scores at either extreme - rewards for those with very high scores or sanctions for those with very low scores - affect primarily small schools and provide very weak incentives for large schools (we did this problem in discussion) (distribution image) When evaluating the impact on policies on changes in test scores over time, the natural fluctuations in test scores must be accounted for BASICALLY "one year's worth of test-score data is insufficient to discern differences in a meaningful way."
Teacher Quality Value Added
His or her contribution to student test score gains. The effect of a teacher on his or her student's test score growth The main strength of value-added analysis is that its data-driven nature allows us to actually produce measures of teacher quality. Its main weakness, however, is that value-added measures necessitate the focus on specific standardized tests. Thus, our measures of teacher quality will be specific to the skills being asked on a given exam, and this focus might cause us to miss important ways in which teachers are contributing to their students' human capital. Bias/ increased Variation Stability: Too unstable for high stakes personnel decisions? 3. Are there links with longer-term outcomes ?
Teacher Characteristics affected VA in Chicago (Aaronson, Barrow, & Sander, 2007)
However, they find that little about the teacher's background can explain this value-added variation. the researchers examine the type of certification teachers have, the quality of undergraduate schools teachers attended, their undergraduate major, and whether they have a master's degree. **while no one characteristic can predict value-added, the sum total of teacher characteristics might be more informative.9 In fact, data from New York City show that changes in overall observed teacher characteristics in high-poverty schools can lead to sizable increases in student test scores (Boyd et al., 2008)
Hoxby (2000) and class size effect on stdudent outcomes
Hoxby use Conneticutt elementary schools with mandated class size rules- compared schools that were just above the cutoff and had to split into more classrooms and just below it and had large classrooms. -Regression discontuinty - compared state-wide exams in 4th and 6th grade -found no evidence
Δ S c o r e_igsty = α + θ_t + δ_g + ρ_s + τ_y + βX_gsty + μZ_iy + ϵ_igsty
Important part: These θt estimates are the value-added measures. A teacher with a higher value of θt has higher test score growth in her class, conditional on the observed characteristics of the students in her class as well as the fixed characteristics of her school and grade and any fixed characteristics common to all teachers in that year **The coefficients of interest in this equation are the teacher fixed effects; This model amounts to a regression of the change in test score for student i in grade g, school s and year y assigned to teacher t on a set of teacher fixed effects (θt), grade fixed effects (δg), school fixed effects (ρs), and year fixed effects (τy). The term ɛigsty is the regression error. The regression therefore controls for fixed differences in test score changes across schools, grades, and years. Xgsty refers to observed characteristics of the classroom in the given year and grade, such as percentage Black, Asian, and Hispanic; the percentage who receive a free or reduced-price lunch; and perhaps even the average test score levels of students from the prior year. Typically, these means are calculated separately for each student in the classroom, using the values of all students other than the given student herself. The Ziy term is a set of time-varying student characteristics, like free or reduced-price lunch status and parental income.
Jepsen and Rivkin (2009)
In 1996, California passed the most sweeping class size reduction law in the nation, reducing K-3 class sizes by about 10 students per class. Measured how that impacted outcomes
Effect of Class Quality on Earnings
Increasing class quality by 1 standard deviation within schools raises earnings by $1520 (9.6%) Lifetime earnings gain of $39,100 X 20 pdv of $782,000 per class
Measurement Problems in Capturing Resource Effects
Inequality in achievement has fallen along with spending inequality. There has been a compositional change in the types of students in U.S. schools over time that has made them more expensive to educate. Test scores provide an incomplete measure of academic achievement; once longer-run measures (such as earnings) are used, school resources appear to be very important. School resource increases lead to a reduction in the high school dropout rate, which affects the set of 17-year-old students taking the NAEP exam.
Shazenbach $320,000 Kindergarten Classroom
Link data from the STAR experiment to U.S. tax records to analyze how class assignment in grades K-3 affects adult outcomes Test to see if early test results lead to longtime adulthood gains - there is an increase in earnings for kids who have a good kindergarten teacher, but this may have been from soft skills and not exactly test skills that were tested on Kindergarten tests
Why Hanushek Critique may be right: Why is observed link between resources and achievement weak?
Measurement problem • Failure to examine funding differences ceteris paribus - Family income, parental education Because public schools are not "for profit" market forces may not yield efficiency in production. 1. School districts are "local monopolies" and may face little competition 2. School governance may be affected by political incentives. 3. Unions may decrease the productivity of school resources if these organizations negotiate for "organizational slack." Other Explanations: • Schools maximize something other than student achievement • "Flat of the curve" • Increased spending improves outcomes not measured by standardized tests (earnings? Card-Krueger)
Explanations for Small Total Resource Effects
Spending levels are sufficiently high to put spending on the flat of the curve. That is, spending is sufficiently high that the marginal returns to additional spending are low. This explanation stems directly from diminishing marginal returns to educational expenditures. There may be political considerations driven by parents' perceptions and preferences as well as by special interest groups; these considerations can distort the use of unrestricted school funds. Lack of competition in local schooling options means principals do not face competitive pressures to produce better outputs. This leads to inefficient use of resources.
Issues with Teacher Value added
Students are not always randomly assigned to students (unlike in Project STAR) There could be biases associated with these models. what if Jamila was given students who were expected to exhibit more test score growth than Zoe? High achieving kids may get 100% on exams two years in a row, but it seems like there's no growth even though there may have been from the teacher
Hanushek Critique
The argument that there is little correlation between the amount schools spend on students and measured academic outcomes in the context of the observed organizational structure of schools. Aggregate time-series evidence.: The large increases in per-student expenditures over time have not been met with gains in measured student achievement Education production function evidence: Research has not found consistent evidence of a positive link between total school resources and student outcomes or between key inputs such as teacher salaries or student-teacher ratios and student performance. - Used NAEP scores to measure if test scores risen across America because of increased per student expenditures/ increased teacher salaries/ decreased STR's; mainly the scores have been stagnant in 17 year olds, increased for little kids, but do little kid scores matter when at 17 you're actually making decisions about college and earning a salary - modest increase in educational attainment over time ***While the lack of a strong correlation between aggregate trends in student performance and school expenditures is suggestive, it is not causal evidence. A critique of the critique- that the composition of the U.S has changed dramatically overtime- more immigrants
Effect Size
The impact of an intervention in standard deviation units of the outcome. For Project STAR, it is the effect of small classes in terms of the standard deviation of test scores. Kinda like test statistics (change in the dependent variable over the standard deviation) Interestingly, the estimates were largest in kindergarten and first grade, fading out somewhat in the higher two grades. This fade-out has led some researchers to question the conclusions drawn from this experiment, because if smaller classes raise student achievement, repeated exposure should produce higher and higher test scores. This does not appear to be the case in Project STAR. shows as well that minority students were particularly influenced by assignment to a smaller class, as were students from low-income backgrounds in second and third grades. In kindergarten, the effect is large enough to eliminate almost two-thirds of the Black-White test score gap. These sizable effects come from reductions in class size of about seven or eight students, or close to one-third of a normal-sized class.
Project STAR
The largest randomized class size experiment in the United States, conducted in Tennessee in the mid-1980s among students in grades K-3. A key to the research design is that both students and teachers were randomly assigned to the three types of classrooms. Project STAR (for Student-Teacher Achievement Ratio) was an experiment carried out in Tennessee which tested the effect of a smaller class size on test scores (both standardized and curriculum based). • Started in 1985, this four-year study compared scores between three groups: - Smaller class size (13-17 students) - Regular class size (22-25 students) - Regular class size with hired aide • 11,600 students, from inner-city, urban, suburban and rural schools partake in this experiment • Both teachers and students were chosen randomly (this fact is critical to establishing causal estimates).
Testing the importance of teacher Value added estimates: Chetty, Friedman, and Rockoff (2014b) "Great Teaching", Education Next
They look at teacher quality effects in the long run- if the effect fades out, does it actually matter? -they can link the students in their sample to long-run outcomes from U.S. tax data: An increase of 1 standard deviation in teacher value-added in one grade increases college enrollment by 0.49 percentage points by age 20 (1.3% of the baseline mean). It also leads to a significant increase in the quality of the colleges students attend. estimate very similar-sized effects: an increase of 1 standard deviation in value-added raises student test scores by 0.08-0.09 standard deviations. Their teacher mobility analysis further supports this finding
complementary and class sizes
This arrangement would make class sizes complementary, since they are allocated to the most academically advanced students. Thus, comparing outcomes between students who are exposed to different class sizes within a school can yield biased estimates of the effect of class sizes on their achievement.
Student teacher ratios decreased
Thus, at least some of the spending increases in Figure 9.1 have gone to support smaller classes, as measured by the student-teacher ratio
How krueger deals with hawthorne Effects n in Project STAR
To test for Hawthorne effects, Krueger examines the relationship between class sizes and test scores only among the control group. Variation in school size generated differences in the exact size of the large classes. He shows that this variation led to effects similar to the experimental estimates, which is inconsistent with the main effects being driven by teachers who might be influenced by their participation in an experiment STAR class size experiment, we refer to being in small classes as being in the treatment group and otherwise in the control group
Card and Krueger (1992b)
Using a similar method to the one already discussed, they investigate whether relative improvements in quality of schools serving Black students in the U.S. South post-World War II can account for the observed convergence between 1960 and 1980 in labor market earnings between Blacks and Whites. They seek to relate variation in school inputs across segregated Black and White schools between 1915 and 1966 in the Southern states that practiced segregation to later-in-life earnings of individuals educated in those states Use Student to teacher ratios as a proxy for more resources **On the whole, the authors find that relative quality improvements of Black schools can explain 20% of the observed convergence in the Black-White earnings gap between 1960 and 1980. These results point to the importance of school quality in reducing long-run economic inequality. More quality ed leads to convergence of earnings across race groups
(Chetty, Friedman, & Rockoff, 2014a).
Using data from a large, unnamed school district in the United States from 1989 to 2009, comprising over 2.5 million students linked to U.S. tax data, the researchers first show that a detailed set of typically unobserved parent characteristics does not affect their value-added estimates when included in the model can show that the sorting on prior trends does not affect value-added estimates. Their results are unchanged when they control for twice-lagged scores, and their analysis based on teacher mobility confirms that students exposed to higher-value-added teachers make larger gains. On the whole, the evidence to date points to value-added models providing informative data on teacher quality. When a school and grade recieve a high value added teacher, there is a sharp increase in average test scores in that school and grade, The increase is not predicted from test scores in prior years, (lagged) and the scores of students in the grades below the one that recieves the high VA teacher are flat (doesnt affect the scores of kids in grade below- so need to account for that in value added regression model)
Even though there's little evidence of a strong positive relationship bw spending and outcomes, there's caveats
While the lack of a strong correlation between aggregate trends in student performance and school expenditures is suggestive, it is not causal evidence. *A caveat to this review is that estimating the causal effect of school spending on education outcomes is an extremely difficult undertaking (can't randomly assign schools to people)
compensatory and smaller classrooms
Within schools, smaller classrooms may be those that serve students with special needs, and as such, this resource is compensatory
Testing the importance of teacher Value added estimates: Rivkin, Hanushek, and Kain (2005)
detailed administrative data that contain test scores, schools attended, and grades attended for all students in Texas. They focus on third- through seventh-grade students in 1993-1995, leading to a dataset with over 200,000 students from over 3,000 public schools in the state. Although they cannot link students to teachers, they use the variation across school grade cohorts in teacher turnover that led to different cohorts being exposed to teachers of differing quality. Similar to Rockoff (2004), they find that an increase of 1 standard deviation in teacher quality increases student test scores by about 0.1 standard deviations in reading and math. According to their estimates, increasing teacher quality by 1 standard deviation is akin to reducing class size by 10 students. The range of teacher quality also is quite large: Some teachers achieve test score increases equal to 1.5 years of student learning, while others only achieve increases equal to 0.5 years.
"How Does Your Kindergarten Classroom Affect Your Earnings? Chetty et al
s examined by linking individual student data from Project STAR to long-run labor market and education outcome data obtained from U.S. tax records. The authors manage to link 95% of the children who participated in the STAR project to their later-in-life tax records. This linkage allows them to analyze how class sizes, class characteristics, and teacher assignments in grades K-3 affect later-in-life education and labor market outcomes when the individuals are 27 years old **The importance of this analysis stems from the fade-out in test scores shown by Krueger and Whitmore (2001), which suggests it is important to determine whether any effects persist into early adulthood.
The Difficulty of Running a Social Experiment: Krueger's analysis of Project STAR (1999)
shows that across all observed characteristics, those assigned to a smaller versus a bigger classroom were identical on average within each participating school. Randomization was two-fold: First, all students were randomly assigned to one of the three class types upon initial enrollment in the school. The second component was that students randomly assigned to one of the large class groups were randomized again after kindergarten either to receive a teacher's aide or not. Most students entered in kindergarten, but many students also entered in each of first, second, and third grades. The experiment ended after the third grade. The Difficulty of Running a Social Experiment: Nonrandom reassignment: Parents with children assigned to the larger classes complained to the principal and got their child reassigned. About 10% of students switched in this manner. Nonrandom attrition: Those who were assigned to a larger class may have been more likely to leave the district for a private school or for another public school. There was about 50% attrition over the course of the study. The worry is that the highest-ability students in the control group left, thus biasing the estimates upward. Hawthorne effects: Teachers assigned to smaller classes may have responded to the fact that they were involved in the experiment, thus biasing the estimates upward.
Testing the importance of teacher Value added estimates: Rockoff (2004)
student-teacher linked data from K-sixth-graders in a single county in New Jersey. Estimating a student fixed-effects model, he finds that an increase of 1 standard deviation in teacher value-added is associated with an increase of 0.08-0.11 standard deviations in student test scores.
S c o r e i g s t y = α + θ t + δ g + ρ s + τ y + β X g s t y + μ Z i y + π i , g − 1 , s t y + ϵ i g s t y
π i , g − 1 , s t y : the impact of prior test scores on current test scores fades over time. The first-difference model embeds in it a strong assumption, namely that the effect of the prior year's test score does not decay. Put differently, this model assumes that the effect of last year's test score on a current test score is 1. This is a strong assumption and one that is easily relaxed by controlling for lagged test scores instead of using the first difference; The first-difference model is akin to forcing π = 1, but a large body of research suggests that π < 1. This is called decay, as the impact of prior test scores on current test scores fades over time. Importantly, the lagged test score model is very easy to estimate, and the data requirements for this model versus the first-difference model are identical. One can even augment this model by controlling for more test score lags. This allows for researchers to control in a detailed manner for each student's prior achievement.