The EduSkeptic's Guidebook 1.0

Mar 29, 2023

My recent post on education and optimism bias received a lot of attention, mostly positive. It occurred to me that, as I’ve written so much on this topic, it may be hard for new readers to get caught up. So here I’m laying out my gloss on the overall state of American education research, policy, and discourse, and pointing you in the direction of evidence for these ideas. This is more of a series of various observations than a conventional argument. The most thorough exploration of my thoughts on the politics and philosophy of education can be found in my first book, The Cult of Smart. The most thorough exploration of the available research can be found in the posts linked here, particularly the piece below, which contains a comprehensive argument about how position in the academic performance hierarchy is largely fixed.

Freddie deBoer

Education Doesn't Work 2.0

This is the first (and may prove to be only) time that I have updated a previous post to improve it. I am doing so because the basic observation outlined here is core to my view of education and society and I was dissatisfied with the first attempt…

3 years ago · 222 likes · 301 comments · Freddie deBoer

If audio is more your thing, this 2021 conversation with public school educator C. Derick Varn is a good summary of where I’m coming from.

My Qualifications

I hold a BA in English and philosophy from Central Connecticut State, an MA in Writing & Rhetoric from the University of Rhode Island, and a PhD in English from Purdue University. At Purdue I focused on assessment of student learning, particularly on writing assessment and education policy related to assessment. My dissertation was about the CLA+, a test that measures college learning, and the social and economic conditions that were (at the time) putting pressure on colleges to assess. At Purdue I worked in the Oral English Proficiency Program, participating in developing and rating for the Oral English Proficiency Test, a proprietary assessment of spoken English for speakers of other languages. In the last year of my PhD studies I served as the assessment coordinator of the Introductory Composition at Purdue program, where I developed and implemented an essay-rating assessment system, including creating prompts and rubrics, training raters, and collecting and analyzing data. I also served as a peer reviewer for the academic journal Language Testing. Relevant coursework from grad school includes classes in frequentist statistics and multivariate analysis, experimental design, quantitative research design for language testing, qualitative research design for education research, empirical methods for composition research, writing assessment, curriculum design for second language pedagogy, and an independent study in empirical research in writing pedagogy.

For four years I served as the Assessment Manager at Brooklyn College in the City University of New York, where I was in charge of assessing student learning, helping faculty to develop assessment systems for their departments and collecting and presenting college-level performance data. I have taught more than a dozen sections of freshman composition, classes in oral English skills for graduate students, a course in public writing, and dissertation workshops for doctoral students. I worked for 18 months in a public school district, primarily as a paraprofessional in a program for students with severe emotional disturbance and as a long-term sub in middle school social studies. I also did test prep of various kinds for four years.

I don’t know if that’s “good enough” and I’m not particularly concerned about it.

Preliminary: Educational Problems Are Recognized to the Degree That They Are Addressable by Policy

The boundaries of educational debate are set by the reach of policy. That is to say, what we allow into our discourse about education is dictated by what we can reasonably influence with the usual policy tools. The ed reform movement is famously hostile to discussion of student poverty, under the dictate of “no excuses.” Reformers don’t control poverty, so it has to be read out of the debate.

A good example: premature babies. Preemies go on to have significantly worse academic outcomes than babies born at full term. (Meta-analysis here.) How much worse? “Combined effect sizes show that very preterm and/or VLBW children score 0.60 SD lower on mathematics tests, 0.48 SD on reading tests, and 0.76 SD on spelling tests than term-born peers…. Combined effect sizes for EF revealed a decrement of 0.57 SD for verbal fluency, 0.36 SD for working memory, and 0.49 SD for cognitive flexibility in comparison to controls.” In the context of education research, these are very large numbers. With 10.5% of babies born prematurely (though not all of them very low birthweight) the scale of this issue is obviously very large. Here’s my question for those of you who aren’t professionally involved in education research: have you ever heard of this dynamic? Even once? I’m betting no, and I would contend that you haven’t for two reasons. One, this is obviously very sensitive for parents of these children. But second, and more importantly, because there’s no policy fix. There’s no lever for a state Department of Education to push to address premature birth. And when there’s no policy lever, we just don’t talk about it. This is obviously not ideal and leads to the situation where individual variation in academic ability goes undiscussed.

In a document that appears to have disappeared from their website, the neoliberal reform shop RAND Education once stated

Some research suggests that, compared with teachers, individual and family characteristics may have four to eight times the impact on student achievement. But policy discussions focus on teachers because it is arguably easier for public policy to improve teaching than to change students’ personal characteristics or family circumstances.

Nothing but a policy hammer; nothing but policy nails.

The Unspoken Assumption: Variables That Affect Performance in School Are Exogenous in Nature

Late-20th and early-21st-century education debates were dominantly made up of warring camps that nevertheless agreed on one unspoken but massive assumption: that the variables that contribute to a child’s performance in the classroom are dominantly exogenous in nature, that is, that a child’s performance in the classroom is the product of forces outside of that child. From the “reform” crowd came the insistence that poor performance was the result of bad teachers, failing schools, and wicked unions. From the side opposed to the reform movement came the insistence that poor performance was the result of poverty and socioeconomic inequality. From the social justice camp that has developed in the past several years comes the insistence that poor performance is the result of white supremacy. In this, the debate about education in America has been argued between people that share a point of view on a pivotally important question. What has gone undiscussed is the role that every child’s individual academic talent or potential plays in their outcomes.

I say “has gone undiscussed” not because people would openly insist that endogenous factors play no role, but that those factors simply have had almost no presence in the conversation. Typically if they’re invoked they’re dismissed with the repetition of “no excuses.” Privately people may admit that individual talent plays an important role - I know because I’ve asked many of them - but there was and is a code of omerta about this in public debate.

Of course exogenous factors like socioeconomic status have some influence over outcomes. The trouble has been that, without a frank conversation about individual tendencies, there’s no way to intelligently sort what we can control and what we can’t. There’s no way to set realistic goals or apportion responsibility for failure when failure inevitably happens.

The Core Argument: Three Statements in Descending Order of Confidence and Importance

Here are three statements that I have made in the past that go a long way to defining my position on these issues. The first I find essentially indisputable, the second I think is very likely thanks to the first, and the third is informed speculation about the second.

Students in essentially all learning contexts sort themselves into a position in the ability spectrum at an early age and stay in that position with remarkable consistency. That is, when we measure learning we find that different students perform at different levels of ability in various quantitative metrics, which creates a hierarchy of performance. What we find is that, with some wiggle and some exceptions, most people don’t move around significantly in that hierarchy of performance over the course of formal education. The students who are high-performing in early childhood education tend to remain high-performing right through college, while the students who are low-performing tend to stay in the same place as well. A remarkable number of interventions consistently fail to move students around in this performance spectrum. We can give kids skills and knowledge that they didn’t previously have. But thousands of years of formal education have not revealed consistent means to change relative performance. Again, you can see this post for many studies and datasets demonstrating this core point.
These static outcomes in relative performance over the course of life suggest that there is such a thing as an individual academic tendency or potential, some sort of intrinsic attribute that predisposes an individual to a particular level of ability. The degree to which this level of potential can be exceeded isn’t precisely known and likely varies with particular skills or tasks, but as noted above the remarkably static distribution of performance suggests that potential is sticky. We have personal academic tendencies that influence our performance in metrics of academic success.
The most parsimonious explanation for this factor is genes, or more likely and specifically, gene-environment interactions. It seems sensible to believe that individual genetic variation influences cognition in consistent ways that have consequences for learning. There’s a whole vast research literature on behavioral genetics out there; a good primer is Kathyrn Paige Harden’s book.

Because discussion of human genetics is fraught, people tend to fixate on the third point. But it’s both the part that I’m least sure of and least interested in. I am much more interested in getting policy types to grapple with point one. Its social consequences are profound. If students have a strong tendency to remain in a given ability band, it means that teachers have been scapegoated for conditions out of their control; it means arbitrary performance standards and one-size-fits-all curricula are counterproductive, even cruel; and it means that the meritocratic ideal will inevitably produce inhumane results. If students have a more-or-less immutable academic potential, it follows that the frantic effort to raise test scores is pointless, that we should focus on nurturing each student so that they can reach the level of their natural potential, and that we should care less about quantitative metrics and more about making schools safe, welcoming, stimulating places where students can discover themselves.

Kids Learn, But That’s Not What Our Culture Actually Cares About

My point has never been “kids can’t learn.” Kids learn all the time. And as Covid learning loss demonstrates, if you artificially restrict children’s access to school, they can fall significantly behind. (We knew that before Covid, for the record, such as with the dismal performance of “virtual charter schools.”) However, while all kids can learn, not all kids can learn to the same level of mastery or at the same speed. My skepticism is not towards the ability of schools to increase student skills and knowledge, but rather about the ability of schools to move students dramatically in the performance hierarchy. It’s tempting to say that all we should care about is learning, not relative performance, and I’m personally on board with that. But, one, academic achievement gaps are inherently questions of relative performance, and if you want to eliminate them, you have to think in relative terms; and two, the meritocratic system that rewards the most academically accomplished is also inherently focused on relative performance. Who gets to go to Stanford is based on who performs better than who in the classroom, and who gets to work at Google depends on who goes to schools like Stanford. Economic reward is handed out based on relative performance, not absolute. As long as this is true, parents and students will find learning in and of itself less important than where they are on the totem poll.

The Difference Between Norm Referencing and Criterion Referencing is Useful for Understanding Our Goals

A useful way to consider this distinction between absolute and relative learning, which keeps asserting itself in this conversation, is by understanding the difference between norm-referenced and criterion-referenced tests. Norm-referenced tests compare test-takers to other test-takers, while tests that are criterion-referenced compare test-takers to some specific level of performance. The SAT is a norm-referenced test; the test developers work hard to ensure the tests return a certain distribution of scores, while test takers really only care about how they score relative to others, as their relative position determines how their score will affect their future. The test to get your driver’s license is a criterion-referenced test; the point is to demonstrate the ability to perform up to a certain threshold on the given construct, and you simply don’t care how you perform relative to peers. When we give people driving tests, we want to ensure that only people who possess a certain minimum of skill at driving are on the roads. There’s a specific ability that needs to be acquired and demonstrated. But when we look at PISA scores to compare countries, when we look at state achievement tests to compare schools and districts, when we look at GRE scores to compare students - this is norm referencing, comparing to the dataset as a whole. When we’re trying to use educational outcomes to decide who gets admission/a scholarship/a job, we’re using norm referencing, at least in concept.

Education rhetoric tends to fixate on criterion referencing - students need to learn X, Y, and Z. In practice education tends to fixate on norm referencing. If we were just concerned with whether kids are learning, we could already declare victory.

American Kids Are Learning More and Faster

I have in the past been called an “edunihilist” or similar, but this misunderstands my position. My fundamental point is that we have every reason to believe that there will always be a distribution of academic performance, that we will never achieve equality in academic outcomes. (And it would break the system if we did.) This is important because there are profound personal consequences based on a given student’s level and type of academic performance. But students can and do learn, and what’s more, American students have made consistent and significant progress when compared to students of the past. That is, students of similar ages and positions in the ability spectrum can now do much more than their analogs in past decades. From students born in 1957 to students born in 2007, for example, performance in constant terms in math metrics has improved by an impressive .95 SD and a more modest but still real .20 SD in reading metrics. These gains represent themselves, among other things, in the age at which students learn specific material - for example, decades ago students often first encountered fractions in middle school, whereas now students sometimes learn them as young as age 7. That’s about three or four years of improvement in terms of when those skills are acquired. Of course, difficulties persist. But the point remains that for American K-12 students in general and for most identifiable subpopulations of students we’re seeing greater knowledge and skill at younger ages.

As I have been fairly obsessive in pointing out, the trouble is that these absolute gains do not and cannot “solve our educational problems” because our educational problems are always defined in relative terms. Specifically, to talk of racial and socioeconomic gaps is inherently to define education in relative terms.

To Close Gaps, It’s Not Enough for Some Students to Learn; Other Students Must Learn Less

This is a very basic point, but I find that it’s consistently under-discussed: to close achievement gaps like the racial achievement gap, not only must Black and Hispanic students learn more, white and Asian students must learn less than they do. Closing any gap has to entail the poorly-performing students not just learning but learning at a sufficiently faster pace than the high-performing students that the gap closes. This is not a minor point! American students of all races have been improving over time. But gaps have persisted because… students of all races have been improving over time. As long as white and Asian students learn as much as Black and Hispanic, the gap cannot close. This is so obvious it feels like it should go without saying, but the point is frequently obscured, for a couple of reasons. First, because “every kid can learn” is a more pleasing and simplistic narrative than “kids from disadvantaged subpopulations can not only learn but can learn sufficiently to close large gaps against competitors who are still learning more themselves.” Second, because the problem suggests a solution that is politically untenable, to put it mildly - to close gaps, we need to prevent the students who are ahead from learning at all.

The Racial Achievement Gap is Not Easily Decomposed

My guess is that the racial achievement gap is the result of a very large number of variables that each have small individual effect but which in aggregate result in the overall observed difference. Socioeconomic inequality and poverty are very often invoked as the cause of these gaps, and certainly there is a generic economic influence on performance. But the data simply does not support the notion that the racial achievement gap is only economic. When we match students of different races at the same income band there are still consistent gaps between racial categories, though they close somewhat. For example, Asian students whose parents make between $30k-$40k score similarly on the SAT Math to white students whose parents make $70K and up. I think that the urge to reduce the racial achievement gap only to money underestimates how profoundly multivariate racial inequality is in this country; there are innumerable environmental, social, and cultural differences between racial classes that could be influencing academic outcomes. Unfortunately, it’s harder to change many variables of small effect than one variable of great effect. Part of the reason that I’m a big supporter of just giving people money is that parents who have more monetary resources are more likely to be able to change these many variables of small effect than government would be able to directly.

The Gender Reversal in Education Serves as an Example

When talking about the racial achievement gap, it’s useful to consider the reversal in academic performance by gender. Obviously, there are major differences between racial categories and gender categories, but recent trends should at least remind us that major changes are possible in group performance. Girls and women performed behind boys and men in school in almost every domain. But over the course of the past half-century, this reality has been substantially reversed, with girls now at parity or exceeding boys in most educational domains. (You’ll find a comprehensive summary of these changes in Richard Reeves’s recent Of Boys and Men.) Throughout history, many people believed that male academic advantage was biological in nature, but these recent developments demonstrate otherwise. However, it’s worth noting that there was no specific pedagogical or administrative change that caused women to improve relative to men; rather, improving social conditions for women helped unlock their potential. But it’s important to note that within the category of girls/women students, there’s a performance distribution - students who perform at the highest level and students who perform very poorly. Improving group-level performance can never eliminate within-group differences.

After the Achievement Gap

The racial achievement gap has closed very slowly, despite a lot of effort, but there has been some progress. I suspect that gradually improving environments for Black and Hispanic people are responsible for this growth. And while things aren’t changing nearly fast enough, I do believe that the racial achievement gap will someday close. What I have been asking people to grapple with, particularly in my first book, is what happens after the achievement gap? Say we get the major racial categories to parity on educational metrics, cool. The singular obsession of the educational policy apparatus for 40+ years has been taken care of. What are we left with? We’re still left with a distribution of academic ability, where some students perform two standard deviations above the mean and where some perform two standard deviations below the mean. And those kids at the bottom of the distribution will face genuine challenges in their lives thanks to their poor performance. If there’s anything inherent or intrinsic to academic ability at all, is that racially-equal future any more sensible or humane? The fixation on the racial achievement gap has allowed us to avoid that question for decades, but it’s troubling and important, even existential.

As I said in The Cult of Smart,

There’s a group of students that needs the help of our policy apparatus more than any other group, yet they are rarely if ever mentioned in our policy debates. We know that they’re performing poorly in the classroom and in the working world, yet no one proposes programs to help them. They are systematically shut out of the most coveted colleges, the best-paying jobs, often even out of stable and happy marriages, yet to speak of their plight often invokes incredulity. They are certainly the most disadvantaged subset of our student population that you can name. I’m speaking, of course, of the untalented, those unfortunate enough to lack a natural aptitude for school and the types of intellectual skills that are so essential in today’s economy. They are black and white, male and female, Jew and gentile. They can be found in public schools and private, in the Northeast and the Southwest and everywhere in between. They are, depending on your point of view, anywhere from the bottom quarter to the bottom half of our educational distribution. And they are suffering in a system where the financial and social benefits of academic success are now essential to living a comfortable life.

The inescapable gravity of the achievement gap discourse has obscured this basic question: what do we owe to students at the bottom of the performance distribution after we’ve eliminated racial inequality? How do we help them find economic security and fulfilling lives? Or do we not care at all?

Here’s the Fundamental Conflict

We talk about education as a great leveler, as a promoter of equality, but we also want education to be a system that sorts good students from bad, that establishes a hierarchy of excellence. And that makes no sense. They’re fundamentally incompatible goals. See too talk of educational “equality”: if we established educational equality, there would not be such a thing as excellence. Could not be.

The Standard Account of American Education and Deliverance Through Improving It

In 2017 I described what I call the Official Dogma of Education, which you can read for my overview of the dominant perspective on education in this country. A briefer version would amount to

America’s economic problems, and particularly its socioeconomic race gaps, are the product of a bad education system.
“Fixing” education will solve these problems and close these gaps.
There are effectively no limits on the ability of policy and pedagogy to improve educational outcomes.
We must therefore spend money and bring policy pressure to bear on these problems in order to solve them.

This position assumes that we have replicable, scalable, and reliable means to meaningfully move students around in the performance spectrum, and that exogenous factors like poverty either are not influential or can be overcome by what happens in the classroom. The “no excuses” rhetoric that dominated education talk in the 2000s and early 2010s, as well as formal policy like No Child Left Behind, depended on this attitude - that sufficient will and unwavering standards could fix education. They did not.

American Education is Not the Basket Case It’s Made Out to Be

A central misconception regarding American education is that we are a uniquely terrible nation when it comes to schooling. This assumption is not defensible. It’s certainly true that our performance does not look good relative to expenditures, but then school funding is not consistently or simplistically associated with student performance. Overall, I think the evidence is strong that the United States has mediocre mean academic outcomes and that this disappointing average performance is the product of a relatively small number of schools in economically-challenged parts of the country that perform truly terribly. Our median student does alright, not great but alright, but our worst-performing students struggle dramatically compared to the rest of the developed world. Meanwhile, the top-performing American public school students are competitive with those from anywhere; I would put our top 1% or 3% or 5% of students up against those from any country. In events like the International Chemistry Olympiad and the International Mathematics Olympiad, for example, American students have excelled for decades. American high school students go on to flourish in the most elite universities in the world. Our problem isn’t at the top.

The story of American education is not of generically bad or even mediocre results but of extreme inequality. Which is the general American story.

We’ve Never Done Particularly Well in International Comparisons

The narrative of American school failure frequently compares our supposedly-poor performance of today with halcyon days of the past when we were an international leader in education. But as David E. Drew effectively demonstrated in 2012, this is an illusion: we’ve never done well on international educational comparisons. Such comparisons have only really been made in a rigorous fashion since the 1960s, and we’ve done bad-to-middling during that entire period. To the extent that there were “glory days,” they were a vestige of a time when American schools were defined by de jure segregation and where a large percentage of the poorest white students didn’t attend school. (It wasn’t until the 1970s that we first achieved near-universal participation in elementary school.)

You might note that the United States has been the world’s dominant economic, intellectual, and military power during the entire period that we’ve been struggling in international educational comparisons, which says something about how important these comparisons really are.

Education Research is Difficult and Much of it is Low Quality

A few years back, there was a lot of commentary on the replication crisis in psychology research. It’s worth saying that a lot of the research in education has the same problems, and the field has done less grappling with those problems than psychology has. There is some high-quality research. But as I laid out here, there are inherent conditions in school settings that make for profound difficulty in securing permission to study, in randomization, in the temptation to data snoop, and in terms of basic research ethics. Selection effects are everywhere. How does anyone know which studies are good and which are bad? The ones that agree with you are the good ones, obviously. Other than that, you need to rely on the aggregation of many studies over time and on meta-analyses. And you need to stay skeptical, as things often change. Pre-K was the silver bullet in education. Then it wasn’t. Don’t get married to one research finding.

But Educational Assessments are Powerfully Predictive, Highly Reliable, and Valid

Because they consistently report bad news, educational assessments (tests) get a bad rap. They are often asserted to be meritless without evidence, usually by people on the political left. In fact, modern educational assessments are valid, reliable, and predictive. Indeed, they’re the kind of education research we do best. Standardized tests are remarkably effective at predicting future academic performance, and a lot of other things too. Again, this is why people hate them - because they’re too predictive. They predict future performance so well that they make people feel like they’re foreclosing on possibility, that they’re cursing children to failure. But of course the tests can do no such thing. They can’t create inequality. They can only reveal it. And if you believe in the existence of racial inequality (economic, social, legal) then the racial stratification of standardized tests is exactly what you should expect.

Fads Come and Go

For a few years there, “grit” was everything. Then it wasn’t. This is a repetitive condition in education policy. Some new fad will look good in limited research with samples of dubious randomization, the fad will be implemented in real schools, gaps will fail to close, and we’ll move on to the next thing. Rinse, repeat.

College Exacerbates Inequality Rather Than Reduces It

Getting kids into college is frequently assumed to be the goal of all of these efforts, and college is represented as a potential great leveler in our society. But it’s difficult to understand how that would work. There are many benefits that education bestows on students, but the market benefits are inherently influenced by supply and demand. We know empirically that to a remarkable degree the college wage premium is a function of how many people have degrees compared to the number of available jobs. If you dramatically increase the number of college degrees the people who have them will compete against each other and their value will plummet, which is counterproductive. Today, where college degrees remain relatively rare, college education creates a class of people who have higher incomes than the norm, which is the opposite of promoting equality.

A good book-length discussion of how this has functioned in American education is The Education Trap by Cristina Viviana Groeger, which I reviewed here. As Groeger documents, the natural tendency is for those with more education to become a kind of organic cartel, using their credentials to raise their own wages, depress those of others, and defend the advantages of incumbency. Which is all perfectly predictable from first principles.

Should more people go to trade school rather than college? Probably. Are they a panacea? They are not.

A More Educated Populace is Not Necessarily Richer or More Equal

The most common justification for our education reform ideology is that improving education will reduce poverty and inequality. But as Matt Bruenig has been documenting for years, there’s very little reason to believe that this is true. In the half-century between 1960 and 2010, we became a vastly better-educated country. But the working-age poverty rate, which I would argue is the relevant metric for this question, actually modestly increased. As measured in Gini coefficient, inequality rose dramatically. If education lowers poverty and inequality… why hasn’t becoming much more educated improved the United States’s quantitative measures of poverty and inequality?

Fund Schools to Fund Them

As long as funding schools and their programs are justified through the desire to improve quantitative academic metrics, those schools and programs will be vulnerable, as our ability to change those metrics is limited. So fund programs for the social and humanitarian good they do. Free school lunch doesn’t do much for test scores, so fund free school lunch because it’s the moral thing to do. Afterschool programs don’t do much for test scores, so fund them to give kids safe and nurturing places to go while their parents are at work. And chess lessons don’t improve grades, music lessons don’t improve test scores, and all manner of good things don’t improve test scores or grades. So fund them for their intrinsic value to students, not to juice quantitative metrics.

Giving People Money Works

This country had a crisis of poverty among the elderly and disabled. So we instituted Social Security, and those poverty rates fell off a cliff. Giving people money is simple to implement and effective at improving quality of life. Trying to improve people’s economic prospects through education is undertheorized, has delayed benefits under the most optimistic scenario, and has thus far been a series of failures. Why not give parents money so that they can improve the living conditions of their kids, and as a bonus maybe test scores will improve? Why not throw your political muscle behind redistribution?

We Need to Figure Out What We Actually Want

Again and again with schooling, what you’ll find is people with very strong views on education who nevertheless don’t actually know what they want. A few years back there was a big New York Times piece on school choice in NYC, and it reflected an incoherent perspective on what schooling is and is for. This failure to understand basic conceptual questions - relative vs. absolute, education for equality or education to foster excellence, college as leveler or college as cartel - ruins our ability to make progress. It’s time for everyone to go back to basics.

A line from one of the old pieces I linked to here that I should have lifted - mobility is necessarily antagonistic to equality.

Expand full comment

Nameless Shameless

We should educate people to their abilities and provide for their material needs as a society.

2 replies by Freddie deBoer and others

46 more comments...

Freddie deBoer

48 Comments

Ready for more?