Federal college rankings: who are they for?

Dec 30 2014

Before the holiday, the Department of Education circulated a draft prospectus of the new college rankings they hope to release next year.That afternoon, I wrote a somewhat dyspeptic post on the way that these rankings, like all rankings, will inevitably be gamed. But it’s probably better to bury that off and instead point out a couple looming problems with the system we may be working under soon. The first is that the audience for these rankings is unresolved in a very problematic way; the second is that altogether two much weight is placed on a regression model solving every objection that has been raised. Finally, I’ll lay out my “constructive” solution for salvaging something out of this, which is that rather than use a three-tiered “excellent” - “adequate” - “needs improvement”, everyone would be better served if we switched to a two-tiered “Good”/“Needs Improvement” system. Since this is sort of long, I’ll break it up into three posts: the first is below.

The first is that the draft document is unclear at its core about who the audience for these rankings is going to be. In the imagination of the authors, it seems most often to be a family sitting around their kitchen table, deciding which college offers the best “value.” The report has the germ of a ranking system designed to assess that value–using really extraordinary data from every student who’s ever taken out of a line to build a large regression model predicting “success.” (More about that below).

As a ranking for prospective students, it’s possible to _imagine_something useful coming out of this. The vision driving it seems to be, roughly, an antidote to US News and World Report that doesn’t include endowment size or library or alumni giving or any of the things designed to keep the same five schools at the head of the ranking. It’s essentially the federal government adding its voice to the massive chorus saying “for God’s sake, don’t pay full tuition at USC when you have in-state at UCLA.” Looking at the criteria, it seems like we’ll probably get a school like the Air Force Academy (mixed socioeconomic applicants, no tuition/debts, universal employment) as #1 overall instead of Yale. It wouldn’t be bad to have a ranking out there like that, so long as we’re clear that it’s just about some minimum threshold of employability and postgraduate debt.

But there’s another set of language that keeps creeping in, about “rewarding” colleges for the good they do. This probably comes of the constituent meetings. The greatest beneficiaries are a different audience, of deans, who want to protect the socially valuable parts of the mission from the econometricians. (Or, more cynically, more metrics to demonstrate the successes of their initiatives in the cover letter for their next deanship.) So the report spends pages on the boost colleges will get for having large numbers of Pell Grant awardees enrolled, because it would be unfair to punish a college for having a large number of low-income or first-generation students. In fact, a number of these people, including myself, have been screaming bloody murder about the ways that a purely value ranking would enforce existing socioeconomic disparities because of things like the gender wage gap.
The goal of this audience here is to use the rankings as tools to make colleges behave more the way that Education Department stakeholders think they should. If Harvard had more poor students, the country would be better off. Therefore, we should change the rankings to reward Harvard for having fewer rich kids.

The problem is that these two goals are deeply unreconcilable. No individual student (at least not the rational-actor, wage oriented automaton that the report assumes all prospective applicants are) benefits from those systemic alignments. If low-income Hunter College graduates do better than Lehman College graduates but Lehman has more low-income students, it would be irresponsible for the government to tell students to go to Lehman just because its mission is more socially important. But that’s one of the things the proposal mulls over. (This is in addition to using income metrics as inputs into the magical model, below).

The eagerness to build “movement” into the model is similarly baroque. Any complex system will show statistically significant movement from year to year; but to flag those in the ranking with the implication that it’s likely to continue is both a disservice to students and to the next set of deans, who will have to keep whatever bubble of juked stats the previous ones set in, whether it benefits their mission or not.

It seems likely that this tension will eventually tear the reports apart into two completely separate rankings–one for students, and one for administrators. (The cleavage is already taking place in the report). The student one will then have all of the flaws of “punishing” socially useful institutions that the report worried about; and to cover those up, they’re already insisting that this is a “rating,” rather than a ranking. My suspicion is the ultimate release will be so occluded in variable measures that it fall like a lead balloon. Or, to use a more relevant metaphor, it will fall like the 2010 NRC graduate program rankings. Those were smothered caveats, regression models, and error bands. Any sensible consumer will turn right back to the regularly-updated US News graduate program rankings, done with a straight reputational survey. Which is, by the way, far from the worst way to handle these things. While they tend to conservatism, reputational surveys are much harder to game than data-based ones, and much more understanding of particular institutional dynamics. And conservatism in ranking isn’t a bad thing. If I remember correctly, the NRC reports ultimately scored schools (on one of the two metrics) not on their reputations but on what their predicted reputations would be based on their statistical profile: so if you ran a highly regarded program at a university with a tiny library, the ranking algorithm assumed nudged your score back down.

But if these reports do somehow produce a ranking that manages to be clear, it will inevitably be misleading one of the two groups. Perhaps the intention of the department is to fudge the ratings just enough that they’ll still look credible to students, while also nudging Harvard to admit more poor students. But this is a difficult balancing act to pull, and it relies on a complicated ranking model that could put the distortionary effects of the US News ranking to shame. So that’s what the next post is about.