Sentences to ponder

When it comes to grading, Republican and Democratic professors at one unnamed elite university put their ideologies into practice, a new study finds: Republicans welcomed inequality, handing out more very high and very low grades, and Democrats’ grades grouped more tightly around the average.

Republicans also gave black students lower grades than their colleagues. In both cases, the researchers stressed, there was no way to know which approach better reflected students’ performance.

From Christopher Shea, here is more, including a link to the paper.


From reading the wsj summary (I didn't read the actual paper) it doesn't look like they controlled for subject area. But "Econ professors assign grades according to different curves than English professors" is a less exciting headline.

If we don't know the absolute performance of the students, how do we determine who is grading ideologically?

Suppose all professors are perfectly color-blind and assign the same rank-order to any group of students. But Republican professors translate those ranks into a grade distribution that has a lower mean and a higher variance than their Democrat colleagues. If the average black student is ranked below the median by all professors, than this would generate the race result in the paper (a negative coefficient on republican_professor*black_student in a grade regression), and yet be completely contrary to the interpretation the authors provide (student characteristics causally affect grades). I only skimmed the paper, so I might have missed something, but it doesn't appear that the authors deal with this problem.

Very good point. Maybe the finding is just "black students tend to fall in one tail, and Republicans assign more extreme grades to the tails."

I agree. Read the paper and thought the same thing. They do not seem to control for that. It doesn't appear to even be mentioned in the paper.

Given the average grade of a black student in this study is about 2.5 (Figure 4) and Repbulican professors grade much harsher on the lower tail of the distribution, this seems like the most plausible effect.

Sure they did. It is right there in the blurb. They stress they do not know which method more accurately reflected student performance. So... they are not even alleging that student characteristics affect the results. It merely follows as a consequence of their original result.

This wouldn't work for the kurtosis finding, but with the race finding you might be able to get some analytical leverage by doing three-way interactions between faculty party id, student race, and a dummy for blind grading.

Wow, that tells us nothing. Exactly what ideology is in play? Are black students being graded differently due to the fact all Democrats are subtly practicing affirmative action, or because all Republicans are racists? Both? Neither?
They're adding an awful lot of narrative to a very small set of numbers.

Grading does require mapping some fuzzy observations (e.g. an essay) onto a numerical scale. Is it possible that some people have an internal function that is more non-linear or more variance maximizing?

Jason B,

They include "department fixed effects" in some of their models, which means they not only control for major (eg, economics) but major-school (eg, economics at George Mason). Similarly they include student fixed-effects in those models that don't have student-level traits entered additively.

The paper is not yet released to the public, so ofc I cannot comment on it. But one obvious caution: Generally, arts subjects score on a more grouped grading curve than science subjects (for obvious reasons, there is usually no "correct answer" in many arts subjects and so any good faith attempt is likely to score higher than a wrong science answer. I am sure this is the kind of thing this level of competent research would have accounted for, but I thought it was worth raising given it was not included in the small linked summary. (and also that it is the kind of mistake politicians / journalists routinely make in my experience)

Interesting. Just don't stereotype. I consider myself pretty liberal but I also like to spread the grades out. (And I'd support a universal grading curve to force it. My thinking is the whole point in grades is to rank students against each other, to help others determine who should receive finite opportunities like jobs, admissions, and scholarships.)

You like to spread the grades out!? Why not let the students get the grade they earn? You are already using a curve! What if you have a tutorial with 6 (out of, say, 18) exceptional students. Will three of them be bumped down to a B+ to accommodate your already-built-in bias that grades should be spread out?

I should have been more clear. I spread the grades out *on course assignments*. The final grades are assigned according to social norms.

You should realize there is a curve implicit in all grading. I can write an exam on which all students score below 10%, and I can write an exam on which all students score above 90% (at least I could do so at the place where I taught). The curve is set by the difficulty of the exam, if it is not set when raw scores are converted to grades.

Creating exams and assignments that spread scores out is a tool for making sure that students who score higher than other students do so because they in fact know more or performed better in a meaningful way.

Grading on a curve is one of the surest signs of a poor teacher. If you can't teach, and you can't write good tests, then just toss the students a bone to disguise your incompetence or laziness.

The purpose of teaching is to impart knowledge, and the purpose of grading is to measure how well the students received and retained the learning objectives. So a student's grade is relative ONLY to the standards of the course, NOT relative to other students. Every student should be able to earn an A. Every student should fail if none of them learned anything.

A class full of A's doesn't necessarily mean the teacher is "Santa Claus". Sometimes the teacher just ROCKS and everyone learned well. But a class full of students whose unadjusted scores are all F's clearly indicates a professor who cannot teach or objectives that were too demanding!

Exam writing is one of the least appreciated skills of a good teacher. Exams should adequately assess learning objectives and produce a NATURAL curve by properly weighting the difficult and easy questions. Most professors, especially the liberal ones, are just too lazy to do this, so they hide their incompetence with a curve.

I am not sure that what you describe is the only goal of testing. Another important goal is the matching for jobs. Oftentimes with jobs being a scarce resource. In this sense some sort of ordering relation is essential. So no matter how clustered a group sometimes it might make sense to take more efforts to deduce ordering to signal to potential recruiters who's better than another.

Every year's students are different so a pure ordering would make it impossible for an employer to compare applicants who took the class in different years. It's more likely that an employer will be judging two applicants who took the same course in different years than two applicants who sat in the same exact classroom. An objective grading system allows employers to compare a wider variety of applicants.

Also jobs aren't resources, but that's a different matter.

See my response to the post above.

Your attitude is very common, I know, but I really think it is flawed. Try to think it all the way through. You say the grade should be relative to the standards of the course. Well what sets the standards of the course? We can take a very concrete and objective subject, like mathematics. Well why did you algebra course teach algebra at the pace it did, as opposed to a slower pace (covering less over the year) or faster pace (covering more over the year)? Why were you given problems as challenging as you were, as opposed to more or less challenging problems? All of these things are subjective choices made by your teacher, and they affected your numerical score in the end. You were graded on a curve, like it or not, whether your teacher intended or not.

A universal curve would just make this explicit. It's not perfect, because different institutions will still present different curricula. But it helps eliminate some of the uncertainty of outsiders ranking students against each other.

There are very few professors who are republican, VERY few.
The university system was taken over beginning with BOAS in 1915. as well as the Frankfurt school of sociology.

Why do you cite (Franz, right?) Boas here? thanks.

Republicans welcomed reality, handing out more very high and very low grades, and Democrats feared hurting students’ feelings, so their grades grouped more tightly around the average.

Also, the reality that black people aren't as smart as whites?

There a host of reasons black students might not do as well as white ones. Also, grades =/= intelligence.

You should have read the second post. It is unknown whether the "reality" of student performance and grades is such that more students should receive very high and very low grades, or less. Indeed I don't think there even is an objective reality to speak of; it all depends on what a grade *means*, which is not absolute but cultural. (By this I mean, does an A mean you're in the top 5%, 10%, 20%, or what? If you don't think of grades like this, does an A mean you learned X amount of material, Y, or Z? These are arbitrary standards, though of course we can try to make them universal or approximately so.)

further on in the WSJ article: "Among eleven black professors in the sample, there were no Republicans, and the Democrats appeared to grade white and black students as their white-Democratic peers did. But there were too few black professors to make that finding statistically significant"

As in every other thread on this site that explores how economists (or economics professors) are measured in studies to determine their biases, we've established only one thing... that they, like members of every other walk of life, are human. Why the surprise?

Don't many professors grade blind? That should make it possible to determine whether bias plays a role.

Where did they find a Republican in academia to include in the study?

It's not a joke -- most non-hard science department have NO Repubicans.

Is that more telling about the professors or the Republicans?

Partly it is telling more about academia and the comparative advantages of the different personalities that gravitate to various majors and ideologies.

Same as most hard science departments.

At least in the 200 person organic chem courses I teach at an "elite" (and liberal) U, the TAs do the grading, under the prof's direction. Everybody with the same total score gets the same grade. I would have to work hard to incorporate race into the grade boundaries. Racial disparities in grading in small vs. large classes should be where to look for this effect if it exists.

well this study tells nothing new: republicans are racists, that's a fact.

Quiet tibu, the adults are having a discussion.

No, it's a hypothesis. The null hypothesis being that genetics are racist.

(not true of course, the null hypothesis is chance, but there are many other hypotheses.)

uuhhh....this is not the playroom?....oopppss.....sorry!!!

As a grad student, I TA'd for a large-lecture survey course for non-majors. I remeber showing them the curve after an exam, which had a clear Gaussian shape with a peak around 60% and a SD of around 20%. I pointed out to the students that this was what we liked to see, because it meant the exam had good differentiation power across the whole spectrum of students. The weren't large numbers of students bumping up against 0% or 100%, nor was everyone bunched up near one score. The students were not receptive to my analysis.

if they truly grade on a curve, then the final grades should be the same regardless of the width of the histogram. My hypothesis is that Republican professors tend to understand statistics better.

Engineering professors tended to use the curve properly, at least moreso than my prior experiences. They also tend to be more conservative. My hardest ass professor I later ran into at a campaign speech for a district attorney running for higher office.

As in, ideology may be the driving force of major selection, and thus success (as professors are basically going to grade you on how similar you are to themselves).

fwiw, figure 2 of the paper shows that there is a closer relationship between SAT and grades for Republican profs than for Democrat profs. basically, the highest vs lowest SAT scores tends to be worth about a full grade to a Republican but only half a grade to a Democrat.

Report all grades on transcripts as Z-scores. Problem solved ;)

I think the implication here is that there is discretion in grading. My engineering professors mostly used ID#s on tests. Also, the calculations left little subjectivity. Additionally, I can hardly imagine them having the time to worry about individualizing grades. I never understood the purpose of grading on a curve until I got into later engineering courses. The engineering professors graded on the curve for the reasons cited above by others, but also because they didn't know what grade a class deserved so they fell back on the law of large numbers and the assumption that each class had about the same talent as last year's class. Basically, the teachers chose NOT to judge the absolute talent of the class and used the individual members competition to judge the RELATIVE talent. I imagine social science professors who are equally humble but charged with more subjectivity in subject matter, to fall back on less differentiation to make up for their lack of omniscience. They also happen to be more Democrat.

Notice the different implications of these statements which say the exact same thing:

"Republicans gave black students lower grades than their colleagues"

"Democrats gave black students higher grades than their colleagues"

Your second statement is too strong. A better one might be:

"Democrats did not give black students lower grades than their colleagues"

I think this thread needs more comments seconding the attitude of the very first. There is much that needs to be controlled, and I can't see the paper to see if they were controlled.

First is the issue of field of study. One might argue that Economics professors (say) give out more diverse grades than English professors (say) because they tend to have a higher count of Republicans. But there is more at play; for example average class size is very different between these fields, and I imagine psychology is such that one tends to give out more diverse grades to students enrolled in larger classes.

Also of importance is the type of course. I'd bet that instructors of intro-level courses assign more diverse grades than instructors of advanced courses. But the fraction of students taking intro-level courses, vs advanced courses, varies from one field of study to another. Also, it would not surprise me even if there were a correlation between political affiliation and teaching intro courses. (I don't have a reason to suggest that, but I'm open to subtle issues of psychology and self-selection.)

Were there enough Republican professors in the survey to constitute a valid statistical sample? Might be hard to obtain at some institutions of higher learning.

The consequence of greater inequality though is to lump larger numbers near median and lose discriminating power over the majority, while the consequence of less inequality is to make finer gradations between most at the cost losing resolution at the extremities (unless the median is so high they all end up lumped together. This questions whether the focus should be on the extremities or on the center. It really depends on what you are looking for.

I was lucky enough to go to school before white people liked black people.

Walter Williams

Out of curiosity, who here has actually graded college courses?

I did it quite a bit as a TA. I had no control over the assignments and often little guidance on grading standards. It was frustrating. I remember having to grade a group project where every student in the group received the same grade (as per the professor's instructions). Some groups had 2 excellent students and two bad students. Some had four mediocre students. Do I average across students within a group? force the good students to suffer for having bad members? let the bad members coast on the good students? How about the fact that they got to pick the project? How do you rank someone who does a great job with a terrible idea to someone with a great (but difficult) idea that falters?

Regarding race, that same group project included presentations. I was instructed to incorporate their communication skills into their presentation grade. I remember listening to black kids who said things like "aks," not "ask." That's always sounded wrong to me and gave me a negative perception of the quality of communication. I remember debating with myself, "Do I give them a lower grade on communication skills? or do I take into consideration the fact that they've likely spent a vast majority of the time surrounded by a linguistic group that considers this the normal pronunciation?" One could have the same debate about "neesh" vs. "nitch" or a number of other words where the pronunciation differs across regions and different socio-economic groups. The one thing I realized is that I did indeed consider saying "aks" instead of "ask" to be more inferior than I did someone who said "nitch" instead of "neesh." Was it because the pronunciation of the first word tends to differ amongst racial lines, whereas in the second word, it doesn't? Was that racist? Would ignoring it constitute a form of affirmative action? Do I somehow try to guess the background of each student and judge them based on what I think is "normal" for their region, race, and socio-economic standing? Was it fair to use my own preferences for "proper" pronunciation as an objective standard when the dictionary considers both to be valid?

I also found that my mood, personal workload, and time constraints could easily alter my grading standards. This was especially a problem if assignments weren't graded in one sitting. My pre-dinner grades would sometimes differ than my post dinner grades. Or, as you go through the work, you hit a long patch of bad grades, and try as you might, your standards ease up a bit, then you hit a bunch of good papers and realize you didn't leave enough room at the top of the distribution to account for the disparity amongst the good works, go back, and regrade everything...become exhausted, impatient, etc. Heck, one time, a lazy prof dropped 1,200 quizzes in my mailbox with a 24 hour turnaround time to grade them all (daily quizzes for an entire semester for a class of 60 people, that remained ungraded until the day before the university deadline for final grades). There is no doubt that by the time I got to quiz 987, I was in an entirely different frame of mind as when I graded quiz 8.

With math problems, its common for the answer of part (a) to be an input in part (b). If a student messes up on part (a), with even perfect work, part (b) will be wrong. To grade fairly, sometimes requires re-doing the question using the wrong part (a) answer to see if the student did (b) properly and avoid punishing a kid twice for one mistake. Merely looking at the final answer doesn't work (and when you're grading 1,200 quizzes in 24 hours, its quite tempting to just grade based on the final answer and avoid resolving the same question over and over again). And how do you compare someone who set up the equations correctly, but messed up basic algebra and simple derivatives to someone who entirely messed up the set up of the problem, but got all their derivatives and algebra right?

Point being, I think people who have never graded assume its a simple task, easy to do fairly and accurately. In reality, its not fun at all and surprisingly frustrating and challenging. There is a reason so many professors dump the task onto their TAs.

Specifically regarding distributions. There were days I'd say, "Ok, you did good: A. You didn't: C" and days I'd say, "You did good, but I've seen better, so B+, and you did bad, but I've seen worse, B-." It'd only be at the end of the semester when adding up total grades that I'd realize that some days I'd have a pretty black or white mentality to answers, and other days, I'd allow for shades of great, the former leading to a wide distribution (often bi-modal) and the second, a pretty tight grouping. Neither was intentional. It just depended on my level of patience and exhaustion on that particular day. It was the first year of a Ph.D. program. Often times one only gets a few hours of sleep at night for days, if not weeks, on end. That affects mood, which in turn affects grading, especially for young grad students who do not have years of experience with these grading issues.

On the second "which approach better reflected students’ performance" issue, Heckman's 2007 PNAS paper notes that "Controlling for ability, minorities are more likely to attend college than others".

This shows which they last very much lengthier and thus saving you income which could otherwise are actually utilized to purchase new onesFTG.

Comments for this post are closed