Considerations: grading

In yesterday's posting, I explained why I am unhappy with our system of numerical grading. But I do have to live with this system. So I thought I would pose a few questions about how to use it as responsibly as possible.

1. I have noticed that different faculty convert between a 100-point (or percentage) grading scale to our 4.0 scale in different ways. Is this a problem? Students get upset: to receive a 93% in one class may earn them a 3.25; in another class a 3.5; in another class a 3.75; in another class perhaps even a 4.0. The defense I heard from a math professor is that it does not matter that different professors align their scales differently, because some professors make their exams so hard that a 93% does indicate a remarkable achievement warranting the top grade of our grading scale (4.0), while others align their expectations a bit differently.

2. My own solution is to avoid all conversions altogether. I grade everything on the 4.0 scale, and then just average these grades (using weighted averages as appropriate). Here's how to round to .25 intervals: you take the raw averaged grade, multiply by 4, round this number to the nearest whole number, and divide by 4. It's easy to make a spreadsheet that does all of this for you.

3. My system of grading yields this question: Which is more appropriate for grading within the course, before the final averaging and rounding: (a) use only the .25-interval grades, (b) use even finer gradations (e.g., .125-interval grades), or (c) use coarser intervals (.5-interval grades, or even just the whole number grades, since these are the only ones whose meanings are defined: excellent, good, satisfactory, low-pass, and fail)? Or does it not matter? (Mathematically, do the averages mean something different on these three different scenarios?)

I hope someone can answer question 3 with a convincing and mathematically well-grounded rationale.

The following is adapted from my web page on grading.

In 2005, St. Lawrence changed its grading system, from the grades of 4.0, 3.5, 3.0, 2.5, etc. (0.5-interval grading) to grades of 4.0, 3.75, 3.5, 3.25, 3.0, 2.75, etc. (0.25-interval grading).

It was the students who wanted the finer gradations. They said that they wanted the grades to more accurately reflect their performance in their courses. The faculty passed this proposal (but not without debate, and not without some faculty arguing in a very different direction). The new grading system went into effect in the 2005-2006 academic year.

It is interesting to note that this change is not really just a refinement of an existing system. The two systems are in fact different enough that it is inappropriate to think that the Grade Point Averages (GPAs) computed in each system can be directly compared.

The following chart shows how GPAs do in fact change if you round grades in different ways. This table shows the kinds of rounding historically used at St. Lawrence University. Other schools often use +/- systems, which numerically convert to grades such as 3.3, 3.7, 4.0. What my little table shows is that it is dubious to compare GPAs on 4.0 grading scales if the systems of rounding are different.

Actual	.25 Rnd	.5 Rnd	.0 Rnd
3.35	3.25	3.5	3
2.8	2.75	3	3
3.6	3.5	3.5	4
2.3	2.25	2.5	2
3.0125	2.9375	3.125	3

Note that not all sets of grades would necessarily always round down on .25 intervals and up on .5 intervals. What is interesting is just that the GPAs are different. Two students with the same grades would have different GPAs depending on whether they started before or after the change in grading intervals – and yet those students who came in the midst of the change have both kinds of grades averaged together, as if averaging these incommensurable scales is legitimate!

Here is a table showing what happens when you average together the grades of students graded under both systems. Imagine these five hypothetical students who happen to get exactly the same raw grades in their courses every year (the grades from the above table) -- but the grading system changes for all except Student 1 sometime during their time here. This table shows the differences in their final GPAs at the end of their four years (the yearly GPAs are taken from the table above):

	Yr 1 GPA	Yr 2 GPA	Yr 3 GPA	Yr 4 GPA	Final GPA
Student 1	3.125	3.125	3.125	3.125	3.125
Student 2	3.125	3.125	3.125	2.9375	3.078125
Student 3	3.125	3.125	2.9375	2.9375	3.03125
Student 4	3.125	2.9375	2.9375	2.9375	2.984375
Student 5	2.9375	2.9375	2.9375	2.9375	2.9375

In this case, the student lucky enough to have arrived before the change has the highest GPA. The student unlucky enough to have spent all four years under the new grading system has the lowest. Again, it is not the case that this change results in lower GPAs for all students -- the point is that the very same raw grades average out to different GPAs depending on the grading system. Worse, these GPAs are then compared to those of students from schools that may use the .3/.7 intervals (plus/minus grading) -- an altogether different system, but because it is also a 4.0 scale we think it is essentially the same!

We place a lot of faith in these numbers that we are not in fact even computed in a mathematically responsible way. Can we really say that the GPA has a stable and unambiguous meaning?

I wish we could switch to a system of grading that does not convert grades to numbers. My favorite option is to use a high-pass, pass, fail system. The "high pass" would be a grade specifically to indicate that the student did so well that you regard that student as having graduate school potential. After all, the main distinctions we wish to make are whether the student should pass, and whether the student has worked with the material so well that you would recommend them for more advanced study in that field.

But I would also be content with a return to A, B, C, D, F grading: no pluses or minuses, and no attempt to convert these grades to numbers. The meanings here are excellent, good, satisfactory, low-pass, and fail.

When I criticize the grading system, people often leap to the assumption that I want to replace it with narrative evaluations of all students. But that is not true. I see the value of our having a shorthand way of representing the quality of students' work. What I object to is converting grades to numbers, averaging these numbers, and tying so much to this average (sometimes taken out to the thousandth decimal place) when this use of numbers is not warranted and thus highly misleading.

Considerations

Saturday, September 15, 2007

More on Grading: Living with What We Have

Friday, September 14, 2007

The Meaninglessness of Numerical Grading

About Me

Blog Archive

Labels

Links

Other Blogs