My Ungrading Mistakes
This semester in my community college General Biology II course, I decided that the laboratory component of the course (22% of the final grade) would be graded “collaboratively.” (The entire course is graded using Multiple Grading Schemes.) Collaborative grading leans into an Ungrading philosophy by shifting the power of grading toward students, and emphasizing learning and effort rather than points-motivated behavior. At the end of the semester, students complete a reflection in which they gather evidence and reflect on their learning and effort, then they propose a grade for themselves. Together, in a one-on-one meeting, the student and I discuss their reflection and decide on a grade together.
I’ve implemented collaborative grading multiple times before: as a component of the course (i.e. for just the lab component), and also as the entire grading system for the course. Despite my previous experience doing collaborative grading, despite having read Susan Blum’s “Ungrading” book, despite being active in the #ungrading community on Twitter, this semester my implementation of collaborative grading was bad. I made several egregious mistakes. I’m writing about them here so you can learn from my blunders and, hopefully, avoid them!
Background context to understand why I made the decisions I did
[TL;DR – I disagree significantly on the pedagogical philosophy behind our lab for General Biology II and therefore I didn’t care much about what students learned in the lab.]
Confession time: I really dislike the way that other instructors at my institution teach General Biology II, and therefore I really dislike the lab. At my institution, Gen Bio II is taught as a “March through the Phyla,” where students learn the defining features of various taxonomic groups, like Protista, Mollusca, Arthropoda, Gymnosperms, etc. This approach to teaching biology varies significantly from the approach outlined in Vision and Change, where core concepts and competencies are emphasized. I am an ardent believer in the Vision and Change framework and use it to guide how I design my courses.
I refuse to teach a March through the Phyla. This semester, I designed my own General Biology II curriculum from scratch. (Last Fall, I redesigned my General Biology I curriculum.) In the lecture portion of my redesigned Gen Bio II, students learned about diverse organisms by exploring myriad data-centered case studies that highlight organisms from various taxa. While I technically had the freedom to redesign the lab component for my course section, redesigning a curriculum without a textbook is a lot of work and I just didn’t have the time to redesign the lab too. So, my students did the same lab as all the other “March through the Phyla” course sections.
The lab consists of students observing and drawing specimens from each taxonomic group. I hate this lab because students don’t design experiments, collect evidence, or analyze data. THERE IS NO DATA IN THIS LAB AT ALL. I think this is grossly inappropriate for a General Biology lab. Sure, it’s cool to look at some of these organisms and dissect things, and I think there is power in seeing a diversity of organisms and having “wow” moments. But I think that an introductory science lab should have students doing the things that scientists do! This lab does not do that.
Thus, I basically didn’t care about what students learned in the lab.
Therein lies the root of my problem. I chose to use collaborative grading for the lab because I didn’t want students to jump through hoops I didn’t care about. In other sections of the course, students take lab exams that consist of viewing specimens and responding to a prompt for each one, such as “What is the name of the structure labeled by the pin?” or “To what Class [or Order, or Genus] does this organism belong?” I hate memorization like this and I was not going to inflict this on my students. Instead, I did add some other components to the lab: each student gave two short presentations about the taxonomic groups we were observing in lab (this was graded using points and a rubric), I incorporated a brief coloring book assignment, and we took a field trip to observe organisms in nature. With the exception of the lab presentations, the other components of the lab included no grades and, frankly, no evidence of learning. When I set up the course, I thought collaborative grading would be appropriate here because I didn’t care much about evidence of learning, so I thought the students could grade themselves primarily on effort.
Mistake 1: I didn’t give students a standard to judge themselves by
In the end-of-semester Reflection, I asked students to reflect on the various components of the lab, including: 1) the scores they received on their presentations, 2) whether they completed the coloring book assignment and whether it represented their best effort, 3) whether they attended the field trip and whether they found all the organisms on their scavenger hunt list, and 4) I asked them to provide pictures of their drawings from lab and rate themselves on their effort for the observational/drawing portion of the lab. At the end of all that, I asked them to give themselves a grade on a scale from 0-10.
But I didn’t give them a standard to judge themselves by.
For every other collaborative grading reflection I’ve used, there was a general guideline about what the scores mean. For example, here is the score guideline for the collaborative grading reflection for the lab component of my Gen Bio I course from this fall:
10 Outstanding effort, always committed to thoroughly completing every task and assignment; excellent skill competence and clear understanding of all concepts
9.0 – 9.5 Great effort, committed to thoroughly completing every task and assignment; very high level of skill competence and clear understanding of almost all concepts
8.0 – 8.5 Very good effort, thoroughly completed most tasks and assignments; high level of skill competence and clear understanding of most concepts
7.0 – 7.5 Good effort, completed most tasks and assignments; moderate level of skill competence and moderate understanding of most concepts
6.0 – 6.5 Some effort, completed some tasks and assignments or completed most tasks and assignments with only partial effort; low level of skill competence and partial understanding of most concepts
<6.0 Insufficient effort, did not complete most tasks, very low level of skill competence and partial understanding of some concepts
This semester, I didn’t provide guidelines because at the end of the semester, when I was writing the Reflection for the students to fill out, I was exhausted. (Did I mention that redesigning a curriculum from scratch without a textbook is a lot of work?) I didn’t take the time to think about how to have students rank themselves on participation/effort and how to combine that with the lab presentation scores that represented student learning/performance. So, after asking students to reflect on their effort and presentation performance, I just asked them to give themselves a grade out of thin air. Yikes.
Understandably, different students scored themselves very differently. Some students gave themselves low scores because they didn’t have time to draw all the organisms in each lab (I didn’t care about that). Some students gave themselves low scores because they weren’t good artists (I didn’t care about that either). Some students gave themselves low scores because they were not as focused as they could have been during lab (I did care about that). One student took the four lab components (drawings, presentations, coloring book submission, and field trip) and scored each one and averaged the score. In my mind, the effort put into observing and drawing specimens all semester far outweighed the single afternoon field trip, but this student’s reflection was the slap in my face that revealed just how much they needed, and didn’t have, guidance about how to grade themselves.
Mistake 2: I didn’t schedule one-on-one meetings with each student
A key component of collaborative grading is the idea that the student and I determine their grade together. In previous courses, I built into the course schedule one-on-one meetings with each student at the end of the semester. We would sit together and talk about whether the evidence they collected and their thoughts in their reflection supported the grade they proposed for themselves.
I didn’t do that this semester, in part because I couldn’t figure out how to fit the meetings into our schedule, and in part because I found in previous semesters that many meetings consisted of me saying “Yep, I agree with your proposed score,” and it felt like a waste of both of our time.
So, this semester I decided to only schedule meetings for students when I disagreed with their proposed score.
The problem with this approach, as I discovered while reading their end-of-semester Reflections, is that I had an incentive to just accept whatever score they propose so I could save time by not having to coordinate and have a meeting with the student. Yikes. I did, in fact, meet with several students to push their grade up, but for some students whose proposed score was close enough to a score that I thought was appropriate, I didn’t. If I had had a dedicated window of time to sit down and talk with them, I may have pushed their score up a little bit, but since I didn’t – and their proposed score was close enough – I just let their proposed score stand. I hope it’s obvious why this is not an ideal way to implement collaborative grading.
Mistake 3: Late Work Tokens + Collaborative Grading Reflection mean small mistakes can have a big impact
This semester, I gave each student 3 “Late Work Tokens” they could use to turn in any assignment late. I love Late Work Tokens because of the way that they provide flexibility within a highly-structured course, but (as I’ll write about in my next post), this particular implementation of Late Work Tokens had a critical flaw: I let students use a Late Work Token to turn in any assignment up to the last day of the semester. While I had a deadline for students to complete their Lab Collaborative Grading Reflection, some students used a Late Work Token to turn in the Reflection on the last day of the semester. This was a problem because it precluded me from scheduling a one-on-one meeting if I disagreed with their score, and it was a problem because if they made a mistake in completing it, there wasn’t time for them to fix it before I had to turn in grades.
One student submitted their Lab Collaborative Grading Reflection on the last day of the semester. I was reviewing it the day before grades were due. In the Reflection, I ask students to attach pictures of their specimen drawings from throughout the semester. This student didn’t attach any pictures. When I reached out to them via email, they said they were traveling in another state and didn’t have access to their drawings. Because I had not monitored their drawings at all throughout the semester, I had no other way to verify that they had – or had not – actually done the drawings. Thus, they earned a lower score on the lab component of the course. Because their other scores were borderline between two letter grades, the lower lab score pushed their grade into the lower letter grade. Any grading structure where one small mistake can result in a large impact on the grade is… not ideal. I didn’t mean for the Lab Collaborative Grading Reflection to be like this, but now I realize it was. Yikes.
What I’ll Do Next Time
I don’t think I’ll do collaborative grading again. At least, not like this. Not just because of my egregious errors (which are all fixable), but because of two bigger issues I have with Collaborative Grading (and Ungrading in general).
First, even with excellent scaffolding and instruction, different students will invariably have different levels of skill in their ability to self-assess and advocate for themselves. Students have neither the practice nor the bigger-picture perspective of the instructor. And we know that some students tend to be harder on themselves (women, people of color) than others (white men). If the criteria for earning a certain grade are clear, why should the onus of deciding the grade be on the novice student instead of the experienced instructor?
Second, I think Collaborative Grading is too liable to be influenced by implicit bias. Yes, the instructor can adjust a student’s proposed score up or down, but who are they more likely to do that with? How does student race, gender, ethnicity, ability, mental health, or values impact the instructor’s willingness to adjust the grade up or down? Will an instructor treat a student who stayed after class to help clean up the lab the same as a student who left as soon as they were done with lab activities? Will an atheist instructor treat a religious student the same as a self-proclaimed atheist? The possibilities for bias here are significant. I worry about this a lot.
I worry that implicit bias may make any form of collaborative grading or ungrading inequitable. Of course, all grading policies are inequitable in some form or other, but I’m left wondering: is the impact of my implicit bias less bad than the inequities in other grading systems? I worry that the answer is no.
There are a lot of principles of Ungrading and Collaborative Grading that I think make for a better classroom and learning experience, but I don’t think they require Ungrading per se. In my next post, I describe various grading policies I enacted this semester that both helped and hurt my students, and what I plan to do differently next time.