Blog

How to transform autograding into smarter grading

Autograders are undeniably helpful for supporting student computer science assessments at scale. When paired with meaningful and personalized feedback, autograders can be an important part of personalized learning at scale.

Christine Lee

Content Manager

What is code autograding?

Autograders are a tool that allows teachers to automatically assess and/or provide feedback to students on their code. They can be configured to provide extensive and immediate feedback, which can then promote quick and efficient student learning. Autograders are widely used in computer science and engineering programs as well as statistics and data science programs although they can also be utilized for non-computer science subject matter.

Using an autograder saves time grading, particularly for large enrollment classes, and ideally provides students with feedback in real time within their existing workflows so they can prepare for their next steps in the learning journey.

The word “automation” may feel like it removes personalization, which is the crux of teacher-student communication and learning. Autograders are a tool that, because they require configuration and customization, can be used as a “debugging” tool or with the right supplements, uphold assessment with integrity, complete with feedback loops. While a minimal binary feedback (correct/wrong) autograder may be useful for summative assessments like final exams where you do not want the students to see the tests fully or direct hints for how to solve issues, this type of autograder deters learning, particularly in low-stakes assignments such as labs or homework; in these situations, students who don't have the information for next steps in learning may haphazardly submit adjustments until the program is deemed correct.

But when feedback accompanies autograding, autograders can uphold personalized learning and increase student engagement. The benefit of immediate feedback enables students to intelligently iterate on code and resubmit improvements before the assignment’s due date without depending on constant instructor involvement. In doing so, instructors can foster self-directed learning. Additionally, an autograder with feedback can support self-paced learning.

Personalized learning revolves around customizing the learning experience for each student based on their unique skills and abilities and other contextual factors. In sum, it’s a way to enact high-level pedagogy in student-teacher interactions.

1 in 3 administrators view personalized learning as “a transformational way to improve public education,” according to a 2018 Education Week research survey. And 57% of administrators believe that digital technology is an effective tool to supplement personalization.

In a subsequent 2019 Education Week survey, 46% of teachers said they were optimistic about personalized learning, with 91% of respondents saying they are at least somewhat confident that digital tools can effectively customize learning for students (p. 4). Personalized learning is an accepted pedagogical approach toward supporting student learning.

By enacting personalized learning, educators can support every student in their learning journey. For instance, reducing the gap in higher- and lower-performing students can foster a more qualified, diverse population of students who may then go on to careers in software engineering (Aravind & Balasangameshwara, 2019).

What are the challenges of enacting personalized learning with autograders?

When it comes to teaching students software programming, there are unique challenges when it comes to assessment and feedback.

Thoughtful, actionable feedback can take time. According to Scott Smith, Professor of Computer Science at Johns Hopkins University, “In courses that teach programming, we typically assign students projects that require them to write programs to solve problems. When instructors grade this type of assignment, they not only have to observe the program’s results but also the student’s approach. If the results are not correct or the program doesn’t run, we have to spend time reviewing hundreds of lines of code to debug the program to give thoughtful feedback.”
A binary system of correct/wrong doesn’t induce learning. Without thoughtful feedback, an autograder meant to support learning doesn’t provide next steps for students. In doing so, students might make adjustments at random and miss learning opportunities. According to Kevin Lin, Assistant Teaching Professor at University of Washington, “The resulting autograder-driven development cycle occurs when students make minor adjustments to their code seemingly at random, submit code to the autograder, and repeat until their program passes all of the given tests.”
Over-reliance on autograders may deter self-directed learning. While autograders are helpful in supporting assessments, particularly in large classes, poorly designed autograders may result in students learning the nuances of the autograder more than they do concepts. According to researchers, “students [may] use autograders in place of their own careful reflection.” (Baniassad, Zamprogno, Hall, & Holmes, 2021).

How can educators enable autograders to help bridge teaching and learning?

Research acknowledges the wide use of autograders in computer science courses for pragmatic reasons. One study found that when thoughtful feedback is provided, subsequent student submissions improved by 96% (Haldeman, et al., 2018). The same research group, in a follow up study, collected data between two semesters of computer science courses. One semester provided hints and feedback for two assignments and the other did not provide such scaffolding. “Results,” they say, “show that the percentage of students who successfully complete the assignments after an initial erroneous submission is three times greater” when provided with hints and feedback (Haldeman, et al., 2021).

Suggestions to foster learning with autograders include:

Adopting tools like Gradescope to utilize autograding (regardless of class size) alongside rubric and feedback functions saves time grading, according to instructors. With Gradescope, instructors can write an autograder designed to grade a single submission and Gradescope handles the rest; instructors need not worry about designing to manage all submissions. In turn, fast feedback enables students to receive results within their workflow and enable next-steps in learning.
Accelerating feedback loops increases student learning outcomes. With Gradescope, the instructor's pre-uploaded code autograder runs on the student code submissions within seconds of the student submitting. This gives students real-time feedback on what's wrong with their code, so that they can iterate immediately. In one case study, Professor Jillian Cannons, Assistant Professor of Math and Statistics at Cal Poly Pomona, utilized Gradescope to centralize near-instantaneous feedback–which then encouraged students to rework the assignment until they achieved the right answer. “Professor Cannons saw students take greater ownership over their work and their learning. Instead of settling for a 7 out of 10, they repeatedly reworked their assignments until they got a perfect grade. This self-motivation led to greater concept mastery, and kindled a sense of passion for computer programming where it was formerly lacking.”
Design your autograder to not only test student code but also assess student tests to ensure that they are actually doing test-driven development. As a result, this promotes students to think through how they should test their code and not just rely on others to write those tests. A well designed autograder with rate limiting and selective output along with good exam design also makes it harder for students to rely solely on an autograder; instead, students take more thoughtful actions on their submissions..
Using tools like Gradescope to enable autograding and manual grading to occur on the same assignment can help instructors provide in-line comments and feedback to students. “After being informed that I wouldn’t have any TAs for my Principles of Programming Languages course the following semester, I was motivated to use one of Gradescope’s features, the programming assignment auto-grader platform. Being able to automatically provide grades and feedback for students’ submitted code has long been a dream of instructors who teach programming. The instructor establishes a grading script that is the basis for the analysis, providing grades and feedback for issues found in each student’s submitted program. Auto-grader [sic] was really the star feature in this case,” according to Scott Smith.
Offering frequent, low-stakes assessments increases transparency and scaffolding for student learning. Having insight into what students do or do not know allows for personalized teacher interventions. Autograding lowers the demand on instructor time so students can code more frequently. At Virginia Commonwealth University, Professor Debra Duke uses Gradescope “to save time and automate parts of the review and feedback process. This allows Duke to assign more projects and give more meaningful, individualized feedback, and her students now get 50 percent more coding practice than they did before.”
Pinpointing learning opportunities within the autograder to students provides specific guidance. Anya E. Vostinar, Assistant Professor of Computer Science at Carleton College, shared her own tips with the Gradescope autograder that fosters such learning opportunities.

Gradescope transforms grading into learning–and enables personalized learning alongside autograding at scale. The result? Educators have time to teach.

Get started with our autograder documentation. Or take a look at Gradescope’ s Community Resources Page, where instructors share their autograders with each other.

Learn more about Gradescope