Address the originality of student work and emerging trends in misconduct with this comprehensive solution.
Deliver and grade all types of assessments from anywhere using this modern assessment platform.
This high-stakes plagiarism checking tool is the gold standard for academic researchers and publishers.
This robust, comprehensive plagiarism checker fits seamlessly into existing workflows.
Give feedback and grade assignments with this tool that fosters writing excellence and academic integrity.
Uphold academic integrity, streamline grading and feedback, and protect your reputation with these tools.
Improve student writing, check for text similarity, and help develop original thinking skills with these tools for teachers.
As we wrote in a previous post on AI-Assisted Grading, we built Gradescope in order to give instructors grading superpowers. Our technology allows instructors to spend less time on grading and other administrative tasks so that they can spend more time interacting with students and improving instruction.
Gradescope is used to grade online assignments, programming projects, and scanned handwritten work, so part of what our technology needs to do is handle handwritten text. In this two-part post, we detail the challenges in addressing this problem, current State-of-the-Art (SOTA) approaches, and how our End-to-End Deep Learning system performs.
The handwriting images we handle are test submissions turned in by students; starting out as physical papers filled out by hand, then converted to digital images from which the student’s work is automatically extracted. The resulting images (Figure 1) are partial or full-page sized and contain potentially multiple regions of handwritten text, math equations, tables, drawings, diagrams, side-notes, scratched-out text, and text inserted using an arrow / circumflex and other artifacts. The content varies widely, spanning many subjects from grade school level all the way up to postgraduate courses.
The role of our handwriting recognition Artificial Intelligence (AI) is to identify and transcribe the handwritten answers from these images. Furthermore, since we need to serve a variety of use cases, the AI must go beyond just text recognition and perform additional tasks. Specifically, it must:
We call this problem Full Page Handwriting Recognition (Full Page HTR) 1. This problem is much harder than classical Handwritten Text Recognition (HTR) which is limited to the recognition of text in images of single words or single lines of text.
Figure 1. Data examples: (a) Full page text with drawing. (b) Full page computer source code. (c) Diagrams and text with embedded math. (d) Math and text regions, embedded math, and stray artifacts.
Academic literature and the typical approaches to this problem usually only attempt to recognize cropped images of single words or lines of text. The task of cropping said words/lines is delegated to another step, called image segmentation. An end-to-end text-recognition system is expected to chain these two steps together, followed by a third step: stitching the individually recognized units back into a passage. This approach suffers from a few problems:
One, image segmentation is usually based on hand-crafted features and heuristics which are not robust to different sources of data, and might break under some unexpected scanning conditions i.e. they are brittle.2
Second, clean segmentation of text is not even possible in many cases e.g., when lines are curved or interspersed with non-textual symbols and artifacts which is very common with the data that we deal with.
Third, stitching a complete transcription from the individually transcribed text regions introduces yet another system, with its own potential for errors, and brittleness to changing data.
Fourth, in order to boost their accuracy, classical systems include closed lexicon decoding; a system that limits their vocabulary to a fixed set of words. This doesn’t work for us since we must cater to the terminology of many different subjects, international proper nouns, and even special things like chemical molecular formulas.
Finally, a multi-step design fragments the end-to-end task, making it difficult to perform sub-tasks that require information from another stage, e.g., stitching back the individually recognized pieces into a passage without losing the original formatting and indentation (important when transcribing computer source code) and other auxiliary tasks such as identifying tables, drawings, etc. and skipping over them even when they contain some text.
In view of these problems, we designed an End-to-End Deep Learning-based model architecture i.e., all of the above steps are implicit and learned from data. Adapting the model to a new dataset or adding new capabilities is just a matter of retraining or fine-tuning with different labels or different data.
Ultimately, our model beat the performance of text-recognition Cloud APIs available from all major vendors and established a new state of art in Full Page Handwriting Recognition. Read all about it here in the follow-up to this blog.
In part two of this series, we present our approach to Full Page Handwriting Recognition and the AI model we...
Here is some of the most interesting work we are pursuing in Artificial Intelligence and Machine Learning at...
What the development and evolution of Artificial Intelligence could mean for the future of plagiarism
Turnitin blog posts, delivered straight to your inbox.