In this blog post, we explore exam design principles. If exams are an integral part of the learning process, how can we make them more effective? And how can item analysis increase teaching efficacy and assessment accuracy?
Before we discuss item analysis, let’s start with why exams exist.
In “Assessment: The Bridge Between Teaching and Learning,” Dylan Wiliam states:
“If our students learned what we taught, we would never need to assess...It is only through assessment that we can discover whether the instructional activities in which we engaged our students resulted in the intended learning. Assessment really is the bridge between teaching and learning.”
Assessment via midterms, tests, quizzes, and exams is the way in which educators gain insight into student learning; in fact, assessment accounts for well over 50% of a student’s grade in many higher education courses.
It’s also a venue through which educators can address student learning, because exams are a window into student learning gaps, and consequently, a way to shore up student learning. Exams should answer a question for educators as much as they offer questions for students:
With which topics are students struggling, and why?
One way to increase visibility into student learning gaps is via item analysis.
What is item analysis?
Item analysis is the act of analyzing student responses to individual exam questions with the intention of evaluating exam quality. It is an important tool to uphold test effectiveness and fairness.
Item analysis is likely something educators do both consciously and unconsciously on a regular basis. In fact, grading literally involves studying student responses and the pattern of student errors, whether to a particular question or particular types of questions.
But when the process is formalized, item analysis becomes a scientific method through which tests can be improved, and academic integrity upheld.
Item analysis brings to light test quality in the following ways:
- Item Difficulty -- is the exam question (aka “item”) too easy or too hard? When an item is one that every student either gets wrong or correct, it decreases an exam’s reliability. If everyone gets a particular answer correct, there’s less of a way to tell who really understands the material with deep knowledge. Conversely, if everyone gets a particular answer incorrect, then there’s no way to differentiate those who’ve learned the material deeply.
- Item Discrimination -- does the exam question discriminate between students who understand the material and those who do not? Exam questions should suss out the varying degrees of knowledge students have on the material, reflected by the percentage correct on exam questions. Desirable discrimination can be shown by comparing the correct answers to the total test scores of students--i.e., do students who scored high overall have a higher rate of correct answers on the item than those who scored low overall? If you separate top scorers from bottom scorers, which group is getting which answer correct?
- Item Distractors -- for multiple-choice exams, distractors play a significant role. Do exam questions effectively distract test takers from the correct answer? For example, if a multiple-choice question has four possible answers, are two of the answers obviously incorrect, thereby rendering the question with a 50/50 percent chance of correct response? When distractors are ineffective and obviously incorrect as opposed to being more disguised, then they become ineffective in assessing student knowledge. An effective distractor will attract test takers with a lower overall score than those with a higher overall score.
Item analysis entails noting the pattern of student errors to various questions in all the ways stated above. This analysis can provide distinct feedback on exam efficacy and support exam design.
How can item analysis inform exam design?
Shoring up student learning can be enacted through feedback, but also exam design. The data from item analysis can drive the way in which you design future tests. As noted previously, if student knowledge assessment is the bridge between teaching and learning--then exams ought to measure the student learning gap as accurately as possible.
Item analysis should bring to light both questions and answers as you revise or omit items from your test.
- Is the item difficulty level appropriate?
- Does the item discriminate appropriately?
- Are the distractors effective?
In doing so, item analysis can increase the efficacy of your exams by testing knowledge accurately. And knowing exactly what it is students know and what they don’t know, helps both student learning and instructor efficacy.
How can item analysis inform course content or the curriculum?
Not only can item analysis drive exam design, but it can also inform course content and curriculum.
When it comes to item difficulty, it’s important to note whether errors indicate a misunderstanding of the question or of the concept the item addresses. When a large number of students answer an item incorrectly, it’s notable. It may be a matter of fine-tuning a question for clarity; is the wording of the question confusing? Are the answers clear?
Or it could be that the material may have to be reviewed in class, possibly with a different learning approach.
Item distractor analysis is also helpful in that it can help identify misunderstandings students have about the material. If the majority of students selected the same incorrect multiple-choice answer, then that provides insight into student learning needs and opportunities. (Also--congrats on a great distractor that highlights student learning gaps and discriminates student knowledge).
Whether you employ item analysis manually or via software, we think data-driven exams and curricula are a great thing. And we hope this helps you out on your pedagogical journey.