Revelations that generative artificial intelligence (AI) can be used to pass high-stakes examinations have prompted a call for the revival of the oral assessment format.
Principal research fellows with the Australian Council for Educational Research Dr Jacob Pearce and Neville Chiavaroli make the recommendation in their paper Rethinking assessment in response to generative Artificial Intelligence (AI).
‘With the recent increase and facilitation of virtual assessment through convenient online platforms and the new challenge … posed by AI, we think the time has come for the “rehabilitation” and re-acceptance of the oral format as a highly valuable and unique form of assessment,’ they say.
While the research focuses on responding to AI in medical education, the authors see value in applying it more broadly to assessments in other learning areas.
At its launch, ChatGPT-4 performed in the top 10% on a range of well-known examinations, while other tools have been shown to be capable of passing the U.S Medical Licensing Exam and have interpreted radiographs for a reasonable performance in a Royal College of Radiologists exam.
The ability of AI to respond convincingly to assessments is so impressive, the authors of the research paper say, that we can no longer rely on the results of unsupervised assessment to verify learning and competence.
‘We need to think carefully about the kind of performance we want our assessments to elicit,’ the authors say, and how candidates can demonstrate that they really ‘know how’.
‘Genuine understanding requires some degree of autonomy in thinking and application of knowledge, as opposed to reciting facts, entering data or following algorithms’, the authors say.
The authors identify the ability to clarify a candidate’s responses in real time as a significant benefit of oral assessment, enabling ‘deep probing of genuine understanding and higher-order thinking’.
Testing student knowledge in the medical field may be categorised as either assisted or unassisted assessment, with a clearer distinction for assessing competencies in each area being proposed by the authors.
Where students undertake assisted examinations, they are able to use a multitude of resources, such as textbooks, the internet, decision-making tools and now AI in – a situation the researchers say is ‘in many ways, representative of real-life clinical practice’.
Unassisted examinations assess the clinical knowledge and reasoning that students and trainees have without access to resources, and are generally used for certification or testing after study completion.
‘In certain circumstances, the intrinsic characteristics of oral assessment – in particular its mode of direct communication, interactivity and flexibility – come to the fore and make it a particularly apt choice for unassisted assessment,’ the authors say.
The research acknowledges that oral assessments have been undervalued for some time, partly because of ‘perceived poor reliability, lack of standardisation and potential for assessment bias’.
However, those designing assessments in medical education now have access to relevant guidelines (by the same authors) that provide clarity on the different forms of prompting available to examiners, potential effects on candidates and best practice approaches.
The research supports further investigation of the use of AI in assisted assessments for medical students.
This article originally appeared on ACER Discover and has been republished with permission.