Abstract

BACKGROUND: Natural language processing is a promising technique that can be used to create efficiencies in the review of narrative feedback to learners. The Feinberg School of Medicine has implemented formal review of pre-clerkship narrative feedback since 2014 through its portfolio assessment system but this process requires considerable time and effort. This article describes how natural language processing was used to build a predictive model of pre-clerkship student performance that can be utilized to assist competency committee reviews. APPROACH: The authors took an iterative and inductive approach to the analysis, which allowed them to identify characteristics of narrative feedback that are both predictive of performance and useful to faculty reviewers. Words and phrases were manually grouped into topics that represented concepts illustrating student performance. Topics were reviewed by experienced reviewers, tested for consistency across time, and checked to ensure they did not demonstrate bias. OUTCOMES: Sixteen topic groups of words and phrases were found to be predictive of performance. The best-fitting model used a combination of topic groups, word counts, and categorical ratings. The model had an AUC value of 0.92 on the training data and 0.88 on the test data. REFLECTION: A thoughtful, careful approach to using natural language processing was essential. Given the idiosyncrasies of narrative feedback in medical education, standard natural language processing packages were not adequate for predicting student outcomes. Rather, employing qualitative techniques including repeated member checking and iterative revision resulted in a useful and salient predictive model.

DOI 10.5334/PME.40