Generating Extremely Short Summaries from the Scientific Literature to Support Decisions in Primary Healthcare: A Human Evaluation Study

Abstract

Recent advancements in Natural Language Processing (NLP) using large pre-trained neural language models were recently used in various downstream tasks, such as text generation. In primary healthcare, such systems can generate very short summaries of research papers to save healthcare experts’ time when browsing through the literature search results, especially in scenarios where the communication with a patient can be supported by the latest scientific literature immediately at the point of care. A use case scenario was explored using recent abstracts and short summaries from the Sematic Scholar platform (baseline TLDR model - an acronym for “too long; didn’t read”). Four state-of-the-art models (OpenAI Davinci, OpenAI Curie, Pegasus-XSum, and BART-SAMSum) were used to generate short summaries. Ten healthcare experts evaluated five short summaries generated for each of the 20 included scientific paper abstracts. Results showed that Informativeness, Naturalness, and Quality were the highest in the baseline TLDR model with an average score of 4.87 (SD = 1.48), 4.94 (SD = 1.36), and 4.81 (SD = 1.5), respectively. No statistically significant differences between the baseline TLDR and OpenAI Curie/Davinci models were detected. The other two models, i.e., Pegasus-XSum and BART-SAMSum scored significantly lower in Informativeness and Quality. Our study demonstrated that we could effectively summarize scientific literature abstracts even with general AI-based text generation models such as OpenAI Curie and Davinci models. However, it should be noted that a higher variance was observed in the general models. Therefore, fine-tuning of the model is still recommended for practical use in the clinical environment.

Publication
In Artificial Intelligence in Medicine. AIME 2022. Lecture Notes in Computer Science, 13263, pp. 373-382
Primož Kocbek
Primož Kocbek
PhD Student

My research interests include statistical models and machine learning techniques with applications in healthcare. My specific areas of interest include temporal data analysis, interpretability of prediction models, stability of algorithms, advanced machine learning methods on massive datasets, e.g. deep neural networks.

Lucija Gosak
Lucija Gosak
PhD Student

My research interests are the integration of mobile applications into the care of chronic patients.

Kasandra Musović
Kasandra Musović
PhD Student

My research interests include the newest pedagogical technologies in different healthcare fields and their effect on individual persons. Specific areas of interest include how serious game in gamification affect the level of physiological and psychological aspects in critical situations, such as cardiopulmonary resuscitation.

Gregor Štiglic
Gregor Štiglic
Associate Professor and head of Research Institute

My research interests include predictive models in healthcare, interpretability of complex models.

Related