Generating Extremely Short Summaries from the Scientific Literature to Support Decisions in Primary Healthcare: A Human Evaluation Study

Primož Kocbek, Lucija Gosak, Kasandra Musović, Gregor Štiglic

2022-07-13

Code Project DOI

Abstract

Recent advancements in Natural Language Processing (NLP) using large pre-trained neural language models were recently used in various downstream tasks, such as text generation. In primary healthcare, such systems can generate very short summaries of research papers to save healthcare experts’ time when browsing through the literature search results, especially in scenarios where the communication with a patient can be supported by the latest scientific literature immediately at the point of care. A use case scenario was explored using recent abstracts and short summaries from the Sematic Scholar platform (baseline TLDR model - an acronym for “too long; didn’t read”). Four state-of-the-art models (OpenAI Davinci, OpenAI Curie, Pegasus-XSum, and BART-SAMSum) were used to generate short summaries. Ten healthcare experts evaluated five short summaries generated for each of the 20 included scientific paper abstracts. Results showed that Informativeness, Naturalness, and Quality were the highest in the baseline TLDR model with an average score of 4.87 (SD = 1.48), 4.94 (SD = 1.36), and 4.81 (SD = 1.5), respectively. No statistically significant differences between the baseline TLDR and OpenAI Curie/Davinci models were detected. The other two models, i.e., Pegasus-XSum and BART-SAMSum scored significantly lower in Informativeness and Quality. Our study demonstrated that we could effectively summarize scientific literature abstracts even with general AI-based text generation models such as OpenAI Curie and Davinci models. However, it should be noted that a higher variance was observed in the general models. Therefore, fine-tuning of the model is still recommended for practical use in the clinical environment.

Type

Book section

Publication

In Artificial Intelligence in Medicine. AIME 2022. Lecture Notes in Computer Science, 13263, pp. 373-382

Primož Kocbek

PhD Student

My research interests include statistical models and machine learning techniques with applications in healthcare. My specific areas of interest include temporal data analysis, interpretability of prediction models, stability of algorithms, advanced machine learning methods on massive datasets, e.g. deep neural networks.

Kasandra Musović

PhD Student

My research interests include the newest pedagogical technologies in different healthcare fields and their effect on individual persons. Specific areas of interest include how serious game in gamification affect the level of physiological and psychological aspects in critical situations, such as cardiopulmonary resuscitation.

Gregor Štiglic

Associate Professor and head of Research Institute

My research interests include predictive models in healthcare, interpretability of complex models.

Generating Extremely Short Summaries from the Scientific Literature to Support Decisions in Primary Healthcare: A Human Evaluation Study

Abstract

Primož Kocbek

PhD Student

Lucija Gosak

PhD Student

Kasandra Musović

PhD Student

Gregor Štiglic

Associate Professor and head of Research Institute

Related