Interpreting influence of feature ranking in derivation of prediction models for screening questionnaires optimization

Abstract

Questionnaire based screening tests have been widely used in different fields ranging from healthcare and psychology to business environment. Espe-cially by deployment of such questionnaires in the online form it is now possible to collect large amounts of screening test data that can be used to study user char-acteristics and apply different data mining techniques to discover new patterns or build prediction models. We used a sample of 39775 complete depression, anxi-ety and stress scale questionnaires collected online. In practice such question-naires can be used to refer users to seek help from an advanced nurse practitioner specialized in mental health. Thus, modern technology enables healthcare work-ers to make clinical judgments based on evidence in advanced health assessment. Different data mining approaches were used to build prediction models and study user characteristics that might influence the prediction of screening test outcomes based on a limited number of questionnaire items. This study focuses on building prediction models to achieve high prediction performance by positioning of items using feature ranking. Additionally, we provide an insight into some characteris-tics of online screening test users using techniques to detect careless and insuffi-cient effort responding. Selection of smaller sets of items in screening tests can significantly reduce the time needed and workload for experts and lay population using the screening tests based on questionnaires. This paper also demonstrates the possibilities of using large survey datasets to provide guidelines that can serve experts in building screening tools of the next generation.

Publication
In 20th Industrial Conference on Data Mining ICDM 2020, Jul 21.-22., Amsterdam, pp. 67-78
Leona Cilar Budler
Leona Cilar Budler
PhD

My research interests include mental health, nursing research, and health informatics. Specific areas of interest include adolescent mental health, psychometric testing of questionnaires, questionnaire localization, and quantitative data analysis.

Majda Pajnkihar, FAAN, FEANS
Majda Pajnkihar, FAAN, FEANS
Professor

My primary research interests include pediatric nursing, nursing research, nursing theories and concepts, nursing safety and quality.

Gregor Štiglic
Gregor Štiglic
Associate Professor and head of Research Institute

My research interests include predictive models in healthcare, interpretability of complex models.