Managing uncertainty in imputing missing symptom value for healthcare of rural India

Purpose In India, 67% of the total population live in remote area, where providing primary healthcare is a real challenge due to the scarcity of doctors. Health kiosks are deployed in remote villages and basic health data like blood pressure, pulse rate, height–weight, BMI, Oxygen saturation level (...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Health information science and systems Jg. 7; H. 1; S. 5 - 15
Hauptverfasser:	Das, Sayan, Sil, Jaya
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Cham Springer International Publishing 18.02.2019 BioMed Central Ltd Springer Nature B.V
Schlagworte:	Algorithms Analysis Bioinformatics Blood pressure Body mass index Clustering Computational Biology/Bioinformatics Computer Science Computer simulation Cost function Data acquisition Datasets Error analysis Fuzzy sets Ground truth Health care Health Informatics Heart rate Information Systems and Communication Service Kiosks Monte Carlo methods Monte Carlo simulation Novels Oxygen content Patients Physicians Primary health care Regression models Shortages Signs and symptoms Special Issue on Application of Artificial Intelligence in Health Research Uncertainty India Softmax classifier Monte Carlo method Fuzzification Missing value Regression model Rural healthcare
ISSN:	2047-2501, 2047-2501
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Purpose In India, 67% of the total population live in remote area, where providing primary healthcare is a real challenge due to the scarcity of doctors. Health kiosks are deployed in remote villages and basic health data like blood pressure, pulse rate, height–weight, BMI, Oxygen saturation level (SpO 2 ) etc. are collected. The acquired data is often imprecise due to measurement error and contains missing value. The paper proposes a comprehensive framework to impute missing symptom values by managing uncertainty present in the data set. Methods The data sets are fuzzified to manage uncertainty and fuzzy c-means clustering algorithm has been applied to group the symptom feature vectors into different disease classes. The missing symptom values corresponding to each disease are imputed using multiple fuzzy based regression model. Relations between different symptoms are framed with the help of experts and medical literature. Blood pressure symptom has been dealt with using a novel approach due to its characteristics and different from other symptoms. Patients’ records obtained from the kiosks are not adequate, so relevant data are simulated by the Monte Carlo method to avoid over-fitting problem while imputing missing values of the symptoms. The generated datasets are verified using Kulberk–Leiber (K–L) distance and distance correlation ( dCor ) techniques, showing that the simulated data sets are well correlated with the real data set. Results Using the data sets, the proposed model is built and new patients are provisionally diagnosed using Softmax cost function. Multiple class labels as diseases are determined by achieving about 98% accuracy and verified with the ground truth provided by the experts. Conclusions It is worth to mention that the system is for primary healthcare and in emergency cases, patients are referred to the experts.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2047-2501 2047-2501
DOI:	10.1007/s13755-019-0066-4