Machine Learning to Predict Neonatal Mortality Using Public Health Data from São Paulo - Brazil

Carlos Eduardo Beluzo , Federal Institute of São Paulo
Luciana C. Alves, University of Campinas (UNICAMP)
Everton Silva, Federal Institute of São Paulo
Rodrigo Campos Bresan, Federal Institute of São Paulo
Natália M. Arruda, Federal Institute of São Paulo
Tiago Carvalho, Federal Institute of São Paulo

Infant mortality is one of the most important socioeconomic and health quality indicators in the world. In Brazil, neonatal mortality accounts to 70% of the infant mortality. Despite its importance, neonatal mortality shows increasing signals, which causes concerns about the necessity of efficient and effective methods able to help reducing it. In this paper a new approach is proposed to classify newborns that may be susceptible to neonatal mortality by applying supervised machine learning methods on public health features. The approach is evaluated in a sample of 15,858 records extracted from SPNeoDeath dataset, which were created on this paper, from SINASC and SIM databases from São Paulo city (Brazil) for this paper intent. As a results an average AUC of 0.96 was achieved in classifying samples as susceptible to death or not with SVM, XGBoost, Logistic Regression and Random Forests machine learning algorithms. Furthermore the SHAP method was used to understand the features that mostly influenced the algorithms output.

See paper

 Presented in Session 31. Mortality Predictions