Using ML to Predict Anemia Among Young Girls in Ethiopia

In an article published in the journal Nature, researchers explored the prevalence and prediction of anemia among young girls in Ethiopia using machine learning (ML) algorithms. Analyzing data from the 2016 Ethiopian Demographic and Health Survey (EDHS), the authors evaluated various predictors of anemia and identified the random forest classifier as the most effective model.

Study workflow diagram. Image Credit: https://www.nature.com/articles/s41598-024-60027-4
Study workflow diagram. Image Credit: https://www.nature.com/articles/s41598-024-60027-4

Key determinants included socioeconomic factors, demographic characteristics, and lifestyle choices. The findings suggested targeted interventions to address anemia among young girls in Ethiopia.

Background

Anemia, characterized by a deficiency in red blood cells or hemoglobin levels, poses significant health risks globally, particularly among young women. This demographic group, due to heightened physiological needs for essential nutrients like iron and folic acid, is particularly susceptible to anemia, exacerbated by factors such as intestinal parasitic infestations prevalent in developing countries. While anemia among reproductive-age women has been extensively studied, research specifically focusing on young women, particularly in Ethiopia, has been limited.

Previous studies in Ethiopia have primarily utilized traditional statistical methods to analyze anemia prevalence and its determinants. However, these approaches may overlook intricate relationships within the data. This paper addressed this gap by employing advanced ML techniques to predict anemia and identify its predictors among young girls in Ethiopia. By leveraging ML algorithms, which excelled in handling nonlinear data and capturing complex interrelationships among predictors, this study provided a more nuanced understanding of anemia prevalence and its associated factors.

The research utilized data from the 2016 Ethiopian Demographic and Health Survey (EDHS) and applied eight ML algorithms, including association rule mining, to forecast anemia and identify its predictors. By doing so, it contributed to the existing literature by offering a novel approach to analyzing anemia prevalence among young women, ultimately providing valuable insights for policymakers to design targeted interventions and mitigate the adverse effects of anemia on this vulnerable demographic group.

Predicting Anemia Among Youth Girls in Ethiopia Using Advanced ML Techniques

The research utilized data from the 2016 EDHS, a nationally representative cross-sectional survey conducted in Ethiopia. Covering nine regional states and two city administrations, the survey included a diverse sample of women aged 15-49. For this study, a weighted sample of 5,642 young girls was analyzed. The primary objective was to predict anemia among these young girls and identify its predictors using advanced ML algorithms.

Various demographic, socioeconomic, and lifestyle factors were considered as potential predictors of anemia, including age group, religion, wealth index, occupation, media exposure, educational status, source of drinking water, family size, body mass index, altitude, type of residence, and region. Data preprocessing involved cleaning, handling missing values, and addressing imbalanced categories in the outcome variable. Multiple data balancing techniques were employed to improve model performance.

Eight state-of-the-art ML algorithms were applied, including decision tree, random forest, extreme gradient boost, light gradient boosting machine, support vector machine, logistic regression, k-nearest neighbor, and Gaussian Naïve Bayes. Evaluation metrics such as accuracy, sensitivity, specificity, F1 score, and area under the curve (AUC) were used to assess model performance. Additionally, 10-fold cross-validation was utilized to validate the models.

Furthermore, feature engineering techniques such as one-hot coding and label encoding were applied, along with dimensionality reduction methods to streamline the input variables. Model interpretability was enhanced through shapley additive explanations (SHAP) analysis to understand feature importance and association rule mining to uncover patterns related to anemia.

Ethical considerations were addressed, ensuring compliance with ethical standards and obtaining informed consent from respondents. Overall, the study provided valuable insights for policymakers to develop targeted interventions and mitigate the prevalence of anemia among young girls in Ethiopia.

Findings and Predictive Analysis Insights

The study analyzed data from 5642 young girls in Ethiopia, revealing that 25.43% of them were anemic. Socio-demographic characteristics highlighted disparities, with most respondents aged 15–19, from rural areas, and with orthodox Christian affiliation. Feature selection using the Boruta algorithm identified influential variables for predicting anemia status. Household sex and smoking status were deemed unimportant and excluded from further analysis.

Data balancing techniques, including synthetic minority oversampling technique (SMOTE), enhanced model performance, with the random forest model outperforming others, achieving an AUC of 82.4%. Hyper-parameter tuning via grid search optimized model precision, recall, and F1 score. Among selected algorithms, random forest, extreme gradient boosting, and support vector machine performed best, with AUC values of 0.82, 0.776, and 0.736, respectively.

SHAP value interpretation revealed region, media exposure, and marital status as significant predictors, while association rule mining identified key factors influencing anemia likelihood, such as age, region, and wealth index. For instance, a 96.3% likelihood of anemia was associated with young girls aged 15–19 from Dire Dawa with primary education and poor wealth index. These findings provided valuable insights for targeted interventions to address anemia among young girls in Ethiopia, emphasizing the importance of socio-demographic factors and regional disparities.

Insights and Implications

By evaluating eight different algorithms, including random forest and support vector machine, the authors illuminated the robust predictive capabilities, paving the way for automated screening tools in healthcare. Moreover, insights obtained from SHAP value analysis revealed key predictors like media exposure and wealth status, shedding light on nuanced risk factors. However, challenges like the absence of regression coefficients and limited data sources posed constraints. Yet, the study's implications were profound, offering avenues for targeted interventions and policy decisions to mitigate anemia's impact. Ultimately, this research underscored the transformative potential of ML in healthcare and the imperative of addressing anemia among vulnerable populations.

Conclusion

In conclusion, insights from ML analysis of anemia prevalence among young girls in Ethiopia underscored the significance of targeted interventions informed by predictive models. By leveraging advanced algorithms, socio-demographic determinants like media exposure and wealth status were identified as pivotal factors. However, challenges such as data limitations and model interpretability needed addressing.

Despite these, the study's implications for public health interventions were profound, emphasizing the transformative potential of ML in mitigating anemia's impact. Continued research and tailored interventions are crucial for effectively addressing anemia among vulnerable populations in Ethiopia and beyond.

Journal reference:
Soham Nandi

Written by

Soham Nandi

Soham Nandi is a technical writer based in Memari, India. His academic background is in Computer Science Engineering, specializing in Artificial Intelligence and Machine learning. He has extensive experience in Data Analytics, Machine Learning, and Python. He has worked on group projects that required the implementation of Computer Vision, Image Classification, and App Development.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Nandi, Soham. (2024, April 30). Using ML to Predict Anemia Among Young Girls in Ethiopia. AZoAi. Retrieved on November 21, 2024 from https://www.azoai.com/news/20240430/Using-ML-to-Predict-Anemia-Among-Young-Girls-in-Ethiopia.aspx.

  • MLA

    Nandi, Soham. "Using ML to Predict Anemia Among Young Girls in Ethiopia". AZoAi. 21 November 2024. <https://www.azoai.com/news/20240430/Using-ML-to-Predict-Anemia-Among-Young-Girls-in-Ethiopia.aspx>.

  • Chicago

    Nandi, Soham. "Using ML to Predict Anemia Among Young Girls in Ethiopia". AZoAi. https://www.azoai.com/news/20240430/Using-ML-to-Predict-Anemia-Among-Young-Girls-in-Ethiopia.aspx. (accessed November 21, 2024).

  • Harvard

    Nandi, Soham. 2024. Using ML to Predict Anemia Among Young Girls in Ethiopia. AZoAi, viewed 21 November 2024, https://www.azoai.com/news/20240430/Using-ML-to-Predict-Anemia-Among-Young-Girls-in-Ethiopia.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Boost Machine Learning Trust With HEX's Human-in-the-Loop Explainability