Development of a method for differential diagnosis of iron deficiency anemia and anemia of chronic disease based on demographic data and routine laboratory tests using machine learning technologies
https://doi.org/10.17650/1818-8346-2025-20-1-171-181
Abstract
Background. The study of machine learning methods, a branch of artificial intelligence science, is relevant for the development of optimal screening strategies, identification of risk groups, and application of less expensive and more accessible laboratory tests to assess the body iron status.
Aim. To select an appropriate artificial intelligence algorithm for predicting serum ferritin (SF) levels and to evaluate its applicability for differential diagnosis of iron deficiency anemia and anemia of chronic diseases.
Materials and methods. A dataset of 9771 patients with micro‑normocytic anemia was used to create the model. On the basis of demographic data (gender and age), clinical blood count, C‑reactive protein level and known SF level, a regression model was developed to calculate the expected SF concentration in a particular patient and, using the same parameters, a classification model to determine the SF level group to which the patient belongs: I – < 15 μg / L; II – 15–100 μg / L; III – 100–300 μg / L; Iv – ≥ 300 μg / L.
Results. As a result, the regression model has moderate predictive ability (R2 = 0.70; median absolute error was 10.7 μg / L), the correlation coefficient between known and predicted SF level was r = 0.854 (p < 0.05). The obtained classification model has high diagnostic accuracy for different clinical groups according to the SF level (AuC ROC was 0.91; 0.79; 0.84; 0.90 and 0.96; 0.76; 0.71; 0.82 for patients with reduced hemoglobin levels in women (< 120 g / L) and men (< 130 g / L) in groups I, II, III, Iv, respectively).
Conclusion. Prediction of SF level using the developed models can be used as an accurate and clinically relevant tool for differential diagnosis of iron deficiency anemia (predicted SF is decreased (< 100 μg / L), C‑reactive protein is normal) and anemia of chronic diseases (predicted SF is normal or increased (>100 μg / L), C‑reactive protein is increased) in real medical practice.
Keywords
About the Authors
N. V. VarekhaRussian Federation
Nikolay Vyacheslavovich Varekha
117198; 6 Miklukho-Maklaya St.; Moscow
N. I. Stuklov
Russian Federation
117198; 6 Miklukho-Maklaya St.; Moscow
K. V. Gordienko
Russian Federation
123007; 76A Khoroshevskoe Shosse; Moscow
R. R. Gimadiev
Russian Federation
117198; 6 Miklukho-Maklaya St.; 123007; 76A Khoroshevskoe Shosse; 119002; Build. 1, 14 Bolshoy Vlasyevsky Pereulok; Moscow
O. B. Shchegolev
Russian Federation
119002; Build. 1, 14 Bolshoy Vlasyevsky Pereulok; Moscow
S. N. Kislaya
Russian Federation
117198; 6 Miklukho-Maklaya St.; Moscow
E. V. Gubina
Russian Federation
119002; Build. 1, 14 Bolshoy Vlasyevsky Pereulok; Moscow
A. A. Gurkina
Russian Federation
117198; 6 Miklukho-Maklaya St.; Moscow
References
1. Haemoglobin concentrations for the diagnosis of anaemia and assessment of severity. Vitamin and mineral nutrition information system. Geneva: World Health Organization, 2011. 6 p.
2. Ministry of Health of Russia. Iron deficiency anemia. Clinical guidelines. 2024. Available at: https://cr.minzdrav.gov.ru/view-cr/669_2 (In Russ.).
3. Sakhin V.T., Kryukov E.V., Rukavitsyn O.A. Anemia of chronic diseases – the key mechanisms of pathogenesis and the attempt of the classification. Tikhookeanskiy meditsinskiy zhurnal = Pacific Medical Journal 2019;(1):33–7. (In Russ.). DOI: 10.17238/PmJ16091175.2019.1.33–37
4. Hoofnagle A.N. Harmonization of bloodbased indicators of iron status: making the hard work matter. Am J Clin Nutr 2017;106(Suppl 6):1615S–9. DOI: 10.3945/ajcn.117.155895
5. Dogan S., Turkoglu I. Irondeficiency anemia detection from hematology parameters by using decision trees. Int J Sci Technol 2008;3(1):85–92.
6. Azarkhish I., Raoufy M.R., Gharibzadeh S. Artificial intelligence models for predicting iron deficiency anemia and iron serum level based on accessible laboratory data. J Med Syst 2012;36(3):2057–61. DOI: 10.1007/s1091601196683
7. Luo Y., Szolovits P., Dighe A.S., Baron J.M. Using machine learning to predict laboratory test results. Am J Clin Pathol 2016;145(6):778–88. DOI: 10.1093/ajcp/aqw064
8. Pullakhandam S., McRoy S. Classification and explanation of iron deficiency anemia from complete blood count data using machine learning. BioMedInformatics 2024;4(1):661–72. DOI: 10.3390/biomedinformatics4010036
9. Yılmaz Z., Bozkurt M.R. Determination of women iron deficiency anemia using neural networks. J Med Syst 2012;36(5):2941–5. DOI: 10.1007/s1091601197724
10. Kurstjens S., de Bel T., van der Horst A. et al. Automated prediction of low ferritin concentrations using a machine learning algorithm. Clin Chem Lab Med 2022;60(12):1921–8. DOI: 10.1515/cclm20211194
11. Terzi E., Sarıbacak B., Sağlam F., Cengiz M.A. A novel expert system for diagnosis of iron deficiency anemia. Comput Math Methods Med 2022;2022:7352096. DOI: 10.1155/2022/7352096
12. McDermott M., Dighe A.S., Szolovits P. et al. Using machine learning to develop smart reflex testing protocols. J Am Med Inform Assoc 2023;31(2):416–25. DOI: 10.1093/jamia/ocad187
13. WHO guideline on use of ferritin concentrations to assess iron status in individuals and populations. Geneva: World Health Organization, 2020.
14. Schop A., Stouten K., van Houten R. et al. Diagnostics in anaemia of chronic disease in general practice: a realworld retrospective cohort study. BJGP Open 2018;2(3):bjgpopen18X101597. DOI: 10.3399/bjgpopen18x101597
15. Weiss G., Goodnough L. Anemia of chronic disease. N Engl J Med 2005;352(10):1011–23. DOI: 10.1056/NEJMra041809
16. Vakhrushev A., Ryzhkov A., Savchenko M. et al. LightAutoML: AutoML solution for a large financial services ecosystem. arXiv 2021;2109.01528. DOI: 10.48550/arXiv.2109.01528
17. Nick T.G., Campbell K.M. Logistic regression. Methods Mol Biol 2007;404:273–301. DOI: 10.1007/9781597455305_14
18. Ke G., Meng Q., Finley T. et al. LightGBM: a highly efficient gradient boosting decision tree. Advances in neural information processing systems 2017;30:3149–57.
19. Moritz P., Nishihara R., Jordan M. A linearlyconvergent stochastic LBFGS algorithm. Artificial Intelligence and Statistics 2016;249–58.
20. Fushiki T. Estimation of prediction error by using Kfold cross-validation. Statistics and Computing 2011;21:137–46. DOI: 10.1007/s1122200991538
21. Swets J.A., Dawes R.M., Monahan J. Better decisions through science. Sci Am 2000;283(4):82–7. DOI: 10.1038/scientificamerican100082
22. Das K.R., Imon A. A brief review of tests for normality. Am J Theor Appl Stat 2016;5(1):5–12. DOI: 10.11648/j.ajtas.20160501.12
23. Meissel K., Yao E.S. Using Cliff’s delta as a nonparametric effect size measure: an accessible web app and R tutorial. Practical Assessment, Research, and Evaluation 2024;29(1). DOI: 10.7275/pare.1977
24. Federal State Statistics Service. Population of the Russian Federation by gender and age by January 1, 2021. 2021. Available at: https://rosstat.gov.ru/storage/mediabank/Bul_chislen_nasel-pv_01012021.pdf (accessed 23. 04. 2022).
Review
For citations:
Varekha N.V., Stuklov N.I., Gordienko K.V., Gimadiev R.R., Shchegolev O.B., Kislaya S.N., Gubina E.V., Gurkina A.A. Development of a method for differential diagnosis of iron deficiency anemia and anemia of chronic disease based on demographic data and routine laboratory tests using machine learning technologies. Oncohematology. 2025;20(1):171-181. (In Russ.) https://doi.org/10.17650/1818-8346-2025-20-1-171-181