In an article recently published in the journal PLOS Digital Health, researchers performed a systematic review and meta-analysis to determine the diagnostic accuracy of artificial intelligence (AI) in detecting fractures through several data types and imaging modalities.
Importance of AI in fracture detection
Bone fractures are a significant global public health concern, specifically for persons with osteoporosis. Fractures often lead to health complications, reduced quality of life, disability, higher healthcare costs, and work absences, which affect families, individuals, and societies.
Different imaging modalities, such as magnetic resonance imaging (MRI), computed tomography (CT), and X-rays, have been employed in fracture detection and diagnosis. AI, including deep learning (DL) and machine learning (ML), has been used extensively to predict fracture outcomes due to accessibility and technological advancements.
Additionally, AI can predict fractures using tabular data, like electronic medical records/structured patient-level data. However, only a few studies have used AI with tabular data to predict fractures despite the increasing importance of AI in the past few years.
Recent meta-analyses and systematic reviews have reported high precision for AI in fracture classification and detection. However, the existing meta-analysis studies and systematic reviews primarily focused on image-based analyses, neglecting a complete examination of different data types and imaging modalities. Specifically, in the current literature, a crucial gap exists regarding the optimal selection of imaging modalities and the choice between tabular, image, or both image and tabular data types.
The study
In this study, researchers performed a meta-analysis and systematic review to evaluate the efficacy of AI in fracture detection using different data types and imaging modalities. They synthesized the existing evidence on fracture detection using AI to discern the limitations and strengths of various data types.
Peer-reviewed studies validating and developing AI, including ML and DL models, for fracture detection were identified by searching multiple electronic databases without time limitations, with the last search being conducted on December 15, 2022. Eventually, 66 studies were included in the systematic review and meta-analysis, with 54 studies identifying fractures using imaging-related data/X-ray or MRI image, nine studies identifying fractures using tabular data/structured electronic health records data, and three studies using both imaging and tabular data.
All selected studies were published from 2007 to 2022, with 48 studies being published in the last three years. A hierarchical meta-analysis model was utilized to calculate pooled sensitivity and specificity. Additionally, a diagnostic accuracy quality assessment was performed to evaluate applicability and bias.
Review findings
The vertebral fracture was the most common fracture outcome in the selected studies, followed by a hip fracture. Other fracture outcomes were femoral neck fracture, scaphoid fracture, wrist fracture, thoracolumbar fracture, and multiple fractures, including, pelvic, spine, and rib, pelvic and limbs, osteoporotic fractures, major osteoporotic fractures, hip and spine, and hip and pelvic. Hip fractures demonstrated the highest pooled sensitivity and specificity.
Convolutional neural network (CNN), a DL approach, was used extensively in the 54 studies using imaging-related data, followed by the transfer learning approach. In the nine studies that employed tabular data, the fully connected artificial neural network (ANN) was the most preferred AI technique.
Logistic regression (LR) and ensemble learning models were commonly utilized, including XGBoost, gradient boosting, and random forest. Three studies that used both tabular and image data primarily employed support vector machine (SVM) with different kernel models.
AI showed high diagnostic accuracy for different fracture outcomes, which indicated the potential utility of this technology in healthcare systems for fracture diagnosis. The pooled sensitivity and specificity using image data were higher than those obtained using tabular data.
AI displayed high classification accuracy for fracture detection when using imaging data, with 92% pooled sensitivity. CNNs with transfer learning demonstrated substantially high accuracy when utilizing image data for fracture classification. Radiographs displayed the highest pooled sensitivity and specificity. AI with radiograph images yielded the highest results with 94% pooled sensitivity. The assessment of applicability and bias for 66 studies revealed moderate to low concerns. Patient selection and reference standards were major concerns in assessing diagnostic accuracy for bias and applicability.
Conclusion
Overall, the findings of this review aligned with other meta-analyses and systematic reviews, which demonstrated that higher pooled specificity and sensitivity for fracture detection could be achieved using AI. However, inconsistent results were observed while comparing various image modalities in fracture detection. Although external validation allows a more effective demonstration of clinical utility compared to simple internal train/test cross-validation, this review displayed that only 13 out of 66 selected studies performed external validation.
To summarize, the review highlighted the high accuracy and reliability of AI in diagnosing different fracture outcomes. However, greater transparency in the reporting of study designs and methods for AI validation and development is essential to ensure the clinical applicability of AI.
Journal reference:
- Jung, J., Dai, J., Liu, B., Wu, Q. (2024). Artificial intelligence in fracture detection with different image modalities and data types: A systematic review and meta-analysis. PLOS Digital Health, 3(1), e0000438. https://doi.org/10.1371/journal.pdig.0000438, https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000438