Enhanced Workflow for High-Resolution Lithology Logs Using Multiple ML Algorithms

Download PDF Copy

By Muhammad OsamaReviewed by Susha Cheriyedath, M.Sc.Dec 12 2023

In an article published in Scientific Reports, researchers from Iran proposed a novel workflow that uses an enhanced weighted average ensemble approach with error-correcting output code (ECOC) and cost-sensitive learning (CSL) techniques to produce high-resolution lithology logs from conventional well logs. They demonstrated that the developed workflow can accurately predict underground lithofacies and outperform commonly used machine learning (ML) algorithms. The research addresses the challenges of multiclass imbalanced data classification and scalability, which arise from the complex geological heterogeneities and the large volume of data.

*Study: Enhanced Workflow for High-Resolution Lithology Logs Using Multiple ML Algorithms. Image credit: Nil Kulp/Shutterstock*

Background

Lithology logs are graphical representations of the subsurface rock types encountered during drilling operations. They provide valuable information for geologists and engineers to evaluate and correlate different formations. Well logs are measurements of the physical properties of subsurface rocks, such as gamma ray, neutron porosity, density, sonic, and photoelectric factor. These properties can reflect the changes in lithology, texture, and structure of the rocks, and thus can be used to gather the lithofacies or the rock units that share similar characteristics and depositional environments.

However, well logs can also be affected by other factors, such as salinity, fluid content, diagenesis, fractures, and clay composition, which can complicate and non-linearize the relationship between well logs and lithofacies. Moreover, the distribution of lithofacies can be highly imbalanced due to the natural heterogeneity of the subsurface, which poses a challenge for traditional ML algorithms that assume balanced data and focus on accuracy.

About the Research

The paper introduces a workflow that relies on an enhanced weighted average ensemble approach, which combines several baseline ML algorithms into a larger one with better performance and stability. Support vector machine (SVM), random forest (RF), decision tree (DT), logistic regression (LR), and extreme gradient boosting (XGBoost) are some widely used baseline algorithms in lithofacies classification.

This study addresses the challenge of multiclass imbalanced data by using two strategies: (1) decomposing the multiclass problem into binary subproblems using ECOC and (2) applying CSL to assign different weights and penalties to different classes according to their importance and rarity.

The authors used a dataset from a Middle Eastern oilfield, which consists of conventional well logs and lithology logs from five wells. They selected one well as a blind well and use the data from the other four wells to train and test the ML algorithms. The lithology logs are divided into seven classes: shale, limestone, argillaceous limestone, chalky limestone, cherty limestone, pyritic limestone, and shaly limestone. The dataset exhibits a significant imbalance among the classes, with shale being the most dominant and pyritic limestone being the rarest.

Methodologies Used

Researchers used the following three-step workflow:

Data preparation: They checked for missing values and outliers, encoded categorical features, removed unnecessary columns, standardized the data, applied linear discriminant analysis (LDA) as a noise reduction technique, and split the dataset into train, test, and blind verification sets.

Model training: The hyperparameters of the baseline algorithms were tuned using grid search and cross-validation, and then trained with different combinations of ECOC and CSL. Further, a voting ensemble classifier and an enhanced weighted average ensemble classifier were also trained.

Model evaluation: Two metrics were used to assess the performance of the models, mean F-measure and mean Kappa statistic, which are suitable for imbalanced multiclass data. The results of the models on both the test set and the blind set were compared to identify the optimal workflow that achieves the best performance.

Research Findings

The outcomes show that the enhanced weighted average ensemble of SVM and RF coupled with ECOC and CSL performs the best among all the models, with a mean F-measure of 91.04% and a mean Kappa statistic of 84.50% on the blind set. Moreover, the ECOC and CSL strategies were effective in handling multiclass imbalanced data, as they improved the performance of the baseline algorithms and the ensemble models.

Furthermore, the generated lithology log by the optimal workflow had a reasonable similarity to the original one and could accurately identify the minority classes, such as shale and pyritic limestone, which are often neglected by other models. This also provides a graphical comparative assessment of the generated lithology log and the original log and demonstrates the robustness and reliability of the optimal workflow.

The study has several applications for the geo-energy industry, as it provides a reliable and automated solution for generating high-resolution lithology logs from conventional well logs. It can help characterize and evaluate subsurface reservoirs and can also be applied to other fields and domains that deal with multiclass imbalanced data, such as image processing, medical diagnosis, and fraud detection.

Conclusion

In summary, the paper presents a versatile and robust workflow for lithofacies classification and generating high-resolution lithology logs from conventional well logs. This workflow can handle the challenges of multiclass imbalanced data and scalability. Moreover, workflow performance is checked by a real-time dataset from a Middle Eastern oilfield with complex and heterogeneous geological formations.

In the future, researchers suggest using other ML techniques, such as deep learning and transfer learning, for lithofacies classification tasks, and considering other sources of uncertainty and noise in the data.

Journal reference:

Jamshidi Gohari, M.S., Emami Niri, M., Sadeghnejad, S. et al. An ensemble-based machine learning solution for imbalanced multiclass dataset during lithology log generation. Sci Rep 13, 21622 (2023). https://doi.org/10.1038/s41598-023-49080-7, https://www.nature.com/articles/s41598-023-49080-7

Posted in: AI Research News

Comments (0)

Written by

Muhammad Osama

Muhammad Osama is a full-time data analytics consultant and freelance technical writer based in Delhi, India. He specializes in transforming complex technical concepts into accessible content. He has a Bachelor of Technology in Mechanical Engineering with specialization in AI & Robotics from Galgotias University, India, and he has extensive experience in technical content writing, data science and analytics, and artificial intelligence.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Osama, Muhammad. (2023, December 14). Enhanced Workflow for High-Resolution Lithology Logs Using Multiple ML Algorithms. AZoAi. Retrieved on July 03, 2025 from https://www.azoai.com/news/20231212/Enhanced-Workflow-for-High-Resolution-Lithology-Logs-Using-Multiple-ML-Algorithms.aspx.
MLA
Osama, Muhammad. "Enhanced Workflow for High-Resolution Lithology Logs Using Multiple ML Algorithms". AZoAi. 03 July 2025. <https://www.azoai.com/news/20231212/Enhanced-Workflow-for-High-Resolution-Lithology-Logs-Using-Multiple-ML-Algorithms.aspx>.
Chicago
Osama, Muhammad. "Enhanced Workflow for High-Resolution Lithology Logs Using Multiple ML Algorithms". AZoAi. https://www.azoai.com/news/20231212/Enhanced-Workflow-for-High-Resolution-Lithology-Logs-Using-Multiple-ML-Algorithms.aspx. (accessed July 03, 2025).
Harvard
Osama, Muhammad. 2023. Enhanced Workflow for High-Resolution Lithology Logs Using Multiple ML Algorithms. AZoAi, viewed 03 July 2025, https://www.azoai.com/news/20231212/Enhanced-Workflow-for-High-Resolution-Lithology-Logs-Using-Multiple-ML-Algorithms.aspx.