Scholarly open access journals, Peer-reviewed, and Refereed Journals, Impact factor 8.14 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool) , Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)
Accurate prediction of physical properties, such as band gap, is crucial in the design and characterization of novel materials with unique functionalities. Although DFT-based calculations are considered reliable and have been used extensively in the field, they are time-consuming and computationally expensive; therefore, they cannot be applied on a broad scale in materials screening. This problem can be solved using a hybrid machine learning methodology aimed at predicting the properties of new materials with a focus on their reliability.
For predicting the band gap property, different regression models are proposed in this work, including Random Forest, XGBoost, ExtraTrees, CatBoost, Multi-Layer Perceptron (MLP), and stacking. The training set is prepared by means of an elaborate feature engineering pipeline, which utilizes multiple descriptors, including elemental, stoichiometric, valence orbital descriptors, and ion-related features. In addition, we employ data augmentation methods that enhance synthetic data alongside real samples to improve the performance of the model.
In this study, we also propose a new method for evaluating the reliability of predictions based on the combined use of a model's uncertainty estimates along with out-of-distribution (OOD) score, estimated with Isolation Forest and nearest-neighbor distance algorithms.
From experimental results, it is evident that the proposed framework demonstrates high predictive ability. As for the performance among all considered machine learning models, it can be observed that the XGBoost model outperforms others with R² of 0.912, MAE of 0.250 eV, and RMSE of 0.349 eV. At the same time, the second best-performing model is Random Forest with R² of 0.905, MAE of 0.248 eV, and RMSE of 0.361 eV. Stacking ensemble achieves an R² of 0.901, and ExtraTrees (R² = 0.878), MLP (R² = 0.869), and CatBoost (R² = 0.861) demonstrate comparative results. At the same time, GNN model demonstrates weak performance with negative R², demonstrating shortcomings of the structure-based approach when data availability is insufficient.
Conclusively, the results of experiments demonstrated the efficiency of hybrid ensemble learning approaches to property prediction, and highlighting it.
Keywords:
Materials Informatics, Band Gap Prediction, Machine Learning, Graph Neural Networks, Ensemble Learning, Uncertainty Quantification, Out-of-Distribution Detection, Materials Discovery
Cite Article:
"Verbindung - A Physics-Aware Multi-Task Learning Framework for Coupled Prediction of Structural and Electronic Properties in Materials", International Journal for Research Trends and Innovation (www.ijrti.org), ISSN:2456-3315, Vol.11, Issue 4, page no.a758-a771, April-2026, Available :http://www.ijrti.org/papers/IJRTI2604109.pdf
Downloads:
00056
ISSN:
2456-3315 | IMPACT FACTOR: 8.14 Calculated By Google Scholar| ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.14 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator