Ashna JOSE – Active Learning approaches to Discover Spin-Crossover Metal Organic Frameworks for Carbon Capture

Supervised by Noel Jakse (SIMaP), Roberta Poloni (SIMaP) and Emilie Devijver (LiG).

Jury

Rémi EMONET, Maître de conférences, Université Jean Monnet Saint-Etienne (Rapporteur)
Rocio SEMINO, Maîtresse de conférences, Sorbonne Université (Rapporteure)
Massih-Reza AMINI, Professeur des Universités, Université Grenoble Alpes (Examinateur)
Milica TODOROVIC, Associate Professor, University of Turku (Examinatrice)
Gian-Marco RIGNANESE, Directeur de Recherche, FNRS (Examinateur)
Noël JAKSE, Professeur des Universités, Université Grenoble Alpes (Directeur de thèse)
Roberta POLONI, Chargée de Recherche HDR, CNRS (Co-encadrante, Invitée)
Emilie DEVIJVER, Chargée de Recherche, CNRS (Co-encadrante, Invitée) 
 

Abstract

Active Learning approaches to Discover Spin-Crossover Metal Organic Frameworks for Carbon Capture

Spin-crossover (SCO) metal-organic frameworks (MOFs) are porous materials that can switch their spin state in response to external stimuli. This ability has been demonstrated to enhance the efficiency of the gas capture and release process owing to the change in gas binding energy upon a change in spin state. Discovering new SCO MOFs is challenging due to the limitations of computational methods in accurately calculating spin energetics. While machine learning approaches have been used to predict SCO thermodynamic properties, these efforts are limited to Fe(II) complexes in octahedral coordination. The complexity and size of MOFs, along with the extensive human and computational effort required to accurately determine thermodynamic properties, result in a lack of labeled data, and has hindered the identification and discovery of these materials. To overcome the above challenges, machine learning is used in this thesis. An active learning approach, named Regression Tree-based Active Learning (RT-AL), is developed, which samples informative training data using the partitions of a regression tree. On comparing its performance to that of the state-of-the-art active learning techniques, it is found to be competitive, and more consistent for different datasets as well as different types of features. The method’s performance is then demonstrated on synthesised as well as hypothetical MOF databases to predict electronic band gap and gas adsorption properties. Descriptors apt for different properties are studied in the low data regime, and it is found that simpler and low dimensional descriptors are suitable for predicting these properties when less labeled data is available. A complete data-driven workflow is then designed for predicting adiabatic energy differences in MOFs, and identifying the ones that can undergo SCO. The MOF-2184 dataset is set up, and an extension of RT-AL, Quantile Regression Tree-based Active Learning (QRT-AL), developed in this work, is used to select training samples from this set, such that it focuses more on values of the labels in the region of interest. A workflow using the AiiDA workflow manager is developed, namely the SCO-MOF workflow, to automate the spin polarised DFT calculations and to compute adiabatic energy differences. Ensemble tree-based models are trained to make predictions on unseen data, and MOFs interesting for carbon capture are identified. This work is the first in the field of data-driven discovery of SCO MOFs for carbon capture, and opens doors to many more possibilities in the future.

Infos date
Wednesday, September 18, 2024 – 14:00
Infos lieu
Amphi Jean Besson
SIMAP, 1130 rue de la Piscine, 38400 Saint Martin d'Hères