Project deliverable

D5.2 – Report on machine learning results delivered for water systems

  • 22 Dec 2023
  • .

The TREX EU Centre of Excellence investigates implementations of Quantum Monte Carlo (QMC) calculations optimized for exascale high-performance computing. These calculations are high-accuracy quantum-chemical and materials simulations that are inherently parallelizable and computationally demanding. Thus, they are uniquely positioned to utilize and explore the upcoming exascale supercomputer architectures. TREX focuses on the development and promotion of an open-source, high-performance software platform of inter-operable flagship codes and exascale-ready libraries.

This scope includes, in work package 5, applications of these QMC methods to atomistic systems that are highly and directly relevant for technological progress and society. One of these systems is water, the “liquid of life.” In addition, TREX investigates Machine Learning Potentials (MLPs) to greatly accelerate QMC dynamics simulations in work package 4, enabling running more and longer simulations with larger unit cells, a task that will remain computationally unfeasible using only QMC calculations for the foreseeable future.

This Periodic Activity Report D5.2 centers on results obtained via MLPs for water. Because the study of water H2O is strongly linked to the physics and chemistry of hydrogen H, results for hydrogen under pressure are included as well. This report is related to deliverables D4.4, D5.3, and D5.4.

Michele Casula’s group (CNRS) investigated hydrogen’s role in hydrogen bonds of water by exploring electronic properties affecting bond dynamics. Their study on protonated water hexamers using QMC methods revealed temperature-dependent proton behaviour. To extend findings, they developed an MLP for water clusters. Ongoing work focuses on improving agreement and studying QMC noise effects on MLP quality and long-range interactions’ impact on charged systems.

They also studied the phase diagrams of hydrogen (H) and hydrogen-rich materials due to H’s relationship with water and high-temperature superconductivity found in H-rich materials. These phase diagrams are very rich, with many competing phases. Resolving them is highly challenging and requires coupling QMC calculations for electrons with path-integral molecular dynamics or path-integral Monte Carlo for quantum nuclei. Lower levels of theory cannot predict these phase diagrams. One of H’s most accurate phase diagrams was calculated with the TREX code TurboRVB within work package 5.

The usual strategy of training an MLP directly on QMC reference data fails as it is computationally too expensive to generate enough QMC training data. Instead, QMC corrections to a computationally cheaper physical baseline method, such as Density Functional Theory (DFT), were employed. This “∆-learning” approach requires less QMC training data. The group of Sandro Sorella (SISSA) developed a ∆-learning MLP, enabling them to train an accurate model using only 684 QMC calculations. They used this model to study the liquid-liquid phase transition of high-pressure hydrogen.

To enable further studies, the groups of Matthias Rupp (UKON, LIST) and Michele Casula (CNRS) collaborate to determine whether so-called “ultra-fast potentials” trained on DFT reference data can be used as baseline potential for the ∆-learning approach. This would enable a computational speed-up by several orders of magnitude, paving the way to a more extended and comprehensive study of the phase diagrams of H and H-rich materials. Further efforts were made towards improved workflows for training set generation.

Overall, nine scientific studies were published that acknowledge TREX funding, in journals including Nature Physics and Nature Communications.