Unsupervised Machine Learning Tool Could Speed ​​Catalyst Discovery | Research

Combining machine learning with computational-derived descriptors has allowed scientists to find new examples of a special class of catalysts using just a few experimental data points. The team led by Franziska Schönebeck from the RWTH University of Aachen in Germany developed a workflow that identified 21 phosphine ligands capable of forming dinuclear palladium (I) complexes with a certain geometry and stability in air compared to most common species of palladium (0) and palladium (II).1

“These dimers are very promising catalysts with distinct reactivity compared to most of the commonly used palladium-based catalysts,” comments Tobias Gensch from TU Berlin, Germany, which did not participate in the study. “However, their chemistry is not yet well understood as their synthesis was unpredictable and the influence of ligand on dimer stability was unknown.” The new approach allowed researchers to predict ligands that stabilize palladium (I) dimers and synthesize several new examples of these complexes, he says.

Finding efficient catalysts is the key to many innovations in chemistry, but different species have different activities and selectivities, so finding the right compounds is difficult. “Accurately predicting the speciation of a catalyst would, in principle, require precise knowledge of all the species that can form under given conditions and their relative energies – a daunting task! »Points out Marc-Etienne Moret, organometallic chemist at the University of Utrecht in the Netherlands.

That’s why chemists typically rely on trial and error to test for ligands that they think might work. “Scientists have also developed maps to classify ligands based on their properties; this can help them visually identify promising candidates, ”adds Moret. But in some cases, these methods do not help. The new results show that machine learning can successfully predict ligands where neither intuition nor visual inspection would succeed, he says. “This could accelerate the development of new catalysts by identifying promising targets before manufacturing and testing them extensively in the laboratory. “

The scientists first used an algorithm to filter out 348 ligands based on their general properties, and then performed further grouping by inputting problem-specific data obtained from density functional theory calculations. This strategy allowed them to group a large data set into smaller subsets of greater similarity, suitable for the problem to be solved. The team then experimentally verified some of the predicted ligands, including one that had never been synthesized before, and used them to make new palladium (I) dimers.

Gensch notes that the system could identify new ligands using just five experimental data points. “Other machine learning approaches such as regression modeling require a lot more input data,” he says. “The ability to work with so little data results from the combined use of a highly informative database of general purpose ligands and problem-specific descriptors, coupled with a simple yet powerful two-step clustering approach. “

“The accuracy of the algorithm’s predictions is remarkable,” says Moret. “It suggested ligands that probably would never have been tested otherwise. This methodology could potentially help solve many related problems for which empirical or computational data exist but do not yet form an intuitively understandable picture. ‘

Comments are closed.