New machine-learning application to help researchers predict chemical properties
One of the shared, fundamental goals of most chemistry researchers is the need to predict a molecule’s properties, such as its boiling or melting point. Once researchers can pinpoint that prediction, they’re able to move forward with their work yielding discoveries that lead to medicines, materials, and more. Historically, however, the traditional methods of unveiling these predictions are associated with a significant cost — expending time and wear and tear on equipment, in addition to funds.
Enter a branch of artificial intelligence known as machine learning (ML). ML has lessened the burden of molecule property prediction to a degree, but the advanced tools that most effectively expedite the process — by learning from existing data to make rapid predictions for new molecules — require the user to have a significant level of programming expertise. This creates an accessibility barrier for many chemists, who may not have the significant computational proficiency required to navigate the prediction pipeline.
To alleviate this challenge, researchers in the McGuire Research Group at MIT have created ChemXploreML, a user-friendly desktop app that helps chemists make these critical predictions without requiring advanced programming skills. Freely available, easy to download, and functional on mainstream platforms, this app is also built to operate entirely offline, which helps keep research data proprietary. The exciting new technology is outlined in an article published recently in the Journal of Chemical Information and Modeling.
One specific hurdle in chemical machine learning is translating molecular structures into a numerical language that computers can understand. ChemXploreML automates this complex process with powerful, built-in "molecular embedders" that transform chemical structures into informative numerical vectors. Next, the software implements state-of-the-art algorithms to identify patterns and accurately predict molecular properties like boiling and melting points, all through an intuitive, interactive graphical interface.
"The goal of ChemXploreML is to democratize the use of machine learning in the chemical sciences,” says Aravindh Nivas Marimuthu, a postdoc in the McGuire Group and lead author of the article. “By creating an intuitive, powerful, and offline-capable desktop application, we are putting state-of-the-art predictive modeling directly into the hands of chemists, regardless of their programming background. This work not only accelerates the search for new drugs and materials by making the screening process faster and cheaper, but its flexible design also opens doors for future innovations.”
ChemXploreML is designed to to evolve over time, so as future techniques and algorithms are developed, they can be seamlessly integrated into the app, ensuring that researchers are always able to access and implement the most up-to-date methods. The application was tested on five key molecular properties of organic compounds — melting point, boiling point, vapor pressure, critical temperature, and critical pressure — and achieved high accuracy scores of up to 93 percent for the critical temperature. The researchers also demonstrated that a new, more compact method of representing molecules (VICGAE) was nearly as accurate as standard methods, such as Mol2Vec, but was up to 10 times faster.
“We envision a future where any researcher can easily customize and apply machine learning to solve unique challenges, from developing sustainable materials to exploring the complex chemistry of interstellar space,” says Marimuthu. Joining him on the paper is senior author and Class of 1943 Career Development Assistant Professor of Chemistry Brett McGuire.
Latest MIT Latest News
- InvenTeams turns students into inventorsThe Lemelson-MIT program challenges student teams across the country to solve problems in their communities — and present their inventions at MIT.
- 3 Questions: Applying lessons in data, economics, and policy design to the real worldGevorg Minasyan MAP ’23 draws on experiences in MIT’s Data, Economics, and Design of Policy MicroMasters and master’s programs to shape policymaking at the Central Bank of Armenia.
- Robot, know thyself: New vision-based system teaches machines to understand their bodiesNeural Jacobian Fields, developed by MIT CSAIL researchers, can learn to control any robot from a single camera, without any other sensors.
- Pedestrians now walk faster and linger less, researchers findA computer vision study compares changes in pedestrian behavior since 1980, providing information for urban designers about creating public spaces.
- Scientists apply optical pooled CRISPR screening to identify potential new Ebola drug targetsCombining powerful imaging, perturbational screening, and machine learning, researchers uncover new human host factors that alter Ebola’s ability to infect.
- Astronomers discover star-shredding black holes hiding in dusty galaxiesUnlike active galaxies that constantly pull in surrounding material, these black holes lie dormant, waking briefly to feast on a passing star.