AI that can "smell" is here, and its workload is comparable to that of an odor evaluator working continuously for 70 years!

AI that can "smell" is here, and its workload is comparable to that of an odor evaluator working continuously for 70 years!

Content at a glance: Smells are always around us. However, it is difficult for us to accurately describe smells. Recently, Osmo, a subsidiary of Google Research, developed an odor analysis AI based on graph neural networks. It can predict the smell of molecules based on the structure of chemical molecules. Based on this AI, researchers mapped the main odor spectrum and established a mapping between chemical structure and odor, which is expected to provide a new method for perceptual research.

A fundamental problem in neuroscience research is mapping the physical properties of external stimuli into sensory perceptions.

In vision, color is a mapping of wavelength. In hearing, tone is a mapping of frequency. But in smell, the mapping between smell and substance is difficult to establish.

Currently, we can only extract some basic smells, draw a fragrance wheel, and then use these basic smells to form more complex smells.


Figure 1: Schematic diagram of the odor wheel

However, this rough classification is difficult to use for scientific research. Although technologies such as odor sensors have been used to monitor odors, these sensors can only identify specific odors. Existing odor identification often still requires the participation of odor evaluators, a process that is time-consuming and has poor repeatability.

Recently, Osmo, a branch of Google Research, developed an odor analysis AI based on graph neural networks (GNN). It can describe the odor of a chemical molecule based on its structure. This model outperformed humans in judging 53% of chemical molecules and 55% of odor descriptors. Finally, the researchers used this model to draw a principle odor map (POM). This result has been published in Science.

Related research has been published in Science. Link: https://www.science.org/doi/full/10.1126/science.ade4401

Experimental process:

1. GNN models are stable in multiple architectures

Smell is essentially people's perception of chemical molecules in the air. Therefore, the structure of chemical molecules will affect the smell. In GNN, the structure of chemical molecules is analyzed and integrated to form a graph representing the entire molecule.

After the molecular structure is input into the model, GNN will optimize the weights of different chemical structures in a specific odor, and finally judge the odor of the molecule through the prediction layer and output the corresponding odor descriptor.

Figure 2: Schematic diagram of the GNN model

Combining the Good Scents and Leffingwell & Associates databases (GS-LF databases), the researchers selected 5,000 molecules as the model database. Each molecule can be described by multiple odors, such as cheese, fruity, etc.

Figure 3: Some molecules in the GS-LF database

Subsequently, the GS-LF database was divided into training and testing sets in a ratio of 8:2, and the training set was further divided into five cross-validation subsets.

The Bayesian optimization algorithm was used to cross-validate the data and optimize the hyperparameters of the GNN model. After the optimization was completed, the GNN model performed stably in multiple architectures, and the highest AUROC in the cross-validation set was 0.89.

2. GNN models outperform humans in odor prediction

To verify the model's ability to distinguish other molecules, the researchers conducted odor tests on the GNN model and a human group.

Figure 4: Judgment of the odor of 2,3-dihydrobenzofuran-5-carboxaldehyde by different models

A: GNN model;

B: RF model;

C: Human group;

D: Evaluation of the odor of 2,3-dihydrobenzofuran-5-carbaldehyde by different evaluators.

For 53% of the molecules, the GNN model's odor prediction results were better than the median of the human group. The current state-of-the-art algorithm, the random forest model (RF) based on count-based fingerprint (cFP), only outperformed the human group in 41% of the molecular odor predictions.

Figure 5: Correlation of predictions from different models with the average of the human group

The researchers then classified the GNN model's predictions by odor descriptors. Except for musk, the GNN model's predictions for molecular odors were all within the human group's error distribution, and outperformed the human group median in the predictions for 30 odor descriptors.

Figure 6: Judgment results of GNN model, RF model and human group on different molecules

The prediction results of the GNN model are affected by the structure of the molecule, so for sulfur-containing garlic and amine-containing fishy smells, the GNN model has a higher prediction accuracy. Musk contains at least five different structures: macrocyclic, polycyclic, nitro, steroidal and linear, so the prediction results of the GNN model are the worst.

The human group's performance was affected by familiarity: they were more consistent in their judgments of common food aromas such as nuts, garlic, and cheese, but had greater differences in their judgments of musk and hay.

At the same time, the number of descriptors in the training set will also affect the GNN model's prediction of a certain smell. When the number of occurrences is large enough, the GNN model can make more accurate predictions for complex structures, such as fragrance, floral fragrance, and sweetness.

Figure 7: Effect of training data on the correlation between the GNN model prediction results and the human group average

However, for flavors that occur less frequently, the accuracy of the GNN model is polarized. It has a high accuracy for predicting fishy, ​​mint, and camphor, but poor judgment for ozone, acetic acid, and fermented flavors.

3. GNN model draws the main odor spectrum

After verifying the performance of the GNN model, the researchers further used it in different olfactory tasks.

First, they tested the model's ability to judge molecules with similar structures. After the model knew the smell of a molecule, it needed to judge the smell of molecules with similar structures but different smells and molecules with different structures but similar smells. For this abnormal structure-odor relationship, the GNN model had a 50% correct judgment rate, while the RF model had only 19%.

Figure 8: A group of triplets whose structures or smells are close to known molecules

After obtaining a stable structure-odor relationship, the researchers began to try to draw a large-scale odor spectrum. They completed the master odor spectrum (POM) of about 500,000 molecules. These molecules are still unknown in the scientific research field, and most of them have not even been synthesized.

However, their positions in the spectrum can be directly calculated by the GNN model, so a large-scale odor spectrum can be drawn. If a trained human evaluator were to evaluate the odor of these molecules, it would take about 70 years of continuous work.

Figure 9: Main odor spectrum

In the figure, the coordinates of each molecular odor are determined by the GNN model, and the RGB value of its color corresponds to the coordinates of the first three dimensions in the predicted odor matrix.

4. The Proust Effect: The Linkage between Smell and Memory

When we smell a certain smell, we recall past memories, and the smell makes the memory more vivid and emotional. In "Remembrance of Things Past", writer Marcel Proust mentioned that when the narrator smelled the scent of madeleine cakes soaked in tea, "the past came to mind". Therefore, this phenomenon is also called the Proust effect.

The sense of smell is more closely connected to memory in the nervous system than any other sense. It is the only sensory system that is directly connected to the emotional and memory brain regions. When the olfactory cells are activated, nerve impulses are transmitted directly to the piriform cortex. This brain region includes the amygdala, which is responsible for fear and other emotions, and the parahippocampal gyrus, which is responsible for memory.

Figure 10: Components of the olfactory circuit

Primary olfactory cortex: primary olfactory cortex;

Amygadala: amygdala;

Hippocampus: seahorse.

Because of the close connection between smell, memory and emotion, perfume has become a must-have for people to go out and meet. Maybe the other person can't call your name when he sees you again, but when he smells this scent, he will definitely remember the scene of meeting you.

With the help of AI, people have a deeper understanding of the connection between molecular structure and smell. Maybe one day, we can really mix the smell that we are most familiar with. Open the bottle cap and you can take a time machine and return to the past.

Reference Links:

[1]https://perfumersupplyhouse.com/2014/01/09/fragrance-creation-wheels-for-you/

[2]https://www.slideserve.com/cora-schroeder/functional-neuroanatomy

Author | Xuecai

Editor | Sanyang

This article was first published on HyperAI WeChat public platform~

<<:  The passion of “Academician Budai” for educating people: after receiving the award, he immediately ran to teach students

>>:  Voice assistant: The "full voice control" in science fiction movies is becoming a reality

Recommend

Let me see who is speaking in such a strange voice.

to be honest, Every time Rong Ge finished shootin...

Make every word of your product copy worth its weight in gold!

Words appear in every corner of our daily lives. ...

“Happiness Trap” – Can we be happy as long as we are happy?

Half a year ago, "good luck" came to me...

iOS 16.4 will add new features that have been withdrawn

In the iOS 16.2 update, Apple intended to release...

How to use content to grow products?

Let me explain it simply and literally. What does...

Jieshou SEO training: What should I do if my website is not included?

In fact, websites that are not indexed are genera...

Toutiao information flow advertising analysis and delivery skills!

Your familiarity with channels means whether you ...

El Nino alert! Is the hottest year in history coming?

Recently, people in many areas have found that it...

6 ways to increase the conversion rate of promotion pages by 30 times!

I once worked for an online education company. Th...

How do these five industries seize the dividends of Zhihu's advertising channel?

Which industries and companies have already condu...

7 open source software that supports the entire Internet era

Open source software has now become the supportin...