With a dataset of 3 million images and 15,000 zebrafish embryos, systems biologist Patrick Müller successfully implemented AI-based embryo recognition. Author|Jia Ling Editor|Sanyang During animal development, embryos undergo complex morphological changes over time. Researchers hope to be able to objectively quantify developmental time and speed, and provide standardized methods to analyze the stages of early embryos and better understand the evolutionary and developmental processes. Previously, scholars' understanding of embryonic development stages and embryonic morphological transformation came from microscopic observation. However, the stage transformation of embryonic development is not ideal and stable. There are so many influencing factors that it is difficult for researchers to observe a specific developmental state. The process of observing embryonic morphology to infer the developmental time and stage is still subjective. In order to objectively establish the relationship between development time and development speed, systems biologist Patrick Müller led researchers from the University of Konstanz to develop a deep learning method based on a twin network. Through image comparison, it can automatically capture the embryonic development process and identify the characteristic stages of embryonic development without human intervention. Currently, the relevant results have been published in "Nature Methods". The paper was published in Nature Methods Get the paper: https://www.nature.com/articles/s41592-023-02083-8 01 Experimental process Dataset: Integrating a large number of embryo images Using the high-throughput imaging pipeline and ResNet101-based image segmentation, the researchers built a dataset of 3 million images and 15,000 zebrafish embryos to generate developmental trajectories of individual embryos. Each embryo was tracked individually and divided by a bounding box of a different color when input to the model. A separate JSON file was created for each experiment, containing information about the embryos belonging to each category. Image processing diagram Model architecture: Siamese network model The twin network structure consists of two parallel neural networks with the same structure. It can receive two pictures as input at the same time, and the weights are shared between the two neural networks. The images are compared through similarity calculation based on feature embedding. The following is a diagram of the structure of the twin network: Twin network structure The neural network structure that constitutes the twin network is as follows: ResNet50-based neural network Backbone network: Based on the ImageNet dataset, the ResNet50 architecture with pre-trained weights is used as the backbone network; Embedding model head: The output of the backbone network is flattened and passed to the embedding model head, which consists of three dense layers with batch normalization layers between each layer, producing an output/embedding of size (1, 256); Transfer learning: All layers of the ResNet50 backbone network are frozen except convolutional block 5 and the model head layer. The feature embeddings generated by ResNet50 are combined in a distance layer to calculate the Euclidean metric between the network-generated embeddings for different inputs during training. Algorithm training: Triplet loss training The algorithm training process is as follows: Constructing image triplets: An image triplet consists of three embryo images, namely, the anchor image, which is an image of an embryo at a random developmental stage t1; the positive image, which is an image similar to the developmental stage t1 (input into neural network 1) or the anchor image after image enhancement (input into neural network 2); and the negative image, which is an image of an embryo at the developmental stage t2 ≠ t1. Image triplet diagram Triplet loss training: The constructed image triplets are passed to the Siamese network, and the triplet loss is calculated based on the following formula to minimize the similarity between the anchor image and the positive image and maximize the similarity between the anchor image and the negative image. Triplet loss calculation formula A represents the anchor image, P represents the positive image, and N represents the negative image. Iterative training: Neural network 1 was trained for 10 epochs using 300,000 zebrafish embryo image triplets; Neural network 2 was trained for 2 epochs using 1 million image triplets, with anchor image enhancements, and GPU-accelerated training using NVIDIA GeForce RTX3070 (ASUS). Task-based training: Corresponding training was conducted on image similarity, embryonic staging, development speed and temperature, and embryonic development changes induced by drugs. 02 Experimental Results Result 1: Automatic embryo staging using similarity graphs The test image is compared with a set of embryo images, the cosine similarity between them is calculated, and the similarity score is obtained to classify the embryo images. Similarity graph of test embryos and reference images By comparing the test image with the time series of developing embryo images, we obtain a curve of similarity over time, from which we extract two main features: The peak of the curve indicates at which developmental stage the embryo in the test image is located. The non-peak regions of the curves contain additional information, such as peak width and similarity to remote embryonic stages, reflecting morphological similarities at different time points. Schematic diagram of embryo age prediction The twin network can identify and predict a set of time series images of an embryo, construct a trajectory based on the predicted developmental stage, and achieve accurate embryo staging. Result 2: Exploring the functional relationship between development speed and temperature Previously, quantifying the temperature dependence of embryonic development required manual or semi-automatic annotation of developmental timing, which significantly limited the number of experiments that could be analyzed within a reasonable time span. The constructed twin network was used to automatically analyze the temperature-dependent changes in developmental rate. The experimental plan was: zebrafish embryos between 23.5 ℃ and 35.5 ℃ and black carp embryos between 18 ℃ and 36 ℃. 100 to 200 zebrafish embryos or 20 to 100 black carp embryos were analyzed under each temperature condition. The experimental results are shown in the figure: Analysis of zebrafish and black carp embryonic development at different temperatures a, d: Schematic diagram of age estimation for zebrafish and black carp; b, e: Development of zebrafish and black carp at different temperatures; c, f: Natural logarithm of estimated growth rate of zebrafish and black carp at different temperatures. Temperature changes had a significant effect on the development rate of both embryos. At lower temperatures, embryos developed more slowly, while higher temperatures led to a significant increase in development. When faced with a temperature change of 10°C, the development rate changed by roughly two times. The temperature dependence of developmental rates was quantified using a twin network, and the data were fitted using the Arrhenius equation. The slopes of the linear fits gave apparent activation energies of 65 kJ/mol and 77 kJ/mol for zebrafish and midge, respectively, over the species-specific temperature range. These apparent activation energies are similar to those of other poikilotherms (e.g., frogs, fruit flies, or yeast) and are significantly different from homeotherms (e.g., mice or humans). Different from idealized speculation, in the higher temperature area, the development rate of both embryos no longer accelerates, but tends to stabilize. In the lower temperature area: the development of zebrafish slows down linearly, and the embryo stops developing when the temperature is below 23℃; the black carp embryos show the characteristics of nonlinear development, stagnating in the primitive sac stage for a long time. Result 3: Quantifying natural variability during embryonic evolution The study found that although embryos are affected by genetic variation, external interference, and noise and randomness in gene expression, which lead to deviations in growth rate and development stage, the evolutionary process will always be completed. Diagram of the evolutionary differences in embryos The twin network was used to evaluate the differences in individual phenotypes among embryos of the same age. The experimental results are shown in the figure: Embryonic development diagram The left panel shows the percentage of embryonic developmental stages predicted after different times, 0 min (green), 400 min (blue), and 800 min (purple); The right graph shows that the average similarity value of embryos decreased over time. In the early embryonic development stage, the predicted embryonic development stage has a narrow distribution, while with the onset of the segmentation period, the distribution width of the predicted embryonic development stage increases. This indicates that during embryonic development, the differences between individuals gradually increase, but the average similarity value decreases over time. In more than 3 million zebrafish embryo image data, about 1% of embryos have abnormal development, often due to spontaneous collapse or dorsal-ventral polarity defects. Using the twin network, researchers were able to detect embryos with abnormal development at an early stage. These abnormal embryos showed low average similarity values outside the predicted normal development range. Illustration of abnormal embryo development Results 4: Identification of drug-treated embryo phenotypes Embryonic development is coordinated by a variety of signaling molecules, and regulating their activity may lead to changes in embryonic phenotypes. During zebrafish development, there are seven major signaling pathways, among which bone morphogenetic protein (BMP), retinoic acid (RA), Wnt, fibroblast growth factor (FGF) and Nodal signaling pathways mainly regulate the orientation of the germ layer and the formation of the anterior-posterior dorsal-ventral axis, while Sonic Hedgehog (Shh) and planar cell polarity (PCP) signaling pathways control the extension and morphogenesis of the body axis. The researchers tested the effectiveness of the twin network in detecting abnormal embryos, and the results are shown in the figure below: Phenotypic comparison between untreated embryos and drug-treated embryos a: Untreated embryos were used as a reference for the phenotype of drug-treated embryos; b-i: Changes in similarity between embryos treated with different drugs and untreated embryos; j: Dependence of embryo number on the accuracy of abnormality detection. Comparison of the phenotypes of untreated embryos with those treated with BMP, Nodal, FGF, Shh, PCP, and Wnt inhibitors and RA-exposed embryos revealed high similarity values between untreated embryos, whereas similarity values between embryos treated with small molecule drugs and untreated embryos were generally low. Statistical analysis of time points is performed to determine the time points at which the embryo population deviates significantly from the reference population, thereby detecting embryo populations with phenotypic defects. The accuracy of the detection depends on the number of embryos analyzed and the type of interference. In addition, the study also explored the accuracy of the method in identifying phenotypes of different penetration rates and severity. The known phenotype range of zebrafish embryos caused by different levels of BMP pathway inhibition is shown in the figure: The twin network can accurately detect developmental deviations. For phenotypes with high penetration or obvious phenotypes caused by high-dose small-molecule BMP signaling pathway inhibitors, only a small number of embryos are needed for accurate detection, while mild phenotypes require about 30 embryos. Phenotypic changes of zebrafish embryos under different levels of BMP pathway inhibition These analyses demonstrate that the Siamese network, trained only using images of normally developing embryos, is able to detect embryonic phenotypic changes in an unbiased manner. Result 5: Automatic derivation of embryonic development period Typically, reference embryo images are available to assess the developmental timing of test embryos, but for newly discovered or uncharacterized species, such reference images may not be available. The researchers propose that a twin network can be used to determine the developmental stage by calculating the similarity between a test image and other images of the same embryo at earlier time points. The results of similarity analysis on zebrafish embryos are shown in the figure: Embryonic development period derivation a: Calculate the similarity between the test embryo and images from previously acquired time points of the same embryo; b: Representative similarity matrix. Similarity showed unique distribution characteristics at different developmental stages. They observed a common pattern: high similarity values clustered locally, while at more distant time points, similarity values were low and plateaued. Interestingly, the local and global statistical similarities between pairs of images, as assessed by the Siamese network, are consistent with the order of key stages during development. Embryos that fall into the plateau stage have stable morphology, highlighting major periods in development, such as the classic cleavage, blastocyst, embryonic disc, organogenesis, and segmentation stages. In contrast, embryos that fall on the border between plateau stages represent transient periods of major changes in developmental morphology. Next, the researchers tried to extend this method to other species, including medaka and three-spined stickleback. The results showed that the twin network generated rich maps for these morphologically diverse embryo sequences. Automatic detection of developmental stages and transitions in black carp and three-spined stickleback embryos In further research, they applied this method to the more distantly related nematode Caenorhabditis elegans. The researchers used open data from different independent sources, such as published papers and YouTube videos, to train and evaluate the network, and successfully automatically identified the first division cycle of C. elegans to form the first four proembryonic cells. These results demonstrate that the Twin Network approach can be used to automatically generate developmental atlases of different species for different biological systems and a wide range of image datasets, without the need for models previously trained specifically for this purpose. 03 Twin Network vs. Digital Twin Network In the 5G era, digital twin networks have been mentioned frequently. At the same time, the "twin technology" with a similar name - twin networks - has also emerged in the field of image recognition. Although the two concepts are different, they have shown synergy in some fields. First of all, please note that these are two completely different concepts. Twin Network: A deep learning architecture that is mainly used in image retrieval, image matching, image classification and other fields. It learns the embedded representation of images to achieve image similarity comparison and analysis. Digital Twin Network: A virtual model of a physical entity that interacts with its corresponding physical entity through real-time data updates and simulation technology, and can simulate the behavior and performance of the physical entity under different conditions. It is mainly used in industrial manufacturing, Internet of Things, urban planning, aerospace and other fields. As an AI algorithm, Twin Network can leverage its own advantages to empower and improve the efficiency of digital twin networks. For example, in the digital twin of industrial equipment, the twin network can compare equipment images at different time points to understand the changes and differences in equipment status; in digital twin city planning, the twin network can process image data captured by monitoring probes, conduct real-time monitoring and simulation of traffic flow and road conditions, and so on. In summary, Twin Network provides image-related support and applications for Digital Twin Network by combining image data and deep learning technology, thereby improving the information acquisition, monitoring and decision-making capabilities of digital twins. Not only Twin Network, other AI tools will also further empower digital twins. |
<<: Snowfall in many places! When traveling on snowy days, pay special attention to these points!
As the traffic dividend gradually disappears, it ...
Review expert: Zhu Guangsi, member of Beijing Sci...
Editor’s Note: After the comprehensive science ex...
How much does it cost to invest in the Ili Kazakh...
On March 28, 2017, Apple released the official ve...
Introduction: The most basic factor that determin...
Regarding the origin of mitochondria, scientists ...
The Mid-Autumn Festival coincides with Teachers...
On September 27, the Circular Array Solar Radio I...
A month after Google released the second develope...
During the development process over the years, wh...
On January 15, the WeChat team announced that the...
Audit expert: Chen Yu Paleontological restoration...
Apple is working on a series of new programs to h...
The 5G tide is coming in full force. In addition ...