The world is too complicated! How to explore simple rules?

The world is too complicated! How to explore simple rules?

1. Complex world, simple rules

In the sunset, birds dance in flocks, sometimes dispersing, sometimes gathering, constantly changing their spatial arrangement without colliding with each other, flying over obstacles without getting separated from each other. How do flocks of birds dance in the air, and schools of fish change in the water?

In 2021, a cargo ship accidentally ran aground in the Suez Canal, causing a "domino effect" on the global economy, reducing trade by as much as $9 billion a day. Why can a small cargo ship block the global supply chain? Similarly, why can an online rumor trigger a large-scale public opinion on the Internet? Can a butterfly flapping its wings really stir up a storm thousands of miles away?

These questions may seem unrelated, but if you think about it carefully, you will find that these complex phenomena have one thing in common: they all occur in complex systems composed of a large number of entities interacting with each other. The 2021 Nobel Prize in Physics was awarded to Italian physicist Professor Giorgio Parisi for his pioneering contribution to the theory of complex systems. When he was young, Professor Giorgio Parisi was also fascinated by the sight of thousands of birds flying in the sky at the Rome train station. He often stood for a long time, observing and photographing the flocks of birds. Based on the observation data of the flocks of birds, Professor Parisi used statistical physics methods to uncover the mystery of bird flight[1]. It turns out that each bird only needs to follow three basic principles to reproduce the wonders of bird flocks flying. These three basic principles are:

(1) Get close to neighbors in the field of vision. Each bird hopes to travel with its companions in the field of vision.

(2) Maintaining a consistent flight direction with neighbors in the field of vision;

(3) When you are too close to your neighbors, adjust your direction to avoid collision.

Therefore, the secret of flock flight lies not in each bird, but in the interaction between them. Flock flight is so complex, but the rules behind it are so simple! A curse appears when studying systems like flocks: the reductionism we are used to relying on [Reductionism is a philosophical idea that complex systems, things, and phenomena can be understood and described by breaking them down into a combination of parts. ] is invalid.

2. What is a complex system?

Although reductionism cannot understand the collective behavior of bird flocks, it is very effective in understanding airplanes. Although airplanes have countless parts and dazzling functions, as long as we understand the role of each part, we can fully understand the flight principle of airplanes. We call systems like airplanes complex systems, while systems like bird flocks and brains cannot understand the wonders of the overall emergence of the system (such as the flying of birds and the emergence of consciousness) even if we have studied all the components of the system (such as each bird and each neuron). Such systems are complex systems, as shown in Figure 1. The core problem that complex system research aims to solve is to explore the simple and universal laws behind complex systems.

Figure 1 Composite system and complex system

Newton established a mechanical and deterministic physical kingdom. Stories like a ball sliding down an inclined plane are always firmly controlled by Newtonian mechanics. In this deterministic kingdom, as long as we give the initial state of the system, everything will operate according to certain rules. In the winter of 1961, meteorologist Edward Norton Lorenz built an ingenious mathematical model in the hope of predicting the weather, but unexpectedly discovered another world. A computer system error of 0.0001 second (0.0001 second) can produce completely different results. As the saying goes, "a slight difference can lead to a great error." He input this highly nonlinear weather model into the computer, and the state trajectory obtained was like a butterfly with its wings spread. So there is the butterfly effect that everyone is very familiar with (as shown in Figure 2). This effect vividly shows the sensitivity of nonlinear systems to initial values, and also reflects an interesting phenomenon of complex systems - chaos.

Figure 2 The origin of the butterfly effect

In fact, most of the real systems we are familiar with are neither chaotic nor completely ordered, but in between, which we call the edge state of chaos and order. Complexity science is born on the edge of chaos and order. In 1984, with the support of Gell-Mann, Anderson, Arrow and others, a group of scientists in the fields of physics, economics and computers established the Santa Fe Institute in a rented convent in the Isso district of Santa Fe. The institute has now become a world-renowned center for complex science research. A large number of scholars, represented by members of the Santa Fe Institute, have tried to break through the shackles of reductionist thinking since Newton and understand complex system phenomena such as emergence and chaos.

3. How to explore simple rules in a complex world - Network Science

George Cowan, one of the founders of the Santa Fe Institute, once said that they were pioneering the science of the 21st century. Now, the future is here! After more than 30 years of hard work by scientists, complex science has entered a new stage of development - using complex networks to characterize and study complex systems. Network science has emerged. The core idea of ​​network science is to use complex networks to model various complex systems [2]. In the real world, everything from global ecosystems and global logistics systems to protein interaction systems in cells can be modeled using complex networks (as shown in Figure 3). Nodes represent the components of the system, and edges represent the interactions between the elements. By studying the network structure abstracted from the system and the dynamics on it, we can understand the laws of the complex system corresponding to the network.

Figure 3 Examples of various complex networks

[Protein interaction network from www.creative-proteomics.com]

Social networks are the most familiar networks in our daily lives. Everyone is a node in a social network, connected through online and offline relationships. Back to the original question, why can an online rumor trigger a large-scale public opinion on the entire network, and how should we start to control public opinion? The key to solving these problems is to find the key figures in the rumor propagation process on social platforms, and to identify and cut off important propagation paths. In short, it is the exploration of two key scientific issues: how to mine important nodes in the network[3], and how to mine important links in the network[4]. The study of these two issues is called network information mining in network science (as shown in Figure 4).

Figure 4 Network information mining

1. How to mine important nodes in the network?

Regarding the first scientific question: how to mine nodes that have a significant impact on the network structure and function based on known network information, it is actually a question of how to sort the nodes. Among the methods to solve this problem, sorting nodes according to their core number is a classic method (i.e., K-core decomposition [5]), which describes the position of nodes in the network. This is like the process of peeling an onion, peeling off the network layer by layer. The later the node is peeled, the more core it is in the network, and the greater the influence of this node. However, such methods are mostly applicable to static and simple networks. In real life, most of the networks we face are large-scale, weighted, evolving, and directed. When faced with such complex networks, how can we quickly and efficiently calculate the core number and mine important nodes?

Inspired by the H-index of scientists, we defined a local H operator [6]. Applying the operator H to a finite sequence of real numbers yields y = H (x1, x2, ..., xn). The definition of the H operator is to find at most y numbers not less than y in the real number sequence (x1, x2, ..., xn) (as shown in Figure 5). This concept is exactly the same as the concept of the H-index. When we apply the H operator to the node degree sequence of the network, the returned y value is called the first-order H-index of the node. When the H operator is further applied to the first-order H-index of the neighbors of a node, the second-order H-index of the node can be obtained. After such continuous application, the H-index sequence of the node can be obtained. Interestingly, this sequence can be strictly proved to converge to the core number of the node.

Figure 5 Schematic diagram of H operator definition

Therefore, through the H operator, we have linked three indicators that have long been considered unrelated: degree, H-index and core number. We call this discovery the DHC theorem of networks [6] (as shown in Figure 6). This theorem is also applicable to evolving, weighted, and directed networks. Based on this theorem, the core number of a node can be quickly calculated in a distributed manner based only on the local information of the network nodes, thereby quickly and accurately mining important nodes in complex networks.

Figure 6 DHC theorem

We found that by applying the DHC theorem to identify key users in the Weibo network, we only need to monitor less than one in 40,000 Weibo users to track more than 95% of major food safety public opinions. In addition, this method can also be applied to many fields such as national innovation analysis[7], important brain region identification[8], and city media influence analysis[9].

2. How to mine hidden links in the network?

As for the second scientific question, how to estimate the possibility of a connection between two unconnected nodes based on known network structure information and possible node attribute information? This problem is called link prediction, and "friend recommendation" in social networks is a typical application of link prediction problems. In link prediction research, data and algorithms directly determine the prediction accuracy. When we get a poor prediction result, we often explore how to design a better algorithm. However, we ignore a very critical issue: whether the analyzed data itself is predictable, that is, how to characterize the predictability of network data.

We believe that if a small number of links are randomly extracted from the network, the eigenvector space of the network is slightly affected, which means that the network is regular and highly predictable. Based on this idea, we apply a method similar to the first-order perturbation of the Hamiltonian in quantum mechanics. We assume that the perturbation caused by reducing or adding a small number of links only affects the eigenvalues ​​but not the eigenvectors. In this way, we can observe the difference between the adjacency matrix reconstructed in this way after the perturbation and the real adjacency matrix. We propose an indicator to measure this difference - the structural consistency of the network [10]. The stronger the consistency, the greater the predictability of the network. Based on this idea, we further propose a link prediction model based on network structural perturbation (as shown in Figure 7). This method significantly surpasses the classic hierarchical model and random block model in predicting lost links and identifying noise edges added to the network. The relevant algorithm can not only be used in relationship prediction in the social field, but also in the prediction of various pathogenic genes such as breast cancer, lung cancer, and heart failure, with higher prediction accuracy than traditional systems biology methods [11].

Figure 7 Network structure consistency calculation

Network information mining has a very wide range of application scenarios. At present, some research results have been applied to actual systems such as online public opinion monitoring, pathogenic gene prediction, medical insurance fraud identification, and e-commerce services, generating certain social and economic value. The report of the 20th National Congress emphasized the importance of the industrial chain and supply chain to national security, requiring efforts to improve the resilience and security level of the industrial chain and supply chain. The relevant methods of network information mining can also be applied to relevant research and play a role. The industrial chain and supply chain are naturally a network, which can be described and characterized by a complex network (as shown in Figure 8). The supply chain is a production-sales relationship network formed by upstream and downstream enterprises to deliver products or services to end users. The industrial chain is an interconnected network formed between industries based on certain economic and technological connections and spatial layouts. By building a network, it is possible to identify important nodes and discover industries that may be "stuck in the neck" in advance; by identifying important links, optimizing important links and early warning of weak links, etc., combined with the perspective from micro nodes to the overall macro network, an optimization and upgrading strategy for the industrial chain and supply chain is proposed to ensure the autonomy, controllability, safety and efficiency of the industrial chain and supply chain.

Figure 8 Optimizing the industrial chain supply chain network from a complex network perspective

4. New frontiers in network science: from low-level to high-level

As one of the important cornerstones of complex networks, graph theory can be traced back to Euler's Königsberg Bridges Problem. It was not until the breakthrough of small-world networks in 1998 and scale-free networks in 1999 that a wave of research in network science was launched over the past two decades. At present, we have a relatively mature understanding of the structure, dynamics, prediction and control of networks at the node and edge levels. However, as research continues to deepen, researchers have found that many real systems contain not only binary relationships between node pairs, but also higher-order interactions in the form of groups and groups [12]. For example, an academic paper may be completed by multiple scholars; biological signal transmission, gene expression regulation and other life processes require the participation of multiple proteins; in the brain's neural network, many cognitive functions, including memory, rely on the encoding and signal synchronization of neuronal groups. This high-order interaction is difficult to be well described by a network based on binary interaction relationships. When we trace back to the origin of network science, there will be some new ideas (as shown in Figure 9). We found that another important contribution of Euler, the Euler characteristic and Poincare's hole formula, provided new ideas for network science, which can be used to study the higher-order structure and dynamics of multi-node interactions, thus advancing the study of network science into the era of higher-order network analysis. Higher-order network analysis allows us to gain deeper insights into the structure and function of the network, and is expected to break through bottlenecks and make new discoveries in some existing problems.

Figure 9 The development history of network science and future frontier challenges

High-order topological analysis has shown great potential in many complex system examples from social processes to neuroscience. The most basic high-order structure of a network is the cycle [Cycle: A closed path consisting of the same starting point and end point.] structure, including cliques [Clique: A subset of vertices in an undirected graph, in which every two different vertices in a clique must be adjacent. In other words, its derived subgraph is a complete graph.] and holes [Cavity: The smallest cycle in an unrelated equivalence class of cycles in a network.] (as shown in Figure 10). In the human brain, cliques and holes, the former as units of information processing and memory, and the latter as the functional basis for cross-brain information integration and distribution, are crucial for parallel processing and high-level cognitive activities of the human brain [13]. The first task of conducting high-order topological analysis of a network is to find the high-order structure in the network. However, so far, there has been no systematic theoretical method for studying the high-order structure of a network. For example, mapping a complete high-order structural map of the brain is still a huge challenge.

Figure 10 Schematic diagram of cluster and hole structure

The key to finding the high-order structure of a network lies in how to calculate the network structure. We borrowed Poincare's idea of ​​geometric body decomposition and regarded the network as a geometric body. We then performed a similar decomposition on it and decomposed it into fully homogeneous subnetworks[14]. We then used some vector spaces and boundary operators on binary domains to describe and calculate the network. Based on this, we can calculate the cluster and hole structures in the network, as well as the topological invariants. Finally, we echoed the Euler-Poincare formula to further verify the accuracy of the calculation (as shown in Figure 11). We applied this method to the neural network of nematodes, calculated the number of all clusters and holes in the nematode neural network, and drew a complete high-order structure map of the nematode neural network[15]. The biological significance of these cluster and hole structures needs to be further interpreted.

Figure 11 Theoretical framework of high-order network analysis

Applying high-order network analysis to understand the brain will be a completely new perspective. High-order structures such as clusters and holes are very critical in the brain, which will also promote our understanding and recognition of neural circuits related to brain function, and provide new ideas for clinical applications and the development of brain-like computing frameworks. For example, our analysis of the neural networks of autistic patients' brains showed that compared with healthy people, autistic patients have "fewer clusters and more holes" in their brain networks. Clusters reflect the ability to process information locally in parallel to a certain extent, and holes reflect the brain's ability to integrate information from different brain regions. This shows that the ability of autistic patients to process information locally in parallel is reduced, but the ability to integrate information across brain regions is improved. But specifically, how are these clusters and hole structures formed in a specific organizational manner, and what is their relationship with cognition and disease? These are all important issues that need further research in the future.

In the future, the combination of network science and artificial intelligence will have great potential. It is not only expected to solve current challenges, such as security and governance issues in modern digital society, but will also give rise to some new scientific problems and application technologies, playing an important role in many fields such as society and economy (as shown in Figure 12). From the establishment of the Santa Fe Institute in 1984 and the birth of complexity science to the awarding of the Nobel Prize in Physics in 2021 for complex systems research, complex science has grown rapidly in just a few decades, but it is still like an adolescent child, both immature and representing the future. Complex science is in the ascendant, and Chinese scholars have a promising future!

Figure 12 Application scenarios of the combination of network science and artificial intelligence

References

[1] George Parisi. Flying with the Starlings. Translated by Wen Zheng. 2022.

[2] NEWMAN ME J. The structure and function of complex network. SIAM Review, 2003, 45(2): 167-256.

[3] Lü, L., Chen, D., Ren, XL, Zhang, QM, Zhang, YC, & Zhou, T. Vital nodes identification in complex networks. Physics Reports 650, 1–63 (2016).

[4] Lv Linyuan, Zhou Tao. Link Prediction. Higher Education Press, 2013[2023-08-12].

[5] Alvarez-Hamelin, JI, Dall'Asta, L., Barrat, A. & Vespignani, A. k-core decomposition: a tool for the visualization of large scale networks. Preprint at https://doi.org/10.48550/arXiv.cs/0504107 (2005).

[6] Lü, L., Zhou, T., Zhang, Q.-M. & Stanley, HE The H-index of a network node and its relation to degree and coreness. Nature Communications 7, 10168 (2016).

[7] Ye, Y., Xu, S., Mariani, MS & Lü, L. Forecasting countries' gross domestic product from patent data. Chaos, Solitons & Fractals 160, 112234 (2022).

[8] Wang, H., Wu, H.-J., Liu, Y.-Y. & Lü, L. Higher-order interaction of brain microstructural and functional connectome. Preprint at https://www.biorxiv.org/content/10.1101/2021.11.11.467196v1.abstract (2021).

[9] Fan Tianlong, Zhu Yanyan, Wu Leilei, et al. Generalization and application of DHC theorem on directed weighted networks. Journal of University of Electronic Science and Technology of China, 2017, 46(5): 766-776.

[10] Lü, L., Pan, L., Zhou, T., Zhang, Y.-C. & Stanley, HE Toward link predictability of complex networks. Proceedings of the National Academy of Sciences 112, 2325–2330 (2015).

[11] Zeng, X., Liu, L., Lü, L. & Zou, Q. Prediction of potential disease-associated microRNAs using structural perturbation method. Bioinformatics 34, 2425–2432 (2018).

[12] Boccaletti, S., De Lellis, P., del Genio, CI, Alfaro-Bittner, K., Criado, R., Jalan, S., & Romance, M. The structure and dynamics of networks with higher order interactions. Physics Reports, 1018, 1-64 (2023).

[13] Sizemore, AE, Giusti, C., Kahn, A., Vettel, JM, Betzel, RF, & Bassett, DS Cliques and cavities in the human connectome. Journal of Computational Neuroscience, 44, 115-145 (2018).
[14] Shi, D., Lü, L. & Chen, G. Totally homogeneous networks. National Science Review 6, 962–969 (2019).

[15] Liu, B., Yang, R., Wang, H. & Lü, L. Complete cavity map of the C. elegans connectome. Preprint at http://arxiv.org/abs/2212.03660 (2022).

<<:  Can your cell phone "eavesdrop" on you? Research shows that your cell phone may be "eavesdropped" on you!

>>:  A woman sneezed and her brain suddenly "leaked". This type of person is very likely to suffer from this disease!

Recommend

Facebook enters live video streaming. Will live streaming become a new trend?

Apple held its autumn new product launch conferen...

Three modes of mobile phone system development by Apple, Google and Microsoft

Since people entered the mobile Internet era to t...

Using RxJava to quickly obtain massive data

Imagine that when you need some dynamic data, you...

Is the promotion cost too high? You need to build a self-growth operating system

Recently, I have received a lot of inquiries abou...

YouTube Chief Engineer: VR is the key to YouTube's future

After Matthew Mengerink became a new engineer at ...

How are customs duties calculated for overseas shopping?

In recent years, with the development of domestic...

Groundwater storage is not just about “storing water”

groundwater It is an important component of water...

Selling one app in two parts, this is Google's new strategy

[[133445]] For mobile software developers, softwa...