Less is more, more is different and sparse is better: but why?
The answer can come from an unexpected connection with the statistical physics of quantum systems
If you find Complexity Thoughts, click on the Like button, leave a comment, repost on Substack or share this post. It is the only feedback I can have for this free service.
The frequency and quality of this newsletter relies on social interactions. Thank you!
What if the same physics that governs quantum particles could also explain the peculiar patterns observed in protein-protein interactions, in complex brains, in social relationships, in the Internet infrastructure or the intricate web of air traffic routes?
This is not science fiction: it is a mathematical framework, based on thermodynamics and information theory, that has been used for decades to describe entanglement in quantum systems. In a new study just appeared in Nature Physics (Disclaimer: it comes from my lab), it has been used to describe the emergence of specific topological features – such as sparsity, modularity, heterogeneity – that are ubiquitous in empirical complex systems characterized by interconnections.
Background
It is usually attributed to Albert Einstein the quote:
“A theory is the more impressive the greater the simplicity of its premises is, the more different kinds of things it relates, and the more extended is its area of applicability. Therefore the deep impression which classical thermodynamics made upon me. It is the only physical theory of universal content concerning which I am convinced that, within the framework of the applicability of its basic concepts, it will never be overthrown.”
In fact, thermodynamics is likely the only theory in physics that can be applied at any scale of interest, regardless of the specific nature of the objects under investigation, being them particles or black holes. There is a reason for the success of thermodynamics: it is the physical theory describing the laws that govern heat, work and energy, key pillars in any physically relevant description of reality. Central to the thermodynamic description of the physical world is the concept of entropy: a measure of the “disorder” or “randomness” in a system, directly related to the probability of observing it in a given configuration within a space of possibilities. Entropy is the core of the Second Law of thermodynamics, stating that the entropy of an isolated system tends to increase over time: it establishes a formal ground for the spontaneous appearance of the “arrow of time” and other theoretical extrapolations, such as the heat death of the universe.
Thermodynamics plays a crucial role also in some of the work of Erwin Schrödinger related to the study of the origin of life. In his famous book “What is life?”, appeared in 1944, he attacks fundamental questions, to unravel what is the source of order in organisms, how organisms exhibit a “highly ordered” (i.e., well organized) states (i.e., structures and processes) despite the Second Law of thermodynamics, and whether new laws of physics are required at all to explain the emergence of life. In fact, he suggests that living matter avoids (at least for a limited amount of time) their inevitable path towards thermodynamic equilibrium by consuming entropy at the expense of the environment. While the universe, as a closed system, will maximize its entropy according to the Second Law, organisms can temporarily and locally contrast entropy maximization by means of processes, requiring work, that will lower their entropy while increasing the entropy of the universe as a whole. This idea of “order-from-disorder” is crucial to understand how living organisms maintain their highly-ordered state and function despite the natural tendency towards increasing disorder in the universe.
In the same years, Claude Shannon published his pioneering work about communication exchange, building the field of information theory that, again, is grounded on the concept of a quantity known as information entropy, as suggested by John von Neumann: “You should call it entropy, for two reasons. In the first place, your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, no one really knows what entropy really is, so in a debate you will always have the advantage”. In the subsequent decades, this concept has found applications in nearly every discipline and it is behind many machine learning approaches to artificial intelligence.
In quantum mechanics, the entropy of quantum systems is quantified by the von Neumann entropy and it is largely based on the concept of density matrix, a powerful tool used to describe the statistical state of a quantum system.
The power of variational principles
It is not surprising that entropy plays such an important role in the description of so many theories: in fact, as shown years later by Edwin T. Jaynes, statistical mechanics and information theory are strictly related, and it is possible to devise an elegant-yet-powerful formalism – based on the maximum entropy principle – to describe a variety of classical and quantum phenomena within a unifying framework. The formalism builds on a variational principle: the maximization or minimization of a physical quantity (here, the entropy) that can be written as a functional depending on pre-defined constraints. In this specific case, the functional is a Lagrangian depending on the entropy and a variety of physically motivated constraints: e.g., it is easy to impose that, on average, the system has to be characterized by a given energy.
Variational principles are rare and powerful: they provide a simple and elegant framework to attack a variety of problems. The d’Alembert principle, encoded in variational terms by Lagrange, allows one to obtain the classical laws of motion. The functional known as action is routinely used – from analytical mechanics to general relativity, field theory and quantum field theory – to derive equations of motions subject to given constraints and gain insights about the underlying conservation laws. It is not an overstatement to claim that most physics theories, nowadays, can (and should) be deduced from a variational principle.
Variational principles are also used in machine learning: from maximum-likelihood to Bayesian variational inference and stochastic gradient descent techniques, modern paths to artificial intelligence are widely based on searching for an optimal function, within some function space, that solves a given optimization problem.
Complex systems
However, there are a variety of natural and artificial systems – not necessarily physical ones – that exhibit peculiar features. Their behavior is characterized by a strong sensitivity to initial conditions and long-term unpredictability (the famous “butterfly effect”), and they exhibit a modular and hierarchical mesoscale organization. Their macroscopic properties, such as the robustness to external disturbances or internal failures or their adaptation to changing environments cannot simply be deduced by the knowledge of its constituents alone: they are emergent. A class of these systems is characterized by an interconnected structure: units are linked with non-trivial random connectivity patterns, where a few ones (hubs) are extraordinarily well linked to others and the vast majority are peripheral and with very few connections. The non-trivial structure identifies a network topology that, together with nonlinear dynamical processes, is responsible for fascinating phenomena – such as the synchronization of fireflies – and catastrophic events, such as pandemics, regional power outages and global economic crises.
Such networks are often characterized by heterogeneous connectivity, the “small-world phenomenon” and the presence of modules or groups. But there is another interesting feature that characterizes real-world complex networks: they are sparse. Why should they be sparse at all? After all, one could intuitively think that the larger the number of connections between units the faster the exchange of information: from electrochemical signals among neurons to goods transported among distinct countries, more links are expected to lower the “effective distance” between units in a network.
In practice, making connections comes with some costs. Let us think about the rail connections between some distant geographic areas: their presence will favor the exchange of human flows between those areas, but they need capitals and work. Similarly, it could be desirable for the nearly 86 billions of neurons in our brain to be densely connected: the result could be faster processing of sensory input and reactions, at the price of an increased metabolic cost.
This trade-off between the notion of function, broadly speaking, and the notion of cost is likely to be responsible for the apparent optimality of sparse connectivity and its ubiquitous observation in empirical systems. For physicists, the challenge is to move beyond the specific details of each system: is it possible to capture this trade-off without relying on those details?
The thermodynamic origin of sparsity
We asked this question nearly six years ago, when we started our academic collaboration on the statistical physics of complex networks and their dynamics. At that time, we were missing the mathematical framework and the physical intuition needed to provide an answer. In fact, there was only a somewhat obscure, but theoretically sound, work about an unexpected link between the statistical physics of quanta and the state of complex networks subject to some kind of dynamical process, such as diffusion. The rationale of that study was simple: network science is the “science of entangled units”, and it was plausible to ask if the more known “science of entangled quanta” could offer some opportunities to build an adequate theoretical framework for complex, non-quantum, systems. A previous attempt in this direction was proposed by Braunstein, Ghosh and Severini in 2006, but it was lacking important theoretical features, such as sub-additivity, later solved by us in 2016 and 2020.
The bridge between the two apparently distinct areas is information. In fact, according to Murray Gell-mann, complex systems might resemble one another in the way they handle information, and quantum statistical physics provided a wonderful framework to work with information at the level of quantum systems.
Figure: different complex systems characterized by some kind of units exchanging some type of information through some kind of connections. The interactions lead to a variety of emergent phenomena. Figure from personal slides.
However, finding an answer has been tougher than expected: it required us to understand each single detail of the framework and how to generalize it within a physically plausible flavor not based on assumptions about the quantum nature of systems. We discovered that the framework was hiding a statistical field theory, that could be better understood and generalized in terms of signaling between network units.
Figure: building and interpreting a network density matrix. Figure from here.
In fact, a density matrix is the generalization of the concept of probability distribution and we have shown that it is not something necessarily limited to the quantum realm. The same idea has been recently used to extend the concept of renormalization group to heterogeneous networks, identify the emergent mesoscale organization in fungal networks and study functional connectivity of the human brain.
Figure: building and interpreting a network density matrix where information is considered as diffusion. Figure reusable under CC BY 4.0 Attribution.
Figure: building and interpreting a network density matrix where information is considered as propagation of a perturbation. Figure reusable under CC BY 4.0 Attribution.
Figure: after defining the dynamics on the top of a structure, one builds the propagation operator and derive the partition function, the density matrix and the von Neumann entropy. Left panel readapted from here. Figure reusable under CC BY 4.0 Attribution.
Once the technical details of each quantity have been understood, we considered using density matrices to quantify how a network responds to stochastic perturbations. We started by introducing a new quantity η, acting in spirit like a thermodynamic efficiency: it balances the trade-off between favoring the flow of information within a network and the diversity of its responses to disturbances. Once again, the idea was simple: networks who are forming need information (eg., electrochemical signals, goods, etc) flowing among their units to guarantee their function, while at the same time they need to respond to unexpected perturbations in a diverse way to guarantee a certain level of robustness. Our η serves as a universal (i.e., detail-agnostic) measure for this purpose.
One of the most fascinating aspects of this framework is its grounding in thermodynamics. The flow of information is related to a generalized concept of free energy, while the diversity of response to disturbances is related to the von Neumann entropy ported to the realm of complex non-quantum systems. The necessity for minimizing free energy while maximizing entropy introduces two competing mechanisms which find their optimum somewhere in between two extreme configurations: one where all units are disconnected and one where all units are connected with each other.
Figure: Forces competing during network formation, and different phases. Figure reusable under CC BY 4.0 Attribution.
We have performed extensive numerical experiments and have found that η is nearly maximum for heterogeneous networks – the more abundant in nature – and for systems that balance integration and segregation – one of the key principles thought to be behind the functioning of complex brains and cities – as well as for networks characterized by the small-world phenomenon. These results are interesting but not fully understood, yet: they are rather difficult to find analytically, i.e. from first principles, and this is a matter of current research.
Figure: η for a variety of network structures characterized by distinct topological features. Figure from here.
The above results are in agreement with previous findings attempting to relate von Neumann entropy with the presence of mesoscale organization:
However, one feature that could be attacked from a theoretical perspective was network sparsity. By assuming that the total number of connections in a network scales with the number of nodes according to a general power law characterized by a scaling exponent, we have analytically maximized η with respect to that exponent and have surprisingly found that it must be equal to 1. The comparison between this theoretical expectation obtained from first principles and the scaling exponent obtained from more than 500 empirical networks from biological, social and engineering systems was confirmed with a good margin: 1.07 +/- 0.02.
The meaning of this result is interesting: while it is not possible to claim that each microscopic unit of a complex network behaves in such a way that η is maximized, we have found a variational principle that offers a macroscopic description of a system, even if individual units don't follow these overarching rules. Our results provide a new perspective on how complex networks, from biological and social ones, might emerge: they are not just random assemblages following a disparate variety of context-specific mechanisms, but systems that evolve towards specific states, guided by thermodynamics and information principles similar to those governing physical systems.
“What we observe is not nature in itself but nature exposed to our method of questioning.” — W. Heisenberg
i wonder is this applicable to systems which is far-from equilibrium? Or even if it is far-from-equilibrium, we can still somehow use "variational principle" to attack such problem? thanks