Complexity Thoughts: Issue #13

Unraveling complexity: building knowledge, one paper at a time

Jul 17, 2023

In a nutshell…

In this issue we have foundational and applied papers.

In thermodynamics, graph theory provides valuable insights into discrete systems, but chemical reactions pose challenges due to their many-to-many interactions. A novel study fills the gap, enabling the identification of nonequilibrium observables and uncovering hidden symmetries. In the same context, a new approach inspired by electrical circuit theory proposes modularizing networks into easily analyzed components, allowing for the prediction of reaction currents and energy flows.

Another paper deals with an analysis of transport on dynamic graphs, and reveals noise-induced resonances that optimize transport efficiency, suggesting a new understanding of the role of noise in optimization algorithms.

Transitioning to network inference, a paper evaluates traditional and modern techniques using synthetic data and experimental neuronal activity. Performance is assessed under various limitations, such as noise, partial measurement, and low sampling rate, with a surrogate data approach providing statistical confidence levels for the inferred links.

Shifting the focus to neuroimaging, the replicability of brain-phenotype associations is explored, with limited sample sizes and statistical power being significant challenges. Leveraging large-scale imaging datasets, such as the UK Biobank, the paper investigates associations between brain and physical/mental health variables, underscoring the need for larger sample sizes to establish highly replicable associations.

Expanding the concept of habitable zones, computation is seen as a measure of living systems, leading to the notion of computational zones defined by capacity, energy, and instantiation. Cool!

Observational evidence highlights the destabilization of Earth's ecosystems, driven by factors like population growth and climate change. Computer simulation models reveal the compounding risks and increased probabilities of system failures, while tipping points can be triggered within the warming range outlined in the Paris Agreement. The paper focuses on threshold-dependent changes caused by multiple stressors and interactions, exploring the combined effects of fast drivers and noisy system dynamics. This work emphasizes the impact of multiple stresses and interactions on collapse.

Finally, the phenomenon of proxy failure is observed across diverse domains, where measures used as proxies for desired outcomes diverge from the actual goals. This unifying mechanism is explored in neuroscience, economics, and ecology, shedding light on the implications for understanding goal-oriented systems.

Network foundations

Geometry of Nonequilibrium Reaction Networks

The modern thermodynamics of discrete systems is based on graph theory, which provides both algebraic methods to define observables and a geometric intuition of their meaning and role. However, because chemical reactions usually have many-to-many interactions, chemical networks are described by hypergraphs, which lack a systematized algebraic treatment and a clear geometric intuition. Here, we fill this gap by building fundamental bases of chemical cycles (encoding stationary behavior) and cocycles (encoding finite-time relaxation). We interpret them in terms of circulations and gradients on the hypergraph and use them to properly identify nonequilibrium observables. As an application, we unveil hidden symmetries in linear response and, within this regime, propose a reconstruction algorithm for large metabolic networks consistent with Kirchhoff’s voltage and current laws.

Circuit Theory for Chemical Reaction Networks

We lay the foundation of a circuit theory for chemical reaction networks. Chemical reactions are grouped into chemical modules solely characterized by their current-concentration characteristic—as electrical devices by their current-voltage (I−V) curve in electronic circuit theory. Combined with the chemical analog of Kirchhoff’s current and voltage laws, this provides a powerful tool to predict reaction currents and dissipation across complex chemical networks. The theory can serve to build accurate reduced models of complex networks as well as to design networks to perform desired tasks.

Noise-Induced Network Topologies

We analyze transport on a graph with multiple constraints and where the weight of the edges connecting the nodes is a dynamical variable. The network dynamics results from the interplay between a nonlinear function of the flow, dissipation, and Gaussian, additive noise. For a given set of parameters and finite noise amplitudes, the network self-organizes into one of several metastable configurations, according to a probability distribution that depends on the noise amplitude α. At a finite value α, we find a resonantlike behavior for which one network topology is the most probable stationary state. This specific topology maximizes the robustness and transport efficiency, it is reached with the maximal convergence rate, and it is not found by the noiseless dynamics. We argue that this behavior is a manifestation of noise-induced resonances in network self-organization. Our findings show that stochastic dynamics can boost transport on a nonlinear network and, further, suggest a change of paradigm about the role of noise in optimization algorithms.

Network inference from short, noisy, low time-resolution, partial measurements: Application to C. elegans neuronal calcium dynamics

Network link inference from measured time series data of the behavior of dynamically interacting network nodes is an important problem with wide-ranging applications, e.g., estimating synaptic connectivity among neurons from measurements of their calcium fluorescence. Network inference methods typically begin by using the measured time series to assign to any given ordered pair of nodes a numerical score reflecting the likelihood of a directed link between those two nodes. In typical cases, the measured time series data may be subject to limitations, including limited duration, low sampling rate, observational noise, and partial nodal state measurement. However, it is unknown how the performance of link inference techniques on such datasets depends on these experimental limitations of data acquisition. Here, we utilize both synthetic data generated from coupled chaotic systems as well as experimental data obtained from Caenorhabditis elegans neural activity to systematically assess the influence of data limitations on the character of scores reflecting the likelihood of a directed link between a given node pair. We do this for three network inference techniques: Granger causality, transfer entropy, and, a machine learning-based method. Furthermore, we assess the ability of appropriate surrogate data to determine statistical confidence levels associated with the results of link-inference techniques.

Network neuroscience

Replicable brain–phenotype associations require large-scale neuroimaging data

Numerous neuroimaging studies have investigated the neural basis of interindividual differences but the replicability of brain–phenotype associations remains largely unknown. We used the UK Biobank neuroimaging dataset (N = 37,447) to examine associations with six variables related to physical and mental health: age, body mass index, intelligence, memory, neuroticism and alcohol consumption, and assessed the improvement of replicability for brain–phenotype associations with increasing sampling sizes. Age may require only 300 individuals to provide highly replicable associations but other phenotypes required 1,500 to 3,900 individuals. The required sample size showed a negative power law relation with the estimated effect size. When only comparing the upper and lower quarters, the minimally required sample sizes for imaging decreased by 15–75%. Our findings demonstrate that large-scale neuroimaging data are required for replicable brain–phenotype associations, that this can be mitigated by preselection of individuals and that small-scale studies may have reported false positive findings.

Human behavior

Dead rats, dopamine, performance metrics, and peacock tails: proxy failure is an inherent risk in goal-oriented systems

When a measure becomes a target, it ceases to be a good measure. For example, when standardized test scores in education become targets, teachers may start ‘teaching to the test’, leading to breakdown of the relationship between the measure--test performance--and the underlying goal--quality education. Similar phenomena have been named and described across a broad range of contexts, such as economics, academia, machine-learning, and ecology. Yet it remains unclear whether these phenomena bear only superficial similarities, or if they derive from some fundamental unifying mechanism. Here, we propose such a unifying mechanism, which we label proxy failure. We first review illustrative examples and their labels, such as the ‘Cobra effect’, ‘Goodhart's law’, and ‘Campbell's law’. Second, we identify central prerequisites and constraints of proxy failure, noting that it is often only a partial failure or divergence. We argue that whenever incentivization or selection is based on an imperfect proxy measure of the underlying goal, a pressure arises which tends to make the proxy a worse approximation of the goal. Third, we develop this perspective for three concrete contexts, namely neuroscience, economics and ecology, highlighting similarities and differences. Fourth, we outline consequences of proxy failure, suggesting it is key to understanding the structure and evolution of goal-oriented systems. Our account draws on a broad range of disciplines, but we can only scratch the surface within each. We thus hope the present account elicits a collaborative enterprise, entailing both critical discussion as well as extensions in contexts we have missed.

Global systems and resilience

Earlier collapse of Anthropocene ecosystems driven by multiple faster and noisier drivers

A major concern for the world’s ecosystems is the possibility of collapse, where landscapes and the societies they support change abruptly. Accelerating stress levels, increasing frequencies of extreme events and strengthening intersystem connections suggest that conventional modelling approaches based on incremental changes in a single stress may provide poor estimates of the impact of climate and human activities on ecosystems. We conduct experiments on four models that simulate abrupt changes in the Chilika lagoon fishery, the Easter Island community, forest dieback and lake water quality—representing ecosystems with a range of anthropogenic interactions. Collapses occur sooner under increasing levels of primary stress but additional stresses and/or the inclusion of noise in all four models bring the collapses substantially closer to today by ~38–81%. We discuss the implications for further research and the need for humanity to be vigilant for signs that ecosystems are degrading even more rapidly than previously thought.

Origin of life

To Find Life in the Universe, Find the Computation

Life is inherently informational, with a complex code that constantly rewrites itself, resembling a Gaian machine that reshapes the planet. Viewing biology as information reveals connections between matter, energy, and computation, with life being a process of information controlling matter. The Landauer limit, which describes the energy cost of irreversibly changing information, is remarkably adhered to by biology, particularly in processes like RNA translation. By identifying "computational zones" in the universe, we can redefine our search for life and expand the notion of habitable zones. Understanding the complex hierarchies and functions of information in biology is crucial for deciphering life's implementation across the universe. Blurring the boundaries between biology and technology, blended systems like humans may be the most common form of life and potentially the ones best suited to discover other living systems.

Rebuilding the Habitable Zone from the Bottom Up with Computational Zones

Computation, if treated as a set of physical processes that act on information represented by states of matter, encompasses biological systems, digital systems, and other constructs, and may be a fundamental measure of living systems. The opportunity for biological computation, represented in the propagation and selection-driven evolution of information-carrying organic molecular structures, has been partially characterized in terms of planetary habitable zones based on primary conditions such as temperature and the presence of liquid water. A generalization of this concept to computational zones is proposed, with constraints set by three principal characteristics: capacity, energy, and instantiation (or substrate). Computational zones naturally combine traditional habitability factors, including those associated with biological function that incorporate the chemical milieu, constraints on nutrients and free energy, as well as element availability. Two example applications are presented by examining the fundamental thermodynamic work efficiency and Landauer limit of photon-driven biological computation on planetary surfaces and of generalized computation in stellar energy capture structures (a.k.a. Dyson structures). It is shown that computational zones involving nested structures or substellar objects could manifest unique observational signatures as cool far-infrared emitters. While this is an entirely hypothetical example, its simplicity offers a useful, complementary introduction to computational zones.

Since this post is public, feel free to share it or to invite a colleague/friend to join #ComplexityThoughts

Share Complexity Thoughts

Complexity Thoughts