In a previous post we have discussed about a recent work analyzing emergence from the perspective of how software works.
In this post, we continue to discuss about this topic, from a broader perspective. To this aim, I have asked some questions to Prof. Hector Zenil, a complexity scientist with long-standing experience on emergence from the perspective of algorithmic information theory, who developed algorithmic information dynamics with his collaborators.
One of the challenges of defining emergence is that one observer’s prior knowledge may cause a phenomenon to present itself as emergent that to another observer appears reducible. By formalizing the act of observing as mutual perturbations between dynamical systems, we demonstrate that the emergence of algorithmic information does depend on the observer’s formal knowledge, while being robust vis-a-vis other subjective factors. — Abrahão and Zenil
Could you expand on how your software algorithmic approach described in this paper published by the Philosophical Transactions of the Royal Society contrasts with the epsilon-machine approach F. Rosas et al. posted on ArXiv? More specifically, what are the key differences in how these approaches model the concept of emergence?
The main difference between Rosas et al. work and ours is that our work is strongly rooted in some of the most fundamental and basic principles of causality methods. There is not a single intrinsic stochastic aspect to our model and that is probably the main innovation in its treatment of emergence hence rejecting a form of strong emergence and all this halo of mysticism around emergence against ontological reductionism. Instead, it transfers the burden of the appearance of strong emergence to the limitations of the observer, hence in favour of epistemological reductionism.
In recent follow up papers (see below), we investigate this further in the context of what happens when lossless statistical compression is used to approximate algorithmic (Kolmogorov) complexity. For example, some coding algorithms may be better at finding causal patterns in data than others. This deficiency is introduced by the subjective nature of the particular instantiation of whoever introduced the heuristics. Such subjectivity is only possible because of the uncompatibility and undecidability of ultimate compression. When one is forced to go for a fully computable method, one may retrieve ‘exact values’ when, for example, approximating randomness by using indexes based upon Shannon Entropy, but one loses the subjectivity aspect unless it is injected artificially into the model just like we do in Shannon entropy by introducing ‘uncertainty’ in the form of not having access to the underlying probability distribution. Our approach is one in which there is no probability or stochasticity embedded in the foundational model hence not playing any fundamental role in defining the explicandum, emergence.
In some way, we killed the concept of strong emergence that obfuscated the field for decades only to keep coming back after us in cyclic waves. We think we did this in a formal and fundamental way utilising what we think is the only theory of causality, algorithmic (Kolmogorov-Chaitin) complexity, that prescribes how to find, and favours short, mechanistic answers that can be written as a computer program. In other words, models that are testable and can be carried out step by step by machine or human, no other assumption required, the basis of science and the scientific method. This was the basis of the late Prof. Barry Cooper idea of explaining emergence in the context of computability.
We interpret this large-scale simplicity as a pattern formation mechanism in which large-scale patterns are forced upon the system by the simplicity of the rules that govern the large-scale dynamics. — Israeli and Goldenfeld
You cited the work of Israeli and Goldenfeld that was published nearly 20 years ago. Could you tell me more on how this earlier research connects to and possibly enhances our understanding of recent studies on emergence? Surely, a commonality is coarse-graining, but what specific aspects or findings from that work do you feel are critical but currently underrepresented in the work by Rosas et al?
There are multiple commonalities between the approach by Israeli and Goldenfeld and Rosas et al. Israeli and Goldenfeld argued that multiple-scale dynamics can appear independent of each other. They demonstrated this on cellular automata (CA) as a case study like Rosas et al. did on CA too, but unlike Rosas et al., Israeli and Goldenfeld involved actual finite automata when making references to the Kolmogorov complexity of the larger scale updating CA rules, with emergence exhibiting a scaling law as a function of ‘coarse-grainedness’. They showed that large-scale coarse-graining allowed predicting capabilities in otherwise fine-grained random-looking systems (like ECA rule 30) and that the independence was apparent. The causal content was light but is fully mechanistic and computational while it appears to me that the causal and computational content in Rosas et al.’s paper is lighter despite the title and coverage as a computational model rather than what appears to me which is a statistical approach with a misnomer for epsilon-machines.
This is not to say at all that Rosas et al.’s contribution has no merit. Had I not found about it through the biased lens of scientific reporting, I would have found it more interesting except for what I think is the related to the ‘software in the natural world’ or ‘a computational approach’ to hierarchical emergence that may completely distract the reader and is not founded. This idea that the approach is computational because it incorporates epsilon machines, comes from an unfortunate choice of names perhaps inspired by the use of actual automata that never was. The original authors of the epsilon machine who I have in great esteem, called their ‘machine’ variables ‘causal states’, their field ‘computational mechanics’, and so on justifying their choices in part because these variables, according to the authors were “screening-off” properties that were essential to the basis of statistical methods of causal inference as introduced by Pearl, Spirtes, and others in the 80s and 90s. But in no way have anything to do with physical causality or causality from the mechanics of a computer program procedure or software.
You noted that epsilon machines are largely stochastic and thus limited to exploring correlations rather than causations. Could you discuss the potential benefits and limitations of using stochastic models in studying emergent phenomena? How do (or could) these models influence the interpretation of results in the field of complex systems?
Introduced by James Crutchfield and Karl Young in the 80s, epsilon machines may have been motivated by actual machines like finite automata, but they are rather based on approaches to causation introduced in the 80s spearheaded by researchers like Judea Pearl and Peter Spirtes et al. Epsilon machines are black-box functions intended to capture causality by way of correlation functions. This is what Rosas et al. meant by the first interpretation in their article when describing the method in their paper. A second interpretation of epsilon machines exists, but it cannot really co-exist with the first one and they are mutually exclusive. That second interpretation comes from the original authors original intention and has unfortunately been carried out to this date reflected across Rosas et al. work describing epsilon-machines as some sort of ‘automata’. Had the second interpretation held, the contribution of the paper would be greater, but being the first interpretation the correct one, the paper is being credited for the wrong reasons.
In my opinion, there is little to no computational or causal content in an approach based upon epsilon machines to emergence. It is an interesting systematic approach to investigating levels of abstraction and emergence within the realm of statistical correlation. The most interesting part of the paper for me is this experimental approach to multi-scale dynamics combining traditional information theory and perturbation analysis.
However, when one takes automata seriously as an explanation for emergent phenomena, we have formally shown with the principles of algorithmic information theory based upon one of the most conventional assumptions in science (mechanical explanation), that the concept of 'strong emergence' is only an artifact. This is also related to the concept of supervenience in epistemology. In the context of the mind, this means that there cannot be two events alike in all physical respects but differing in some ‘mental’ respect, or that an object cannot alter in some mental respect without altering in some physical respect. In other words, there are no hidden unexplainable causes causing an emergent phenomenon that cannot be explained from other underlying causes. There is nothing that cannot be unaccounted for only that we cannot account for as limited observers. There can only be weak emergence, the type of emergence that is only on the ignorant eyes of the beholder. The problem is, of course, twofold: (1) we cannot ever witness all multiple scales and causal trajectories causing an apparent emergent phenomenon, and (2) we have been historically terrible at dealing with causation.
The history of science and its practice can be seen as two opposing forces against each other, or the discovery of a causal explanation against non-causal ones, from moving away from randomness and chance, magic and divine explanations to moving away from astrology and more lately from correlation in the last decades. The history of science is basically the history of how limited we have been to deal with causality at every given time, from the development of logic to discovering natural physical laws. Traditionally, we haven’t known how to deal or measure causality but in the 80s another powerful tool was introduced, that of perturbation and intervention analysis and the concept of counterfactuals. Counterfactuals are artificial assumptions that can be explored either as correlation again (then coming back full circle with a marginal gain) or by computer simulation hence mechanistic. Rosas et al. work falls into the former but is in the right direction.
In this sense, the paper of Rosas et al. has successfully incorporated some of these causal tools combined with traditional information theory to investigate the concept of emergence. Our group combined these tools years ago with algorithmic information theory by way of actual finite automata (approximations to universal Turing machines, and computer programs) covering several papers spanning the last 10 years published in journals such as Nature Machine Intelligence, Physics Reviews E, The Transactions of the Royal Society, and others, with a book recently published by Springer Nature and another book by Cambridge University Press last year. We called the theory and framework Algorithmic Information Dynamics (AID).
To explain how we pushed the problem of emergence fully to the observer, let me refer to two other papers that we released last year posted on the preprint server ArXiv (https://arxiv.org/abs/2303.16045 and https://arxiv.org/abs/2405.07803). We found that we were able to reconstruct agnostically the intended meaning of a message even when such message could contain noise or be partially scrambled or deconstructed. This is because of compression algorithm defects and uncomputability. Had we been perfect at measuring algorithmic complexity with an ultimate compression algorithm, all the configurations of the message would have carried the same (algorithmic) information content and we would have been unable to reconstruct it in the first place. So, it is the majority of the observers’ limitations that disclosed the original intended message by another subjective sender. So, on both ends we have subjective faulty systems that cab understand each other only because they are own faulty technology and ultimate mathematical limitations. So, uncomputability led to meaning discovery and we argue that is the basis of communication, can capture the meaning of meaning and may lead to open ended evolution.
Further details:
→ Undecidability and Irreducibility Conditions for Open-Ended Evolution and Emergence
What are some other key papers or researchers that you believe should be included in a discussion on the algorithmic and software-based approaches to emergence?
Rosas et al.’s paper does a decent job in citing previous work on classical information theory and emergence, but when it comes to combining algorithmic, computational and software approaches to causality and emergence, I have mentioned a few above, including Cooper’s, Israeli and Goldenfeld’s and our own work that I think has helped move the field forward.
What I find often unfortunate is that science writing has become some sort of scientific tabloid journalism trying to find the best commercial angle for a piece of scientific work to grab people’s attention. Some scientific outlets and writers have become too powerful in deciding what and how the public should consume science, with science media outlets simply replicating articles amplifying a signal that may send the wrong message with now even science influencers producing videos replicating already hyped original sources. This should not be the scientific journalism answer to the post-truth political and social-divisive era we live in today, instead of balancing and counter acting towards more objective balance and accuracy. The journalistic job about a topic should be to provide nuance, context and real opposing views.
This deformation of reality through the lens of scientific media has happened before in other fields such as ‘assembly theory’ and ‘integrated information theory’, this latter for example has merits but they were hindered under a pile of hype that came in waves talking about a proven explanation for consciousness in the media for more than a decade until a breaking point forcing a group of academics formally opposing and probably overcompensating against Integrated Information.
I thank Hector for taking time to answer my questions and for providing a broader perspective on the topic, which is a genuine way to advance science. I will be glad to host other views from our readers and to follow-up with new posts, if needed.
Appendix
In the meanwhile, I think that Hector has also highlighted a crucial issue in the field of scientific journalism, which plays a significant role in the advancement of science in general, not just complexity science.
Recently, there has been a noticeable trend toward eye-catching headlines and simplified narratives that prioritize reader engagement over comprehensive coverage. This shift is often driven by a limited group of writers who determine what is newsworthy and how it is presented, potentially hindering scientific progress.
This trend could partly be attributed to the increasing commercialization of scientific journalism, where economic pressures on editors and writers may compromise the in-depth, iterative nature of scientific discovery. The essence of scientific advancement—characterized by meticulous (and sometimes never-ending) discussion, hypothesis testing, and validation—is at odds with the rapid pace and sometimes superficial coverage driven by business interests, which may align more closely with the priorities of funders or stakeholders than with those of the scientific community.
Furthermore, the competitive pressure to secure prestige, funding, and attention can be detrimental for the valuable time scientists need to engage with their peers, potentially affecting the quality of published research. This concern is further demonstrated by the significant increase in paper retractions in 2023, highlighting a need for reflection (source).
I am sensitive to this topic: stay tuned for a dedicated post about it in the next future.