The New Math of How LargeScale Order Emerges By Philip Ball. From Quanta Magazine 10/06/2024
The puzzle of emergence asks how regularities emerge on macro scales
out of uncountable constituent parts. A new framework has researchers
hopeful that a solution is near.
A few centuries ago, the swirling polychromatic chaos of Jupiter’s
atmosphere spawned the immense vortex that we call the Great Red Spot.
From the frantic firing of billions of neurons in your brain comes your
unique and coherent experience of reading these words.
As pedestrians each try to weave their path on a crowded sidewalk, they
begin to follow one another, forming streams that no one ordained or
consciously chose.
The world is full of such emergent phenomena: largescale patterns and
organization arising from innumerable interactions between component parts.
And yet there is no agreed scientific theory to explain emergence. Loosely,
the behavior of a complex system might be considered emergent if it can’t be
predicted from the properties of the parts alone. But when will such
largescale structures and patterns arise, and what’s the criterion for when
a phenomenon is emergent and when it isn’t? Confusion has reigned. “It’s
just a muddle,” said Jim Crutchfield, a physicist at the University of
California, Davis.
“Philosophers have long been arguing about emergence, and going round in
circles,” said Anil Seth, a neuroscientist at the University of Sussex in
England. The problem, according to Seth, is that we haven’t had the right
tools — “not only the tools for analysis, but the tools for thinking. Having
measures and theories of emergence would not only be something we can throw
at data but would also be tools that can help us think about these systems
in a richer way.”
Though the problem remains unsolved, over the past few years, a community
of physicists, computer scientists and neuroscientists has been working
toward a better understanding. These researchers have developed theoretical
tools for identifying when emergence has occurred. And in February, Fernando
Rosas, a complex systems scientist at Sussex, together with Seth and five
coauthors, went further, with a framework for understanding how emergence
arises.
A complex system exhibits emergence, according to the new framework, by
organizing itself into a hierarchy of levels that each operate independently
of the details of the lower levels. The researchers suggest we think about
emergence as a kind of “software in the natural world.” Just as the software
of your laptop runs without having to keep track of all the microscale
information about the electrons in the computer circuitry, so emergent
phenomena are governed by macroscale rules that seem selfcontained, without
heed to what the component parts are doing.
Using a mathematical formalism called computational mechanics, the
researchers identified criteria for determining which systems have this kind
of hierarchical structure. They tested these criteria on several model
systems known to display emergenttype phenomena, including neural networks
and GameofLifestyle cellular automata. Indeed, the degrees of freedom, or
independent variables, that capture the behavior of these systems at
microscopic and macroscopic scales have precisely the relationship that the
theory predicts.
No new matter or energy appears at the macroscopic level in emergent
systems that isn’t there microscopically, of course. Rather, emergent
phenomena, from Great Red Spots to conscious thoughts, demand a new language
for describing the system. “What these authors have done is to try to
formalize that,” said Chris Adami, a complexsystems researcher at Michigan
State University. “I fully applaud this idea of making things
mathematical.”
A Need for Closure
Rosas came at the topic of emergence from multiple directions. His father
was a famous conductor in Chile, where Rosas first studied and played music.
“I grew up in concert halls,” he said. Then he switched to philosophy,
followed by a degree in pure mathematics, giving him “an overdose of
abstractions” that he “cured” with a Ph.D. in electrical engineering.
A few years ago, Rosas started thinking about the vexed question of whether
the brain is a computer. Consider what goes on in your laptop. The software
generates predictable and repeatable outputs for a given set of inputs. But
if you look at the actual physics of the system, the electrons won’t all
follow identical trajectories each time. “It’s a mess,” said Rosas. “It’ll
never be exactly the same.”
The software seems to be “closed,” in the sense that it doesn’t depend on
the detailed physics of the microelectronic hardware. The brain behaves
somewhat like this too: There’s a consistency to our behaviors even though
the neural activity is never identical in any circumstance.
Rosas and colleagues figured that in fact there are three different types
of closure involved in emergent systems. Would the output of your laptop be
any more predictable if you invested lots of time and energy in collecting
information about all the microstates — electron energies and so forth — in
the system? Generally, no. This corresponds to the case of informational
closure: As Rosas put it, “All the details below the macro are not helpful
for predicting the macro.”
What if you want not just to predict but to control the system — does the
lowerlevel information help there? Again, typically no: Interventions we
make at the macro level, such as changing the software code by typing on the
keyboard, are not made more reliable by trying to alter individual electron
trajectories. If the lowerlevel information adds no further control of
macro outcomes, the macro level is causally closed: It alone is causing its
own future.
Introduction
This situation is rather common. Consider, for instance, that we can use
macroscopic variables like pressure and viscosity to talk about (and
control) fluid flow, and knowing the positions and trajectories of
individual molecules doesn’t add useful information for those purposes. And
we can describe the market economy by considering companies as single
entities, ignoring any details about the individuals that constitute
them.
The existence of a useful coarsegrained description doesn’t, however, by
itself define an emergent phenomenon, said Seth. “You want to say something
else in terms of the relationship between levels.” Enter the third level of
closure that Rosas and colleagues think is needed to complete the conceptual
apparatus: computational closure. For this they have turned to computational
mechanics, a discipline pioneered by Crutchfield.
Crutchfield introduced a conceptual device called the ε (epsilon) machine.
This device can exist in some finite set of states and can predict its own
future state on the basis of its current one. It’s a bit like an elevator,
said Rosas; an input to the machine, like pressing a button, will cause the
machine to transition to a different state (floor) in a deterministic way
that depends on its past history — namely, its current floor, whether it’s
going up or down and which other buttons were pressed already. Of course an
elevator has myriad component parts, but you don’t need to think about them.
Likewise, an εmachine is an optimal way to represent how unspecified
interactions between component parts “compute” — or, one might say, cause —
the machine’s future state.
Computational mechanics allows the web of interactions between a complex
system’s components to be reduced to the simplest description, called its
causal state. The state of the complex system at any moment, which includes
information about its past states, produces a distribution of possible
future states. Whenever two or more such present states have the same
distribution of possible futures, they are said to be in the same causal
state. Our brains will never twice have exactly the same firing pattern of
neurons, but there are plenty of circumstances where nevertheless we’ll end
up doing the same thing.
Rosas and colleagues considered a generic complex system as a set of
εmachines working at different scales. One of these might, say, represent
all the molecularscale ions, ion channels and so forth that produce
currents in our neurons; another represents the firing patterns of the
neurons themselves; another, the activity seen in compartments of the brain
such as the hippocampus and frontal cortex. The system (here the brain)
evolves at all those levels, and in general the relationship between these
εmachines is complicated. But for an emergent system that is
computationally closed, the machines at each level can be constructed by
coarsegraining the components on just the level below: They are, in the
researchers’ terminology, “strongly lumpable.” We might, for example,
imagine lumping all the dynamics of the ions and neurotransmitters moving in
and out of a neuron into a representation of whether the neuron fires or
not. In principle, one could imagine all kinds of different “lumpings” of
this sort, but the system is only computationally closed if the εmachines
that represent them are coarsegrained versions of each other in this way.
“There is a nestedness” to the structure, Rosas said.
A highly compressed description of the system then emerges at the macro
level that captures those dynamics of the micro level that matter to the
macroscale behavior — filtered, as it were, through the nested web of
intermediate εmachines. In that case, the behavior of the macro level can
be predicted as fully as possible using only macroscale information — there
is no need to refer to finerscale information. It is, in other words, fully
emergent. The key characteristic of this emergence, the researchers say, is
this hierarchical structure of “strongly lumpable causal states.”
Leaky Emergence
The researchers tested their ideas by seeing what they reveal about a range
of emergent behaviors in some model systems. One is a version of a random
walk, where some agent wanders around haphazardly in a network that could
represent, for example, the streets of a city. A city often exhibits a
hierarchy of scales, with densely connected streets within neighborhoods and
much more sparsely connected streets between neighborhoods. The researchers
find that the outcome of a random walk through such a network is highly
lumpable. That is, the probability of the wanderer starting in neighborhood
A and ending up in neighborhood B — the macroscale behavior — remains the
same regardless of which streets within A or B the walker randomly
traverses.
The researchers also considered artificial neural networks like those used
in machinelearning and artificialintelligence algorithms. Some of these
networks organize themselves into states that can reliably identify
macroscopic patterns in data regardless of microscopic differences between
the states of individual neurons in the network. The decision of which
pattern will be output by the network “works at a higher level,” said
Rosas.
Introduction
Would Rosas’ scheme help to understand the emergence of robust, largescale
structure in a case like Jupiter’s Great Red Spot? The huge vortex “might
satisfy computational closure” Rosas said, “but we’d need to do a proper
analysis before being able to claim anything.”
As for living organisms, they seem sometimes to be emergent but sometimes
more “vertically integrated,” where microscopic changes do influence
largescale behavior. Consider, for example, a heart. Despite considerable
variations in the details of which genes are being expressed, and how much,
or what the concentrations of proteins are from place to place, all of our
heart muscle cells seem to work in essentially the same way, enabling them
to function en masse as a pump driven by coherent, macroscopic electrical
pulses passing through the tissue. But it’s not always this way. While many
of our genes carry mutations that make no difference to our health,
sometimes a mutation — just one genetic “letter” in a DNA sequence that is
“wrong” — can be catastrophic. So the independence of the macro from the
micro is not complete: There is some leakage between levels. Rosas wonders
if living organisms are in fact optimized by allowing for such “leaky”
partial emergence — because in life, sometimes it is essential for the macro
to heed the details of the micro.
Emergent Causes
Rosas’ framework could help complex systems researchers see when they can
and can’t hope to develop predictive coarsegrained models. When a system
meets the key requirement of being computationally closed, “you don’t lose
any faithfulness by simulating the upper levels and neglecting the lower
levels,” he said. But ultimately Rosas hopes an approach like his might
answer some deep questions about the structure of the universe — why, for
example, life seems to exist only at scales intermediate between the atomic
and the galactic.
The framework also has implications for understanding the tricky question
of cause and effect in complex and emergent systems. Traditionally,
causation has been assumed to flow from the bottom up: Our choices and
actions, for example, are ultimately attributed to those firing patterns of
our neurons, which in turn are caused by flows of ions across cell
membranes.
But in an emergent system, this is not necessarily so; causation can
operate at a higher level independently from lowerlevel details. Rosas’ new
computational framework seems to capture this aspect of emergence, which was
also explored in earlier work. In 2013, neuroscientist Giulio Tononi of the
University of Wisconsin, Madison, working with Erik Hoel and Larissa
Albantakis (also at Wisconsin), claimed that, according to a particular
measure of causal influence called effective information, the overall
behavior of some complex systems is caused more at the higher than the lower
levels. This is called causal emergence.
The 2013 work using effective information could have been just a quirk of
measuring causal influence this way. But recently, Hoel and neuroscientist
Renzo Comolatti have shown that it is not. They took 12 different measures
of causal power proposed in the literature and found that with all of them,
some complex systems show causal emergence. “It doesn’t matter what measure
of causation you pick,” Hoel said. “We just went out into the literature and
picked other people’s definitions of causation, and all of them showed
causal emergence.” It would be bizarre if this were some chance quirk of all
those different measures.
For Hoel, emergent systems are ones whose macroscale behavior has some
immunity to randomness or noise at the microscale. For many complex systems,
there’s a good chance you can find coarsegrained, macroscopic descriptions
that minimize that noise. “It’s that minimization that lies at the heart of
a good notion of emergence,” he said.
Tononi says that, while his approach and that of Rosas and colleagues
address the same kinds of systems, they have somewhat different criteria for
causal emergence. “They define emergence as being when the macro system can
predict itself as much as it can be predicted from the micro level,” he
said. “But we require more causal information at the macro level than at the
micro level.”
The new ideas touch on the issue of free will. While hardened reductionists
have argued that there can be no free will because all causation ultimately
arises from interactions of atoms and molecules, free will may be rescued by
the formalism of higherlevel causation. If the main cause of our actions is
not our molecules but the emergent mental states that encode memories,
intentions, beliefs and so forth, isn’t that enough for a meaningful notion
of free will? The new work shows that “there are sensible ways to think
about macrolevel causation that explain how agents can have a worthwhile
form of causal efficacy,” Seth said.
Still, there remains disagreement among researchers about whether
macroscopic, agentlevel causation can emerge in complex systems. “I’m
uncomfortable with this idea that the macroscale can drive the microscale,”
said Adami. “The macroscale is just degrees of freedom that you’ve
invented.” This is the sort of issue that the scheme proposed by Rosas and
colleagues might help to resolve, by burrowing into the mechanics of how
different levels of the system speak to one another, and how this
conversation must be structured to achieve independence of the macro from
the details of the levels below.
At this point, some of the arguments are pretty fuzzy. But Crutchfield is
optimistic. “We’ll have this figured out in five or 10 years,” he said. “I
really think the pieces are there.”
Resources
Software in the natural world: A computational approach to hierarchical emergence
by Fernando E. Rosas, Bernhard C. Geiger, Andrea I Luppi, Anil K. Seth, Daniel Polani, Michael Gastpar, and Pedro A.M. Mediano
Summary
Understanding the functional architecture of complex systems is crucial
to illuminate their inner workings and enable effective methods for their
prediction and control. Recent advances have introduced tools to
characterise emergent macroscopic levels; however, while these
approaches are successful in identifying when emergence takes
place, they are limited in the extent they can determine how it
does. Here we address this important limitation by developing a
computational approach to emergence, which characterises macroscopic processes in terms
of their computational capabilities. Concretely, we articulate a view on
emergence based on how software works, which is rooted on a mathematical
formalisation of how macroscopic processes can express selfcontained
informational, interventional, and computational properties. This
framework reveals a hierarchy of nested selfcontained processes that
determines what computations take place at what level, which in turn
delineates the functional architecture of a complex system. This approach
is illustrated on paradigmatic models from the statistical physics and
computational neuroscience literature, which are shown to exhibit
macroscopic processes that are akin to software in humanengineered
systems. Overall, this framework enables a deeper understanding of the multilevel
structure of complex systems, revealing specific ways in which they
can be efficiently simulated, predicted, and controlled.
FIG. 1. Illustration of causal states. Causal states are sets of of
trajectories which bear equal predictions for the future evolution of the
system, as defined by the equivalence relationship in Eq.
FIG. 2. The two faces of ϵmachines. Illustration of the dual interpretation
of ϵmachines that establish a bridge between causality and computation. a)
Causal face: View of ϵmachines as the effective mechanism driving the
system, acting ‘behind the scenes’ to generate observable data (a1).
Technically, this corresponds to interpreting it as a hidden Markov process
— i.e., dynamics that take place on variables Et on a latent statespace,
while generating the observable data Xt (a2). b) Computational face.
Alternative view of ϵmachines as discrete automata, where the data
corresponds to inputs given by a user driving the system between different
states (b1). Technically, this corresponds to seeing it as a discrete
automata with states ek, whose deterministic transitions are governed by the
input data xi (b2). Note that (a1) focuses on variables (e.g. Xt, Et), while
(b2) portraits the states that those variables can take (e.g. x0, e0). Fig.
(a1) is adapted from Ref. [39].
FIG. 3. The various machines associated with a macroscopic process. Diagram
of the relationship between the different machines associated with a
macroscopic process Z and its corresponding microscopic process X. The
ϵmachines with causal states Et and E ′ t correspond to the optimal
prediction of the future of X and Z, respectively, using data from the same
level. In contrast, the υmachine with causal states Ut provides optimal
prediction of the future of Z using data from X, hence using the minimal
amount of micro information for optimally predicting the future of the
macro.
FIG. 4. Example of computational closure. Illustration where micro causal
states are shown as small golden nodes and macro causal states are
represented as big paleyellow nodes. Transitions of micro causal states are
represented as simple arrows responding to three possible inputs: two inputs
denoted by a and b (not shown) trigger transitions within the same macro
state, and one input denoted by c (not show) triggers a transition to a new
macro state. The coarsegraining f(a) = f(b) = 0 and f(c) = 1 generate
deterministic dynamics for the macro states represented by double arrows,
whereas 0 makes the state to remain and 1 makes a transition to the next
state.
FIG. 5. Multilevel analysis via ϵmachines. a) Optimal automata can be built
at different levels of coarsegraining of observed data. Each automaton
accounts for the resulting patterns taking place at that scale. b) If the
considered levels of description are computationally closed, then the
automata of higher levels are coarsegrainings of the ones of levels below.
This process of coarsegraining of machines reveals the computations taking
place at each of those levels.
FIG. 6. The multiple hiearchies describing multilevel computations in a complex
system. Left: Lattice of all possible coarsegrainings, here illustrated for the
case of a process that can take five possible values. Center : Sublattice of
only those coarsegrainings that are causally/informationally closed. Right:
Lattice of stronglylumpable coarsegrainings of the ϵmachine of the
microscopic level. Only the last lattice provides a minimal blueprint that
highlights the distinct computational processes, and distinguishes which
computations take place at what level.
FIG. 7. Possible computational architectures of an emergent macroscopic level.
Our theory shows that the computations carried out by a causally closed process
Z with respect to a microscopic process X and the trivial coarsegraining 1 can
be categorised within four groups, illustrated here. The computations are the
same as the ones at the microscale if the ϵmachine of X and Z are equivalent
(as in b and d), and are trivial if the ϵmachines of Z and 1 are equivalent (as
in c and d). At the left of each subplot is the lattice of coarsegrainings in
real space, which is the same for the four cases; at the right is the lattice of
corresponding ϵmachines in theory space, which better illustrates the effective
computational structure of the system.
FIG. 8. Conserved quantities in elementary cellular automata. Illustration of
the computations associated to different types of conserved quantities. a)
Rule 60 forces configurations to have even parity. Hence, the parity is a
conserved quantity which is computationally trivial, akin to case (c) in
Figure 7. b) In contrast, rule 150 keeps the parity of the initial condition.
Hence, while the parity is also a conserved quantity for these dynamics, the
computations associated with it are nontrivial, akin to case (a) in Figure 7.
FIG. 9. Ehrenfest diffusion model. a) The model considers particles contained in
two connected chambers. The microscopic description of the system (Xt) is a
binary vector that specifies in which container is each particle, while the
macroscopic description (Zt) is the number of particles in the left chamber. b)
Illustration of the finite state machine description of the ϵmachine
corresponding to the macroscopic variable. c) One realisation of the dynamics of
the macroscopic process of a system of n = 40 particles, which naturally
oscillates around n/2.
FIG. 10. Energy dynamics of an Ising model are causally closed. When
considering the Ising model under Glauber dynamics, it can be shown that its
energy is a macroscopic variable whose dynamics are causally — and hence also
computationally — closed.
FIG. 11. Causally closed coarsegrainings of a random walk over a
network. A random walk on a modular network can be coarsegrained such that
the dynamics over the module’s labels is causally closed. Furthermore, by
considering equivalence classes of modules given by their size provides a
further causally closed macroscopic process
FIG. 12. Hopfield network compute memory retrieval on a causally closed
macroscopic level. The state of a Hopfield network is determined by the activity
of each of the involved neurons, here represented as a square grid. Nonetheless,
the similarity between the present pattern and the patterns that the network
stores (denoted by Z µ t , with µ ∈ {1, 2, 3, 4, 5} in the figure), which
determines to which of the stored patterns is more similar to the current
configuration. Our results show that Zt = (Z 1 t , . . . , Z5 t ) is a causally
closed coarsegraining of the neural system, which critically determines the
memory retrieval process.
FIG. 13. Diagram illustrating the relationships between closure and
lumpability of Markov chains. Informational/causal closure imply computational
closure (Theorem 2). Within the space of Markov X, strong lumpability of X
implies information closure (Proposition 4). The same does not hold for weak
lumpability and computational closure: If X is weakly lumpable, then the same
does not need to hold for E due to the minimality property of εmachines. The
diagram refers to (counter)examples in the text. Indeed, Example 1 is strongly
lumpable, while Counterexample 4 is weakly lumpable.
Mike Notes

I can use this in the Machine Learning part of the Pipi 9 core.