Emergence and Evolution - Constraints on Form

by Chris Lucas

"The view of evolution as chronic bloody competition among individuals and species, a popular distortion of Darwin's notion of 'survival of the fittest,' dissolves before a new view of continual cooperation, strong interaction, and mutual dependence among life forms. Life did not take over the globe by combat, but by networking."
Lynn Margulis and Dorian Sagan, Slanted Truths, 1997

"The emergent qualities that are expressed in biological form are directly linked to the nature of organisms as integrated wholes; these can be studied experimentally and simulated by the use of complex non-linear models."
Brian Goodwin, How the Leopard Changed its Spots, 1994, Ch 7

Introduction

Organization is a common feature of the world around us, so it is natural to consider how these structures arise. Since Darwin it has been assumed that 'Natural Selection' is an adequate mechanism to explain all the non-physics features of life. Yet this approach neglects both the self-organizing aspects of physical systems and the goal driven behaviour (teleology) of higher organisms.

Recent work on cellular structures shows that chemical systems often organize themselves into complex forms, examples include viruses, protein folding and microtubules. Form can thus be an integral aspect of a system and not 'selected for' by external forces. Additionally, organisms make choices, they do not behave passively under environmental selection but act to change those forces. Taking account of these influences leads us to take a rather more coevolutionary approach to natural selection, incorporating insights from complex systems science.

Selective Forces

It is well known in physics that gravity and electromagnetic forces both can act from outside a system (fields) and within (interconnections). To some extent this is just a difference in our viewpoint, whether we include the 'source' in our system or not - if the source is much larger than the other components then it makes sense to treat it as an external invariant, simplifying the equations (the effect of the 'system' on the source is then considered negligible).

This thinking pervades evolution also, with the view that 'selection' is an external force, a fixed 'fitness function' shaping the organism. In some cases this seems to be a reasonable viewpoint, for example if a predator (the external force) can more easily catch slow prey then there will be survival advantages in evolving general speed, defence or disguise adaptations. Yet even here it is clear that the actual 'adaptation' isn't a function of the selective force, that only has the effect of choosing between the available options - in this case between slow, fast, dangerous or disguised 'prey systems', which must each arise via alternative methods. It should be noted that selection does not select 'for' a characteristic, only 'against' a disadvantageous one, all 'successful' adaptations avoid selection; as do any good, bad or indifferent changes that don't have 'selective' relevance to any particular 'culling' process. What is passed on to the next generation is the overall package, warts and all.

Model Simplifications

We need to be clear before discussing these issues just what generalisations are being made. In many treatments, particularly in Artificial Life, it is assumed that there are populations of organisms with the same genotype, that the fitness measure applies equally to all organisms regardless of environmental location, that it is constant with time, that it is independent of other traits, that it is independent of population size and so on. None of these simplifications will be true in 'real' systems, other than in restricted circumstances, so we must be wary of excessive extrapolations of theoretical results.

Phenotypic properties of an organism are not directly visible at conception, they emerge over the course of development, in a way not yet understood but said to relate mainly to DNA driven processes. These are shaped to some extent by environmental interactions, but the extent of this on the morphology seems limited in practice. Yet we must recognise that DNA itself does not 'drive' anything. It is a passive molecule, an archive, containing (replicated) information on how to create proteins (genes - ingredients) or sets of proteins (chromosomes - shopping lists). There are no structural specifications present, no organizational information - that has to arise by self-organization along with the other cell components (about 500 metabolic processes, involving 10,000 proteins, occur in each cell).

Explaining Form

"For my purposes a genetic replicator is defined by reference to its alleles, but this is not a weakness of the concept. Or, if it is deemed to be a weakness, it is a weakness that afflicts the whole science of population genetics, not just the particular idea of genetic units of selection. It is a fundamental truth, though not always realized, that whenever a geneticist studies a gene 'for' any phenotypic character, he is always referring to a difference between two alleles... The genes that exist today reflect the set of environments that they have experienced in the past. This includes the internal environments provided by the bodies the genes have inhabited, and also external environments, desert, forest, seashore, predators, parasites, social companions, etc.
... when we are talking about development it is appropriate to emphasize non-genetic as well as genetic factors."
Richard Dawkins, The Extended Phenotype, 1999, Ch. 5 & 6

We thus need to explain the origin of the various forms that can occur, these 'emergent' properties. The neo-Darwinian viewpoint is that random variations in the genes can explain this. Unfortunately this explanation seems inadequate since it assumes that for any feature of the body (phenotype) a matching variation in the DNA (genotype) can be found, in other words there is a linear or near-linear correspondence between gene and morphology - a continuum of possible expression, so that all intermediate variations will be possible (the environment also affects the parameters of course). But genes themselves are not 'optimized' by selection, only the phenotype is affected directly. So any functional equivalence (isomorphism) at genetic level will be just as valuable to the system at the phenotypic level (as will any other equivalent way of achieving the same end). In other words two different sets of genes (or alleles) which give rise to the same selected 'trait', will have no selective forces acting relative to each other. Is this likely ?

The genetic makeup of our DNA relies on codons, units of 3 base pairs coding for amino acids. There are 61 codons possible (plus 3 'punctuation' codons), yet only 20 amino acids are produced. Thus given a gene, coding for a small protein comprising 100 amino acids, we could have (on average) 3 different codons for the same amino acid, and 5 x 10⁴⁷ different genetic arrangements coding for the same protein ! Given that many proteins will have similar selective or functional effects (due to similar shape - their folding characteristics), we can see that genetic variation, in itself, does not guarantee any phenotypic variation on which selection may then act, there may be long periods of stasis when nothing happens (exploration of what are called neutral networks), before a mutation occurs that actually causes a phenotypic effect.

Trait Constraints

When we look at actual organisms we see that features rarely occur in continuous variants, there are discontinuities between species - the intermediate forms are never found. Traditional views dismiss this problem as the result of 'selection', yet it seems clear that many features are difficult to relate to any type of 'selective advantage' without highly implausible 'just so' stories. Ranges of features (e.g. number of legs) are highly constrained, some patterns (e.g. odd numbers of legs) seem never to occur, or to do so only in restricted phyla (e.g. starfish). This implies restrictions on the search space possible, in other words evolution is not entirely 'random', it has biases. Previous history has 'locked-in' features which are now difficult to escape, i.e. developmental constraints seem to prevent certain genetically driven variations becoming viable.

Epistasis

There is however no single gene for any phenotypic 'trait', any more than one single part runs a car. An expressed gene produces only a single protein, which must then interact with many more cell components before any effect is seen. A single mutation, of course, can destroy a trait (just like a single part failure can stop a car), but cannot itself create anything complex in isolation. The features of any system are thus a complex inter-meshing of processes requiring multiple lower level components. Traits are the result of combinations of genes (polygeny) and a single gene can also affect many traits (pleiotropy), i.e. genes perform multiple jobs and a job requires multiple genes.

"The dynamics of allele frequency change at a locus are greatly affected by linkage and interaction with other loci. Selection for favourable combinations of genes can create strong associations (linkage disequilibrium) among alleles at different loci if they are tightly enough linked. Different gene combinations may confer high fitness, so that a population can evolve towards any of several or many stable genetic compositions... but pleitropy and linkage disequilibrium, giving rise to negative genetic correlations between the selected character and other traits including fitness, reduce the response to selection, and cause the character to return towards its original state if selection is relaxed."
Douglas J. Fuyuyma, Evolutionary Biology, 1986, Ch. 7, Summary

In fact, some (perhaps many) genes comprise many alternative exons (active sequences) which can be spliced (assembled) into a number of variant proteins, and then used for different purposes in different contexts or creatures. This form of structure is true for all complex systems, the interactions are a 'many to many' (N:M) process, not an hierarchical (1:N) control structure as traditionally envisaged. It is essentially a highly nonlinear configuration, where feedback processes (both positive and negative) interact. So how can these overall features or traits arise, if neither selection nor genes are specific enough ?

Regulatory Networks

Luckily an alternative process can cast light on this dilemma. Genes come in two main forms, expression and regulatory (some perform both functions however). The expressive (or structural) genes are those that actually create the cell proteins (structure or metabolism) by the familiar process involving mRNA copies from the DNA and synthesis by tRNA and Ribosome - these we can regard as the 'low level' genes. But before a gene can be expressed it needs to be 'switched-on' and this is a function of the genetic regulatory system, the 'high level' control process for the cell. There is an analogy here with computer programming where the 'low level' or machine-code operations are specific and fragile, whereas 'high level' languages allow a modular approach, each statement controls a functional set of associated 'low level' operations. Low level genes are like the ingredients of a recipe, the high level ones choose the recipe (but maybe don't specify the quantities of each ingredient...). Gene networks therefore can be regarded as 'sub-routines', called as and when required to implement either low (ingredient) or high (recipe) level functions during the essential developmental (cooking) stage - the 'time' dimension.

Thus three stages are necessary for our model, the reductionist cataloging of protein parts (gene structures); the holistic specifying of their interconnectivity (switching functionality); and the dynamics through time of the resultant 'system' under environmental influences. Regulation in genes is actually poorly understood as yet, but it is known that a combination of activation and suppression switching operations is involved (using what are called 'promoters'). A typical gene has associated, on average, ten or so switches, combinations of activation and supression by other proteins that lock onto the switch sites, so a complex network of interacting proteins is almost always required to start and stop a particular gene activation. In fact, the combinatorial logic possibilities are almost infinite here (a single gene has about 1000 control combinations, a mere 5 have 10¹⁵), so even though all animals have basically the same set of genes (we share 99% with the chimp) a few changes in the 'wiring network' can lead to massive changes in function and form.

The regulatory genes (or 'transcription factors') will act in many different ways at different locations during the developmental process (depending on specific switch activations), and the most powerful of these (called 'tool box genes', including the homeotic or Hox genes) can activate very complex top level building blocks, e.g. trigger the complete development of an entire limb at a certain location (and can perform very different construction jobs within the same creature or in different creatures, depending on local context and development stage, eg. limbs, eyes, heart or wings !).

If we simplify this process a little then we can represent it by a system of logic gates. The epistatic (nonlinear) interactions between genes can then be seen as interconnections between their regulatory mechanisms, so in this way we can envisage a large scale network of logical controls determining the DNA expression process.

Punctuated Equilibria

One feature of such a 'Boolean Network' is that expression doesn't take place in a linear fashion, with gradual changes, but operates in jumps. Cascades of interactions switch the expression sequence between alternative stable modes (called attractors). Each mode flows through a series of cyclic changes as each node (gene) in the sequence activates or de-activates in turn (several genes will generally express at the same time here, since each network consists of a modular number of associated 'low level' units).

The net result of this is that the operation of the cell changes in discrete steps, not incrementally. If we imagine a mutation affecting not an expression gene (protein) but the switches of a regulatory pathway then we have a mechanism based on complex system theory to explain discontinuous variation (without disruption to the functional integrity of any expression gene). A step change in cellular operation could lead to a new type of cell (tissue) and the resultant development to a new species. We can view mutation here as a two path process, regulatory genes give step changes in function whilst expression changes 'fine tune' and optimize the current function. The 'gradualism' of Darwin is no longer valid in all cases, 'saltationism' (step change) is demonstrably involved also - both methods co-exist.

Phenotypic Development

Mechanisms for creating phenotypes during development however are still unclear, since in this case we have the added complication of multi-cellular organisation, together with the detailed way in which all the cells grow and differentiate. Yet we do know that even single cells communicate extensively (exchanging signals - physically, electrically and chemically) so we perhaps have here a further interconnection and regulatory network that may prove amenable to similar reasoning - organs as attractors ? It has been found, from simulations, that the attractor types that are seen can change, if several cells are able to communicate. These cases correspond to richer interconnection regimes and allow new types of attractor to exist that cannot, it seems, occur in isolated cells. In some cases the entire system can move spontaneously to a single type of new cell, or to a stable mixture of cell types. These systems can be self-regulating, the balance between types being a probabilistic feature of the interactions, cells will change apparently at random until the correct, stable, balance is seen.

This cellular example shows that the functionality of a group of cells can proceed very well without external influence (we will allow the need for non-specific flows of resources here, e.g. energy), the 'emergent' form is a self-organized one, not dependent upon environmental selection. Changes to the components of the cell (due say to mutation), will change the balance of the system and the parts will then coevolve (we can call this an internal selection) until a new balance is reached, a new stable attractor - a cellular ecosystem. Note that the attractor is a combined feature of the parts - it is not generated by parts in isolation, thus molecular reductionism is an inadequate analysis technique with which to explain the whole in such complex systems.

Whole Systems

Adaptation studies, in general, tend to focus upon individual 'traits', culling specific failures (or selecting 'improvements') e.g. giraffe neck length, in turn irrespective of other co-existing 'superiorities' (i.e. each 'mutation' is assumed to occur sequentially, and they are selected or rejected in isolation over multiple generations). The more complex the organism becomes however the more traits exist that can interact to affect the overall performance (so survival may be a very nonlinear optimization process), e.g. giraffe leg length is important too, the traits aid or oppose each other in fitness terms. Each of these observable traits has emerged in some way from the overall interactions of their parts, in ways not currently known in detail. Genes interact extensively as we saw (epistasis - polygeny and pleiotropy), so it seems generally inaccurate to assign particular properties (traits) to an isolated proportion of the lower level components of a system (especially to single genes). Likewise, there can be multiple paths through the same genetic sub-routines, dependent upon environmental influences, so the genotype needs to be considered instead from an whole systems perspective, as an emergent dynamic whole with multiple possible stable attractor states.

Emergence

So what is this emergence exactly ? Generally it is defined by saying 'the whole is greater than the sum of the parts'. In other words we cannot predict the outcome from studying only the fine details. Examples include cellular metabolism, ant colonies, organism development, snowflakes. Of course 'knowing' the outcome we can develop reductionist explanations for explaining (to some degree of probability) the small scale interactions involved - this is a one to many process, we break down the trait into multiple isolated parts or sets of parts. The reverse process, many to one - in other words explaining from first principles what actual forms will appear - seems beyond us. The essence of the phenomena however is that 'new' descriptive categories are necessary, in other words the features or attractors cannot be described within our existing vocabulary, we require new terms, new concepts to categorise them. This is a feature of 'open-ended' evolution - 'novelty' appears outside our current experience or that of the system. In these cases we cannot easily apply a 'fitness function' since the 'function' is initially unknown (and may not even exist) and is highly context dependent.

Discarding Dualism

It is useful here to consider the relevance of dualism to emergence. It may be considered that by saying that emergent properties are not 'explicable' by consideration of the lower level details, that we are claiming that there is an inherent duality between the levels of description (rather like that claimed by Descartes between body functions and those of mind). That is not the case. What we are saying is that there are a number of nested levels of detail, each of which has properties different from those levels that comprise it, and so needs a new holistic type of description or label to be applied (one that cannot be stated in terms of the individual part properties, neither separately nor in other partial combinations).

There are three aspects involved here. First is the idea of 'supervenience', this means that the emergent properties will no longer exist if the lower level is removed (i.e. no 'mystically' disjoint properties are involved). Secondly the new properties are not aggregates, i.e. they are not just the predictable results of summing part properties (for example when the mass of a whole is simply the mass of all the parts added together). Thirdly there should be causality - thus emergent properties are not epiphenomenal (either illusions or descriptive simplifications only). This means that the higher level properties should have causal effects on the lower level ones - called 'downward causation', e.g. an amoeba can move, causing all its constituent molecules to change their environmental positions (none of which however are themselves capable of such autonomous trajectories). This implies also that the emergent properties 'canalize' (restrict) the freedom of the parts (by changing the 'fitness landscape', i.e. by imposing boundary conditions or constraints).

Despite the different labels we give to emergent properties at the different levels (e.g. cell, organism, society) , the general features found at each level are considered equivalent in complexity terms, each being due to the same form of connectivity applied within the specific space and time framework appropriate to that form of structure. The semantic labels are level (function) dependent, the emergence features are considered universal under the 'general systems theory' viewpoint underlying complexity science (i.e. the 'laws' apply to all equivalent forms of systems, regardless of material or immaterial form). By studying the lower level connectivity of each system we can, in principle (if not always in practice), determine the expected emergent features, and conversely by relating the emergent features to those of systems whose connectivity is not known we may be able to infer their internal connectivity also. This is where the science comes in, by allowing us to model, in computer simulations, various connectivity options and determine which transition rules/interactions are critical to the emergence of certain features and which are not.

Hierarchical Levels

For our purposes here we need not be concerned about the technical details, but we do need to be aware that self-organizing emergence is an hierarchical process. Sub-atomic particles give rise to Atoms (which have emergent properties - e.g. density), these in turn combine to form Molecules which have different emergent properties (e.g. shape), which in turn form Metabolisms with yet more properties (e.g. cycles), which constitute Cells with further properties (e.g. movement), and then to Organisms at a yet higher level (goals) and on to Humans (abstractions). These levels cannot easily arise by means of standard mutation and crossover operations but seem to require a new form of evolution, sometimes called 'compositional evolution', 'cooperative coevolution', 'synergistic selection' or 'holistic darwinism', a symbiotic form that allows separately evolved functional building blocks to combine in modular fashion, improving overall combinatorial fitness.

In a similar way we need to ask to what extent can even higher level interactions self-organize and what is the influence of selection in, say, an ecosystem. Here we return to our initial distinction between external and internal processes. For our predator/prey interaction the predator was regarded as an external selective force on the prey behaviour (itself now regarded as a self-organizing phenotype). Yet we can also regard this from a higher level, as an ecosystem, where the interactions between the constituents are then internal to the system. We now have a self-contained coevolutionary system, similar to our cell, and the system will evolve over time to form a balance, an Evolutionary Stable Strategy (ESS).

Gaia and Self-Regulation

Taking a further step upwards, the whole planet is interconnected by weather, birds, insects etc., so we could regard that in turn as a coevolving organism - the Gaia theory. Again on a Solar System or Galactic level interactions of a different form (gravitational and electromagnetic) lead to emergent structure, suggesting that the concept of 'external selection' is simply a convenient simplification in what is always essentially a two way process. Self-regulation by feedback mechanisms seems to be a feature of every level of evolution, a hierarchy of order, emergent from initial disorder.

Teleology and Teleonomy

We now come to the matter of 'teleology', the following of goals in evolution and behaviour and the relationship of this to genes. Crucial to such a concept is the idea that humans, animals and perhaps even plants are causally effective, i.e. that they are active agents and not just passive ones. As we have seen, emergent properties allow for such a downward causation scientifically, and it proves to be philosophically incoherent to deny our own causality as humans (e.g.' who' is causing the self-referential denial?). Science has however traditionally refused to allow a role for non-random directions in evolution (the appearance of such, i.e. 'teleonomy', does not prove such goals exist), yet there are clear parallels between human self-organizing behaviours (cultural norms) and those of 'lesser' animals, both affect their evolutionary dynamics by their behaviours, in ways that do not seem random. The problem as usually stated is that the inheritance of any such 'learned' behaviours (as proposed by Lamarck) seems impossible, there is no mechanism to incorporate such changes into the genes and only the reproduced genes (not the body) in fact survives (but note this doesn't necessarily apply in some asexual phyla). This limitation can be shown to be false to some extent, an indirect mechanism is known to occur called the Baldwin effect - in which a change in behaviour provides a selection bias making such (originally learnable) behaviours more genetically determined in later generations. But that is not our concern here, we can instead allow that genotype structures do not change in themselves with learning, but claim that our behaviour can and does change their expression sequences, i.e. those regulatory pathways.

What we need to consider also is that since one species (Homo sapiens) can clearly change its place in its coevolutionary environment, in ways not directly related to its genes but by deliberate cultural activities, then might this not be true for other species also ? Any form of available choice allows an animal (or plant) to alter the selective forces it experiences (and to a large extent those experienced by its offspring also). Grazing by sheep moves them from an area of scarce food to one of plenty, clearly reducing the selective pressures on them - their cultural 'choice' (flocking), conscious or not, increases their collective fitness as a group relative to other possible choices. Survival and selective reproduction then are closely associated with the coevolution of organisms and environment, and dependent not only upon genes but on experiences (trial and error learned behaviours) also. This is important because the fitness landscape is not homogeneous (the same everywhere) but heterogeneous (it has spatial and temporal structure). This allows different behaviours to search out different niches, alternative lifestyles or formulae for success, and much faster than is possible by genetic variation alone. In this way culture transforms 'group selection' from being competition between spatially separated groups, to being competition between temporal choices, possible bifurcations in group trajectories, the poorer choice being selected against by immediate environmental feedback - including the emergent downward causation of group behaviour. General Selection Theory recognises that the basic principles of natural selection (variation, selection, retention) apply at all levels of reality, not just to genes or to biological individuals, thus small or large group 'learned' behaviours have adaptive effects also.

Lamarkian Learning

Learning isn't however just a human or higher animal prerogative, it is known to occur in simple nematode worms (which are brainless, with exactly 302 deterministically wired neurons), and even in single-celled bacteria. Passing messages between members of the species is also not a function of supposed 'intelligence', slime molds are quite good at it. Since we assume that our 'learning' is (variously) stored chemically, by physical cellular or neuronal changes, or by accessible parts of our environment then all these possibilities for (external or internal) environmental plasticity affecting gene expression seem open to any lifeform, however simple, i.e. nurture drives nature as well as the other way about, they must co-evolve (in fact hundreds of such enabling or disabling signals of various types are constantly passing between cells and/or organisms, with more being discovered each year). What this means in practice is that our changing environmental context is found to actively select gene activity, it determines which genes are expressed and which behaviours are generated - which then in turn determine our next environmental context and the subsequent genetic processes. This mutual triggering effect is the 'structural coupling' of autopoietic thought and is an active and increasingly recognised area of modern biological research.

Could a simple learned emergent property however be passed on directly in any other way, generation to generation (non-culturally) ? For simple cells, using asexual reproduction, passing chemical knowledge in this way may well be possible - bacteria pass on antibody resistance in a similar way. For multi-cellular organisms this is more problematical, learning presumably affects the somatic (body) cells not the gametes (specialised reproductive cells), yet in these cases also the chemical environment is common to all cells, leaving some possibilities open e.g. the child's immune system 'learning' to recognise foreign bodies from the mother's and the mother's chemistry affecting embryo development (as seen in thalidomide cases).

Conclusion

To some extent or another evolution is affected by the decisions made by the lifeforms comprising it, we change our destiny (say) by moving to a warmer climate or by deciding to scavenge. Such active decisions (conscious, unconscious or random as the case may be) change the shape of our fitness landscape, and this in turn alters that of all the associated species. Even by digging a hole we alter the environment, changing the landscape for other creatures (literally in this case !), so we cannot study evolution on the basis that the fitness landscape is fixed and not altered by the organisms present, the full two-way coevolutionary perspective is always necessary. The concept of genetic 'selection', useful though it is in an isolated sense, can now be seen as just a passive simplification for what is always a complex, coevolutionary and adaptive emergent system. This system makes use of dynamic self-organizing processes and selection at many levels (chemical, regulatory, learning, coevolutionary) and needs to be understood in a rather deeper sense than the shallow linear reductionism often employed by neo-Darwinists. The realisation that this is so has led to the new field of Evolutionary Development Biology (Evo Devo for short), to which we would add the power of self-organization and attractors. Let's leave the last word however to paleontologist Stephen Jay Gould:

"In any environment, hundreds of possible anatomies might work - and the forms and colours of this particular population in that specific valley are fortuitous consequences of the largely non-adaptive mutations that happened to arise and spread in an isolated population.
The resulting pattern of differences among valleys is largely non-adaptive. Each local race must avoid elimination by natural selection (and is fit in this negative sense), but its particular features represent only one in a myriad of workable possibilities, and any particular solution arises by the happenstance of mutation in an isolated population, not by natural selection."
Eight Little Piggies, 1993, 1.1 Unenchanted Evening