Research Bibliographies, Page 4

This bibliography is like a glimpse into the past, an artefact of my time with BT’s Complex Systems Laboratory. Originally intended for internal use within a research group and later made available to the public, the bibliography has not been actively maintained since 2000.

Information, Complexity and Entropy

And Evolution

Also see section on speed of evolution, as well as Adami’s (1998) book Introduction to Artificial Life.

Bookstein, F. (1983) ‘Comment on a “Nonequilibrium” Approach to Evolution’, Systematic Zoology 32: 291-300.

Mostly on-target critique of Wiley and Brooks (1982), but not without its own problems; see Wicken (1983).

Brooks, D.R. and E.O. Wiley (1986) Evolution as Entropy: Toward a Unified Theory of Biology. Chicago: University of Chicago Press.

Having read this book very quickly in hopes of finding new perspectives on how to quantify the accumulation of information by a population over time — not the authors’ principle aim, I must emphasise — I was disappointed. Amidst a profusion of notions of entropy deriving from statistical mechanics, thermodynamics, and communication theory (embellished with apparently random switches between them) are embedded some fundamental positions which I believe to be patently false. For instance, the authors believe (as does Wiley 1988 in a shorter and more accessible article) that, due to the second law of thermodynamics, entropy always increases in evolutionary systems — yet this clearly need not be the case for such open systems. Yes, the overall entropy of the entire system of which an evolutionary system is a part must certainly increase, but I can find no argument as to why the subset of physical entities which make up the evolutionary system itself should display increasing entropy. Similarly, the authors insist (p. 137, in what was actually for me the single most interesting chapter) that phylogenetic change can never be negentropic — yet, the authors’ own account would appear to suggest that exactly that may occur when, for instance, an organism’s genome is significantly shortened (as is known to occur with some bacterial strains). Nonetheless, alert and critical readers will find interesting insights on the relationship between various notions of entropy and biological evolution and the pseudo-paradox that evolutionary systems appear to accumulate information even while the overall system of the universe is perpetually destroying it due to the second law.

Collier, J. (1986) ‘Entropy in Evolution’, Biology and Philosophy 1: 5-24.

This is an interesting attempt to sort out some of the myriad terminological confusions of Brooks and Wiley (1986), which I have read only quickly. While Collier does seem to have gone some way toward de-obfuscating Brooks and Wiley, I haven’t yet become sufficiently convinced that Brooks and Wiley’s approach is useful that I am motivated to study Collier in detail.

Collier, J. (1998) ‘Information Increase in Biological Systems: How Does Adaptation Fit?’, in Van de Vijver et al. (1998), pp. 129-40.

The author introduces ‘information of adaptation’, which turns out half-way through the paper to be simply the classical mutual information relation, in this case between traits and the environment. This is interesting but uninspiring.

Griffiths, P.E. (1999) ‘Genetic Information: A Metaphor in Search of a Theory’, draft of 7.9.99 — NOT FOR CITATION. (Included here for research reference and not for citation.)

Offers a critical look at ‘information talk’ in molecular and developmental biology. It would be unfair to critique the article, as it is clearly a working draft. I would note, however, a general impression that theoretical and philosophical discussion of the role of ‘information talk’ with respect to genes and such would seem to be more useful in a context informed by quantitative information theoretic analysis.

Layzer, D. (1988) ‘Growth of Order in the Universe’, in Weber et al. (1988), pp. 23-39.

This is a very interesting article which particularly emphasises a distinction between potential entropy or potential information on the one hand and actual entropy on the other, with the author arguing that information should be understood as the difference between potential and actual entropy, rather than simply equating it with negentropy. According to this view, systems may increase in information and in entropy at the same time, provided the generation of potential entropy exceeds the realisation of actual entropy. This view permits, for instance, the universe to have begun to expand from a state of zero entropy and zero information, and it allows a view of increasing entropy during the process of evolution which is grounded in the expansion of the dimensionality of the genetic state space. As Layzer suggests, genetic variation may thus generate both entropy and potential entropy/information. Later in the article, unfortunately, Layzer refrains from pinning down precisely quantitative expressions for the relevant quantities which undergo change during the course of evolution. Also, it should be noted that Layzer suffers from a common misconception about Maxwell’s Daemon, noted in the comments on Morowitz (1968a).

Løvtrup, S. (1983) ‘Victims of Ambition: Comments on the Wiley and Brooks Approach to Evolution’, Systematic Zoology 32: 90-96.

Mostly on-target critique of Wiley and Brooks (1982); see Wicken (1983).

Morowitz, H.J. (1968a) ‘Order Information and Entropy’, chapter 6 of Morowitz (1968), pp. 123-147.

This is an interesting, if somewhat meandering treatment of biological topics from the standpoint of thermodynamic entropy theory and information theory. (Note that these two topics are often conflated to poor effect, but in this case little harm seems to have been done.) While useful ideas can be gleaned from this chapter, doing so is very challenging on account of a common error running through the whole of the treatment: namely, Szilard’s 1929 idea that the acquisition of information entails the expenditure of sufficient energy to save the second law of thermodynamics. This view was later shown to be in error, and a correct account was given by Bennett in 1987. (See Bennett 1987b in the bibliography for Mind Out of Matter). (Of course, Morowitz is not to be blamed for this appeal to Szilard and Brillouin after him, since he predated Bennett by nearly two decades.) There is also a somewhat confusing discussion of a system’s entropy from its own ‘point of view’ which appears (pp. 133-134) to entail treating nonequilibrium biological systems as conservative. Unless I have missed something very badly (which is altogether possible!), citing the Liouville theorem and the conservation of phase space volumes over time requires treating the systems as conservative.

Ray, T. S. (1994) ‘Evolution, complexity, entropy, and artificial reality’, Physica D 75:239-63. (available online)

Statistical analysis of four variations of the original Tierra, with interesting notes on changes in population entropy over time. Nothing too Earth-shattering.

Standish, R.K. (1999) ‘Some Techniques for the Measurement of Complexity in Tierra’, in Floreano et al. (1999), pp. 104-8.

Offers a start on empirically estimating the classical information content of Tierran organisms, with a view to analysing the accumulation (or otherwise) of information over time. Unfortunately, after an interesting start, the paper finishes with an apology about the lack of results: “Due to the time constraints of finishing this paper, the analysis…has not been completed” (p. 107). Rats!

Saunders, P.T. and M.W. Ho (1976) ‘On the Increase in Complexity in Evolution’, Journal of Theoretical Biology 63: 375-84.

While this article gets off to a promising start with an abstract including comments like “complexity…gives a direction to evolution” and the “increase in complexity is…a consequence of the process by which a self-organizing system optimizes its organization” (p. 375), it shies from drawing any substantive conclusions whatsoever or appealing to any quantitative definitions of information or redundancy, noting that “such definitions ignore too many aspects of our intuitive understanding of the word to be suitable for our purposes” (p. 377). Later, in a discussion which appears to ignore utterly the difficulty and subtlety of the intellectual terrain (p. 377), the authors suggest that complexity admits of easy definition and empirical quantification in the form of DNA content. Very disappointing.

Teal, T.; D. Albro, E. Stabler, C.E. Taylor (1999) ‘Compression and Adaptation’, in Floreano et al. (1999), pp. 709-19.

Explores idea that adaptive systems, and particularly evolutionary ones, compress information about their environments. While initially promising, the paper’s examination of MDL (minimum description length) for formal grammars is marred by the ommission of references to early work in the area of compression and predictability. For example, the authors’ first conjecture that compression aids in generalisation is a weaker form of a result actually proved — rather than conjectured — in the late 1970s. Grounding in Kolmogorov complexity, Solomonoff’s results on prediction, and general PAC (probably approximately correct) stuff would really have helped. All these things would have been impossible to miss with a reference to Li & Vitanyi (1997), yet only a short IEE presentation by Li & Vitanyi is actually cited.

Weber, B.H.; D.J. Depew, J.D. Smith, eds. (1988) Entropy, Information, and Evolution: New Perspectives on Physical and Biological Evolution. Cambridge, Massachusetts: MIT Press.

Having recently re-read this interesting collection of contributions from top authors on an eclectic mix of thermodynamics, information theory, and evolution, my impression of the book’s single most glaringly plain characteristic has been reinforced: it is the astonishing extent to which the authors disagree blatantly over even the very basics. Only slightly less remarkable is the convincingness and authority with which each writes about their own particular position and those of others. Readers looking for a clear and uncontroversial picture of the relationship between evolution, information theory and thermodynamics will not find it here. Those looking to find confusion mixed with insight will be rewarded, but it’s hard work separating the two.

Wicken, J.S. (1979) ‘The Generation of Complexity in Evolution: A Thermodynamic and Information-Theoretical Discussion’, Journal of Theoretical Biology 77: 349-65.

Author attempts to show how the tendency of evolutionary processes to give rise to increasingly complex life forms is a consequence of the Second Law of Thermodynamics. While this paper is one of Wicken’s better written pieces, I nonetheless find Wicken (1987) more useful. This paper also provides the clearest indication yet that Wicken’s repeated allusions to Chaitin and algorithmic information theory are based on fundamental confusions about the field.

Wicken, J.S. (1980) ‘A Thermodynamic Theory of Evolution’, Journal of Theoretical Biology 87: 9-23.

Similar in flavour to Wicken (1979), builds on idea that biosphere evolves toward maximum structuring and minimum dissipation. Again, Wicken (1987) is probably more useful.

Wicken, J.S. (1983) ‘Entropy, Information, and Nonequilibrium Evolution’, Systematic Zoology 32(4): 438-43.

Primarily a comment on Wiley and Brooks (1982), which successfully clarifies some of the semantic confusion surrounding both Wiley and Brooks and their commentators (including Bookstein 1983 and Løvtrup 1983), while unfortunately introducing one or two of his own confusions.

Wicken, J.S. (1987) Evolution, Thermodynamics, and Information. Oxford: Oxford University Press.

This attempt to forge greater connections between neo-Darwinism and thermodynamics is interesting and generally well conceived, but it suffers from a very qualitative as opposed to quantitative feel throughout. It’s difficult to extract concrete quantitative conclusions, which will be a disappointment to those who, like me, read the book in hopes of finding insights into the job of using thermodynamics or information theory to quantify measurable aspects of evolution. Of particular note is a useful first chapter (pp. 17-28) on ‘Entropy and Information’ which persuasively explodes the ever so common conflation of Shannon and thermodynamic concepts of entropy based on the symbolic isomorphism between the respective equations. This is a good salve to those like Brooks and Wiley (1986), who run rough shod over the distinction.

Wiley, E.O. (1988) ‘Entropy and Evolution’, in Weber et al. (1988), pp. 173-88.

Concise discussion of the Brooks & Wiley approach to entropy and evolution, as discussed more fully in Brooks & Wiley (1986). The basic aim of the approach is to explain evolution by natural selection as a consequence of the second law of thermodynamics. In the space of two short pages, however (pp. 176-177), I become deeply confused attempting to follow the thinking. With a nod to authors such as Morowitz (1968) and Schrödinger (1945 book called What is Life?), who took the apparently decreasing entropy of evolutionary systems as a merely local effect paid for by global increases in entropy (makes perfect sense to me…), Wiley goes on to suggest that biological organisms really shouldn’t be viewed as dissipative structures at all and should be analysed via nonequilibrium thermodynamics. From there, he suggests that information resides within biological systems and has a physical basis and that the entropy of biological information must therefore increase in irreversible processes due to the second law of thermodynamics. My confusion: for a start, biological systems obviously are dissipative structures at the chemical level, and I don’t quite understand why a nonequilibrium thermodynamical approach to biological systems should require that this fact be superseded by some higher level notion according to which they are not dissipative. I completely agree (following Layzer 1988, for instance) that a genetic phase space of increasing dimensionality is a very important factor in understanding how entropy and information might simultaneously increase in an evolutionary process, but I have difficulty following Wiley’s reasoning which appears to go along with this.

Wiley, E.O. and D.R. Brooks (1982) ‘Victims of History — A Nonequilibrium Approach to Evolution’, Systematic Zoology 31: 1-24.

Pre-cursor to Brooks and Wiley (1986).

Wiley, E.O. and D.R. Brooks (1983) ‘Nonequilibrium Thermodynamics and Evolution: A Response to Løvtrup’, Systematic Zoology 32: 209-19.

As the title suggests…

Benford’s Law

I’m not sure whether everyone will agree that this is most properly classified as information theory, but that’s the context in which I tend to think about it, so here it is! NOTE: Benford’s law describes the appearance of significant digits in a wide range of data with a logarithmic distribution: probability (first significant digit = d) = log10(1 + 1/d), where d = 1, 2, … 9. Successive digits are much more uniform but still follow logarithmic distributions.

Benford, F. (1938) ‘The law of anomalous numbers’, Proceedings of the American Philosophical Society 78:551-72.

This was Benford’s re-discovery of Newcomb’s results and included evidence in the form of 20,229 observations from data sets ranging from areas of rivers, baseball statistics, atomic weights, and the street addresses of the first 342 men listed in American Men of Science. The results follow the logarithmic distribution remarkably well — some commentators believe too well, suggesting Benford may have massaged the numbers for a better match.

Feldstein, A. and P. Turner (1986) ‘Overflow, Underflow, and Severe Loss of Significance in Floating-Point Addition and Subtraction’, IMA Journal of Numerical Analysis 6:241-51.

This paper shows that a Benford distribution of data implies that floating point additions and subtractions can result in overflow or underflow with surprisingly high frequency. Authors discuss this result in light of computer design and suggest a 128-bit long word format for reducing the risks of serious errors in scientific computations to more acceptable levels.

Hamming, R.W. (1970) ‘On the Distribution of Numbers’, Bell System Technical Journal 49(8): 1609-25.

Describes the distribution of mantissas of floating point numbers in terms of what Hamming calls the ‘reciprocal distribution’. (Benford is not mentioned in the article, but of course the form of the distribution is the same.) Argues that sequences of multiplications and divisions effectively ‘drive’ numbers toward the familiar distribution (in essence, because the presence of just one member of the distribution in a sequence of operations ensures the outcome will also be, due to the invariance of the distribution under scaling operations), with similar but more limited results for the case of addition & subtraction. Discusses important applications to computer hardware and software design, the generation of random numbers, etc.

Hill, T. (1995a) ‘Base invariance implies Benford’s Law’, Proceedings of the American Mathematical Society 123: 887-95.

This and Hill (1995c) are the more mathematically detailed of Hill’s articles on Benford’s law, this one demonstrating that scale invariance implies base invariance, and that base invariance in turn implies Benford’s law. If we assume that any intuitively ‘universal’ law governing the distribution of digits must be scale invariant — i.e., that changing measurement units does not change the distribution of digits — then it turns out that Benford’s law is the only law which can govern the distribution of digits.

Hill, T. (1995b) ‘The significant-digit phenomenon’, The American Mathematical Monthly 102: 322-27.

This is a brief but somewhat detailed explanation of the fact that scale invariance implies base invariance, which in turn implies Benford’s law.

Hill, T. (1995c) ‘A statistical derivation of the significant-digit law’, Statistical Science 10(4): 354-63.

This and Hill (1995a) are the more mathematically detailed of Hill’s articles on Benford’s law, this one demonstrating that if distributions are selected at random, and in turn random samples are taken from these distributions, the significant digits of the combined samples converge to the Benford distribution. Moreover, distributions of successive digits also conform to a logarithm law, and successive digits are not independent. I.e., the probability of a given digit appearing in a given place depends on the presence of other digits in other places. (NOTE: This article is repeatedly referred to, even by Hill, as having a 1996 publication date, yet the first page of the actual article definitely says 1995. I have seen only a photocopy and not the original journal itself, so I cannot verify other dates printed on the journal, but I would guess it unlikely that Statistical Science printed its date incorrectly on the first page of the article…)

Hill, T. (1996) ‘The first-digit phenomenon’, American Scientist 86: 358-63.

This is a very accessible account of Benford’s law, it’s history and applications, including several examples and a description of the character of the ‘random samples from random distributions’ theorem. An interesting example illustrates the dependent nature of the probability of particular digits occuring in particular places: while the unconditional probability that the second digit of a base 10 number is a 2 is approximately 0.109, the conditional probability that the second digit is 2 given that the first digit is 1 is 0.115. And the probability that the first three significant digits are 3, 1, 4 is roughly 0.0014, fourty percent higher than one would expect if digits were distributed uniformly!

Matthews, R. (1999) ‘The power of one’, New Scientist 10 July 1999, pp. 26-30.

This ‘popular science’ article covers Benford’s Law, which describes the frequency of appearance of digits in all manner of humanly used data, ranging from listings of drainage areas of rivers to numbers appearing in magazine articles, from census data to stock market prices. The law is the basis of the new field of ‘digital analysis’, which applies statistical tests to uncover deviations from Benford’s law and thus the more likely appearance of fraud in data sets.

Newcomb, S. (1881) ‘Note on the frequency of use of the different digits in natural numbers’, American Journal of Mathematics 4:39-40.

This was the first publication of the logarithmic distribution which would be re-discovered by Benford more than half a century later. Includes observation about un-even wear in books of logarithm tables but no statistical evidence or proof.

Schatte, P. (1988) ‘On Mantissa Distributions in Computing and Benford’s Law’, Journal of Information Processing and Cybernetics EIK 24(9):443-55.

Suggests but doesn’t prove that after a sufficiently long computation in floating point arithmetic, mantissa distributions follow Benford’s Law; treats Benford’s Law itself as a corollary of this more general logarithmic mantissa distribution in ‘extensive computing’. Reiterates conclusion that base 8 is optimal for information storage, first floated in 1973 article in German by same author (given as ‘Zur Verteilung der Mantisse in der Gleitkommadarstellung einer Zufallsgrösse’, Z. Angew. Math. Mech. 53:553-65). Although the author does a much better job than Hill of situating work within a broader context of other people’s results, I found the article almost impossible to follow.


Also see section on information and evolution, which has some entries which are quite broad.

Chaitin, G.J. (1987) Information Randomness & Incompleteness. Singapore: World Scientific.

This is the broadest collection available of Greg Chaitin’s foundational work on algorithmic information theory, sometimes also known as Kolmogorov complexity theory. Includes early papers from before the self-delimiting version of algorithmic information theory, as well as more recent work. Also see Li and Vitani (1997) and introductory material available on this site.

Collier, J. (1990) ‘Intrinsic Information’, in Hanson (1990), pp. 390-409.

Here the author attempts to offer an account of an object’s intrinsic physical information, in terms of ‘intropy’ (what is often called ‘negentropy’, the difference between maximum possible entropy and displayed entropy) and ‘enformation’; the latter term I never could find an explicit definition for. By the end, I have to admit a degree of befuddlement as to why something like the algorithmic complexity conditional on the laws of physics couldn’t do the same job. (Algorithmic complexity does make an appearance, but in a significantly under-explained way.)

Hanson, P.P., ed. (1990) Information, Language and Cognition: Vancouver Studies in Cognitive Science, vol. 1. Oxford: Oxford University Press.

Lachmann, M.; M.E.J. Newman and C. Moore (submitted) ‘The Physical Limits of Communication, or Why Any Sufficiently Advanced Technology is Indistinguishable from Noise’, submitted to Nature. (available online)

Shows that optimally coded communications are indistinguishable from black body radiation, for anyone who does not know the encoding scheme. Perhaps I missed some subtleties in the physics, but the basic argument of the paper seems singularly unremarkable from the standpoint of algorithmic information theory or even classical information theory: it is, after all, well known (as the authors acknowledge) that an optimally coded message is incompressible and appears completely random.

Li, M. and P. Vitanyi (1997) An Introduction to Kolmogorov Complexity and its Applications. 2nd edition. New York: Springer-Verlag.

This is probably the single most comprehensive textbook of Kolmogorov complexity, or algorithmic information theory, presently available. Highly technical and much broader than, say, Chaitin’s collective volume on the same topic. (Chaitin 1987)

Shannon, C.E. and W. Weaver (1949) The Mathematical Theory of Communication. Urbana: University of Illinois Press.

This is what started it all in modern information theory — it’s still one of the best treatments available. The only significant criticism is Weaver’s over-emphasis of the connections with thermodynamics; while many authors succumb to the attraction of the analogy between thermodynamic entropy and Shannon’s entropy of a distribution, readers should be careful of inferring from the superficial symbolic similarity between the two definitions some genuine isomorphism, particularly when it comes to applications to the second law. Shannon himself mentions Boltzmann only briefly and makes no such strong claims.

Stentiford, F.W.M. and D.W. Lewin (1973) ‘An evolutionary approach to the concept of randomness’, British Computer Journal 16(1): pages? February.

Briefly reviews von Mises and Kolmogorov/Chaitin approaches to randomness and introduces an ‘evolutionary’ approach based on a variety of probabilistic finite automata. Relative complexity comparisons may be made (and are made, via a short collection of empirical results) by comparing the extent to which individual sequences may be predicted by the probabilistic automata. There are one or two minor distractions in the paper, such as a slight superfluity in the definition of conditional complexity (which allows for the possibility that no program will exist which can compute one string given another string, yet one always exists which simply computes the required string directly, ignoring the given string) and a statement which seems to suggest that effective definitions of randomness for arbitrarily long strings are impossible. (The latter is a standard part of modern algorithmic information theory.) The later statement that “it would not be possible to design a computable sequence which would be guaranteed to baffle an estimate of complexity based on an evolutionary procedure” is parasitic on the randomness incorporated into the evolutionary procedure; consider “it would not be possible…based on a procedure of randomly guessing bits”. (Clearly it is never possible to construct a sequence, computable or otherwise, which we can guarantee cannot be produced by randomly choosing bits.)

Management & Finance

Christensen, C.M. (1997) The Innovator’s Dilemma: When New Technologies Cause Great Firms to Fail. Boston: Harvard Business School Press.

This is an excellent book, which I seem to be the last in the whole business community to have read. Based on real research and data rather than the metaphor which abounds in so many management books, this book describes the process whereby a new disruptive technology, inferior by the standards of existing value networks, takes root in a new value network and then proceeds to improve its performance at a sufficiently high rate that it shifts the entire value framework and ultimately drives the previous technology into an ‘upmarket retreat’ in the direction of higher margins but shrinking volumes. The phenomenon has occurred repeatedly in industries as diverse as disk drives, steel mills, excavators, motorbikes, and many others. This is a ‘must read’ which, like many classic books, will make the reader feel clever by clearly and explicitly articulating that which otherwise might have been appreciated mainly intuitively and incompletely. It should certainly be read by managers of research departments who need to grasp the importance of pursuing work with no identifiable markets and with no clear potential for generating significant revenues within the existing value network. It should also be read by researchers, because it will let them say to themselves, “I told you so”.

Davis, S. and Meyer, C. (1998) Blur: The Speed of Change in the Connected Economy. Oxford: Capstone.

While occasionally heavy on consultant-style metaphorical fluff, this book is an otherwise excellent treatment of the dynamics of the modern, communication-enabled economy and gives short shrift to many of the clear-cut 19th-century concepts and distinctions which are still de rigeur in many business and economics curricula (think products and services, buyers and sellers, the importance of tangible capital, etc.) The book is primarily an exercise in challenging the way people think about market dynamics. The final chapter’s list of practical suggestions for ‘blurring’ the reader’s own business and approach might seem at first like flaky “business self-help”, but in fact they have some substance to them and are worth considering!

Evans, P. and T.S. Wurster (2000) Blown to Bits: How the New Economics of Information Transforms Strategy. Boston: Harvard Business School Press.

(I’m not sure how this book came to have a copyright year of 2000 when I read a retail copy in December of 1999, but that’s what it says…) Written by two vice presidents of the Boston Consulting Group, this book offers some very strong central tenets and metaphors wrapped up in many layers of seemingly unnecessary explanation and exposition, resulting in a remarkably high ratio of words to substance. The main message is that the exploding growth of connectivity between participants in the economy is fundamentally changing the traditional trade-off between a product or service’s richness and its reach (i.e., the number of participants to whom it can be delivered).

This article was originally published by on and was last reviewed or updated by Dr Greg Mulhauser on .

Mulhauser Consulting, Ltd.: Registered in England, no. 4455464. Mulhauser Consulting does not provide investment advice. No warranty or representation, either expressed or implied, is given with respect to the accuracy, completeness, or suitability for purpose of any view or statement expressed on this site.

Copyright © 1999-2023. All Rights Reserved.