Isn’t This (Just) AI?

This article reviews a few of the most glaring factors which differentiate my own aims for systems with self models from the bulk of existing work in Artificial Intelligence.


Perhaps thousands of researchers have, over the last fifty years or so, claimed to be trying to create artificial systems which display the kinds of behaviours which, when witnessed in biological systems, we are inclined to call ‘intelligent’. The vast quantity of money which has been poured into such efforts by academic, government, and commercial funding bodies has made possible the development of a host of extremely useful applications, but it has done little to promote the emergence of ‘intelligent’ behaviour.

Today’s computer systems are, in a word, dumb.

As a result, whenever someone proposes a new initiative aimed once again at the ‘old fashioned’ goal of biological-style intelligence, the most common response from those outside AI seems to be “Isn’t this just AI?”, while the most common response from those inside AI seems to be that they’ve already been there, tried that, and failed — so take it from them, there’s no use trying any more.

As one prepared to stand up and propose such a new initiative, it’s important for me to differentiate both my aims and methods from those of current AI. Fruitful and substantive debate will grow not from recycling the standard responses but from diving into the technical side of the initiative itself. I hope this short paper will encourage thought in that direction.

The Software Ball and Chain

One of the tasks of the group of which I’m a part is to attack the grossly low year-on-year rate of growth in software productivity. Internal estimates suggest that software productivity increases something like 5% per annum. The analogous figure for hardware, by contrast, is a weighty 50%. One of many methods for escaping the ball and chain effect of software is to link the performance of our systems more directly to that of the underlying hardware.

The first differentiating factor underscores the relevance of my robot suggestions in this area:

Factor 1: The systems I suggest will be controlled dynamically by self modifying hardware systems, not by traditional programming.

In other words, I am aiming specifically for hardware-embedded cognitive architectures which obviate the need for traditionally programmed software control systems. Of course such hardware can be emulated in software, but that coding is tough work. By analogy, as I suggest in a paper contrasting digital and analogue computation that building a yo-yo and playing with it is easier than programming a system to calculate the trajectory of a yo-yo being played with.

The Cognitive ‘Balance of Power’

The bulk of existing systems apply as much computational horsepower as possible to chew up input data and come up with output data as quickly as possible. The aim is to solve some problem or effect some behaviour as efficiently as possible, without wasting computational capacity on seemingly superfluous tasks. As a result, one fact which characterises most real computer systems is that they spend an inordinate amount of time doing almost nothing, their CPUs racking up idle time like a fidgety manic trapped in a featureless manager’s office.

The second differentiating factor turns this arrangement on its head, just as Nature has done:

Factor 2: The architecture of the systems I suggest will deliberately not throw all their computational resources at ‘solving a problem’, but will instead ‘waste’ a great deal of it continuously seeking information about their environment, themselves, and the interactions between and within each.

To pick one biological example, human brains devote an enormous cortical area to efferents from visual sensors which continuously scan their environment. Even though a human being may think they know all they need to know about a visual scene, they keep looking at it, moving their heads and eyes, and they never just spontaneously turn off their visual processing for the sake of some idle clock cycles or to devote those brain cells to a new task like calculating their next stock market transaction.

Likewise, large parts of the variety of system I am suggesting will maintain a feverish pace of activity even when the system is apparently not ‘doing’ anything. Such systems will gather information largely for their own use, even if doing so doesn’t relate in an obvious way to performance of the current task at hand. The idea is that such systems will be better equipped for rapid changes in either their environment or their task than systems which concentrate all their power on performing one task in one environment and which must be reconfigured to handle something new.

Specific ‘Killer Apps’?

Closely related to the above line of thought is the difference between systems which are designed to perform a specific well-defined problem and those which are designed to be good at a general class of problems. Oversimplifying greatly, traditional software engineering proceeds on the basis of a carefully prepared list of design requirements setting out what the software should do in response to particular varieties of inputs. Applications are for something defined precisely by this list of design requirements. Applications which meet their design requirements particularly well, or which meet a useful set of design requirements which have not been met before, are sometimes called ‘killer apps’. They are the ‘must have’ applications which legitimise the underlying technology, making people ask “what do I need to buy to run this particular application?”.

But supposing that every good and worthwhile system has a killer app is commercially naive. As a colleague recently put it to me, what is the killer app of a VCR? Of course, answering “why, a VCR is for watching films!” is no more convincing than saying “why, a computational system is for computing!”. The reply to those answers is “what specific film (computing task) is the killer app?”. Very few people are convinced to buy a VCR by the desire to watch a specific film, yet millions of people own them.

Likewise, the third differentiating factor highlights the absence of specific computational problems:

Factor 3: The aim of building ‘cognitive robots’ is to discover general principles of cognitive control which can then be engineered into systems performing more specific tasks, rather than to build application-specific systems from the start.

In other words, by building cognitively sophisticated robots, I aim to learn how to engineer cognitive sophistication, not how to solve specific computational problems. I want to answer questions like “how does one create, maintain, and exploit for purposes of behaviour a real-time model of the environment, the robot, and their interrelationships?” or “how does one create a system for reading information from a videotape?”. These contrast sharply with questions like “how does one set the flap angle while landing a Boeing 747?” or “how does one replay the ear scene from Reservoir Dogs?”.

Information and Grounding

Finally, the approach to building cognitively sophisticated robots which I have suggested is based on a particular description of a cognitive ‘self’ phrased in terms of algorithmic information theory. One central feature of this approach is the notion of mutual information content between physical bodies. When the phrase ‘algorithmic information theory’ comes up in conversation with researchers in computing or telecommunications areas, a huge number of people usually indicate they are completely familiar with the area and often indicate that it’s really quite easy and standard fare for their field. Perhaps this is so. But certain basic facts tucked subtly into the compact formulation of algorithmic information theory seem nonetheless to escape many of the same people. For instance, if someone cannot recount, in three minutes or less, an information theoretic version of Gödel’s incompleteness results, then I would be skeptical that they have spent sufficient time familiarising themselves with algorithmic information theory to be prepared to comment on the technical aspects of what I have proposed. This is not because Gödel is indispensable for understanding cognition — far from it — but because some of the same subtleties which are responsible for incompleteness also underpin implicit properties of the ‘self model’ as I have described it.

Among these implicit properties — easily missed in the absence of familiarity with algorithmic information theory — is the fact that representation in the self model framework is inherently grounded in properties of the physical world. So is the notion of a representation which is specifically used as a representation by a cognizer. (In Mind Out of Matter, I distinguish between the notion of being a representation and being used as a representation; both receive information theoretic accounts.)

On this approach, someone who claims they have programmed a system with a data structure to represent the external world and then placed it in a robot to help it model its environment is relying on a different meaning of the word ‘represent’, one which does not share the mathematical rigour of the one I have proposed. It is possible that such a structure really does represent, but extremely unlikely. It takes a great deal more than just claiming that it represents to show that it satisfies the information theoretic definition. (On my account, it is more likely that the programmer’s brain, together with the physical structure of the robot, represents the external world, but the robot’s program — or the physical structure of that which implements it — very likely does not.) My challenge to those who prefer different meanings of the word is to tell me exactly how they intend to adjudicate on whether a particular item in the world is a representation of some other item. The literature is chock full of failed attempts. The problem is related to that of defining what it means for a given physical system to implement a particular abstract functional or computational system — and the literature is similarly full of mathematically trivial failed attempts. (See Chapter 5 of Mind Out of Matter.)

All this underlies the final differentiating factor which I’ll describe in this paper:

Factor 4: The cognitive architecture I have suggested is based on a specific definition of representation which avoids the symbol grounding problem and closes off the approach of building a system around structures which we can trivially label ‘representational’.

To put it differently, the information theoretic framework underlying my suggestions greatly narrows the class of systems which we can get away with saying implement the suggested architecture. The framework precludes from the very beginning the kinds of programming exercises which we might, in our fuzzy moments, like to call representational structures.


The upshot of the above is that what I have proposed is clearly different both in aim and method from the strand of research called Artificial Intelligence. This is not to say they cannot both be painted in broad strokes as the project of building intelligent systems. But under scrutiny, such a broad strokes picture turns out to obscure far more than it reveals.

This article was originally published by on and was last reviewed or updated by Dr Greg Mulhauser on .

Mulhauser Consulting, Ltd.: Registered in England, no. 4455464. Mulhauser Consulting does not provide investment advice. No warranty or representation, either expressed or implied, is given with respect to the accuracy, completeness, or suitability for purpose of any view or statement expressed on this site.

Copyright © 1999-2021. All Rights Reserved.