On Eugene Goostman, The Turing Test and The Chinese Room

9 minute read

A computer program called Eugene Goostman has recently been said to have passed the Turing test for the first time. But what does this mean? Are we about to be replaced by robots? Maybe not (yet).

As the story tells it, a machine can be said to have passed the Turing test if they can fool at least 30% of people that they are human. The test consisted of a tester chatting Eugene for five minutes and then rendering a verdict on the humanness of their chat partner.

If a computer is mistaken for a human more than 30% of the time during a series of five minute keyboard conversations it passes the test. No computer has ever achieved this, until now. Eugene managed to convince 33% of the human judges that it was human.

University of Reading

The test was first envisioned by Alan Turing in 1950, in a paper called Computing Machinery and Intelligence. What has been since commonly called the Turing test was called by Turing the imitation test. Turing presented the test with an analogy:

It is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart from the other two. The object of the game for the interrogator is to determine which of the other two is the man and which is the woman. He knows them by labels X and Y, and at the end of the game he says either ‘ X is A and Y is B ‘ or ‘ X is B and Y is A’. The interrogator is allowed to put questions to A and B thus: C: Will X please tell me the length of his or her hair? Now suppose X is actually A, then A must answer. It is A’s object in the game to try and cause C to make the wrong identification. His answer might therefore be ‘My hair is shingled, and the longest strands are about nine inches long.’

– Turing, Computing Machinery and Intelligence

One player tries to fool the interrogator, while the other tries to help them to come to the right conclusion. Next, Turing asks: ‘What will happen when a machine takes the part of A in this game?’

To make the test fair for the machine – to rule out “beauty competitions” in Turing’s words – the test is administered through a text-based interface. There is an interrogator and two competitors, one human and machine. The interrogator questions both competitors, who both try to convince the interrogator that they are human.

Now, what does it mean if a computer passes the imitation test?

The test is Turing’s attempt to replace a very hard question ’Can machines think?’ with an easier question ‘Are there imaginable digital computers which would do well in the imitation game?’

Turing places special importance on digital computers as the most prospective alternative for passing the test, but was also keen to ”rid ourselves of a superstition” that there is qualitative difference between digital computers and other kinds of computing. Digital computers tend to be fast, but any operations processed by digital means could be processed also by other means, just more slowly.

In theory, the operations computers process could be done by hand with pen and paper, just extremely slowly. For the imitation game, however, speed of answering is crucial for the imitation to work, so digital computers seem like the only possible alternatives.

Turing formulates the question as follows:

Let us fix our attention on one particular digital computer C. Is it true that by modifying this computer to have an adequate storage, suitably increasing its speed of action, and providing it with an appropriate programme, C can be made to play satisfactorily the part of A in the imitation game, the part of B being taken by a man?

In light of Eugene’s performance we can answer this positively. Computers can satisfactorily play the part of a human, and we have no reason to believe that they will not do continually better at this task as computing power increases and programming becomes better. The 30% success rate used in the University of Reading test also comes from a prediction Turing made:

I believe that in about fifty years’ time it will be possible to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent. chance of making the right identification after five minutes of questioning.

So the 30% chance hailed as an important milestone in all the coverage on Eugene is simply a prediction by Turing on how computing would advance. He seemed overly optimistic in his prediction, since we are 14 years late in fulfilling his guess. In the future, we can relatively safely predict that the success rate of programs like Eugene will become better, possibly soon reaching the 50% mark of randomly picking out the human from the test.

What practical consequences will this have? We can expect to start seeing smarter and smarter chat bots, eventually capable of holding a conversation in manner similar to humans. They will be able hold their own in a conversation substantially longer than the five minutes mentioned by Turing.

Like any new technology, smart chat bots will be put to two uses: good and bad. A specialized bot can answer customer complaints, handle reservations, take orders, and all kinds similar tasks where a human connection is required or preferred, but the task is relatively simple. Over time, this reservation will slowly relax, when bots become more and more intelligent and able to handle more and more complex task.

The same principle applies to bots put to bad use. We are bombarded daily with emails that try to fool us into believing it is from a legitimate sender, when the intention is actually malicious. As soon as chat bots are able, someone will put them to task of fooling us in real time conversations.

The first princes from exotic countries promising untold riches in exchange for a small advance fee will not fool many, but as the software advances, so does the risk. A significant amount of security risks is still due to social engineering, and once bots learn to impersonate people the havoc they can wreak multiplies immensely.

Now, all of this still assumes benevolent or malicious humans using software as tools for their own purposes. What about intelligence, bots that mimic consciousness? There is an important distinction to be made about consciousness and the appearance of consciousness. The Turing test tests only the latter: any system able to imitate consciousness and fool a human is deemed conscious. But is that enough? For Turing it seemed to be, but he acknowledged that others might not agree:

We cannot altogether abandon the original form of the problem, for opinions will differ as to the appropriateness of the substitution and we must at least listen to what has to be said in this connexion.

Turing was right to expect that his substitution of can machines think with can machines imitate thinking might elicit objections. He lists several possible objections (interestingly, Turing sees extra-sensory perceptions, like telepathy, as a possible argument against machines possessing consciousness), but one is of special interest here. Turing calls it the argument from consciousness and quotes Professor Jefferson as presenting it:

Not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain – that is, not only write it but know that it had written it. No mechanism could feel (and not merely artificially signal, an easy contrivance) pleasure at its successes, grief when its valves fuse, be warmed by flattery, be made miserable by its mistakes, be charmed by sex, be angry or depressed when it cannot get what it wants.

This argument takes several forms, one of which is popularised by the philosopher John Searle as the Chinese room argument. I won’t go over it in detail, but the central point of the argument is that

Computational models of consciousness are not sufficient by themselves for consciousness.

In other words, Searle would say that Turing is only describing what the consciousness does, but not how it works. “Programs are neither constitutive of nor sufficient for minds,” as Searle puts it. A model explaining how a program might imitate a human in the imitation test is not sufficient in showing that the program is actually conscious.

Searle’s argument echoes other philosophers of mind, like David Chalmers, who argues that in order to understand the consciousness we need to solve two types of problems: easy problems and the hard problem. Easy problems are specific problems about how the consciousness works in categorising impressions, integrating data and focusing attention. These are easy because they can be answered by looking at the mechanisms that handle those specific tasks. Understanding those mechanisms lets you understand that part of consciousness.

In comparison, the hard problem can’t be answered in this manner. Chalmers explains:

It is undeniable that some organisms are subjects of experience. But the question of how it is that these systems are subjects of experience is perplexing. Why is it that when our cognitive systems engage in visual and auditory information-processing, we have visual or auditory experience: the quality of deep blue, the sensation of middle C? How can we explain why there is something it is like to entertain a mental image, or to experience an emotion? It is widely agreed that experience arises from a physical basis, but we have no good explanation of why and how it so arises. Why should physical processing give rise to a rich inner life at all? It seems objectively unreasonable that it should, and yet it does.

Applied to Turing’s example, Chalmers argues that even if we created machines that passed the Turing test, something would be lacking. We might manage to make machines that behave like humans, but they would never know what it is like to feel human, or like anything at all. They would lack a subjective experience of what it feels like to be me, also called the phenomenological consciousness in philosophy of mind.

Some philosophers of mind don’t agree, and argue instead that the hard problem of consciousness is simply a collection of easy problems that will eventually be solved. Regardless of whether you side with Chalmers or his critics, it is probably clear that while Eugene Goostman is an impressive step forward in AI, the problem of computer consciousness is far from solved.

Turing gave a suggestion for going forward in his 1950 paper. Eugene Goostman follows one part of Turing’s suggestion; maybe following the other would lead to further breakthroughs:

It can also be maintained that it is best to provide the machine with the best sense organs that money can buy, and then teach it to understand and speak English.