Turing test: what is it and why is it so difficult to pass? Alan Turing. Computer passed the Turing test

Text
Artyom Luchko

The British University of Reading announced with great fanfare that "an important milestone in the history of computing" had passed and the computer had passed the correct Turing test for the first time, misleading the judges who believed they were talking to a 13-year-old Ukrainian boy. Look At Me figured out what was really behind this event.

What was the experiment


University of Reading, which performed the first successful Turing test

Chatbot testing was organized by the School of Systems Engineering at the University of Reading to mark the 60th anniversary of Alan Turing's passing. The experts communicated simultaneously with a living person and with the program, being in different rooms. At the end of the test, each of the judges must declare which of his two interlocutors is a person and who is a program. For the purity of the experiment, five computers and 30 judges were involved, each of whom conducted a series of 10 written dialogues lasting 5 minutes. Although usually in the annual competition for artificial intelligence programs for the Loebner Prize ( in which programs compete to pass the Turing test for a prize of $ 2000)only 4 chatbots and 4 people take part. As a result of the experiment, the program Eugene Goostman managed to convince 33% of the jury in its "humanity", which happened for the first time in history. Robert Llewellyn, one of the judges, British actor and tech buff, said:

The Turing test was amazing. There were 10 sessions of 5 minutes each, 2 screens, 1 person and 1 car. I got it right only 4 times. This robot turned out to be a smart guy ...

Chatbot Eugene Goostman was developed by a native of Russia Vladimir Veselov (now he lives in the USA) and Ukrainian Evgeniy Demchenko living in Russia. The first version appeared back in 2001. The age of the teenager was not chosen by chance: at the age of 13, the child already knows a lot, but not everything that complicates the task of judges. In 2012, the chatbot had already gotten close to success: then 29% of the judges believed in the "humanity" of the Ukrainian schoolchild. In the course of the latest improvements, the programmers managed to prepare the virtual interlocutor for all possible questions and even train him to select examples of responses on Twitter.

What is Turing test,
and what are its disadvantages


Alan Turing at the age of 16

The Turing test was first proposed by British mathematician Alan Turing in his 1950 article "Computing and Intelligence" in the journal Mind. In it, the scientist asked a simple question: "Can a machine think." In its simplest form, the test is as follows: a person interacts with one computer and one person. Based on the answers to the questions, he must determine who he is talking to: a person or a computer program. The task of a computer program is to mislead a person into making the wrong choice. The test involves a five-minute textual dialogue, during which at least 30% of the judges must believe that they are dealing with a person, not a machine. At the same time, of course, all test participants do not see each other.


John Searle, American philosopher

There are many different versions of this test. (in some variations the judge knows that one of the checked interlocutors is a computer, in others he does not know about it), but many scientists and philosophers criticize him to this day. At one time, the American philosopher John Searle challenged the test with his thought experiment known as the "China Room." He allowed himself to assume that the ability of a computer to conduct a conversation and answer questions convincingly is far from the same as having a mind and thinking like a person. “Suppose I am locked in a room and [...] that I don’t know a single Chinese word, in writing or spoken,” Searle wrote in 1980. He pretended to receive questions written in Chinese through a crack in the wall. He was not able to read these characters, but had a set of instructions in English that allowed him to respond to "one set of formal characters with another set of formal characters." Thus, Searl could theoretically be able to answer the questions simply by following the rules of English and choosing the correct Chinese characters. And his interlocutors would be convinced that he can speak Chinese.

Most critics of the Turing test as a way to measure artificial intelligence are of the same opinion. They argue that computers can only use sets of rules and huge databases programmed to answer questions to make them seem reasonable.

How the program deceived the jury


University of Reading Professor Kevin Warwick

Eugene Goostman has two factors that helped him pass the test. First, the grammatical and stylistic mistakes that the machine makes in imitating the letter of a teenager, and secondly, the lack of knowledge about specific cultural and historical facts, which can also be attributed to the age of the student.

In the process of developing artificial intelligence, there is no more iconic and controversial stage than passing the Turing test

"The success of the program is likely to raise some concerns about the future of information technology," said University of Reading professor Kevin Warwick. - In the process of developing artificial intelligence, there is no more significant and controversial stage than passing the Turing test, when a computer convinces a sufficient number of judges to believe that not a machine, but a person communicates with them. The very existence of a computer that can trick a person into thinking they are a person is a wake-up call for cybercrime. ” The Turing test is still an important tool in the fight against this threat to this day. And now experts have to understand more fully how the emergence of such advanced chat bots can affect online communication on the Internet.

Judging by the logs that can be found on the web (I still can't try the bot on my own, probably because of the excitement, the site couldn't withstand the traffic and "fell"), the chatbot is quite primitive and, as it seems at first glance, does not differ much from similar developments that can be found on the Internet. One of the curious dialogues with "Eugene" was presented by the journalist Leonid Bershidsky, who asked him uncomfortable questions about a high-profile event that could not pass by the young Odessa citizen.

Even taking into account the well-developed character and biography, mistakes and typos that a real teenager can make, the credibility of the bot is questionable. In fact, it also responds to keywords, and when confused, it issues pre-prepared and not the most original stub answers. If the program were able to use search engines to be in the context of the current situation in the world, we could see a much more impressive result. It probably takes time. Formerly renowned futurist Raymond Kurzweil, who is Google's CTO, said computers could easily pass the Turing test by 2029. According to his assumptions, by this time they will be able to master the human language and surpass the intellect of people.

7 supercomputers that can outwit humans

ELIZA


An empirical experiment in which a person communicates with a computer intelligence program that models responses like a person.

It is assumed that turing test passed if a person, when communicating with a machine, believes that they are communicating with a person, not a machine.

British mathematician Alan Turing in 1950 came up with such an experiment by analogy with a simulation game, which assumes that 2 people go to different rooms, and the third person must understand who is where, communicating with them in writing.

Turing suggested that this game be played with a machine, and if the machine can mislead an expert, it will mean that the machine can think. Thus, the classic test follows the following scenario:

The human expert communicates via chat with the chatbot and other people. At the end of the conversation, the expert must understand which of the interlocutors was a human and who was a bot.

Nowadays, the Turing test has received many different modifications, consider some of them:

Reverse Turing test

The test consists in performing some action to confirm that you are human. For example, we can often face the need to enter numbers and letters into a special field from a distorted image with a set of numbers and letters. These actions protect the site from bots. Passing this test would confirm the machine's ability to perceive complex distorted images, but so far none exist.

Immortality test

The test is to maximize the repetition of a person's personality characteristics. It is believed that if a person's character is copied as qualitatively as possible, and it cannot be distinguished from the source, then it means that the test of immortality has been passed.

Minimal Intelligent Signal Test

The test assumes a simplified form of answers to questions - only yes and no.

Turing meta test

The test assumes that a machine "can think" if it can create something that it itself wants to test for intelligence.

The first passing of the classic Turing test was recorded on June 6, 2014 by the Zhenya Gustman chatbot, developed in St. Petersburg. The bot convinced the experts that they were communicating with a 13-year-old teenager from Odessa.

In general, machines are already capable of much, now many specialists are working in this direction and more and more interesting variations and passing of this test await us.

"Eugene Goostman" was able to pass the Turing test and convince 33% of the judges that it is not the machine that communicates with them. The program posed as a thirteen-year-old boy named Yevgeny Gustman from Odessa and was able to convince the people who spoke to her that the answers it gave out belonged to a person.

The test took place at the Royal Society of London and was organized by the University of Reading, UK. The authors of the program are Russian engineer Vladimir Veselov, who currently lives in the United States, and Ukrainian Evgeny Demchenko, who now lives in Russia.

How did the Eugene Gustman program pass the Turing test?

On Saturday June 7, 2014, a supercomputer named Eugene tried to recreate the intelligence of a thirteen-year-old teenager, Eugene Gustman.

The testing, organized by the School of Systems Engineering at the University of Reading (UK), involved five supercomputers. The test was a series of five minute written dialogues.

The developers of the program managed to prepare the bot for all possible questions and even train it to collect examples of dialogues via Twitter. In addition, the engineers gave the hero a bright character. Pretending to be a 13-year-old boy, the virtual "Evgeny Gustman" did not raise doubts among the experts. They believed that a boy may not know the answers to many questions, because the level of knowledge of an average child is significantly lower than that of adults. Moreover, his correct and accurate answers were attributed to unusual erudition and erudition.

The test involved 25 "hidden" people and 5 chat bots. Each of the 30 judges conducted five chat sessions, trying to determine the real nature of the interlocutor. For comparison, in the traditional annual competition for artificial intelligence programs for the Loebner * Prize, there are only 4 programs and 4 hidden people.

For the first time a program with a "young Odessa citizen" appeared back in 2001. However, only in 2012, she showed a really serious result, convincing 29% of the judges.

This fact proves that in the near future, there will be programs that will be able to pass without problems turing test.

Probably, today there is not such a person who at least once has not heard of such a concept as the Alan Turing test. Probably the majority, in general, are far from understanding what such a testing system is. Therefore, let us dwell on it in more detail.

What is Turing test: basic concept

Back in the late 40s of the last century, many scientists were engaged in the problems of the first computer developments. It was then that one of the members of a certain non-state group Ratio Club, engaged in research in the field of cybernetics, asked a completely logical question: is it possible to create a machine that would think like a person, or at least imitate his behavior?

Needless to say who invented the Turing test? Apparently not. The following principle was taken as the initial basis of the entire concept, which is still relevant now: can a person, during some time of communication with some invisible interlocutor on completely different arbitrary topics, determine who is in front of him - a real person or a machine? In other words, the question is not only to simulate the behavior of a real person by a machine, but also to find out if it can think for itself. until now, this issue remains controversial.

History of creation

In general, if we consider the Turing test as a kind of empirical system for determining the "human" capabilities of a computer, it should be said that the curious statements of the philosopher Alfred Iyer, which he formulated back in 1936, served as an indirect basis for its creation.

Iyer himself compared, so to speak, the life experiences of different people, and on the basis of this expressed the opinion that a soulless machine cannot pass any test, since it cannot think. At best, it will be pure imitation.

In principle, it is so. Imitation alone is not enough to create a thinking machine. Many scientists cite as an example the Wright brothers, who built the first airplane, abandoning the tendency to imitate birds, which, incidentally, was characteristic of such a genius as Leonardo da Vinci.

Istria is silent whether he himself (1912-1954) knew about these postulates, nevertheless, in 1950, he compiled a whole system of questions that could determine the degree of "humanity" of the machine. And I must say that this development is still one of the fundamental ones, however, already when testing, for example, computer bots, etc. In reality, the principle turned out to be such that only a few programs managed to pass the Turing test. And then, "pass" - it is said with a great stretch, since the test result has never had an indicator of 100 percent, in the best case - just over 50.

At the very beginning of his research, the scientist used his own invention. It has received the name "Turing test machine". Since all conversations were supposed to be entered exclusively in printed form, the scientist set several basic directives for writing answers, such as moving the printed tape to the left or right, printing a certain character, etc.

ELIZA and PARRY programs

Over time, the programs became more complex, and two of them, in situations where the Turing test was used, showed stunning results at that time. These are ELIZA and PARRY.

As for Eliza, created in 1960, based on the question, the machine had to determine the key word and, based on it, compose the opposite answer. This is what made it possible to deceive real people. If there was no such word, the machine returned a generalized answer or repeated one of the previous ones. However, passing the Eliza test is still in doubt, since the real people who communicated with the program were initially prepared psychologically in such a way that they thought in advance that they were talking to a person, and not to a machine.

PARRY is somewhat similar to Eliza, but was created to simulate the communication of the paranoid. Most interestingly, real patients of clinics were used to test it. After recording the transcripts of the teletyped conversations, they were evaluated by professional psychiatrists. Only in 48 percent of cases they were able to correctly assess where the person is and where the car is.

In addition, almost all programs of that time worked for a certain period of time, since a person in those days was thinking much faster than a machine. Now it's the other way around.

Supercomputers Deep Blue and Watson

The developments of the IBM corporation looked quite interesting, which did not exactly think, but had incredible computing power.

Probably many remember how in 1997 the Deep Blue supercomputer won 6 chess games against the then reigning world champion Garry Kasparov. Actually, the Turing test is applicable to this machine rather conditionally. The thing is that from the very beginning there were many templates of games with an incredible amount of interpretation of the course of events. The machine could estimate about 200 million positions of pieces on the board per second!

The Watson computer, which consisted of 360 processors and 90 servers, won the American game show, beating the other two participants in all respects, for which, in fact, received a $ 1 million prize. Again, the question is controversial, since incredible volumes of encyclopedic data were put into the machine, and the machine simply analyzed the question for the presence of a keyword, synonyms or generalized matches, and then gave the correct answer.

Eugene Goostman emulator

One of the most interesting events in this area was the program of the Odessa resident Yevgeny Gustman and the Russian engineer Vladimir Veselov, now living in the United States, which imitated the personality of a 13-year-old boy.

On June 7, 2014, the Eugene program showed its full potential. Interestingly, 5 bots and 30 real people took part in the testing. Only in 33% of cases out of a hundred the jury was able to determine that it was a computer. The point here is that the task was complicated by the fact that a child's intelligence is lower than that of an adult, and there is less knowledge.

The Turing test questions were the most general, however, for Euegene there were also some specific questions about the events in Odessa, which could not go unnoticed by any inhabitant. But the answers still made me think that there was a child in front of the jury. So, for example, the program answered the question about the place of residence immediately. When asked if the interlocutor was in the city on such and such a date, the program announced that it did not want to talk about it. When the interlocutor tried to insist on a conversation in line with what exactly happened that day, Eugene denied himself by saying, they say, you yourself should know, why ask him something? In general, the child's emulator turned out to be extremely successful.

Nevertheless, it is still an emulator, not a thinking creature. So the machine uprising won't take place for a very long time.

but on the other hand

Finally, it remains to add that so far there are no prerequisites for the creation of thinking machines in the near future. Nevertheless, if previously the issues of recognition were related specifically to machines, now that you are not a machine, almost every one of us has to prove. Look at least at captcha input on the Internet to get access to some action. So far, it is believed that no electronic device has yet been created that can recognize mangled text or a set of characters, except for a person. But who knows, anything is possible ...

When World War II begins, the scientist rushes to Bletchley Park - to the Government School of Codes and Ciphers. There he joins specialists working on deciphering messages created with the help of the legendary German Enigma machine. The Nazis used its secret designations for their radio messages. Within the walls of the school, Turing invents a unique installation - the Turing Bombe.

Mahina, three meters long and weighing two and a half tons, dealt with codes in a matter of minutes. And the British authorities received accurate information about the movements of the enemy. Although the tape was recognized by critics as very successful, nevertheless, it does not reveal all the scientific achievements of Alan Turing. It's a pity ... This talented professor has been engaged in morphogenesis for a long time and even described mathematically the process of amoorganization of matter. In addition, Turing is the author of an abstract computing apparatus, the great-grandfather of modern computers. And the scientist is one of the first to seriously think about the interaction of synthetic and living minds.

In 1950, when laboratories of many countries were trying to develop the first computer programs, he attracted the attention of the world community with his article "Computing Machines and Mind", which appeared in the pages of Mind magazine. The essence of the material was as follows. The Briton suggested replacing the question "Do machines think" with the equivalent "Can machines do what we do". In this case, as Turing argued, there would be a clear boundary between intellectual and physical capabilities. Alan gave a simple test as an example. The subject must communicate in parallel with a person and with a PC. The conversation is not conducted verbally, face to face, but in writing, blindly, using the keyboard. In the days of the mathematician, computers were not yet so fast and powerful. Therefore, the negotiations were going on at certain time intervals. Pauses slowed down the reaction rate, and it became extremely difficult to understand who is who in this situation. The test was passed if the attitude was mistaken for a living subject.

Many believed that Turing, carrying out his research, was terribly pessimistic and the prospect of the coming to power of machines did not please him in the least. There is, however, evidence that suggests otherwise. For example, the scientist's friend Robin Gandhi often recalled that when Turing flipped through the passages of his work for the hundredth time, every now and then he smiled and even giggled. Be that as it may, his search became an important milestone on the path of convergence between a computer and a person. And in fact, a trial analysis of this area. Later, experts will go on experimenting and come up with various ways how an electronic thinker can lead a person around his finger.

So, in 1966, the American scientist Joseph Weizenbaum announced the creation of a virtual interlocutor - the computer program "Eliza". She was supposed to imitate a noble psychotherapist. Why did Weizenbaum focus on this particular medical area? It is here that you can easily answer a question with a question. In addition, their semantic load is relatively small, there are no lengthy sentences, and thoughts are easily structured into a single system. When giving advice, “Eliza” did not philosophize, but simply paraphrased the interlocutor's speech. It looked something like this:

Subject: I have a headache.

Eliza: Why do you say you have a headache?

At times, the test takers fell into the trap and selflessly believed that they were talking to a real doctor. But there were also funny moments. From time to time during the experiment, people realized that the electronic doctor did not understand the essence of the questions. Not finding the correct answers, "Eliza" usually concluded: I see ... And translated the dialogue to another topic. Joseph Weizenbaum wrote about his program in his book The Possibilities of Computing Machines and the Human Mind. From judgments to calculations ":

In a sense, Eliza was an actress with a certain technique, but she had nothing to say herself. The script, in turn, was a set of rules that allowed the actor to improvise on whatever material he had.

Eight years later, in 1972, another American, Kenneth Colby, released a similar program called PARRY, designed to copy the behavior of a paranoid schizophrenic. To test the new invention for effectiveness, Colby conducted a funny experiment. He invited professional psychiatrists to test two groups of patients - real patients and virtual ones generated by the PARRY program. Communication was carried out using a teletype. The transcripts of the speeches were later shown to another team of psychiatrists. Then two medical teams determined which of the subjects was a human and which was an apparatus. As a result, the correct decision was made only in 48% of cases. And this meant that the car had managed to deceive the doctors. It is noteworthy that Eliza and PARRY were destined to meet each other. Rendez-vous was organized through the ARPANET. The dialogue between the electronic doctor and the patient lasted for several minutes.

Now let's move to another sphere, let's say, to the music. This is not mathematics, geometry and physics, where everything obeys dry numbers. Here you need a flight of imagination, talent and, most importantly, inspiration. Without these three components, the birth of a good work, something that penetrates deeply into the soul, is impossible. More precisely, it was impossible until the meticulous magi invented the Iamus music computer at the University of Malaga, South Spain. Named after his son Apollo, he writes rhythmic scores, in complexity comparable only to Gershwin or Orff. First, the PC generates simple, short rhythmic phrases - “genomes”. Then they begin to evolve and gradually take the form of a full-scale academic essay. The developers, based on the Turing test, tested the operation of their unit on professional musicians. The authors of Iamus gave the artists a chance to listen to several versions of opuses: those created by a computer and by genuine composers. Later, connoisseurs had to establish: who is who. The most interesting thing is that the test led the respondents to a dead end. A work composed by a synthetic mind practically did not differ from a man-made one.

The catch was that Iamus's compositions evoked the same emotions: sadness, joy, laughter, tears. Therefore, most of the subjects were never able to decide and give an exact answer. They usually said they didn't know.

A similar replica from the tested was expected by the specialists of the University of Cambridge. Within the walls of their alma mater, British linguists and programmers tried to teach computers to compose Japanese hokka. Now tell me: how can you believe that this is a machine if it is capable of creating such a thing?

Everything was fine yesterday

And now everything is covered -

This is the essence of Windows.

In developing his test, Alan Turing argued that if scientists invented artificial leather and endowed machines with it, it was unlikely to make them more human. A computer is a computer that thinks of folders and files. Nevertheless, specialists have been working in this direction for a long time. For example, mechanical engineer John-John Kabibihan of Qatar University invented a soft silicone polymer that, when heated to 36.6 degrees, resembles real skin, 3D printed an artificial hand and wrapped it in the new material. Then I ran a simple test. Kabibikhan sat the participants with their backs to him and began to touch their shoulder with his hand or with an artificial model. The respondents were unable to make clear distinctions.

However, despite numerous studies and attempts to bring a computer closer to a person, the Turing test was officially passed only in 2014. This became possible thanks to the program Eugene Goostman, performed by a native of Russia Vladimir Veselov and a representative of Ukraine Yevgeniy Demchenko. The experiment was reduced to a series of short dialogues with five computers. In the course of them, the jury had to guess whether they were being driven by automatic machines or real people. The test was passed if the computer fooled the Areopagus for a third of fromled by time. Actually, the brainchild of Veselov-Demchenko just succeeded. The indicator was even a few tenths higher - 33%. Artificial intelligence was actively explained on behalf of a fictional thirteen-year-old teenager from Odessa, Zhenya Gustman, who "claims to know everything in the world, but because of his age does not know anything." It was he who was recognized as a living person. Skeptics, however, immediately called passing the Turing test questionable. After all, Zhenya Gustman was only chat bot... Therefore, in their opinion, the answer to the question of whether machines can do what we do, remains open. However, you can try to find a solution by talking to robot programs on the Internet. Today in the open world wide web they are visible and invisible: from online games to social networks. If Alan Turing were in the 21st century, he would certainly have organized such a visual experience.

The Turing test is an empirical test, the idea of \u200b\u200bwhich was proposed by Alan Turing in the article "Computing Machinery and Intelligence", published in 1950 in the philosophical journal "Mind". Turing set out to determine whether a machine can think.

The standard interpretation of this test is as follows:

“A person interacts with one computer and one person. Based on the answers to the questions, he must determine who he is talking to: a person or a computer program. The task of a computer program is to mislead a person into making the wrong choice. "

All test participants cannot see each other. If the judge cannot say for sure which of the interlocutors is a human, then the car is considered to have passed the test. In order to test the intelligence of the machine, and not its ability to recognize spoken speech, the conversation is conducted in a "text only" mode, for example, using a keyboard and a screen (intermediary computer). Correspondence should be conducted at controlled intervals so that the judge cannot draw conclusions based on the speed of responses. During Turing's time, computers responded more slowly than humans. Now this rule is necessary because they react much faster than humans.

As of 2009, none of the existing computer systems came close to passing the test.

Computer program simulating a psychotherapist

From Roger Penrose's The New Mind of the King

... Most consonant with the Turing test is a computer program developed by C.M. Colby in the mid-1960s, which mimics a psychotherapist - so successfully that some patients have found a willingness to give her preference over a human therapist and tell more about themselves than when communicating with a living doctor. As an example of such communication between a computer "psychotherapist" and a patient, we present the following excerpt from a dialogue (which was conducted via a teletypewriter):

Therapist: How can I help you?

Patient: I don't know ...

Therapist: Try to tell something about it.

Patient: No.

Therapist: Can you find out the reason?

Patient: I am tired today and do not feel well.

Therapist: Why? Do you have any suggestions on this?

Patient: Yes ... I'm old.

Therapist: Please continue.

Patient: It seems to me that I have lost interest in everything that is happening.

Therapist: Tell me more about your feelings.

Patient: I have some thoughts on this ...

Therapist: You have considerations.

Patient: I think I lack a family.

Therapist: What makes you worry about the lack of family?

Patient: There is a challenge in this ...

Therapist: Why is it so important to you?

While this dialogue may give a strange feeling that a computer understands a patient's problem, there is really nothing of the kind: the computer simply follows fairly simple mechanical rules. (There is also a "reverse" situation, implemented in the system, where a computer simulating a person suffering from schizophrenia gives textbook answers and simulates the corresponding symptoms that can fool some medical students into believing that they are being spoken to by a real living patient! )