Has the Turing test now been passed?

by on June 8, 2014 at 1:08 pm in Current Affairs, Science, Web/Tech | Permalink

A programme that convinced humans that it was a 13-year-old boy has become the first computer ever to pass the Turing Test. The test — which requires that computers are indistinguishable from humans — is considered a landmark in the development of artificial intelligence, but academics have warned that the technology could be used for cybercrime.

…Eugene Goostman, a computer programme made by a team based in Russia, succeeded in a test conducted at the Royal Society in London. It convinced 33 per cent of the judges that it was human, said academics at the University of Reading, which organised the test.

It is thought to be the first computer to pass the iconic test. Though there have claims other programmes have successes, those included set topics or question in advance.

A version of the computer programme, which was created in 2001, is hosted online for anyone talk to. (“I feel about beating the turing test in quite convenient way. Nothing original,” said Goostman, when asked how he felt after his success.)

The computer programme claims to be a 13-year-old boy from Odessa in Ukraine.

So far I am withholding judgment.  There is more here, lots of Twitter commentary here.  By the way, here is my 2009 paper with Michelle Dawson on what the Turing test really means (pdf).

Alexey June 8, 2014 at 1:19 pm

I am surprised to see this coming out of Russia. There is some formidable engineering talent, but practically no computer science.

derek June 8, 2014 at 1:34 pm

Maybe Russians have long experience in the difficulty of convincing the apparat that you are human.

Chris Hallquist June 8, 2014 at 1:21 pm

I’m having trouble seeing how this is different from past Loebner Prize “wins.” Would really like more background on this particular iteration. In fact, when I first heard this, I wondered if this *was* about the Loebner Prize, and the silver for that contest had been awarded or something.

andrew' June 8, 2014 at 1:26 pm

“Nothing original”

Definitely a 13 yo boy. Creation touted in groundbreaking research barely looks up from minecraft to utter “meh.”

In Soviet Russia, Turing test you!

Nedkom June 8, 2014 at 1:55 pm

Odessa is not in Ukraine

Jamie_NYC June 8, 2014 at 3:59 pm

Right on. Are you also pretending to be 13 years old?

DK June 8, 2014 at 9:23 pm

Maybe he can see future?

Bryan Willman June 8, 2014 at 1:59 pm

The real problem with widely known current understandings of a “turing test” is that while they are interesting and point out one area for research with important implications, they completely side step some other questions, some of which may be more important. (And whlie a machine that can “pass” the turing test with say 75% of the population over any field of discourse would be a great technical achievement, with serious economic consequences, it would be one of those chasms which once crossed, proves to be a necessary but unexciting and insufficent step to the next goal.)

In particular, Turing’s early work points out that you cannot write a program that reliably decides whether any of all other programs will halt or not. (This is not controversial.) But humans do things which it can be argued violate this rule (sometimes by changing the question on the fly – for example there’s a large class of programs for which the question “does it halt” must be answered “it depends…”)

So, by this (perhaps wrong) argument, *IF* humans can do tasks that computers cannot do (special exotic tasks) then human brains and computers (at least in the general sense) are different in a fundamental way. But this would never be revealed or rejected based on something like the general population case of the imitation game.

One could imagine a special sort of imitation game test in which rather than “wise observer” it is “the spirit of Godel” or “Church” or even “Turing” trying to decide if a web only collaborator on a computablility proof is really a human or a computer.

Or, humans have a communication link to some entity outside the solar system – how do we decide if it is a “living species” or rather, an artificial avatar of that species which may have outlived the original species?

A Berman June 8, 2014 at 3:19 pm

The great thing is that you can never prove that a human can do something that a computer can’t, because such a proof would immediate turn into a recipe for writing a computer program to do that same thing.
This is also why arguments about free will are also doomed to failure– free will is indistinguishable from randomness to the point of view of any observer since in both cases an observer cannot predict the output.

suntzuanime June 9, 2014 at 12:57 am

You are obviously not very well-versed in computer science and have only a layperson’s understanding of the Halting problem. The idea that any human could decide whether *every* program halts is absurd – to take an obvious counterexample, there are some programs so long that the human will not have finished reading them before it expires due to old age. Humans can decide whether many programs halt, but it’s not hard to write a program that can decide whether many programs halt as well. Indeed, for any arbitrary length X, it’s possible to write a program that can reliably decide whether any program of up to length X will halt. So there is in fact no actual program which a human can decide the halting of and a program cannot.

Looking for a magic distinction between man and machine in the Halting Problem is barking up the wrong tree.

ummm June 8, 2014 at 2:39 pm

A human and a computer with advanced AI can both carry out simple conversation and solve simple math problems, so I fail to see the big deal here. The technology for emulating human behavior has been around awhile

Colin June 8, 2014 at 2:45 pm

1 in 3 judges is “passing the Turing test”? Consider me thoroughly unimpressed.

Willitts June 9, 2014 at 1:22 am

Same here.

How many Turing Tests of random judges would it take for 33% to rate it a pass?

BTW, I had trouble structuring that question, but you know what I mean. I’m too drunk to spend further time getting it right.

Dan Weber June 9, 2014 at 8:07 am

The judges aren’t random. If you submit something that gives random answers you will probably get a 0% acceptance rate.

The question isn’t how good the judges are, it’s how good the programs are at fooling the judges. Any judge can get 50% accuracy just by flipping a coin, but the Turning Test is working in the opposite direction.

Jan June 8, 2014 at 3:08 pm

From the one “Goostman” quote provided in the article it seems like the fact that it was supposed to be boy from Ukraine could have made it much easier to dupe the judges. That is, a response in somewhat flawed/strangely constructed English could be expected from a non-native speaker. Doesn’t that kind of throw off the ability to evaluate this program? I’m less worried about it being designed as 13 year old.

Steve Sailer June 8, 2014 at 5:19 pm

“I feel about beating the turing test in quite convenient way. Nothing original.”

That sounds like something you’d program a computer to say to try to pass the Turing Test. Are we sure that “Goostman” is a real human being?

Careless June 8, 2014 at 11:39 pm

Yeah, my first thought as well. You can hide a lot of stupidity behind broken English

Larry June 8, 2014 at 5:23 pm

Chimps yesterday, computers today. What else will turn out to be smarter than a fifth grader this week?

Andreas Moser June 8, 2014 at 7:12 pm

Does it speak Russian or Ukrainian?

Jan June 8, 2014 at 9:31 pm

You’re joking, but It is a question I actually considered. Odessa is mostly ethnic Ukrainian with a large Russian minority, but Russian is the dominant language of the city. That, plus that fact that Eugene Goostman has the characteristics of a typical post-Soviet Jewish name, a group that primarily speaks Russian, leads me to believe he is in fact a Russian speaker. Also, the guys that programmed him are Russian.

Marie June 8, 2014 at 8:06 pm

“So something passes the Turing Test if it shows at least one conversation between two sentient A.I.s where they aren’t talking about a man?”

This made me almost able to stand Twitter.

I don’t understand this whole idea. When people say AI, do they mean artificial *intelligence* or do they mean faked intelligence?

I could make a random word generator, or set up chimps with typewriters, and every once in awhile a sentence would come out that could have come from a human. Does that mean that my generator or my chimp is, just for that sentence, intelligent? Why is 30% anything, wouldn’t it only be potential actual intelligence if it could pretty much pass the test all the time with everybody? Maybe I should go look this up. . . . .

TMC June 8, 2014 at 8:17 pm

And this is about a 13 yr old boy. You could write random phrase 1, random phrase 2 and “boobies”, and get your 33%.

Marie June 8, 2014 at 9:13 pm

Oh, now, that’ll have me laughing for several days!

Willitts June 9, 2014 at 1:25 am

From his tweets, Im beginning to think my 13 year old nephew is either a computer program or Ukrainian.

Nikki June 8, 2014 at 8:31 pm

Is the coverage part of the test? Judging by Prof. Kevin Warwick’s clarity of expression, he may well be a computer, too.

Also, from the comments there: “I’m not sure it’s passing the Turing Test if you have to invent a back-story to explain why it can;t pass the Turing Test.”

Elijah June 8, 2014 at 9:02 pm

By the standards of these clowns Quake bots passed the Turing Test.

Joël June 8, 2014 at 9:15 pm

Ridiculous. The program doesn’t pass a one minute Turing test. Try it. I wonder who were the judges…

Robert Merkel June 8, 2014 at 9:23 pm

From reading Turing’s original paper, it’s clearly subverting the original test. Turing’s paper shows imagined transcripts about the subtler points of poetry, not a teenage non-native speaker feigning confusion at the earliest opportunity.

As at least one wag has already pointed out, if I write a program that prints random garbled text at intervals and claim it’s imitating a two year old, do that program win too?

JoeDog June 8, 2014 at 10:05 pm

The Turing test was passed years ago by one of my colleagues. He wrote an IRC bot that was hyper-critical of George W. Bush. A Bush supported argued with it while we laughed ourselves silly.

Vernunft June 8, 2014 at 11:51 pm

You wrote Michael Moore? TAKE IT BACK

Willitts June 9, 2014 at 1:28 am

What’s remarkable is that a liberal with BDS is indistinguishable from programmed rants.

msgkings June 9, 2014 at 3:14 pm

Not as remarkable as the fact that a conservative with ODS is indistinguishable from a liberal with BDS.

Oh wait, they are opposite in sign only. They are indistiguishably boring and useless though.

JoeDog June 8, 2014 at 10:06 pm

The Turing test was passed years ago by one of my colleagues. He wrote an IRC bot that was hyper-critical of George W. Bush. A Bush SUPPORTER argued with it while we laughed ourselves silly.

Ian June 8, 2014 at 10:21 pm

Logs or it didn’t happen ;-)

Urstoff June 9, 2014 at 12:05 am

So it was a marginal performance (33% of what kind of judge? People off the street are easy to fool with computer programs; seasoned Turing test judges are nearly impossible to fool) of an easy version of the easy version of the Turing test. Not only did it not have to play the man/woman imitation game (the version of the test proposed by Turing), it didn’t even have to try to seem like an adult.

Of course, any version of the Turing test is mostly useless. It’s just a very shallow operationalization of intelligence; Turing thought it wasn’t because presumably you need a lot of knowledge and some high-level reasoning skills to fool people, but of course, that’s not the case. You need linguistic misdirection and vague answers. It would be nice to see the transcripts of these tests.

Justin June 9, 2014 at 1:23 am

Are these judges randomly called in from time to time to test an actual human? And what are the results?

Daryl June 9, 2014 at 1:57 am

The link to the paper appears to be broken.

Urso June 9, 2014 at 10:42 am

Clicking the twitter link was enlightening. Turns out everybody makes the same jokes.

Peter M June 9, 2014 at 4:31 pm

Here is a reaction (unfavorable) to the initial press from a writer at the scientific organ BuzzFeed. http://www.buzzfeed.com/kellyoakes/no-a-computer-did-not-just-pass-the-turing-test

CMOT June 9, 2014 at 4:45 pm

This doesn’t seem a whole lot better than ELIZA was doing back in the 60′s. Wanna know more about ELIZA? Skip Wiki and ask Siri. I’m serious about that.

Any article about chatterbots vs Turing tests that doesn’t mention this almost 50 year old experiment is incomplete.

Here’s a sample interaction of ELIZA posing as a psychotherapist

young woman: Men are all alike.
eliza: In what way?
young woman: They’re always bugging us about something specific or other.
eliza: Can you think of a specific example?
young woman: Well, my boyfriend made me come here.
eliza: Your boyfriend made you come here?
young woman: He says I’m depressed much of the time.
eliza: I am sorry to hear you are depressed.
young woman: It’s true. I’m unhappy.
eliza: Do you think coming here will help you not to be unhappy?
young woman: I need some help, that much seems certain.
eliza: What would it mean to you if you got some help?
young woman: Perhaps I could learn to get along with my mother.
eliza: Tell me more about your family.
young woman: My mother takes care of me.
eliza: Who else in your family takes care of you?
young woman: My father.
eliza: Your father.
young woman: You are like my father in some ways.
eliza: What resemblance do you see?
… there’s more …

Comments on this entry are closed.

Previous post:

Next post: