I’m going to talk about trans­la­tion and bots, and how think­ing about trans­la­tion can help you think about bots. And before I start, I’d like to tell you a lit­tle bit about myself, so you know where I’m com­ing from. But I don’t have that much time, so I’m going to con­dense this very short­ly into three hash­tags, as I’ve learned. The first hash­tag is #ComputationalLinguistics. That’s what I do. That’s my job, and I’m study­ing it. My sec­ond hash­tag is #GermanLanguage. That’s one thing that espe­cial­ly inter­ests me, and it’s also my excuse if I screw up some­thing dur­ing my talk. And my third hash­tag is #acad­e­mia‽, which I’m hav­ing very con­flict­ed feel­ings about, so I added a lit­tle inter­robang there. You now know pret­ty much every­thing there is to know about me. So I think I’ll start with the talk.

My talk has three sec­tions. The first one is about bots and lan­guages. The sec­ond one, I thought I’d talk about humans and lan­guages. And the third sec­tion is about the Turing Test and about diversity.

Bots and Languages

I’ll start with telling you what inspired me to do this talk. It was an arti­cle in a German paper that I read a few weeks ago about two bots talk­ing to each oth­er. The authors made the bots talk to each oth­er by copy­ing and past­ing the out­put of the bots into each oth­er’s input fields, which is a pret­ty good idea. But the com­menters of the arti­cle were very unhap­py with it. Something was off. They weren’t con­tent with the phras­es that the bots used, and they thought the bots were just say­ing ran­dom stuff.

I was real­ly con­fused about this, because the bots that were used for the arti­cle were very famous. They were Rose and Mitsuku, and they’ve both won prizes, the Loebner Prize. Probably some of you are famil­iar with that. So I thought it was a bit off that peo­ple were so unhap­py with those bots. And I thought about why this was. And I think I have an idea what caused this neg­a­tive reaction.

As I see it, the bot was designed like this. Some humans were train­ing the bot with some data, with a cor­pus maybe, and lan­guage rules, and they always had a spe­cif­ic type of human in mind, or read­er who reads the out­put of the bot. And they were always assum­ing that the per­son who reads the out­put is the same type of per­son as the devel­op­ers. I think that’s a very easy thing to assume. Everyone’s nod­ding yes, thank you.

And why this did­n’t work in the arti­cle that I read was I’m a German speak­er, and the bots weren’t pro­grammed to speak German. They were pro­grammed to speak English. So when I read the arti­cle, it looked a bit dif­fer­ent. It looked like this:

There was some­thing in between. There were not just humans train­ing the bot and humans read­ing the out­put, there was a trans­la­tor in between. And in case you can’t tell, this is a dif­fer­ent type of per­son than this one. And it’s some­one from a dif­fer­ent cul­ture and from a dif­fer­ent lan­guage com­mu­ni­ty. And so every­thing felt a lit­tle bit different. 

And one rea­son it felt dif­fer­ent was that the peo­ple who trained the bot, with this pic­ture in mind, had added some idiomat­ic phras­es to the data­base. So for exam­ple, one of the bots said some­thing about the pres­i­dent hav­ing shoplift­ed the elec­tion. That’s what it sound­ed like in German. Of course, the orig­i­nal phrase was steal­ing the elec­tion,” and that does­n’t exist in German. So every­one who read the German arti­cle was just con­fused by this. Why would the bot talk about shoplift­ing and elec­tions? That does­n’t make any sense.

So that’s a prob­lem. I’m not real­ly sure how to avoid this. Maybe we can just do with­out all these cul­tur­al ref­er­ences and idiomat­ic phras­es, and just use nor­mal words for your bots. So your bot will just be under­stood by every­one. Bad idea, right? It’s not going to be that easy. Sorry.

Navigating Languages

So let’s talk a bit about how humans nav­i­gate lan­guages. You prob­a­bly know this quote by Wittgenstein, The lim­its of my lan­guage mean the lim­its of my world.” I’m not a big fan of this idea in par­tic­u­lar, and I know many lin­guists aren’t fans of this. But I’m inter­est­ed in the oth­er way around. So let’s switch this and talk about how the lim­its of my world mean the lim­its of my language.

I’m a human being, as most of you are. And I have some per­son­al expe­ri­ences, and I have a cul­tur­al back­ground that always influ­ences what I’m say­ing. And they always inform my lin­guis­tic choic­es. I’m always going to use lan­guage that har­mo­nizes with my ver­sion of real­i­ty as I expe­ri­ence it. And that’s not a big prob­lem when we’re talk­ing just about mis­trans­la­tions of idioms, because it does­n’t real­ly harm any­one. It’s not hurt­ful, or any­thing. But of course there are dif­fer­ent sit­u­a­tions where that might be a big more crit­i­cal, and I brought you some exam­ples of how that might be dangerous.

The first one is a set of German terms for the word refugee.” There are many oth­er terms for it. I just brought you four of them. They have very dif­fer­ent con­no­ta­tions. A very neu­tral one is flüchtling,” but if you want to, you can use it with a slight neg­a­tive con­no­ta­tion. And the next one, asy­lant” is just an insult­ing term; it just came to be, his­tor­i­cal­ly. Then we have some more neu­tral ones like asyl­suchende” or refugee.” Refugee is actu­al­ly the most pos­i­tive word when you’re talk­ing about refugees in German. It’s most­ly col­lo­cat­ed with the word wel­come.” So refugees wel­come” is the stand­ing phrase. And I’m always going to talk about this top­ic informed by my own feel­ings and opin­ions about this top­ic. So I can’t real­ly talk about refugees in German with­out trans­port­ing these feel­ings and opin­ions, because I choose one of those words.

Another exam­ple that I brought is also polit­i­cal. That’s about ter­ri­to­r­i­al nam­ing dis­putes. I think Martin was talk­ing about this a few weeks ago. Which of those terms would a bot use to talk about these geo­graph­ic areas? They always have the same ref­er­ent, the words that are on the slide, but they have dif­fer­ent con­no­ta­tions and for the ter­ri­to­r­i­al nam­ing dis­putes, it’s just informed by a polit­i­cal real­i­ty. What you believe, who’s land is this, really?

So, I’m a human and I have opin­ions, and I use lan­guage that fits my opin­ions and my feel­ings. A bot isn’t a human, so a bot can’t choose. But my lan­guage will shape my bot. So, I have a feel­ing about a par­tic­u­lar polit­i­cal top­ic, I’m going to use lan­guage that fits this top­ic, and I’m going to teach my bot to use that same type of lan­guage. Might be a prob­lem, right? Because if a bot talks about refugees, what type of opin­ion or stance do I want to com­mu­ni­cate? It’s not that great. And you can’t real­ly use nor­mal” words, like I sug­gest­ed, because there are no nor­mal words, because humans are dif­fer­ent and have dif­fer­ent opinions.

The Turing Test

Now I’m going to talk about what all of this has to do with the Turing Test. You all know the Turing Test. It’s always in the media when­ev­er any­thing hap­pens about bots, you’re always going to read this ques­tion, Does the bot sound human enough?” It’s the essen­tial ques­tion of the Turing Test. And I have a prob­lem with this because it implies that being a human or sound­ing human is objec­tive in some way. There is a default human being, and a bot just has to imi­tate this human being. And I think that’s not a good way of look­ing at it.

So, the com­mon­ly asked ques­tions is, Does this bot sound human?” And the ques­tion that I think is a lit­tle bit more inter­est­ing is why do so many bots that win the Loebner Prize sound pret­ty much exact­ly the same? They’re real­ly sim­i­lar to each oth­er. Maybe they all have a par­tic­u­lar type of default human being in mind, the peo­ple who design these bots. But if so, who is this par­tic­u­lar mys­te­ri­ous default human being? Has any­one met them, maybe?

Let’s have a look at the pro­file of one of the bots from the arti­cle. This is Rose, and the pro­file is tak­en from chat​bots​.org. And this says, Rose is a twenty-something com­put­er hack­er, liv­ing in San Francisco.” Is this a default human being? It’s prob­a­bly for very many peo­ple who design chat­bots, because the tech com­mu­ni­ty isn’t real­ly that diverse. It’s get­ting there slow­ly, but there’s still this idea of a default human being, and peo­ple trans­port this when they build bots. And I’m unhap­py with this.

I had a look at the recent win­ners of the Loebner Prize, and I noticed that there were three types of per­son­al­i­ties that the devel­op­ers gave their bots, because per­son­al­i­ties always score more points. The peo­ple who pick the win­ner of the prize actu­al­ly ask the bot about their per­son­al­i­ty. Where they come from, who their par­ents are, and so on. And the per­son­al­i­ties that occurred in the last few years were American; aliens or robots, so not human at all; and then I found one sin­gle instance of a bot who was­n’t American or alien, and that was a thir­teen year-old boy who was pre­sent­ed to be a thir­teen year-old Ukrainian boy. And I think that’s a bit shock­ing. If you have a look at this list, peo­ple are either American, not human, or chil­dren. Not real­ly, right? So I oppose this idea of a default human being, and I oppose this idea of mak­ing bots sound like Americans, aliens, or children.

So I’m going to ask every one of you who makes bots to think about what dif­fer­ent types of per­son­al­i­ties you might give your bots. Maybe you know some slang words. Maybe you know Yiddish or Arabic or what­ev­er. And I think it’s worth it to make the bot land­scape more diverse. So I’m going to leave you with this mes­sage: make more diverse bots.

Thank you.

Further Reference

Esther has post­ed a sum­ma­ry of this pre­sen­ta­tion at her web site, as well as her slides.

Darius Kazemi’s home page for Bot Summit 2016.