Gideon Lichfield: Hello everybody. Welcome to the session on Compassion Through Computation: Fighting Algorithmic Bias. I'm Gideon Lichfield, I'm the editor of MIT Technology Review. And I'm going to be moderating the discussion with two very interesting speakers. Joy Buolamwini, who is currently doing she says her fourth degree at MIT and is founder of the Algorithmic Justice League, and uses a variety of different forms of expression to examine the way in which algorithmic bias affects the AI technology that we use, and describes herself also as a poet of code. And then, Justine Cassell is the Associate Dean of Technology, Strategy, and Impact at the School of Computer Science at Carnegie Mellon.

I'm gonna ask Joy to speak first and then Justine, then we will have a Q&A. And you can ask questions by raising your hand, but if you feel like using technology, there is this app we are using called Slido. You can access it by this website, You can go in there, you can put in questions that you want to ask the speakers. Those will come up and then we will be able to select from them and lead the discussion. So, I'll ask Joy to step up first. Thank you Joy.

Joy Buolamwini: Hello, my name is Joy Buolamwini, founder of the Algorithmic Justice League, where we focus on cre­at­ing a world with more inclu­sive and eth­i­cal sys­tems. And the way we do this is by run­ning algo­rith­mic audits to hold com­pa­nies account­able.

I’m also a poet of code, telling sto­ries that make daugh­ters of dias­po­ras dream, and sons of priv­i­lege pause. So today it’s my plea­sure to share with you a spo­ken word poem that’s also an algo­rith­mic audit called AI, Ain’t I A Woman?” And it’s a play on Sojourner Truth’s 19th‐century speech where she was advo­cat­ing for women’s rights, ask­ing for her human­i­ty to be rec­og­nized. So we’re gonna ask AI if it rec­og­nizes the human­i­ty of some of the most icon­ic women of col­or. You’re ready? [The Davos record­ing omits most of the cor­re­spond­ing visu­als for Joy’s piece; the fol­low­ing is her pub­lished ver­sion.]


So there you are. [applause] And so what you see in the poem I just shared, which is also an algo­rith­mic audit, is a reflec­tion of some­thing I call the cod­ed gaze. Now, you might have heard of the white gaze, the male gaze, the post‐colonial gaze. Well, to this lex­i­con we add the cod­ed gaze, and it is a reflec­tion of the pri­or­i­ties, the pref­er­ences, and also some­times the prej­u­dices of those who have the pow­er to shape tech­nol­o­gy. So this is my term for algo­rith­mic bias that can lead to exclu­sion­ary expe­ri­ences or dis­crim­i­na­to­ry prac­tices.

So let me show you how I first encoun­tered the cod­ed gaze. I was work­ing on a project that used com­put­er vision. Didn’t work on my face until I did some­thing: I pulled out a white mask. And then I was detect­ed. So I want­ed to know what was going on, and I shared this sto­ry with a large audi­ence using the TED plat­form; over a mil­lion views. And I thought some­body might check my claims, so let me check myself.

And I took my TED pro­file image and ran it on the com­put­er vision sys­tems from many lead­ing com­pa­nies. And I found some com­pa­nies didn’t detect my face at all. But the com­pa­nies that did detect my face? labeled me male. I’m not male; I’m a woman, phe­nom­e­nal­ly. And so I want­ed to know what was going on.

Then I read a report com­ing from Georgetown Law show­ing that one in two adults, over 130 mil­lion peo­ple, have their face in the face recog­ni­tion net­work that can be searched unwar­rant­ed, using algo­rithms that haven’t even been audit­ed for accu­ra­cy. And across the pond in the UK where they actu­al­ly have been check­ing how these sys­tems work, the num­bers don’t look so good. You have false match rates over 90%; more than 2,400 inno­cent peo­ple being mis­matched. And you even had a case where two inno­cent women were false­ly matched with men. So some of the exam­ples that I show in AI, Ain’t I a Woman,” or the TED pro­file image, they have real‐world con­se­quences.

And because of the real‐world con­se­quences, this is why I focused my MIT research on ana­lyz­ing how accu­rate sys­tems work when it came to detect­ing the gen­der of a par­tic­u­lar face. And so with the research we’re doing, it’s been actu­al­ly cov­ered in more than thir­ty coun­tries, more than 240 arti­cles, talk­ing about some of the issues with facial analy­sis tech­nol­o­gy.

So in order to assess how well these sys­tems actu­al­ly work, I ran into a prob­lem. A prob­lem that I call the pale, male data issue.” And in machine learn­ing, which are the tech­niques being used for com­put­er vision (hence find­ing the pat­tern of a face), data is des­tiny. And right now if we look at many of the train­ing sets or even the bench­marks by which we judge progress, we find that there’s an over­rep­re­sen­ta­tion of men—75% male for this nation­al bench­mark from the US gov­ern­ment. 80% lighter‐skinned indi­vid­u­als. So pale, male data sets are des­tined to fail the rest of the world, which is why we have to be inten­tion­al about being inclu­sive.

So, the first step was mak­ing a more inclu­sive data set, which we did, called the Pilot Parliaments Benchmark, which was bet­ter bal­anced by gen­der and skin type. The way we achieved bet­ter bal­ance was by going to the UN Women’s web site, and we got a list of the top ten nations in the world by their rep­re­sen­ta­tion of women; Rwanda lead­ing the way, pro­gres­sive Nordic coun­tries in there, and a few oth­er African coun­tries as well. We decid­ed to focus on European coun­tries and African coun­tries to have a spread of skin types.

So final­ly with this more bal­anced data set, we could actu­al­ly ask the ques­tion, how accu­rate are sys­tems from com­pa­nies like IBM, Microsoft, Face++—a lead­ing billion‐dollar tech com­pa­ny in China used by the gov­ern­ment, when it comes to guess­ing the gen­der of a face?

So what do we see? The num­bers seem okay. 88, maybe you get a B with IBM. 94%, Microsoft is the best case over­all. And Face++ is in the mid­dle. Where it gets inter­est­ing is when we start to split it down.

So when we eval­u­ate the accu­ra­cy by gen­der we see that all sys­tems work bet­ter on male faces than female faces, across the board. And then when we split it by skin type, again we’re see­ing these sys­tems work bet­ter on lighter faces than dark­er faces.

Then we did some­thing that hadn’t been done in the field before, which was doing an inter­sec­tion­al analy­sis bor­row­ing from some of Kimberlé Crenshaw’s work on antidis­crim­i­na­tion law, which showed that if you only did single‐axis analy­sis, right, so if we only look at skin type, if we only look at gen­der, we’re going to miss impor­tant trends.

So, tak­ing inspi­ra­tion from that work, we did this inter­sec­tion­al analy­sis. And this is what we found. For Microsoft you might notice that for one group there is flaw­less per­for­mance. Which group is this? The pale males for the win! And then you have not-so‐flawless per­for­mance for oth­er groups. So in this case you’re see­ing that the darker‐skinned females are around 80%. These were the good results. Let’s look at the oth­er com­pa­nies.

So now let’s look at Face++. China has the data advan­tage, right, but the type of data mat­ters. And so in this case we’re actu­al­ly see­ing that the bet­ter per­for­mance is on dark­er males mar­gin­al­ly. Again, you have dark­er females with the worst per­for­mance.

And now let’s look at IBM. For IBM lighter males take the lead, again. Here you see that for lighter females there’s a dis­par­i­ty, right, between lighter males and lighter females, but lighter females actu­al­ly have a bet­ter per­for­mance than dark­er males. And cat­e­gor­i­cal­ly across all of these sys­tems, dark­er females had the worst per­for­mance. So this is why the inter­sec­tion­al analy­sis is impor­tant, because you’re not going to get the full spec­trum of what’s going on if you only do single‐axis analy­sis.

Now we took it even fur­ther and we dis­ag­gre­gat­ed the results of the dark­er females since that was the worst‐performing group. And this is what we got. We got error rates as high as 47% on a task that has been reduced to a bina­ry. Gender’s more com­plex than this, but the sys­tems we test­ed used male and female labels, which means they would have a 50/50 shot of get­ting it right by just guess­ing. So for these sys­tems, we’re pay­ing to do an audit that actu­al­ly shows is mar­gin­al­ly bet­ter than chance.

So I thought the com­pa­nies might want to know what was going on with their sys­tems, and I shared the research. IBM was by far the most respon­sive com­pa­ny; got back to us the day we shared the research and in fact released a new sys­tem when we shared the research pub­licly. So first we gave the research pri­vate­ly to all of the com­pa­nies and gave them some time to respond.

So here you can see that there’s a marked improve­ment from 2017 to 2018. So for every­body who watched my TED talk and said, Isn’t the rea­son you weren’t detect­ed because of you know, physics? Your skin reflectance, con­trast, etc.” The laws of physics did not change between December 2017 when I did th study and 2018 when they launched the new results. What did change was they made it a pri­or­i­ty. And we have to ask why.

So, this past sum­mer you actu­al­ly had an inves­tiga­tive piece that showed that IBM report­ed­ly secret­ly sup­plied the New York Police Department with sur­veil­lance tools that could ana­lyze video footage by skin type—skin col­or in this case, and also the kind of facial hair some­body had, or the cloth­ing that they were wear­ing. So enabling tools for racial pro­fil­ing. And then for The New York Times I wrote an op‐ed talk­ing about oth­er dan­gers [of the] use of facial analy­sis tech­nolo­gies. We have a com­pa­ny called HireVue, for exam­ple, that says we can use ver­bal and non­ver­bal cues, accord­ing to their mar­ket­ing mate­ri­als, and and infer somebody’s going to be a employ­ee for you. And how we do this is we train on the cur­rent top per­form­ers. Now, if the cur­rent top per­form­ers are large­ly homo­ge­neous, we could have some prob­lems.

So it’s not just a ques­tion of hav­ing accu­rate sys­tems, right. How these sys­tems are used is also impor­tant. And this is why we’ve launched some­thing called the Safe Face Pledge. And the Safe Face Pledge is meant to pre­vent the lethal use of facial analy­sis tech­nol­o­gy. (Don’t kill peo­ple with face recog­ni­tion, very basic.) And then also think­ing through things like secret mass sur­veil­lance, or also the use by law enforce­ment.

So so far we have three com­pa­nies that have come on board to say we’re com­mit­ted to the eth­i­cal and respon­si­ble devel­op­ment of facial analy­sis tech­nol­o­gy. And we also have oth­ers are say­ing we’ll only pur­chase from these com­pa­nies. So if this is some­thing that you’re inter­est­ed in sup­port­ing, please con­sid­er going to the Safe Face Pledge site. And if you want to learn more about the Algorithmic Justice League, vis­it us at ajlu​nit​ed​.org. Thank you.

Justine Cassell: You’ve heard beau­ti­ful­ly about one aspect of bias in com­pa­nies today, and that’s algo­rith­mic bias. In the same way we want to be taught by edu­ca­tors who rep­re­sent us and we want politi­cians who rep­re­sent us, we also want tech­nol­o­gy to rep­re­sent us, in every sense of rep­re­sent.” We want it to look like us; we want it to mir­ror who we are; and we want it to stand up for us—share our val­ues and pre­serve them.

But that real­ly hasn’t been the case, yet. And that’s what I’m gonna talk to you about. I’m gonna talk about the kinds of bias that are lead­ing to tech­nol­o­gy let­ting us down. I’m gonna talk about why it hap­pens, and I’m going to start to talk about what to do about it.

But first let me take you back to France in 1984. So in France in 1984 the gov­ern­ment set up a ter­mi­nol­o­gy com­mis­sion. It’s a very French thing. They’d had a whole bunch of ter­mi­nol­o­gy com­mis­sions before. And the goal of this one la fémin­i­sa­tion des noms de pro­fes­sion. That is, the inven­tion of names for jobs tra­di­tion­al­ly done by men that may now or one day be done by women. Like doc­tor, pro­fes­sor, researcher, post­woman, and so forth.

And they released a report about the words they had invented—the neol­o­gisms that they had invent­ed for these new jobs. Like a woman doc­tor. Or a woman mail deliv­er­er, and so forth.

As the report was released, there was a counter-report. (And that’s also very French.) And this counter‐report was released by the Académie Française, who sees them­self as in charge of the French lan­guage, stand­ing up for and insur­ing that the French lan­guage remains pure. And what they said was This very impulse of yours is hon­or­able, but it’s going to lead to bar­barisms and seg­re­ga­tion, and worse sex­ism than we had before, because it’s unre­flec­tive and unmo­ti­vat­ed by research.”

Well to any researcher, that’s a dream. So I decid­ed to do an exper­i­ment to look to see who was right. Was the fem­i­niza­tion of the terms for jobs going to lead to more women mov­ing into those jobs, as the ter­mi­nol­o­gy com­mis­sion sug­gest­ed? And so I went to the US and I went to France, and I did an exper­i­ment with chil­dren of 8 or 9 years old. They’re start­ing to think about gen­der.

And I asked them what they would call this pic­ture. This is a female truck dri­ver. The word’s a lit­tle unfor­tu­nate. Luckily 8‐ and 9‐year‐olds don’t know this. Those of you who are French speak­ers know that camioneusse unfor­tu­nate­ly already has a mean­ing that we might not want to use. But in this instance we asked them sim­ply what they would call this per­son.

What they would call this person—a woman researcher: chercheuse. What they would call this per­son—un doc­toresse, a woman doc­tor. And what they would call this per­son—un postière, a female mail deliv­er­er.

And in fact what I found was that it was chil­dren who scored most high­ly on a test of stereo­type bias, those chil­dren who had the nar­row­est beliefs about who could do what, that used the fem­i­nized terms. And that makes no sense, you might think. Why? And I asked them: why? Why would doc­toresse” not mean a woman doc­tor?

They explained doc­toresse” is kind of like a doc­tor but not a real doc­tor. That’s why we call her a doc­toresse.

And that’s because lan­guage reflects life, and not the oth­er way round. And so when you want to change some­thing, you can’t sim­ply change the words. You can’t sim­ply change the pic­tures. You have to change soci­ety. And that’s what we’re gonna turn to now.

So I cofound­ed, with a num­ber of oth­er very smart peo­ple, a non­prof­it called EqualAI​.org. And I invite you to join us online and also to get to speak to Miriam Vogel, the new exec­u­tive direc­tor who’s here in Davos. We real­ized that very well‐intentioned peo­ple can do very nasty things. And we start­ed this foun­da­tion, this non­prof­it, with that in mind.

We looked at the stats. We saw that the num­bers of women going into com­put­er sci­ence are going down and not up. That par­ents say that they’d like their chil­dren to be com­put­er sci­en­tists so that they can earn more, but don’t want their girl chil­dren to take com­put­er sci­ence class­es. We know that in less‐resourced schools com­put­er sci­ence isn’t even taught.

And why is that? And what do we do about it? Why does it mat­ter? A par­ent said to me, I don’t under­stand you. Why would you want my girl to become a com­put­er sci­en­tist? That’s a sucky pro­fes­sion. It’s a bunch of badly‐dressed, non‐washed, greasy‐haired men eat­ing Cheetos, and drink­ing Red Bull, and stay­ing up all night alter­nat­ing between writ­ing code and play­ing video games.”

You see the prob­lem here? So, what I sug­gest­ed was I don’t want to make any girl become a com­put­er sci­en­tist. But I want every pro­fes­sion to be avail­able to girls, to peo­ple of col­or, to oth­er under­rep­re­sent­ed groups, those of dif­fer­ent abil­i­ties. Because if not, we’re going to ampli­fy the worst of our­selves in tech­nol­o­gy. We’re going to ampli­fy our abil­i­ty to kill. Our abil­i­ty to destroy. Our abil­i­ty to hate. And not the best of us. And it takes inten­tion­al­i­ty, and it takes work to cre­ate tech­nol­o­gy that ampli­fies the best of us. Human‐centered tech­nol­o­gy.

Now Joy spoke beau­ti­ful­ly about one aspect of that human‐centered tech­nol­o­gy that has not been inten­tion­al­ly cre­at­ed. That’s relied on what’s called a con­ve­nience sam­ple.” My stu­dents define con­ve­nience sam­ple” as your two office mates and the two office mates across the hall. And that’s not real­ly what you want when you build a piece of tech­nol­o­gy. And the peo­ple who built those algo­rithms grab the first data set avail­able, and it was the pale males.

But there are oth­er kinds of bias. As well as algo­rith­mic bias, there’s also bias in what the work­force looks like, and bias in what the tech­nol­o­gy looks like. And in all three are­nas, we are not rep­re­sent­ed. Neither what we look like, what we sound like, what our val­ues are. And what we want them to be.

And this all hap­pens for a rea­son that is not inten­tion­al for the most part. And that is because of the psy­cho­log­i­cal notion of stereo­type. Now when we talk about stereo­types usu­al­ly what we mean is neg­a­tive beliefs about a per­son. But that’s not what a stereo­type is in tech­ni­cal lan­guage. A stereo­type is the abil­i­ty to take in infor­ma­tion, and rather than need­ing to take in the huge stream of infor­ma­tion that comes at us every sec­ond, we grab a piece of it and extrap­o­late. We look at an eye, for exam­ple, and we say, Oh. Totally. Olive‐shaped eye…pseudo-ethnicity: Asian. Really good at math. Mmm…not so good at inter­per­son­al rela­tion­ships. Won’t argue in pub­lic.”

Now the first part of that, the Asian, that’s an extrap­o­la­tion from one lit­tle detail. What it allowed me to do was not look at the rest of that body, or hear that per­son talk, or have a con­ver­sa­tion with that per­son, but sim­ply to pattern‐match what’s in the world with what’s in my head. And pattern‐matching is a lot faster than tak­ing in infor­ma­tion.

But, as you saw, it has dan­gers. And here are some oth­ers. Because when we match a kind of per­son to a set of traits like good at math” or not will­ing to argue,” some­times that’s fine. There’s a great old movie called Pillow Talk where Doris Day is talk­ing to Rock Hudson, and he’s talk­ing in a Southern accent. He turns out to be a con man. But she hears him speak­ing in his Southern American accent, she says, He’s so cute. He’s so sweet. So naïve, so inno­cent.” And he’s not at all. But she extrap­o­lat­ed from that one datum to that.

Miley Cyrus thought it was okay to rep­re­sent her­self with slan­ty eyes. But by doing that she extrap­o­lat­ed to all of those oth­er traits. And that leads peo­ple in those negatively‐stereotyped groups to try and become like the norm. Like Joy wear­ing a white mask. This is an actu­al prod­uct for sale to keep your eye­lids look­ing Caucasian, non‐Asian. And that’s a very sad thing.

So, it leads to even worse things than that. For exam­ple, it turns out that when the hand hold­ing a cell phone is black, peo­ple are way, way more like­ly to see that phone as a gun. And this is unfor­tu­nate­ly even more true of police­men than it is of every­day peo­ple. So you can imag­ine a young per­son grab­bing a phone and being shot dead. And it unfor­tu­nate­ly has hap­pened way too often, and con­tin­ues to hap­pen.

And things like that lead to say­ing, I’m look­ing for peo­ple to work on my team. I need peo­ple who are gonna suc­ceed. People who are gonna suc­ceed like I suc­ceed­ed. I’m from a group that fits here. We need peo­ple who fit in.” And only a few weeks ago a friend of mine went for an inter­view in Silicon Valley, and when she didn’t get the job she said, Can you tell me why?” And the hir­ing man­ag­er said, You just don’t fit.”

Yeah, you’re right, I don’t. I don’t wear the same size. I’m not the same height. My skin hap­pens to be a dif­fer­ent col­or. And you need that. Why do you need that? Because a bro cul­ture, a cul­ture where every­one looks alike, which is what Silicon looks like now, cre­ates bro prod­ucts. Our stu­dents from Carnegie Mellon come back and tell me—and this has changed over the last cou­ple of years—that their boss­es tell them to cre­ate for them­selves. Design tech­nol­o­gy that you would love.” Even nar­row­ing that field of tech­nol­o­gy.

And yet we know that diver­si­ty in teams cre­ates inno­va­tion. That is, it has been shown, as defin­i­tive­ly as we can show any­thing, that the more per­spec­tives, the more dif­fer­ent points of view, dif­fer­ent kinds of peo­ple we have on a team, the more objec­tive tech­nol­o­gy inno­va­tion will be cre­at­ed.

This is not an exam­ple of that. This is an exam­ple of cre­at­ing a tech­nol­o­gy that fits a stereo­type. That is, Alexa is a ser­vant. In the same way that girls (those are young women), were asked to be phone oper­a­tors because they had soft voic­es and gen­tle tem­pera­ments, and could serve those peo­ple who use the phone, Alexa does the same thing. And that’s why in the UK, Siri had a male voice. Because there the male valet, or but­ler, was enough of a stereo­type to allow men to serve. But that dis­ap­peared in the face of US Alexa, and it’s now a female voice. This stereo­type gets more and more nox­ious. Taxi dri­vers in Germany refuse to have a GPS with a female voice. They refuse to take instruc­tions from her on how to dri­ve.

And an even ugli­er exam­ple, unin­ten­tion­al­ly for sure, comes from a paper on vir­tu­al tutors teach­ing chil­dren math. These are four vir­tu­al tutors,” four rep­re­sen­ta­tions. Children were allowed to choose—this is not my work. Children were allowed to choose whichev­er one they want­ed. And they chose the one to use first that looked like them—same gen­der, same eth­nic­i­ty. But they learned more math from the white male.

And that’s not sur­pris­ing. If you look at the rep­re­sen­ta­tion in these pic­tures, these are stereo­types of what a sci­en­tist sit­ting in his arm­chair believes black men and women and white women and white men look like.

Children who collaborated with the bi-dialectical virtual peer, speaking both African-American Vernacular English and Mainstream School English would better at science than children who collaborated with a Mainstream School English only virtual peer.

So we did an exper­i­ment. We built two ver­sions of a piece of knowl­edge. They looked iden­ti­cal but one spoke the same dialect as the chil­dren we were work­ing. Took us two years to build a gram­mar of that dialect. And the dialect is just lan­guage with­out an Army and a Navy. Two ver­sions of that piece of tech­nol­o­gy. And chil­dren worked with the tech­nol­o­gy to do sci­ence. And it turned out that the chil­dren who worked with the tech­nol­o­gy that sound­ed like them learn more sci­ence. So this has real‐world con­se­quences that we have to pay atten­tion to.

So what do we do? We’re at an inflec­tion point, and it’s both a risk and an oppor­tu­ni­ty. And this is a fourteen‐minute talk and not four­teen hours, and I’m hap­py to talk to any­one who wants to know more. But the inflec­tion point is that the future of work is not gonna be like the past of work. We have to grab hold of that. We have to have inclu­sion and diver­si­ty offi­cers on the team that dig­i­tizes the com­pa­ny. That builds the plat­forms, the per­for­mance algo­rithms, the met­rics, and the poli­cies that gov­ern change. It’s an oppor­tu­ni­ty. Because all of our com­pa­nies and our uni­ver­si­ties are in the mid­dle of change. And women, peo­ple of col­or, peo­ple of dif­fer­ent abil­i­ties, need to be at every stage of that change. Not just women engi­neers but women prod­uct design­ers. Women mar­keters. Unless we do that, we won’t have a diverse group like this.

And I want to say some­thing that doesn’t get said often enough. We need to invest in the pipeline. We need to make it okay for girls to be engi­neers with­out hav­ing to have greasy hair, like Cheetos, or drink Red Bull. It has to be okay. But, the pipeline is leak­ing just as bad­ly at the top. Senior women when they get to senior posi­tions are told that they’re dif­fi­cult. They’re hard to man­age. They’re just not right. And they don’t stay in those posi­tions because they’re kicked out.

So you can’t just hire women. You need to keep them. And to do that you need what’s called the cohort effect. Not one but a min­i­mum of three. Not one per­son of col­or but a min­i­mum of three. For any­one who’s seen The Intern, not one old man but a min­i­mum of three. And if you do that, if you have old­er women and younger women, then you’ll have role mod­els for peo­ple to look up to. You’ll have cohorts to talk to one anoth­er.

So we’ve talked about two kinds of bias. And here’s the third. One is the rep­re­sen­ta­tion of us in the work­force. The sec­ond is the rep­re­sen­ta­tion of us in the tech­nol­o­gy that we use, such as Siri and Alexa. And the third is algo­rith­mi­cal bias. If we pay atten­tion to this, we can have a work­force that looks like the peo­ple that we’re build­ing for. And all of us win if that hap­pens. Thank you.

Gideon Lichfield: Okay. So thank you very much Justine and Joy. So, as a reminder, you can— I'm going to do some audience Q&A in a moment. And if you want to jot down some questions and put them in so they'll show up on my screen you can go to the But I will also take questions in the analog method.

But first I'm going to try to formulate a question… And I'm not sure if I'm going to formulate it very well, but I'll try to formulate a question that is to both of you that kind of encompasses what you were both talking about.

We are now…we're in an era and we're moving into an era of ever more personalization and optimization. And this is true in the algorithms that figure out what we might want to buy and how to sell it to us. And what clothing will fit us best and will match the purchases we've made in the past and so on. It's also going to be true of the software that is going to help employers make decisions about who they should look for and where they should recruit. It's in the software that is going to be increasingly used to monitor how people perform at work, and how they could perform better and more optimized. And inherent in all of that optimization and personalization, inevitably there are going to be correlations between certain kinds of behavior. And whether gender, or race, or socioeconomic class, or other things. So I think the question I'm trying to formulate is, how in this world of increasing optimization where the algorithms will be accurate… They'll increasingly be accurate. But their application could lead to discrimination. How do we stop that?

Justine Cassell: Do you want to go first?

Joy Buolamwini: Sure. So, accurate algorithms can be abused but we always have to remember that accuracy is always relative. And as we learn more the systems that we thought would be more precise might not actually be doing what we think. So let's take the example of precision medicine. The right medicine for the right person, at the right time. And when you look at what some startups are doing, they're saying okay we have all of this clinical data. Let's train on that so we can make better predictions.

In the US case it wasn't even until 1993 that women and people are color were mandated to be part of clinical trials. If you look at cardiovascular disease, one in three women die of this, but less than a quarter of research participants are women, and the way in which heart disease manifest in women and manifest in man is not necessarily the same.

So as a result you might think you're getting more and more accurate but it could be just for a small sliver of society. So I'm always thinking about what does full-spectrum inclusion look like? And if we're talking about precision, precisely who are we benefiting and precisely who are we harming?

Lichfield: Mmm, Justine.

Cassell: That example's a great example, because one of the things we know about research participants is that low socioeconomic status citizens often don't want to be part of research. Because they don't want their data to be stolen. They don't want people to make money off of their personal data. And that's a tension, because if their data isn't used; they own it, they've kept it; but they're also not part of that data set. And it's a complicated question.

But I want to talk about another aspect of personalization. What I thought you were going to say was aren't we going to build more and more instances of technology? A white pale male, a white woman who's old, a white woman who's young. And there I was going to say that in my own work I've been working very hard to build gender ambiguous, ethnicity ambiguous, representations of technology. And what we find is—and we've done this mostly for children—is that children attribute their own ethnicity and their own gender-binary gender to those pieces of technology. In fact the only gendered piece of technology I've ever built, that looks like a woman, is SARA (for any of you who interacted with her two years ago), the socially-aware robot assistant. Other than that, all of our work has gender-ambiguous names, ethnicity-ambiguous names, and looks, so as to not fall into stereotypes that I know I have as a scientist.

And I was going to stand up here and make you all take the implicit association test to have you all see the way I've seen the stereotypes, the noxious stereotypes, you carry with you. The only thing that we can do is realize them, and then decide what to do about them. And until we do that, we're capable—I'm capable—of hopefully nothing as really ugly as those math tutors, really negative as those math tutors. But stereotypes nonetheless and so I try and stay away from personalization in the realm of what technology looks like.

Lichfield: Do you feel like the companies that're building these algorithmic tools have gotten better about issues of bias since you and other people started raising it? And also, other than IBM how did the other companies react when you approached?

Buolamwini: Sure, so we got a range of reactions. One reaction was no response. Another was a very cautious corporate response. We have a new paper that's coming out where we look to see if our process actually made a change. And so after we did the Gender Shades research which showed racial bias, that showed a gender bias and it also showed intersectional bias, all the companies that we audited within seven months made significant improvements.

So then we decided to look at companies that we didn't audit in the first place. So we didn't include Amazon, for example, but Amazon is selling their technology to law enforcement. And we found out that technology has racial bias, has gender bias, and it was close to the level of the companies a year before. So here we're seeing the companies that were checked, right, are trying to make some kind of improvement. And then the companies that weren't weren't held to the fire. So it definitely makes a difference. Some people are paying attention, not enough people are paying attention, but we'll keep watching.

Lichfield: Right, so that then rai— Oh, sorry. Justine.

Cassell: I was going to say since I build personal assistants I've looked a lot at the personal assistants on the market, and over the last six months I've done a lot of press interviews. And I'll leave the company nameless but I can say that one company, you would tell the personal assistant to go away and the personal assistant would reply, "Why? Can't we just stay friends?" And I quoted this in the interview and it disappeared a week later. So, we can do good but we shouldn't have to follow companies around letting them know they're being watched.

Lichfield: Right. So who in the end should be doing this? I mean, what is the role of of law and regulation in setting how algorithms should work, especially—and this is like a perpetual problem now with technology—especially when the tech itself is moving way faster than lawmakers and legislators can.

Buolamwini: Absolutely. We have to definitely think about the maturity of a specific technology. So going back to facial analysis. With what we know about the flaws it's completely irresponsible to use it in a law enforcement context, yet we see companies selling it in that space. So I believe one of the first legislative steps that we can take, as others have called for, are moratorium's until we have a better understanding of the real social impact of some of these technologies.

Also making sure that there's something called affirmative consent, where we know if I'm going into an interview and they're going to analyze my face? I have some way of pushing back. There's some kind of due process and I have to say yes. Right now you might have opt-out, at best? but oftentimes you're channeled into these systems. So that's a place where you could have regulation come in to increase transparency but also have affirmative consent.

Cassell: I'm very American in my perspective. Regulation's okay but I'd like to diminish it as much as possible and rely on education. And I think this is an area where we don't educate our young people. We don't educate our old people very well about it either. And we need to introduce into the discussion about AI, into the education of AI researchers, a lot more information about society. We talk about human-centered computing. We're starting to talk about human-centered AI. That includes a lot more than building a tool that can recognize when you're having trouble walking and sending a wheelchair. It means knowing who people are; how they are; how they behave, with the finesse and the depth of precision that's needed in order to design for them in a truly justice-oriented way.

Lichfield: But that requires a whole lot more data on them.

Cassell: It does, and we have those data.

Lichfield: Right.

Cassell: We're not missing the data, we're not using them. We're using my two officemates and the two guys across the hall: convenience sample.

Lichfield: Right. But that's what I mean, is because we need that much more data then it raises that many more privacy questions.

Buolamwini: And we are missing a lot of data. So let's go and look at the Human Genome Project, right. So, let's say people who are African or of African diaspora, about 20% of the global population. Less than 5% of what we have for the genome represents those populations. So you have severe underrepresentation of many different kinds of Asians as well. So we do have intense data gaps, but we also have to think about agency when people get to decide if they participate or not.

So earlier you were talking about various people in lower socioeconomic status not trusting. But there's also a reason for not trusting. If you have a case where you have a population with syphilis and you don't tell them, as happened with the Tuskegee experiments, there are very valid reasons not to trust—

Cassell: To go ahead believing that. Yeah, for sure.

Buolamwini: Absolutely.

Lichfield: There's a question here from someone in the audience, Abinav? Is Abinav in the room? No? Okay. Because otherwise I would have them explain their question, but as it is it doesn't really quite make sense to me. Are there any other any other questions in the room here?

Audience 1: Yes. A comment. I'm very much in Joy's camp in terms of thinking about how the data's actually used. Because I think in terms of a computer, having the nuance to make some of the distinctions based upon the data that it has, I don't know that computers are there yet. So even for like, my son who has more privilege than he knows what to do with, we tell him when he goes out, "Don't wear a hoodie," you know. "Don't do this, don't do that," because a computer isn't going to say you're a black boy of privilege. It's going to you're a black boy who's six two who looks like this. Right?

Cassell: Stereotype.

Audience 1: So there's a lot of nuance that needs to go behind the data. And I tell him you know, "You're going into a world where the data might be used in a police situation, right. And the computer's going to make a very quick decision because the computer didn't go to school with a guy like you. The computer's not making that nuanced decision." And that's something that I think that we need to work on, and that needs to be part of the discussion.

The second thing I would just say is you know, when we look and say we need to have more people of color, more women, more diversity in the programmers and the people who are doing these algorithms so that they're thinking about these other people and the distinctions, that's a long-term play. And I'd be interested to know what are the things that we need to do now to start to make those corrections?

Cassell: Love that question. So, CMU this year attained 50% of the entering class in computer science as women. [Buolamwini snaps fingers] Yeah. And there's this beautiful mural… And it's only in the women's room and I've been talking to them about this. But it says "Computer science: an old boys club? Not at Carnegie Mellon." And I love that. It took us…years. Not a very very long period of time, but a few years to accomplish that. But we accomplished it by investing in the pipeline young. In going into grade schools, and middle schools, and high schools, and working with children, insuring— Jane Margolis has written a beautiful book called Stuck at the Shallow End about race and computing. And she's looked at the LA school systems. There is no AP computer science in inner city schools in Los Angeles. Or there wasn't till she arrived. She had done the same thing with women before she wrote Stuck at the Shallow End. She wrote Unlocking the Clubhouse.

So there are things we can do. Yes, we're starting now with kids who are 7 or 8 and so it's not going to be right away that we're going to hire. But, I want to point out that we make a mistake, we make a terrible mistake that backfires, when we think about hiring diverse people.

So, I once had a job as a faculty member at a high-status university. And the head of my department came up to me and the one black faculty member on the faculty and said, "I give the two of you six months to diversify the faculty." Yeah. Okay. Now, it happened to be a topic that I had been looking at for twenty-three years. Not my colleague; it wasn't his topic of interest, and certainly we couldn't do anything in six months.

I was offered a position as diversity officer at another university. And I said no, despite the fact that this is something I care very deeply about. Because when I said, "Okay. We need to think about a pronoun policy. We need to have nonbinary bathrooms. Half our buildings are not accessible, we need ramps up right away." And the provost said to me, "We're just going to start with women."

But intersectionality, Kim Crenshaw's work, tells us that you can't just work along one axis. You have to consider for example first-generation college attendees. You need to consider the full range of peoples, not only the diverse people (which usually means black or women), but diverse teams because that's where creativity comes. And the only way this is going to get done, you all know it, is if we show value. And we can show dollars raised. 85% of consumers are women. And so few women are product designers or marketers or engineers. And we could sell more—our creativity will go up, our innovation will go up, if we hire diverse teens. And if we work in collaborative teens.

Buolamwini: On the point of representation. I'm a computer scientist, right. But I don't fit the many stereotypes that you listed off in your original talk. And I think the representation of what it looks like to be a computer scientist— I'm also an athlete, pole vaulter, all of these other things. But inviting people in who don't necessarily look like the stereotype and saying, "You too matter. What you're doing is valuable. And also your perspectives matter." The paper I wrote, Gender Shades, I did it in collaboration with Dr. Timnit Gebru who was at Stanford. And she is from Ethiopia. It's no surprise that as dark-skinned women we found the problems that we did. So it's also being able to elevate the work that is there as well.

Lichfield: Unfortunately that is all the time that we have. So, Joy and Justine thank you so much for being here.

Help Support Open Transcripts

If you found this useful or interesting, please consider supporting the project monthly at Patreon or once via Square Cash, or even just sharing the link. Thanks.