[Speaker slides were not pre­sent­ed in the video recording]

Julia Angwin: Hi there. I am going to talk about some­thing that does­n’t sound like it should be at a tech con­fer­ence, but is. So I want tell you about what I’ve learned about for­give­ness in my stud­ies of algo­rith­mic account­abil­i­ty. But I’m going to start with a lit­tle bit of just a small his­to­ry of me and my own expe­ri­ence that I think will be relevant.

So, this is me. I grew up in Silicon Valley, in Palo Alto, at a time when the per­son­al com­put­er was real­ly excit­ing. This is my first com­put­er. And I thought I would be a pro­gram­mer when I grew up. I actu­al­ly did­n’t know there were any choic­es oth­er than hard­ware or soft­ware. I thought like, you pick one and then you’re cool. And I was like soft­ware seemed more inter­est­ing. So I was study­ing to do that. And I took a wrong turn some­how and I fell in love with jour­nal­ism, my col­lege news­pa­per, and thought well, I’ll just do this for a lit­tle while, it seems fun.

So I end­ed up at The Wall Street Journal in 2000. They hired me to cov­er the Internet.” And I was like, Anything in par­tic­u­lar about the Internet?”

They were like, Nah. Just every­thing. You seem like you know computers.”

So I was like okay, that sounds like a great assign­ment. So I spent thir­teen years there cov­er­ing tech­nol­o­gy. And I was in the New York office so I had the weird angle of most­ly cov­er­ing the AOL-Time Warner merger.

So I want to talk about what I learned about for­give­ness in my time as a reporter. So, for four­teen years, I was cov­er­ing tech­nol­o­gy for The Wall Street Journal. I wrote a lot of sto­ries, right. Most of them weren’t that inter­est­ing, but some of them were. One of the biggest sto­ries I worked on was the AOL-Time Warner merg­er, because AOL was a tech com­pa­ny” of its time. And that merg­er con­sumed ten years of my life. 

And real­ly, as you prob­a­bly all may remem­ber it was based on account­ing fraud, right. Like they were they were actu­al­ly doing this crazy thing. AOL, they had like a cafeteria—they would ask the guys to pro­vide the cafe­te­ria ser­vices at AOL. And instead of pay­ing them, they would say, We’re going to pay you dou­ble and then you buy ads.” And that was the game. That’s how they got all this ad rev­enue. Because Wall Street was only valu­ing on ad rev­enue, not on profit.

So that was fun times. And in the end I think they paid a $300 mil­lion fine. It was big sto­ry. I got to be part of a Pulitzer Prize win­ning team. Super good.

But, what’s weird is that of my report­ing there’s only two times where peo­ple that I wrote about went to jail. So, one was a spam­mer. So, this guy was a spammer—he was called the Buffalo Spammer. It was the ear­ly days of spam, so it was excit­ing. I went to his house and knocked on his door, talked to his moth­er, you know, etc. And as part of my report­ing and then additional…you know, the New York Attorney General charged him, and he actu­al­ly went to jail. He got the max­i­mum sen­tence of three and a half years.

The oth­er guy who I wrote about who went to jail was an AOL exec­u­tive, actu­al­ly, who did an embez­zle­ment scheme. I don’t have his photo—he seems to have removed from the Internet, but he was also a black man. And he also got a prison sen­tence for doing some small-time embez­zle­ment; he was the head of HR.

And then I think about where the for­mer AOL exec­u­tive I wrote about who did all those round-trip deals—they’re doing cool. Steve Case, fund­ing a lot of things. Bob Pittman, run­ning a giant radio net­work. Dave Colburn, the guy who actu­al­ly did all the deals that were round-tripped, he set­tled with the FTC for four mil­lion and now he’s invest­ing in tech com­pa­nies and in Israel.

So you know…look, this isn’t a par­tic­u­lar­ly unique sto­ry. But let’s just say this is a very American sto­ry about for­give­ness. Who is for­giv­en in this case? What was the unique fac­tor about these peo­ple? I don’t know, they were white men. I don’t know, they were real­ly pow­er­ful, right? And I find it real­ly depress­ing and sad that the two peo­ple I wrote about as a tech reporter who went to prison were both black men, because as you know they were prob­a­bly the only two black men I ever encoun­tered in my whole time cov­er­ing tech­nol­o­gy, right?

So, we already know that our soci­ety hands out for­give­ness and pun­ish­ment unequal­ly. Right? I’m not telling you any­thing you don’t know, I’m just telling it to you in the form of a per­son­al story.

So, flash for­ward to I decide to write a series about algo­rith­mic account­abil­i­ty at my new employ­er, ProPublica. As we all know, algo­rithms are very impor­tant in our lives, right. This is…if you haven’t seen it, The Wall Street Journal Blue Feed, Red Feed is a delight­ful app that you can vis­it every day; this is last night. And it shows you sort of the top sto­ries that would be trend­ing in a con­ser­v­a­tive news feed and in a lib­er­al news feed. So of course yes­ter­day was spec­tac­u­lar. The con­ser­v­a­tive news feed, Trump: econ­o­my grow­ing 3%. And the lib­er­al news feed: Mueller’s charges, right. So it’s like, very dif­fer­ent sto­ries are being presented.

So I decid­ed I want­ed to do some account­abil­i­ty stud­ies about algo­rithms in our lives. And it’s hard to study the news­feed in a quan­ti­ta­tive way, and I also want­ed some­thing with high­er stakes. So I start­ed with an algo­rithm that is used in the crim­i­nal jus­tice sys­tem to pre­dict whether a per­son is like­ly to com­mit a future crime. This is, lit­er­al­ly, Minority Report soft­ware basi­cal­ly, that is used through­out the United States for sen­tenc­ing, parole, pre­tri­al release, a lot of dif­fer­ent stages of the crim­i­nal jus­tice system.

So, how many peo­ple were aware that that’s even hap­pen­ing? Okay, good. Yay. When I start­ed look­ing into it two years ago, it was­n’t as well-known that this was even being done. And so I thought well, I’m going to look into this and see if I can actu­al­ly fig­ure out if the soft­ware is biased.

So, I went and did a free­dom of infor­ma­tion request in Florida. And it took five months and some legal wran­gling but we did get all of the scores that they had assigned every­one who was arrest­ed dur­ing a two-year peri­od. So first thing we did was put those scores up in a his­togram. What you see is that on the left is black defen­dants, and on the right is white defen­dants. And so the dis­tri­b­u­tion of scores 1 through 10—which is basi­cal­ly 1, least risky; 10, most risky—is very even for the black defen­dants. But for the white defen­dants it’s strange­ly skewed towards low risk, right?

So my first thought was, Huh. That’s weird.” But, you can’t real­ly say it’s 100% biased until you test whether it’s accu­rate, right? What if every­one of the white defen­dants was Mother Teresa, right, and they nev­er did any­thing wrong… It was just some weird like, jay­walk­ing tick­et or something. 

So, we went and did six months of scrap­ing the crim­i­nal records of every one of those defendants—that’s 18,000 peo­ple; It was a com­plete nightmare—and join­ing those data sets to make sure we had the match of a per­son­’s score with their true recidi­vism. Did they actu­al­ly go on to com­mit a crime in the next two years. And also what was their pri­or record like?

And what we found was that there was a dis­par­i­ty, right. When you did a logis­tic regres­sion, which is just a sta­tis­ti­cal tech­nique which allows you to con­trol for all oth­er fac­tors. When you con­trol for all oth­er fac­tors than race, you saw that black defen­dants were 45% more like­ly to be giv­en a high-risk score. And that’s con­trol­ling for the out­come too, right, which is like, not com­mit­ting a crime or com­mit­ting a crime in the future—in the next two years.

So that meant there was some dis­par­i­ty here. And when you looked at it in a chart, basi­cal­ly when you look at false pos­i­tives and false neg­a­tives, you see that the dif­fer­ence is real­ly stark. The false pos­i­tive rate for African American defen­dants was twice as high, right. There were twice as like­ly to be giv­en high-risk score but not actu­al­ly go on to com­mit future crimes. So like, false­ly be giv­en a higher-risk score than a white defen­dant. And sim­i­lar­ly the white defen­dant is twice as like­ly to get an unjus­ti­fi­able low-risk score despite the fact that they turned out to have been far more risky.

And so, that dis­par­i­ty is real­ly a ques­tion of for­give­ness, right? We have decid­ed that some peo­ple are just more for­giv­able up front, right, despite the fact that the facts on the ground were exact­ly the same. That was a sur­pris­ing out­come for me, because I think we also think of bias as against, right, the bias against black defen­dants. But real­ly what this was was a bias for white defen­dants. And it’s sort of a dis­tinc­tion with­out a dif­fer­ence but it’s inter­est­ing to think about, and that’s why I like to frame these con­ver­sa­tions around forgiveness.

But you know, you could say this is a one-off. So any­ways, we did anoth­er analy­sis— Oh, I for­got to show you what—sorry—what it looks like in prac­tice. So here’s a black defen­dant and a white defen­dant; dif­fer­ent same: crime, pet­ty theft. Brisha, high risk; Vernon, low risk. So Brisha, 18 years old, walk­ing down the street, grabbed a kid’s bicy­cle from a yard. Tried to get on it, ride it. Got a few yards down. The moth­er came out, said, That’s my kid’s bike.” She gave it back. But in the mean­time the neigh­bor had called the police so she was arrest­ed. And Vernon stole about $80 worth of stuff from the CVS

So they get these high-risk scores. They get their risk scores when they’re arrest­ed. And basi­cal­ly, when you look at it it was com­plete­ly the oppo­site, right. So Vernon got a low-risk score despite the fact that he had already com­mit­ted two armed rob­beries, and one attempt­ed armed rob­bery, and he had already served a five-year prison sen­tence. And, he went on to com­mit grand theft; he stole thou­sands of dol­lars of elec­tron­ics from a ware­house and he’s now serv­ing a ten-year prison term. 

Brisha had some pri­or arrests, but they were juve­nile mis­de­meanors, and so records are sealed. But I can tell you that mis­de­meanors are not usu­al­ly armed rob­beries. So, let’s just say it’s a small­er crime. And she does­n’t go on to com­mit any crimes in the next two years. So this is what a false pos­i­tive and a false neg­a­tive look like in real life. That’s what for­give­ness…unfair for­give­ness, real­ly in that one case, looks like. Well then you could argue we should for­give every­body. But that’s a sep­a­rate issue. So any­way, this is what we found for this one thing. 

So then I was like okay, I want to try anoth­er. This was fun. So we did anoth­er analy­sis. I was like what’s anoth­er algo­rithm that pre­dicts an out­come. Well, weird­ly car insur­ance. So, your car insur­ance pre­mi­um that you pay is actu­al­ly meant to pre­dict your like­li­hood of get­ting in an acci­dent, right. So I was like I want to com­pare that to true risk. That’s my new game. Predicted risk, true risk. That’s what I do. 

So, once again it was an enor­mous amount of work to get all the data. Consumer Reports actu­al­ly bought a pro­pri­etary data set that I ana­lyzed with them. And we found a sim­i­lar issue, which was there was a dif­fer­ence in the way risk was allo­cat­ed. An exam­ple is… This is a guy Otis Nash. He lives in East Garfield Park in Chicago. Which is…there’s real­ly no way to describe it, it’s pret­ty much a bombed-out, bad neigh­bor­hood in Chicago on the West Side, and it’s dan­ger­ous and almost entire­ly minor­i­ty. And he pays $190 a month for car insur­ance. He’s nev­er had any acci­dents, he’s a great dri­ver, and he has Geico. But he’s strug­gling. He’s…$190 a months for some­body who works as a secu­ri­ty guard is no joke. He works six days a week and he can bare­ly afford it. 

So then, there’s this guy Ryan across town. He pays $55 a month for the same plan from Geico, right. And he has actu­al­ly just recent­ly got­ten in an acci­dent, and has the same cov­er­age. And you know, the real dif­fer­ence between these two is their ZIP code. So insur­ance com­pa­nies actu­al­ly have one fac­tor that they use to price your insur­ance that is sep­a­rate from your dri­ving record, and it’s called the ZIP code fac­tor.” And they basi­cal­ly assign a risk score to each ZIP code that is inde­pen­dent of how you dri­ve. And when you look at it— Now, Ryan and Otis are nev­er going to be exact­ly the same. They’re not the same age, they don’t have exact­ly the same risk fac­tors. But when you con­trol for all the risk fac­tors, every sin­gle one of our charts looks like this. 

So, the chart is basi­cal­ly pre­dict­ed risk to true risk. And if you think of pre­dict­ed risk as essen­tial­ly your pre­mi­um. And the red straight line is for minor­i­ty neigh­bor­hoods. So, for minor­i­ty neigh­bor­hoods the prices track risk; they just keep going straight like a nice lin­ear rela­tion­ship. And the blue line that goes down? That’s where the white neigh­bor­hoods are. And that’s basi­cal­ly they go up, and then all of a sud­den as they get riski­er, the price goes down. Unexplainedly, right? 

And so once again we have this strange mea­sure of dis­count apply­ing to white neigh­bor­hoods that is not explain­able by risk. And the insur­ance indus­try to this day—we pub­lished this ear­li­er this year—has yet to respond. They said they would come out with a big paper” explain­ing why this was true, and as of yet they have not respond­ed. I’m speak­ing at their con­ven­tion next week. And I’m anx­ious­ly await­ing the rebut­tal. Maybe they want to present it to me on stage. 

But you know, this is again this weird, unex­plained for­give­ness for one set of peo­ple, baked into an algo­rithm, right. And so I guess what I want to say is like, all of you guys might be in the posi­tion to build algo­rithms. Maybe that’s what you’re going to do next, or maybe you’re going to be audit­ing them. We’re all in a world of auto­mat­ed deci­sion­mak­ing. There will be more and more deci­sions that are going to be auto­mat­ed. And so I guess I would just like to leave you with this thought, which is we talk about bias and bias is impor­tant to think about, but think about for­give­ness, too. Because in some ways, what we have done is, at least in the things I’ve stud­ied, is that we’ve just met­ed out forgiveness—some peo­ple get impuni­ty. They’re not held to the same standard—that the­o­ret­i­cal stan­dard that we apply to every­one else. And so take that with you. And I’d be hap­py to take any questions.

Sarah Marshall: Questions for Julia. While peo­ple are get­ting their con­fi­dence, I’m real­ly keen to know the make-up of your team, who are you work­ing with to kind of help you to scrape eigh­teen thou­sand records, etc.

Julia Angwin: Oh yeah, right. Yeah, I almost did a talk on the future of jour­nal­ism. Which was about… I could­n’t decide, because I do feel like I’m build­ing a new kind of jour­nal­ism here. I have two pro­gram­mers work­ing for me, and a researcher. So we have a real team, and each one of these projects takes a year. And I think that as we go towards— You know, I thought the talk on the earth­quake was so impor­tant. Because jour­nal­ists are going to need to do much more val­i­da­tion, ver­i­fi­ca­tion, foren­sics analy­sis, right. And so we do need to build more of basi­cal­ly quan­ti­fied teams. And so I’m try­ing to pio­neer that a lit­tle bit in my way. 

Audience 1: Thank you very much. Actually, my ques­tion was delv­ing into that a lit­tle bit deep­er. How do you process all the data, and how do you actu­al­ly merge the dif­fer­ent data­bas­es? Like, if you could just explain in a lit­tle bit more detail—I’m sure it’s a very com­pli­cat­ed process—but a lit­tle bit more in detail how like the A B C of it goes.

Angwin: Right. So, one rea­son that you don’t see so much work like this, includ­ing from aca­d­e­mics, is because it’s a night­mare. So for instance, in both cas­es the spe­cial sauce that we brought was to match the pre­dict­ed risk to the true risk. And what that means real­ly is a giant data­base join, right. And those are super messy. And in both cas­es… You know, one took six months, one took nine months. And there’s real­ly no get­ting around the fact that you have to do a lot of it by hand. Like there’s…we tried to auto­mate, and we tried to do prob­a­bilis­tic match­ing and all that stuff. But truth­ful­ly, the stan­dards that we’re held to as jour­nal­ists is it can’t be just prob­a­bilis­tic match. It can’t be like 80% right. It has to be right.

And so in the end we end­ed up doing a lot of hand match­ing of records. Which was…hor­ri­ble. And one thing I’ve been think­ing about a lot is how to build more capac­i­ty for that. Because I don’t think… Most news­rooms can’t do this, right. ProPublica is like, you know, this utopi­an uni­verse of jour­nal­ism, non­prof­it fund­ed, real­ly. Doing well and invest­ed in this type of work. But that’s not true of most news­rooms. And so I’ve been think­ing about the fact that this is some­thing maybe Mechanical Turk could be brought to bear. I’m actu­al­ly try­ing to work with this amaz­ing cod­ing group at San Quentin prison in California. They actu­al­ly have a cod­ing acad­e­my and they need— So I’m try­ing to work with them to teach the inmates maybe to help with this type of match­ing? I think that there’s a lot of untapped oppor­tu­ni­ties for this type of work that I’ve been try­ing to explore, because I do think this is the gat­ing fac­tor for this type of work. 

Audience 3: Hi. I work for the New York City Department of Education. And we’ve actu­al­ly been using your arti­cles to teach stu­dents about algo­rith­mic bias—

Angwin: Oh yay!

Audience 3: —so thank you for writ­ing such impor­tant jour­nal­ism that our stu­dents can use. But I was curi­ous, how might we think about empow­er­ing the next gen­er­a­tion of stu­dents to make eth­i­cal deci­sions? Some of this can feel a lit­tle bit hope­less when you see some of it, and how do we make them feel empowered?

Angwin: Oh, I love that ques­tion, because I am a strange­ly hope­ful per­son despite my weird job of doing only unhope­ful things. And so I do believe you know… Like, the crim­i­nal risk score algo­rithm is a good exam­ple. If they they fixed… So, after our sto­ry came out, a bunch of math­e­mati­cians and com­put­er sci­en­tists came out with all these papers study­ing our our data set and com­ing up with some the­o­ret­i­cal con­clu­sions. And essen­tial­ly, they all said you know, you could fix this algo­rithm if you were to bal­ance the error rates. Like if you were to choose to opti­mize your algo­rithm to bal­ance the error rates— They’ve cho­sen to opti­mize it anoth­er way, which is for pre­dic­tive accu­ra­cy. So mean­ing it’s cor­rect” in its pre­dic­tions 60% of the time for both black and white defendants. 

But, when it’s wrong it’s wrong in this com­plete­ly dis­parate way, right. So you could actu­al­ly fix it that way, and all that would happen—the only bad” outcome—was the algo­rithm would be more accu­rate for black defen­dants than white. Which makes sense because there’s actu­al­ly more black defen­dants in our crim­i­nal jus­tice system. 

So, what’s weird is there is a hope­ful out­come, right. There’s like some­thing you could do. Now, I would also, though, like to step back and say I’m not entire­ly sure we should be pre­dict­ing any­one’s crim­i­nal­i­ty in the future. I can’t even pre­dict my hus­band and I’ve been mar­ried to him for a very long time. It’s like pre­dict­ing human behav­ior, like, we can’t even get our maps to get us to the right place most of the time. So is this real­ly where we want to bring com­put­ers to bear, is pre­dict­ing human behav­ior? I feel like this is maybe like a future thing that we’re not going to be so good at, yet. 

But I do think that algo­rithms are going to be bet­ter than peo­ple, right. In a lot of ways. But we have to learn how to hold them account­able, and we have to build sys­tems around that. I’m per­fect­ly sure that a car is going to dri­ve bet­ter than me. I’m a pret­ty bad dri­ver, right? So I feel like there is— I don’t want to be a Luddite about it, I just want to say we need sys­tems of over­sight and account­abil­i­ty before we can move for­ward, because oth­er­wise it real­ly will be the Wild West.