danah boyd: I com­mend the EU Parliament for tak­ing up the top­ic of algo­rith­mic account­abil­i­ty and trans­paren­cy in the dig­i­tal econ­o­my. In the next ten years we will see data-driven tech­nolo­gies recon­fig­ure sys­tems in many dif­fer­ent sec­tors, from autonomous vehi­cles to per­son­al­ized learn­ing, pre­dic­tive polic­ing, to pre­ci­sion med­i­cine. While the advances that we will see will cre­ate phe­nom­e­nal new oppor­tu­ni­ties, they will also cre­ate new chal­lenges, and new wor­ries. And it behooves us to start grap­pling with these issues now so that we can build healthy sociotech­ni­cal systems.

I want to focus my remarks today on a provoca­tive state­ment: I believe that algo­rith­mic trans­paren­cy cre­ates false hope. Not only is it tech­ni­cal­ly unten­able, but it obfus­cates the pol­i­tics that are at stake.

Algorithms are noth­ing more than a set of instruc­tions for a com­put­er to fol­low. The more com­plex the tech­ni­cal sys­tem, the more dif­fi­cult it is to dis­cern why algo­rithms inter­act the way they do. Putting com­plex com­put­er code into the pub­lic domain for every­one to inspect does lit­tle to achieve account­abil­i­ty. Consider, for exam­ple, the Heartbleed vul­ner­a­bil­i­ty that was intro­duced into OpenSSL code in 2011 but was­n’t iden­ti­fied until 2014. Hundreds of thou­sands of web servers relied on this code for secu­ri­ty. Thousands of top-notch com­put­er sci­en­tists worked with that code on a reg­u­lar basis. And none of them saw the prob­lem. Everyone agreed about what the right out­come should be, and tons of busi­ness­es were incen­tivized to make sure there were no prob­lems. And still, it took two and a half years for an algo­rith­mic vul­ner­a­bil­i­ty to be found in plain sight, with the entire source code pub­licly available.

Transparency does not inher­ent­ly enable account­abil­i­ty, even when the stars are aligned. To com­pli­cate mat­ters more, algo­rith­mic trans­paren­cy gets you lit­tle with­out data. Take for exam­ple Facebook’s News Feed. Such sys­tems are designed to adapt to any type of con­tent and evolve based on user feed­back, for exam­ple clicks and likes. When you hear that some­thing is per­son­al­ized, this means that the data that you put into the sys­tem is com­pared to oth­er data already in the sys­tem shared by oth­er peo­ple, and that the results you get are rel­a­tive to the results that oth­ers get. People mis­tak­en­ly assume that per­son­al­ized means that deci­sions are based on your data alone. But quite to the con­trary, the whole point is to put your data in rela­tion­ship to oth­ers’. Even if you require Facebook to turn over the News Feed algo­rithm, you’d know noth­ing with­out the data. Asking Facebook for the data would be a vio­la­tion of user privacy.

Your goal isn’t to have trans­paren­cy for trans­paren­cy’s sake. You want to get to account­abil­i­ty. Most folks think that you need trans­paren­cy to achieve account­abil­i­ty in algo­rithms. I’m not sure that’s true. I do know that we can’t get clos­er to account­abil­i­ty if we don’t know what the val­ues are that we’re aim­ing for. We think that if the process is trans­par­ent, we could see how unfair deci­sions were being made. But we don’t actu­al­ly even know how to define our terms. 

Is it more fair to give every­one equal oppor­tu­ni­ty, or to com­bat inequity? Is it bet­ter for every­one to have access to con­tent shared by their friends, or should hate speech be cen­sored? Who gets to decide? We have a lot of hard work to define our terms, that in many ways sep­a­rate the hard work of under­stand­ing algo­rith­mic process­es from the hard work that we have to deal with in terms of our social issues. If we can’t define our terms, we’re not going to be able to suc­ceed in algo­rith­mic accountability.

Personally I’m excit­ed by the tech­ni­cal work that is hap­pen­ing in an area known as fair­ness, account­abil­i­ty, and trans­paren­cy in machine learn­ing. An exam­ple rem­e­dy in this space was pro­posed by a group of com­put­er sci­en­tists who were both­ered by how hir­ing algo­rithms learned the bias­es of those who were in the train­ing data. They renor­mal­ized the train­ing data so that pro­tect­ed cat­e­gories like race and gen­der could­n’t be dis­cerned through prox­ies. To do so they relied heav­i­ly on legal frames in the United States that define equal oppor­tu­ni­ty in employ­ment, mak­ing it very clear what the terms of fair­ness are. And they could com­pu­ta­tion­al­ly pro­tect them through the same tech­ni­cal mech­a­nisms as the law. This kind of rem­e­dy shows the ele­gant mar­riage of tech­nol­o­gy and pol­i­cy to achieve agreed-upon ends.

No one, least of all a typ­i­cal pro­gram­mer, believes that com­put­er sci­en­tists should be mak­ing the final deci­sions about how trade­offs are being used to decide soci­etal val­ues. But at the end of the day, it’s com­put­er sci­en­tists who are pro­gram­ming those val­ues into the sys­tem. And if they don’t have clear direc­tion, they’re going to build some­thing that upsets some­body, around the world. Take for exam­ple sched­ul­ing soft­ware. Programmers have been told to max­i­mize retail­er effi­cien­cy by spread­ing labor out as much as pos­si­ble. This is the goal that they are told to opti­mize for. But it means that work­ers’ sched­ules are all over the place, that chil­dren suf­fer, that work­ers do dou­ble shifts with­out sleep, etc. The prob­lem isn’t the algo­rithm. It’s how it’s deployed. What max­i­miza­tion goals it uses. Who got to define them. And who has the pow­er to adjust the sys­tem. If we’re going to deploy these sys­tems, we need to clear­ly artic­u­late what the val­ues are that we believe are impor­tant. And then we need to hold those sys­tems account­able for build­ing to those standards.

The increas­ing wide­spread use of algo­rithms makes one thing crys­tal clear: our social doc­trine is not well-articulated. We believe in fair­ness, but we can’t even define it. We believe in equi­ty, but…not if cer­tain peo­ple suf­fer. We believe in jus­tice, but we accept process­es that sug­gest oth­er­wise. We believe in democ­ra­cy, but our imple­men­ta­tion is flawed. Computer sci­en­tists depend on clar­i­ty when they design and deploy algo­rithms. If we aren’t clear with what we want, account­abil­i­ty does­n’t stand a chance.


Help Support Open Transcripts

If you found this useful or interesting, please consider supporting the project monthly at Patreon or once via Cash App, or even just sharing the link. Thanks.