Maya Indira Ganesh: So, I’m going to just start with in the next twen­ty min­utes telling you about a single-issue pub­li­ca­tion that I’ve been work­ing on at Tactical Tech. It’s called See(n) Through Machines: Data Discrimination and Design. This pub­li­ca­tion will come out in a few weeks I hope, and has a num­ber of essays in them. And for the pur­pose of this talk I’m going to talk about a cou­ple of these essays—some con­tri­bu­tions in the publication. 

Please do me a favor; I am hav­ing some prob­lems with my moth­er’s bank account, where you usu­al­ly deposit your rent. From this month on, please trans­fer mon­ey to the fol­low­ing account: …
My land­lord

I want to start with some­thing that hap­pened to me recent­ly. I got into a lit­tle bit of mis­com­mu­ni­ca­tion with my land­lord. He sent me an email that said this. And it end­ed up in my spam fold­er. And as a result he was a lit­tle bit annoyed that I was send­ing the rent to the wrong account for many months. 

Now, spam fil­ters are the most com­mon and famil­iar kinds of machine learn­ing that we come across, but they still get it wrong. This email looked a lot like a phish­ing attempt, so it got sent to spam. Which is great. I mean, it made me feel very secure, but also annoy­ing. It was also annoy­ing because I missed this email.

Machine learn­ing at its most sort of base lev­el in a very sim­pli­fied way, is a process of com­put­er soft­ware read­ing a lot of data and iden­ti­fy­ing pat­terns and asso­ci­a­tions in that data, like how dif­fer­ent pieces of infor­ma­tion in that data set are relat­ed to each oth­er. And we can move on to some­thing a lit­tle bit more com­plex that relates to the sort of top­ic of my talk. 

[Ganesh has some trou­ble get­ting the above clip to play at ~18:40; com­ments omit­ted.] Okay maybe we won’t do the video, which is a shame. But any­way this is a video by… It’s on YouTube, as you can see. So Dr. Michal Kosinski is a com­pu­ta­tion­al psy­chol­o­gist at Stanford University who devel­ops deep neur­al net­works, a kind of machine learn­ing process that is said to close­ly mim­ic how ani­mal brains are thought to work. They are used to iden­ti­fy pat­terns in faces based on images loaded online.

So, what Dr. Kosinski does is that he uses deep neur­al nets to iden­ti­fy extro­verts and intro­verts. And in the clip he was talk­ing about how you can look at peo­ple’s faces and iden­ti­fy their gen­der, how you can look at peo­ple’s faces and iden­ti­fy their polit­i­cal views. And he was say­ing that we get out­raged when we think that machines can do these things, but if you think about it for a moment we do these things all the time. We’re con­stant­ly mak­ing these deci­sions about peo­ple based on when we see their faces, so why can’t we train machines to do the same thing?

So in that clip he was show­ing extro­vert­ed peo­ple and intro­vert­ed people—their faces—and ask­ing us in the audi­ence, or his audi­ence in that talk, to iden­ti­fy which one was the extro­vert and which one was the intro­vert. People got it right. Machines also start­ed get­ting it right because they could iden­ti­fy pat­terns in the way that extro­verts and intro­verts were tak­ing pho­tographs of them­selves. These were all selfies. 

So that’s what the talk was about. And how machine learn­ing works is that soft­ware called algo­rithms are exposed to a train­ing data set, in this case faces of peo­ple. And the data can be real­ly any­thing but it looks at pat­terns with­in those sets. So what Kosinski was look­ing at with the extrovert/introvert was tilt of the head, sep­a­ra­tion between the eyes, shape of the nose, breadth of the fore­head. And then the algo­rithm was pick­ing up those pat­terns in the faces and being able to iden­ti­fy them in new faces, so in a new data set that it had nev­er seen.

This process of kind of look­ing at faces and iden­ti­fy­ing things about peo­ple is noth­ing new. I mean, we know that this is like over 150 years old. Trying to iden­ti­fy crim­i­nals, for exam­ple, it was a big thing in Europe about a hun­dred years ago. 

Now, what’s inter­est­ing about Kosinski is that a few months ago he was in Berlin and he gave a talk where he described this research. He also talked about new research where he was apply­ing facial recog­ni­tion tech­nol­o­gy to gay and straight faces. He was work­ing with a data set, he said, of self-identified gay and straight peo­ple who had post­ed their pic­tures online, as we all do, and using algo­rithms to read these faces and to be able to iden­ti­fy pat­terns in them—whatever pat­terns they saw. And then he said that with 90% accu­ra­cy, his algo­rithms then, based on what they had learned about the pat­terns in those faces… And we’re talk­ing about like 35,000 images. That the same algo­rithm could look at a new data set and iden­ti­fy what was a gay face” and what was a straight face.”

And so he talked about this research. It’s not pub­lished yet. He was just sort of talk­ing about it so I can’t actu­al­ly give you a cita­tion. And then, the next slide he showed us was a map of all the places in the world where homo­sex­u­al­i­ty is crim­i­nal­ized and in some places pun­ished by death. And he said, would­n’t it be real­ly unfor­tu­nate if this kind of tech­nol­o­gy was avail­able to gov­ern­ments in these places. And I think he was con­cerned, but that did not make him sort of think about the tech­nol­o­gy he was developing.

Anyway, it’s also use­ful to note that Dr. Kosinski’s research on psy­cho­graph­ics has been in the news for the last few months because alleged­ly Cambridge Analytica used his research to do micro­tar­get­ing for elec­tion cam­paigns. So there’s kind of inter­est­ing con­nec­tions there.

So let’s move on to anoth­er exam­ple. There’s an anec­dote about bor­der cross­ings between Canada and the US. This is from some time ago. The poros­i­ty of the bor­der has made it pos­si­ble for cit­i­zens of one coun­try to trav­el, live in, and work in the oth­er. So it was com­mon for bor­der con­trol police to ran­dom­ly ask peo­ple cross­ing to say the last four let­ters of the English lan­guage alpha­bet. Because it isn’t always easy in that part of the world to iden­ti­fy peo­ple based on just how they look or how they speak. Canadians, fol­low­ing the British English pro­nun­ci­a­tion, would say, double-you, ex, why, zed” where­as Americans would say double-you, ex, why, zee.” This quaint tech­nique was of course very quick­ly learned and hacked. And of course after 911 pass­port checks became mandatory. 

A ver­sion of this con­tin­ues over in Europe. In a recent news sto­ry, Deutsche Welle report­ed that the German gov­ern­ment was con­sid­er­ing bring­ing in voice recog­ni­tion tech­nol­o­gy to dis­tin­guish Syrian Arabic speak­ers from Arabic speak­ers from oth­er coun­tries, to ensure that only cit­i­zens flee­ing Syria were receiv­ing asy­lum in Germany. Deutsche Welle reports that such soft­ware is based on voice authen­ti­ca­tion tech­nol­o­gy used by banks and insur­ance companies.

However, the news report quotes lin­guis­tics experts who say that such analy­ses are fraught. Identifying the region of ori­gin for any­one based on their speech is an extreme­ly com­plex task, one that requires a lin­guist rather than auto­mat­ed software.

In 2012, the artist Lawrence Abu Hamdan held a meet­ing in Utrecht in the Netherlands to talk about the use of speech recog­ni­tion tech­nolo­gies in asy­lum cas­es of Somali refugees. Having ascer­tained that they were com­ing from rel­a­tive­ly safe pock­ets of Somalia, the Dutch author­i­ties want­ed to deny them asy­lum. Working with a group of cul­tur­al prac­ti­tion­ers and artists and activists, and Somali asylum-seekers, Abu Hamdan was able to show that accent is not a pass­port, and was in fact a non-geographic map. Abu Hamdan writes that 

The maps explore the hybrid nature of accent, com­pli­cat­ing its rela­tion to one’s place of birth by also con­sid­er­ing the social con­di­tions and cul­tur­al exchange of those liv­ing such itin­er­ant lives. It reads the way peo­ple speak about the volatile his­to­ry and geog­ra­phy of Somalia over the last forty years as a prod­uct of con­tin­u­al migra­tion and crisis.
Lawrence Abu Hamdan, Conflicted Phonemes artist statement

Voice even­tu­al­ly is an inap­pro­pri­ate way to fix peo­ple in space.

So, as Nicole Shephard writes, we must look at the prac­tices of quan­tifi­ca­tion and what they mean, but as a sort of con­tin­u­a­tion of his­to­ry. Big data are the lat­est trend in a long tra­di­tion of quan­tifi­ca­tion with roots in moder­ni­ty’s fetishiza­tion of tax­on­o­my in the ser­vice of insti­tu­tion­al order.” And what’s inter­est­ing about sort of looking—at least inter­est­ing for me in look­ing at big data tech­nolo­gies, is in a way that sort of chal­lenges the assump­tion that all of these algo­rithms and machine learn­ing and big data just kind of plug very seam­less­ly into human sys­tems, and we can speed up and auto­mate things. And as Shephard says, we need to look at the sort of his­tor­i­cal con­ti­nu­ities, espe­cial­ly where gen­der, race, and sex­u­al­i­ty are concerned.

So I’m inter­est­ed in data and dis­crim­i­na­tion, in the things that have come to make us unique­ly who we are, how we look, where we are from, our per­son­al and demo­graph­ic iden­ti­ties, what lan­guages we speak. These things are effec­tive­ly incom­pre­hen­si­ble to machines. What is gen­er­al­ly cel­e­brat­ed as human diver­si­ty and expe­ri­ence is trans­formed by machine read­ing into some­thing absurd, some­thing that marks us as different.

Big data tech­nolo­gies are not only being used to clas­si­fy but also mis­clas­si­fy us by what we say and do online. There’s a very inter­est­ing recent Pulitzer Prize-winning inves­ti­ga­tion by ProPublica—maybe many of you have seen this—that showed that racial bias was being per­pet­u­at­ed by the use of algo­rithms that pre­dict­ed recidi­vism rates, or the like­li­hood of offend­ing again, in that it was pre­dict­ing that peo­ple of col­or were more like­ly to com­mit crimes in the future than white people. 

What I found inter­est­ing about this sto­ry was that it exposed the ways in which state and pri­vate insti­tu­tions, social prej­u­dices, and mul­ti­ple data­bas­es col­lude in the cre­ation of algo­rith­mic mech­a­nisms with bias. Such net­works effects have become the stuff of every­day news now, each sto­ry pre­sent­ing a fresh out­rage to estab­lished notions of human rights and dig­ni­ty. For exam­ple, image recog­ni­tion soft­ware that runs on machine learn­ing iden­ti­fies black peo­ple as goril­las or Asian peo­ple as blink­ing. Latonya Sweeney finds that peo­ple with black-sounding names were more like­ly to be served ads for ser­vices relat­ed to arrest and law enforce­ment. Interestingly I was in Canada recent­ly, just at the the begin­ning of August for a few weeks for a sum­mer school, and I found that as soon as I went to Canada I was get­ting these ads for arrest and take care of your crim­i­nal record” and bail. And it was kind of shock­ing to me and I thought that okay, maybe in Canada peo­ple with Indian-sounding names seemed criminal. 

So the sort of line of ques­tion­ing that inter­ests me in look­ing at data and dis­crim­i­na­tion is a con­tin­u­a­tion of Tactical Tech’s work from an exhi­bi­tion we did last year called Nervous Systems: Quantified Life and the Social Question, which we did at the Haus der Kulturen der Welt between March and May last year. What Nervous Systems did was to try and tell a more lay­ered and nuanced sto­ry beyond just how algo­rithms work, but to look at the com­plex infra­struc­tures of quan­tifi­ca­tion and soft­ware, but also cul­tur­al sym­bols, val­ues, and practices.

One of the things that Nervous Systems did was to sort of unpack the his­tor­i­cal prece­dent of big data, lest we think that these are things that are new. And I like this quote by Shannon Mattern that you know, this quan­ti­fy­ing spir­it is some­thing that’s kind of old in Europe, but also in many oth­er parts of the world:

Explorers were return­ing from dis­tant lands with new bytes of information—logs, map, specimens—while back home Europeans turned nat­ur­al his­to­ry into a leisure pur­suit. Hobbyists combed the fields for flow­ers to press and but­ter­flies to pin. Scientists and philoso­phers sought ratio­nal modes of descrip­tion, clas­si­fi­ca­tion, analysis—in oth­er words, systematicity.
Shannon Mattern, Cloud and Field, Places

So I want to sort of go from this order­ing and sys­tem­atic­i­ty to the idea of dis­crim­i­na­tion itself. And so I start by sort of unpack­ing what does the word dis­crim­i­na­tion” mean. And actu­al­ly, you can move away from dis­crim­i­na­tion in a legal sense to sort of think that… The exam­ples I’ve been talk­ing about show that dis­crim­i­na­tion is about being very clear­ly iden­ti­fied, being very clear­ly dis­tin­guished and seen. It’s not nec­es­sar­i­ly just about dis­ad­van­tage per se, but about vis­i­bil­i­ty. And what does vis­i­bil­i­ty through a machine mean for dif­fer­ent kinds of people? 

In work that I did last year at Tactical Tech with Jeff Deutch and Jennifer Schulte, we wrote about what we call the ten­sion between anonymi­ty invis­i­bil­i­ty.” We exam­ine the tech­nol­o­gy prac­tices and envi­ron­ments of LGBT activists in Kenya, and of hous­ing and land rights activists in South Africa. We found that they want­ed the kinds of vis­i­bil­i­ty that tech­nol­o­gy brought, but in order to do their activism and make their claims, that vis­i­bil­i­ty brought risks because of their mar­gin­al posi­tion in soci­ety. Being gay and from a working-class back­ground or a very Christian envi­ron­ment, or from a small town, or some com­bi­na­tion of these, was much more risky, per­haps, for some­one who is maybe English-speaking, upper class, urban. Both ran the risk of expo­sure through tech­nol­o­gy and had to main­tain their pres­ence online very care­ful­ly. But the lat­ter’s vis­i­bil­i­ty was less of a liability.

So for the past four months I’ve been work­ing on this zine, or mag­a­zine, about data and dis­crim­i­na­tion which I call See(n) Through Machines” to sort of exam­ine what it means to be seen through machines, but also can we look through the machine in the process of doing so. These are some of the con­tri­bu­tions that are there, and I’m going to take you quick­ly through them. 

The first one is actu­al­ly real­ly inter­est­ing because Luiza Prado is a design­er who’s look­ing at this arti­fact called the Humboldt Cup. It’s actu­al­ly on dis­play at the me Collectors Room in Auguststrasse in Mitte. The Humboldt Cup is a 17th cen­tu­ry Dutch arti­fact. And what she does is she traces this cup, which has engrav­ings of Brazilian natives, and through that explores how racial clas­si­fi­ca­tion and the cen­sus in Brazil developed. 

And since I’m run­ning out of time I’m going to leave that and talk about the last D, actu­al­ly, which is design. So that there’s data, dis­crim­i­na­tion, and design. I’ve been real­ly inter­est­ed in sort of strug­gling with the idea of design and what we expect of design in this con­text. If we design some­thing bet­ter, we think maybe the prob­lem will go away. Or that if we had more vis­i­bil­i­ty and trans­paren­cy in the design process, sys­tems will be improved and held to account.

One con­tri­bu­tion in the zine that talks direct­ly to this is Kate Sim’s con­tri­bu­tion on anti-rape tech­nolo­gies. She says that anti-rape tech­nolo­gies are about good user experience—UX—as well as about gath­er­ing more data about the con­text in which rapes hap­pen, in order to design bet­ter sys­tems. She finds that the design process is dis­lo­cat­ed, and anti-rape tech­nol­o­gy is actu­al­ly devel­oped in many dif­fer­ent cities, and because this process is dis­lo­cat­ed makes account­abil­i­ty difficult. 

Ame Elliott and Cade Diehm direct­ly take on UX design and the weaponiza­tion of design and mal­ware, say­ing that just mak­ing some­thing eas­i­er to use, even if it’s mal­ware, isn’t nec­es­sar­i­ly a good thing. There’s a lot of empha­sis on mar­ket­ing good design to make sys­tems more usable.

Caroline Sinders writes to design­ers and tech­ni­cal experts to con­sid­er that per­haps machines and machine learn­ing are as messy as the peo­ple who make them. In show­ing how com­plex it is for machines to learn, how lan­guage works, she dis­cuss­es devel­op­ments in automat­ing online harass­ment and abuse iden­ti­fi­ca­tion in Wikipedia. 

At some lev­el, talk­ing about machines and bod­ies and data and dis­crim­i­na­tion is real­ly also a study in fail­ure and error, in machine sys­tems but also in our own human imag­i­na­tion and empa­thy for and about each oth­er. So I will end on that note and ask that you look out for our new pub­li­ca­tion. It should be out by the end of September. Stay in touch, and I’m hap­py to take more ques­tions as we go for­ward. Thanks.

Further Reference

Session page