Trevor Paglen: So…
Kate Crawford: Hi. It’s great to see you here.
Paglen: I think we should get right into it because we have half an hour, and there’s a lot of material to cover, a lot of ideas. So just… I guess like, I can kind of start where I came to this topic from, which is that I started thinking a lot about the social and political implications of— You know, I don’t even think we can call it the Internet. We need something else. I know in Berlin here at the Haus der Kulturen der Welt there’s this concept the technosphere that they’ve been interrogating. So we could just use that for now, despite any kind of nitpicky problems one might have with that.
But I think for me you know, being somewhat attached to the Snowden project was really when I started thinking about what these kind of planetary infrastructures of communication and surveillance kind of were, and what their implications might be. And of course there’s been a lot of concern about what the implications of mass surveillance at a global scale are in terms of democracies, in terms of state power, in terms of culture and the like. But I think that we when we look at the NSA kind of mass surveillance infrastructures, the questions that it poses are what’s our relation to politics? There’s a question about privacy, etc.
But at the same time that those infrastructures have been built, you know, NSA might be tapping the cables between the Google data centers. But Google has the data centers, and has the data. And I think we’re kind of arriving at a moment where it’s starting to become clear what that’s going to mean. And a big part of what that is has to do with AI. So I think you should just talk about like, what are we talking about when we’re talking about AI? Are we talking about the Singularity that’s going to take over the world, or are we going to talk about [inaudible].
Crawford: Well it’s funny. I actually think, despite that excellent introduction, that the big concerns that I have about artificial intelligence are really not about the Singularity, which frankly computer scientists say is…if it’s possible at all it’s hundreds of years away. I’m actually much more interested in the effects that we are seeing of AI now.
So to be clear, I’m talking about this constellation of technologies from machine learning, to natural language processing, to image recognition, and to deep neural nets. It’s really this sort of cluster of technologies that, it’s a very loose term, but allow us to sort of work with that today. But know that that’s the sort of underlying technological layer that we’re interested in here.
And I think what’s sort of fascinating about this moment is that people don’t realize how much AI is part of everyday life. It’s already sort of part of your device. Often it has a personality and a name, like Siri or Cortana or Alexa. But most of the time it doesn’t. Most of the time it is a nameless, faceless, backend system that is working at the seams of multiple data sets and making a set of inferences, and often a set of predictions, that will have real outcomes.
And in the US, which is where I’ve been doing a lot of my research, I’m looking at how these kinds of large AI predictive systems are being deployed in core social institutions. So here I’m talking about things like the health system, education, criminal justice, and policing. And what is I guess to me as a researcher so interesting and I think concerning about this moment that we’re would deploying these things with no agreed-on method about how to study them; what effects they might be having; how there might be disparate impact with people from different communities, from different races, low income populations. The downsides of these systems seem to be very much clustered around populations who differ from the norm. So that’s the project that I think is really open and really needs a lot of us to be thinking about and working on.
Paglen: It seems like the kind of policy framework and even kind of intellectual framework that we have to think about this is very much the words privacy and surveillance. And for a long time I’ve kind of felt like these are really inadequate concepts to hang our hats on when we’re talking about how data is being used in society and how it will be used in society.
Crawford: And I think you know, it doesn’t mean that they’re not important. They are absolutely crucially important ideas. I think the limitation that I’ve found with privacy is that it comes from a very legalistic, individualist perspective. You know, you have individual privacy rights—
Paglen: It’s a bourgeois concept in the first place.
Crawford: It is inherently a bourgeois concept that sort of emerges from you know, basically around the sort of late 1880s. And you see with the emergence of new technologies—at the time it was sort of popular newspapers—that people are like, “Oh, this newspapers is writing a scandalous story about me. There should be some sort of right of privacy.” And so we start to see the emergence of a juridical framework of privacy. But it was always designed to protect the elites, to protect individuals.
What I think is going to be needed to address this new set of challenges with machine learning and artificial intelligence is a much more collective-based set of practices, both in terms of how we represent political action together as groups, but also concepts around ethics and power are very important here. Because we are, particularly when we talk about AI, we’re really talking about seven companies in the world who are deploying this at scale. Who have the data infrastructure. Who have the capacities to be really doing this. That is extraordinarily concentrated. That is the thing that I think we have to really think about in terms of they’re going to be the companies that decide what educational looks like, what health looks like. So that’s why I think we need to be thinking about power and ethics and move on, perhaps, from the individualistic framing of privacy.
Paglen: Yeah. So I think we should kind of go into that. We can ask some questions about what these implications are. You know, I think for me—and we’ve talked about this a lot it—a few months ago you know, earlier this year the company DeepMind kinda famously won this Go game and it was considered this huge advancement or this really spectacular thing. Because nobody thought that you would be able to beat a grand master at Go. It’s a much much more complicated game than chess. DeepMind did it, and lots of media attention about it.
But then after that they did something that didn’t get as much media attention, which what they did was apply that same AI framework to look at power consumption at Google data centers. And what they were able to do was to reduce the power consumption of the cooling costs by 40%. And that sort of efficiency is really, really remarkable. I mean, you don’t see that kind of happen every day. And to me that’s a kind of a microcosm of a phenomena that might become more widespread. And that kind of optimization is going to have massive effects for things like labor. For things like logistics. For things like healthcare. For insurance, credit. These kinds of things that are very much a part of our everyday lives, right. So maybe we can talk through some of that.
Crawford: Yeah no, it’s interesting. I mean, both Trevor and I were sort of particularly fascinated with this story. I mean, what happened with Go was that part of the reason there was such media attention was because it was predicted that we wouldn’t be able to defeat a human Go master for another ten years. So there was a sense that we’d taken a leap forward in time.
But what was interesting about the energy consumption project is that they used exactly the same technique. It was a game engine. So they thought about data centers as a game that you play, where you could open windows, increase the temperature or decrease the temperature according to the energy load. And they got very precise about times and then how would you play with all of these levers. And the result I still think is extraordinary. And if we’re going to talk about where I think there’s real positive upside to how we can start thinking about using AI, imagine if we could do that across a whole range of different energy consumption technologies. I mean, that’s astounding.
However. If you apply that same logic to alright, you’re a worker who is in a basic supermarket job. And we’ve basically got you on the clock, coming in at the peak optimal time, when there’s going to be maximum crowds. And you’re really only going to have like a two-hour shift and then we’re going to get rid of you. That’s great for me as the person who runs the supermarket. I’m maximizing my value for buying your labor. But for you it’s terrible, because you’re waiting to see if you’ve been summoned… It’s this kind of idea of the sharing economy (terribly misnamed) spread to everything. So that shift—
Paglen: Like flexible labor.
Crawford: Flexible labor, absolutely. Absolutely. Or so-called “flexible,” but flexible for whom? Certainly not flexible for the person who’s doing the labor.
So I think these are the sort of shifts that we have to take. That there has been this way of talking about AI as a singular thing that if applied to everything will be brilliant, everything will be more efficient. But I think we need a much more granular analysis that says where are we going to get maximum benefits with minimum human costs? And that is not an easy question to ask right now, because we have so little data.
Paglen: No, absolutely. I mean, we’re talking about systems that can radically transform everyday life. And that has political implications, it has cultural implications, it has sociological implications.
But… How do you… There’s a couple of questions here in relation to power. And one of the things that you mentioned before was that this is really five companies, right. So we’re talking about like, that right there—
Crawford: Five to seven, depending.
Paglen: Five to seven, depending how you slice it. Whether IBM is in there…
But what… First of all, how did that concentration happen? Why can’t I just go and create my own AI and figure out how to run my studio more efficiently or something like that? I mean maybe I can.
Crawford: Well this is actually something that I think is fascinating. Because if we compare it to the last sort of extraordinary technological shift that I think most of us in this room witnessed, which was around the Internet and the Web, right. So we had this sense of like well, you could more or less teach yourself some forms of code. And you could you know, pretty much create a web site. You could do a whole lot of things without too much sort of self-teaching. It was pretty straightforward.
The difference with AI is that the costs to first of all have large-scale training data is huge. Just to get that training data, it’s extremely valuable. Companies don’t share it because it’s proprietary. There are open data equivalents, but it’s again then the issue becomes processing. So then you’re running big GPUs. It’s very expensive. And actually another artist, Darius Kazemi, just did a really interesting short paper looking at if he was trying to start again now as a kid saying, “I wanna do DIY AI, how would I do it?” And he’s like, “I could not afford this.”
So I think that’s part of the issue. Also these are all the companies who have been collecting data for some time. They have different types of data, so they’ll be producing different types of sort of AI interventions. But what is interesting is what happens when they start deploying those models, is we’re starting to see this pattern which is really interesting. Which is that we’re really good at machine learning for some things. But keep in mind, machine learning systems are really looking for patterns, and they’re very very bad at unpacking why those patterns are there, or thinking about the context.
So let me give you an example to make this concrete. There was a really interesting study done at the University of Pittsburgh Medical Hospital, where they were studying pneumonia patients. So they thought look, let’s basically train this on a deep neural net, which we don’t really know what it’s doing but we’ll see what the outputs are. We’ll train it on a more open system where we can see what patterns they find but we don’t understand the rules.
And what they found with the DNN version, the deep neural net, was that it was extremely good at figuring out who was actually likely to have complications from pneumonia. Apart from in one case: it wanted to send home all of the people who had chronic asthma. Of course, they’re the people who are the most vulnerable and most likely to die. So it was a very bad decision.
But the reason why it came to that conclusion is actually quite logical, which was that the data indicates that the doctors had been so efficient— If you came to me and you said you had pneumonia and you had chronic asthma, I’m like, “Straight to intensive care. Off you go.” So you are actually now unlikely to get complications because I’ve moved you straight into an intensive care system.
But of course if I’m a data model I just see that oh, people with chronic asthma don’t have complications, send them home. So it’s a really interesting study to show the difference between interpretability and data patterns. There’s a pattern there, but how are you interpreting it? So I think we have a set of issues there that also relate to how we think about the deployment of AI into social systems.
Paglen: So we can think about its deployment in healthcare, labor. What about other… I’m thinking of like, what are the classic kind of sectors in the post-Fordist economies like insurance, real estate, credit, right. And very much affect our everyday lives, right. And it’s almost like credit is almost a kind of…right, in a way. What I mean by that is that these are de facto things that you can do as a human in the world, right. And if your credit score’s modulating, you effectively have different rights than somebody with a different kind of credit score.
And so one of the things that when we think through what will… The integration of AI into those sorts of industries, what will be the effects of that, do you think? I was thinking in terms of our everyday privileges, basically.
Crawford: Well, it’s going to be really interesting. One of the things I’d suggest is that these systems are going to get really good at hyperpersonalizing to you. To the point where if you’re an 18 year-old who’s having a few beers at a party and there’s your Facebook photos, an insurance company’s like, “Huh, interesting. And you’re driving a car. We might be increasing your insurance premiums,” on this very granular like—oh, this week, this month.
But actually, and again I’m going to speak to the context that I know best, which is the US legal system. We do have some protections that we can use around credit. Because let’s face it, credit and insurance agencies have been using data to really pinpoint people for some time. So there’s some pushback there.
But I’m actually more worried about when this gets deployed into areas like the criminal justice system. So I’m sure some of you read the ProPublica story “Machine Bias”. That was based on Julia Angwin’s work over fourteen months, with five journalists basically FOIA’ing the hell out of this company called Northpointe. Northpointe has used this software platform throughout courtrooms in the US. What it does is it gives a criminal defendant a number between one and ten, to indicate the risk of them being a violent offender in the future—so it’s basically like a recidivism risk.
And what she found in this big investigation was that basically black defendants were getting a false positive rate of twice that of white defendants. So the race disparity was extraordinary, and the failure rates were really very clear. But what was fascinating about this huge story—blew up, everyone was concerned: we still don’t know why. Northpointe hasn’t released the data. They won’t reveal how these calculations are being made because it’s “proprietary.”
So this system that was being deployed to judges that they were using to make these really key decisions, is still a complete black box to us. So that’s where we’re actually really bad at thinking about you know, what are the due process structures? How do we make these kinds of predictive systems accountable?
Now, that of course is not an AI system. To be super clear, it’s a predictive data system. I wouldn’t call it autonomous in the way that I would call AI. But I think it’s a precursor system.
Paglen: Yeah, I mean there’s another company called Vigilant Solutions. And what they do is—
Crawford: Doesn’t sound ominous at all. That’s clearly great.
Paglen: No, exactly. Not at all. It’s a company that mostly caters to law enforcement. And what they do is they deploy LPR cameras. So they have cameras all over cities, and all over their own fleet of cars that just go and take a picture of everybody’s license plate. And they sell this to law enforcement, insurance, collections agencies and that sort of thing.
They had a program in Texas where they were partnering with local law enforcement agencies. And they were installing ALPR cameras on cop cars, so anywhere the cop car drove it would record the license plate. Vigilant would ingest that data, merge it with their own, and then make it available back to the cops.
The other move that Texas did was gave police the ability to swipe people’s credit cards as a way to pay fines, traffic tickets, take care of arrest warrants and that sort of thing. So then what the police had was here’s a record of where everybody is, here’s everybody who we have something on. We can drive to their house, take out your credit card. And so this is like Ferguson, the very predatory kind of municipalities kind of gone—you know, is this a vision of the future?
Crawford: That’s extraordinary. I mean, it’s interesting and so that we don’t depress you too much and leave a little time for questions, the thing that I’m also really interested in and we should talk about this too, Trevor, is like, what do we do about it? What are the things that we could do about this? And we’ve talked a little bit about sort of existing legislative frameworks. I think most of the time, they’re not actually up to this challenge. I think we have a lot of work to do to think about where we get accountability and due process in these kinds of quite opaque systems.
The thing that I’ve been working on recently was we did a White House event on the social and economic implications of AI with Meredith Whittaker, who’s here tonight, looking specifically at what we could do. And I know this was something that’s interesting to you Trevor, because one of these questions is how do we give access to people? How do we make sure that people get access to these tools. But then secondly, if you’re being judged by a system, how do we start thinking about due process mechanisms?
So I think that’s one of the areas where I think we have the most work to do. But I also think that collectively, we could actually really start pressuring for these kinds of issues.
Paglen: So in the due process case, you can’t have a black box that’s sending people to prison or not. I mean that’s a real simple thing, right?
Crawford: Yeah. And of course predictive policing is another big thing here, too. Are you having much predictive policing in Germany? Is this a thing that’s happening here, not that anybody knows about— Yes, a little bit? A little bit? Okay, alright.
Well, I would be keeping a close eye on that. This is tragically one of the areas that the US has really be leading the way. There are predictive policing systems in New York, in Miami, in Chicago, in LA. And there’s been a really interesting set of studies looking at how these systems are working. They’re on often built by Palantir. Palantir is one of the major— I’m sure many of you are familiar with Palantir as a company that provides a lot of technologies to various military organizations around the world.
But this interesting thing has just happened. We just got the first study that looked at the effectiveness of predictive policing in Chicago. This was by Rand. So it’s not a radical organization. And they found that it was completely ineffective at predicting who would be involved in a crime. But it was effective at one thing, which was increasing police harassment of the people on the list. So you know, if you’re on a heat list you’re going to get a lot of attention, but it’s not necessarily going to help predict who’s going to be involved in a violent crime. So we’re already starting to see just like, empirical testing of these systems? is that they’re not even meeting baseline criteria of what they say they’re going to do.
So I think this is where we have a lot of potential to move, and potential to work collectively around political issues, is to say, “Show us the evidence that this predictive policing system will actually work, and work without producing disparate impact.”
Paglen: Yeah, I mean I think there’s two layers of concerns here. I tend to take the bigger concer— Like you know, more of a meta concern, which is that the problem with these AIs being used in policing is not that they’re racist. It’s that the idea of quantifying human activity in the first place, I find very violent, you know. Like for example labor or something like that. If you’re going to have a capitalist society, then capitalism was all about optimization, right? Creating efficiencies. That’s one of the ways in which you make money. So how do we start to reconceive of… I guess my concern is that we actually don’t even have a political or economic framework within which to address something like a 40% increase in efficiency across a logistic sector or something like that, you know.
Crawford: No, I think that’s right. And I think this is part of the issue around what’s happening now. And I want to really avoid the kind of technological inevitability arguments which come up a lot where people say this is the new thing, so it’s going to happen, and it’s going to touch every part of life.
Not necessarily. And what’s interesting, what I’ve been doing is going back to… A lot of the early works are being written about AI in its sort of first decades of development, basically back to the 1970s. There’s an extraordinary AI professor called Joseph Weizenbaum, who wrote the program ELIZA—you might have seen this program. It’s a natural language processing…very early program designed to simulate conversation. Very basic. But he was amazed by how people were taken in by it. And it was…you know, a very simple kind of Turing test. Like, we have a conversation and oh, it sounds like a real person.
He very quickly started to ask critical social questions about AI. And he had this total conversion moment where he was like, if we start deploying AI into all of our social systems, it will be a slow-acting poison. So it’s a pretty harsh critique. But what it did was it started to make people think about where can this work, and where might it not work. I don’t think we’re going to win, Trevor, between you and I, in sort of trying to say, “Well, not all of life should be ‘metricized.’ ” I think that’s been happening for well over a century.
But I think we have the chance to push back when it comes to this issue of where should this be deployed? Are there areas where we simply don’t have sophisticated enough systems to produce fair outcomes?
Paglen: You know, one of the things that I know you’ve done a huge amount of work on, too, is just in terms of the ethics of research that goes into this. You know, like what are the human subjects implications for people in universities, doing the kind of groundwork that you know… Doing the kinds of studies, writing the kinds of algorithms that will eventually become a DeepFace or a DeepMind, or whatever Google, or what have you. But, could you talk about that a little bit? Just what are the research ethics.
Crawford: This is a really interesting space. And I’m going to basically give away a forthcoming research paper that we have that’s about to be publicized. But basically we’ve been looking into what I think is a really, really interesting shift. We already had a culture where a lot of scientists and academic researchers—particularly computer scientists—felt as though, “This is data that we’ve just collected from mobile phones. It’s not human subjects data. We can do what we want with it. We don’t have to ask about consent. We don’t have to think about lifetime of the data. We don’t have to think about risk.” Because computer science has never really thought of itself as a human subjects discipline. So it has been outside of all of that sort of human subjects work that happened to the critical social sciences and humanities in the late 20th century.
But here’s where gets really weird. There’s this thing that has just started to happen, and by just I mean probably in the last twenty-four months, where we’re moving to forms of autonomous experimentation. What that means is that these are systems where there isn’t a person “designing” the experiment and looking at the result. This is basically a machine learning algorithm that is looking at what you’re doing, poking you to see if you will click on our ads if we show you these images in quick succession. If that gets a good response it will continue to optimize and optimize, and reexperiment and reexperiment.
And this could happen to you thousands of times a day and you won’t be aware of it. There certainly isn’t any kind of ethics framework around autonomous experimentation. But there’s a new set of platforms, things called multiworld testing, where this is being deployed into things like everything from basically how you read news—so experimenting and seeing what kinds of news will make you buy more ads, to traffic directions, right. So if you’re in an autonomous experiment, someone will be allocated to the optimal route so you’ll get to work faster. But somebody has to be allocated to the suboptimal route, otherwise we’ll put you on the same road, so that won’t work.
Now that might be okay if all it means is that you’re going to be late to work five minutes. No big deal. But what if you’re rushing to hospital? What if you’ve got a sick kid? What if you have no way to say, “Do not assign me to the experimental condition which is suboptimal, please.” Like, there’s no consent mechanism, there’s no feedback mechanism. So once you start deploying something at that scale, we’re kind of used to the traffic optimization thing, because we can see it. But what happens when that’s in a whole range of sort of backend data sets where you’re being optimized and experiment on multiple times a day?
So for me, I’ve been collaborating with people specifically in machine learning and information retrieval. And we’ve been testing these systems, and looking at them, and going okay, what are the possible downsides here? How might you create mechanisms of feedback so people would be able to say, “Look, to me it’s worth it really not to be experimented on when I’m sick and racing to hospital.”
But these are mechanisms that haven’t been designed yet. So what I’m most interested in doing right now and where I think we have a big job to do is to create a field around what are the social implications of AI? Get people working on these systems trying to test them. Sometimes that will reverse engineering from afar. There are legal restrictions there like the CFAA in the US that really worry me. But I think that process of really trying to hack and test these systems is going to be critical.
Paglen: One of the recommendations that you made in the [2016] AI report that I think is actually quite important is that you’re calling for more diversity in the research. And that you were playing with some AI systems in the studio, there sure are autonomous experiments where there’s nobody in control. But sometimes you see the very specific subjectivities of the people who are creating these systems.
So for instance if you’re running object recognition, we ran it on a painting, it says “oh, this looks like a burrito.” A burrito is only a thing that you would think of as a class of things that you would want to identify if you were a white young person living in San Francisco in particular. So there are these moments where you really do see the specificities of the experience of the people developing the software.
And I think that that translates into many other kinds of spheres. So for example if you are a big data corporation and you decide oh, we’re not going to encrypt this data because it doesn’t really hurt anybody, I don’t really have anything to hide. That is you coming from a class position and a race position where yeah, maybe you don’t have anything to hide. You are not being preyed upon by police, by other kinds of agencies. And so to me that was really interesting, one of your recommendations; like, actually you need more diverse people working—
Crawford: Yeah…I mean, this is actually where we are not doing very well at all. So, right now if you look at the stats on what the engineering departments are like at the big kind of seven technology companies, the ratio—basically it’s around 80 to 90%, depending on which company, men. So just getting women into those rooms has been extremely difficult, for a whole lot of reasons.
And then if you look at people of color, the numbers are even more dismal. And underrepresented minorities. I mean, this is an extraordinarily homogeneous workforce. The people in these rooms designing these systems look like each other, think like each other, and come from generally speaking very upwardly-mobile, very wealthy kind of sectors of society.
So they’re mapping the world to match their interests and their way of seeing. And that might not sound like a big deal. But it is a huge deal when it comes to the fact that certain ways of life simply don’t exist in these systems. I mean, it’s interesting that I think strangely, sort of race and gender, which has always been an issue in computer science, is actually even more important in AI. Because AI, it’s not just an “economic” argument about “getting people jobs and getting people skills.” It’s about these are the people mapping the world, and they are only seeing this narrow, dominant slice. If we don’t get more diversity in those spaces or at least different ways of thinking about the world, we are going to create some serious problems.
Paglen: Absolutely. And so for me one of the takeaway things to think about when we’re think about AI, this is not neutral. There’s specific kinds of power that these systems are optimizing for. Some of them are maybe unconscious, you know, kind of racial positions or that sort of thing. But some of them are quite conscious, you know, like the kinds of systems that are going to become more profitable and reproduce themselves are ones that are going to make money. Are ones that are going to enhance military effectiveness. Ones that law enforcement would want to capitalize on. These are the kinds of vectors of power that are flowing through these. And so I think for me it’s always important to make that point, is that this is not happening in a vacuum. It’s not a level playing field. And that’s probably part of the civic project, is to think about what kinds of power do we want flowing through these optimizations.
Crawford: And I think showing people how power works is really key here. And this is where I think of your work now on machine vision, is that you’ll really showing people like, these are the different ways that bodies are tracked and understood. It’s very different to human seeing. That it has a whole range of capacities that people are not used to looking at. And I know you’re making a series of works that will really start to show people what this quite alien way of seeing looks like. And I think that is quite a radical and important act right now, is simply a lot of people are not aware of how much these systems are around us all the time.
So part of I think what we can do now and where I think artists and activists and academics can all really start to work together is first of all how do we show to people the materiality of these systems? And how do we start to think politically not just about hiding from them, or that encryption is going to be the answer. Because I fear that we’re in an arms race now. It’s actually going to take a lot more political pressure. It’s going to take a lot more research. And it’s going take a lot more public interest in this question. Because I mean, one of the things that I know we agree on is that this feels like a very big storm cloud that’s on the horizon. Like a lot of changes are about to happen, and a lot of people are just not aware of it yet. So a part of just making this public awareness a bigger issue I think is really important at this point.
Paglen: Absolutely. So I think that with that maybe we have time for question or two. I’m not quite sure, but—
Crawford: Are we allowed to do questions? No, we’re not. Sorry about that. You can come and talk to us later, or tonight at the party. But thank you so much. Thank you, Trevor.
Paglen: We’ll see you at the party. Thank you guys.
Further Reference
Invisible Images (Your Pictures Are Looking at You), Trevor Paglen at The New Inquiry