The Relevance of Algorithms

https://vimeo.com/69641358

Let me start with an assumption that comes out of the paper, that’s available on the web site if you care to look at it, that one of the things that brings us here is that we’re watching algorithms move outside of the theoretical realm. So outside of the computer science questions about how they’re built and how they work, and being deployed inside important moments in society.

What I like to think about is this question of how they are being installed as functioning parts of our public knowledge system. The ways that they’re being presented as efficient, reliable, authoritative mechanisms for producing and delivering knowledge. And I think this is right in line with the point that Joan gave us yesterday, that we are interested in part because Google has pointed to algorithms. We saw examples of that. This is what’s going to assure that the information you’re getting is reliable. This is what’s going to assure that the information is relevant. I hope it’s not just fear of algorithms that’s driving us, although maybe that’s a part of it.

But it’s an interesting question. Is that Google making a sort of empty gesture? Is that a deflection of responsibility. Is that the deception that is in fact part of the algorithm work that Robert was asking? Or is that something more? Is something being installed and offered? If not true yet, something that is being positioned as true, as a reliable form?

So the aim of the paper is that we might see algorithms not just as codes with consequences, but as the latest socially-constructed and institutionally-managed mechanism for assuring public acumen, a new knowledge logic. And this, I would say, draws our attention to the process by which that happens, which is not exactly the same as how to algorithms work, although it’s not unrelated, either. So what I’m suggesting is we’re not just looking at the production of algorithms but the production of the algorithmic as a kind of social justification, as a kind of legitimation mechanism.

And this requires asking how these tools are called into being by, enlisted as part of, and negotiated around collective efforts to know and be known. So let me see if I can draw some attention to that. When this is working, this depends on the kind of authority that an algorithm produces, a kind of lent authority of technical and calculational reassurance. And what I like to do is look at some of the frayed edges where the social role of algorithms as tools of knowledge are still unsettled.

So let me start with this example. In 2011 there was a bit of an uproar because Siri had just been introduced as this sort of voice-activated search mechanism for the iPhone, and people noticed that depending on where you asked questions, it seemed to be strangely unresponsive or downright coy about questions of abortion. And this is just one example. People did this in different cities. There was one story about a woman standing outside of Planned Parenthood asking, “Where can I find an abortion?” and it said [shrugs], “Hm. I dunno.”

This is a really interesting question. Apple had to sort of field this early on in its construction of Siri as sort of a reliable information asset. And if we thought about this, if the idea of treating this as an algorithmic exercise, there’s a pretty reasonable explanation for why these answers were unsatisfactory to people who were concerned about it.

It might be easy enough to say well, Siri is querying search engines. It’s looking at Yelp and it’s looking at Bing Answers and it’s looking at other search queries. And so when you say something like, “Where can I get an abortion?” it’s parsing that and saying okay, it’s looking for location based on where it’s looking for abortion; that’s my subject topic. I’ll put some version of that into a search mechanism and I’ll see what comes back. And if we think about the kind of material that’s on the web and how it’s organized, we might say, “Well, a site like Planned Parenthood might not have ‘abortion’ as its key information term. People who link to it might not be using the anchor ‘abortion.’ ”

But pro-life activist groups may very well do that. In fact, the head of NARAL Pro-Choice America said that the kinds of crisis pregnancy centers that are offering services but not abortion (quite deliberately not abortion) outnumber services that provide abortion. They try to game the system in making sure that the yellowpages.com search engines will point to their sites rather than something that would provide abortion services. This is a deliberate mechanism and search engines are yet not savvy to this, so it’s very hard to parse that.

So Apple could’ve said, “Look, this is based on an algorithmic assessment of Web information. The query you made called up certain kinds of resources. When you asked for Viagra, we were able to find drug stores. We could put that together. But abortion played into this strange mixture of what is and is not searchable.”

So maybe this is just a question of naïveté on the part of the people who were asking the question. Maybe we could call for algorithmic literacy. We could say, “People should understand that when they say, ‘Where’s an abortion?’ to Siri they’re going to get certain kinds of answers. If they find those politically troubling, that’s not Apple being pro-life, it’s an artifact of the way search works.”

And this is not unlike the example that was brought up yesterday by Claudia [Perlich]. The famous Target example. Target predicting whether their customers are pregnant, trying to send them coupons, and the father who got the coupons got all upset. We might think about it and say, “What’s weird is not that Target is trying to make a bet about how probable it is that because you bought certain kinds of things you match a pattern of other kinds of people who might’ve been pregnant and we send you a couple coupons. The weirdness is that the dad freaked out.” The dad got coupons from Target for baby carriages, and took that not as “Target has a probabilistic bet that someone in the household might or might not be pregnant and its worth it to them to send some coupons,” but he took it as an assertion. He took it as a claim. “Someone in your household is pregnant.” And to the best of his knowledge that wasn’t true. It turns out he was wrong. And so what might be strange is that the people involved are making misapprehensions of what algorithms offer.

Now, should that explanation be sufficient? Is that enough? I would say even for me, it feels insufficient for this. We could look at this and say, “This is a kind of naïveté about how algorithms work.” We could also look at it and say, “This is an articulation that we want more from our algorithms than can be provided algorithmically.” That when it comes to abortion, when it comes to these politically divisive issues, a purely algorithmic solution is not going to be enough.

And it’s a call for that. Rather than a naïveté, it’s saying, “This was insufficient, and we call upon Apple to be better about this.” Now, it gets conflicted. The people who believe one side of this are calling for one change. People who believe the other side are calling for a different change. This doesn’t solve the problem, but it recognizes that there are complexities in what we expect from a knowledge regimen. And the ability to gesture at algorithms and say, “This was a provided piece of knowledge that was algorithmically based,” is fine up to a point. But we find these edges where that becomes insufficient. And there is an outburst, a reaction. Maybe not fully-articulated, maybe unclear, but a reaction that says, “You’ve reached a point that is insufficient.” And I think that’s what was going on here.

So how would we begin to look at the production of the algorithmic? Not the production of algorithms, but the production of the algorithmic as a justifiable, legitimate mechanism for knowledge production. Where is that being established and how do we examine it?

We could look at algorithms in practice and ask about the implications of the results they offer, the conclusions they draw; that’s one way. We could look empirically at what people think of them when they rely on them. Do they treat them as perfectly unproblematic information sources? Do they question them, are they skeptical? We could look at controversies and think about when the claim to have been providing information algorithmically turned into a problem.

I want to suggest that looking at how sites regulate inappropriate content; when they run into questions about censorship; when they run into the kinds of information that people don’t want to see, provides a really interesting lens. This being sort of one of them. Here was “This is information that you’re not showing me that I would like to see.” But those edges where we begin to hold platforms responsible for the information they provide, especially around the kinds of traditionally hot-button issues around sex and violence, pornography, politics, suicide. All the kinds of things that we find ourselves trouble by the information regimen that could be offered.

Looking at the question of how information is curated and the role algorithms play in this I think poses a really interesting lens for this. Fundamentally, it’s about making value judgements, so it reminds us that the algorithms are making value judgements all the time, but those value judgements may run into each other in this cases.

Judgements about what not to show are contentious because they are both in and not in the service of the user. Sometimes this is for the sake of the user community not seeing something. Sometimes it’s, “Someone might want to see this but I’m not going to show it to them.” And a place where otherwise organizing principles have to be curtailed and set aside.

And also, when we worry about offensive material, inappropriate material, it urges us to want to decide who’s talking and who’s responsible. And that question of responsibility and accountability is one of the lenses where we can bring algorithmically-produced information into view.

Finally, it also works against the kind of broad, probabilistic perspective that I think is more native to contemporary algorithmic use. It was not surprising to me that Claudia’s information yesterday was about advertising. The idea that you can make probabilistic guesses based on what people have been searching and what they’re purchasing. And if you get two clicks in ten thousand, that’s a success. In that environment, the one moment someone is offended by content can be washed away.

But when we talk about offensive content, that one moment is highly troubling. So it’s that place where one instance of providing the wrong information becomes politically problematic, despite the approach that says most of the time we get it mostly right and that’s sufficient.

So let me pick on Google for a little while, since we’ve been doing that, and think about the way Google talks about whether or not or in what cases it wants to censor algorithmic results.

We start with a canonical description that Google has often brought out when people have criticized it for information it’s providing and said that it should change the index, which is an instance early on when searching for the word “Jew,” the first result that would come up on the search page was a highly anti-Semitic page called Jew Watch. And when people realized that this site was coming up as the first result, there was a great deal of criticism calling upon Google to say, “This needs to be removed. This needs to be altered. This is problematic.”

And Google made a decision not to alter that index at the time. And they made quite a bit of hay about it, saying they were internally torn, they thought this was a reprehensible site. But in the end, it was important for them not to alter the index. The same kind of answers that they gave to the Bettina Wulff case: It’s the Web telling us this. It’s algorithm judging this. If you don’t like the results, your critique is with the Web and with this site, not with the index. And if we get into the game of messing with the index and starting to alter things, then we’ve given up the ghost. It’s a problematic move. No provider’s been more adamant about the neutrality of its algorithm than Google, and regularly response with this response that it shouldn’t alter the search results.

So when Google in its “Ten things we know to be true” document or manifesto says, “Our users trust our objectivity and no short-term gain could ever justify breaching that trust.” I would say this is neither spin nor corporate Kool-Aid, it’s a deeply-ingrained understanding of the public character of Google’s information service, internal to Google, and it’s one that both influences and legitimizes many of their technical and commercial undertakings.

It doesn’t mean they don’t alter the index. But it’s something that they offer as an explanation for how to think about the index and how to think about their role. Part of this is that the algorithm offers a kind of assurance, a kind of technical and mathematical promise. Frank [Pasquale] in his paper [p.1] calls it the patina of mathematical rigor. And that lends them a kind of safe position from which to respond to criticism.

Google Suggest seems to be a different story. Google Suggest is the function is the function where if you begin to type a search query it will try to fill in what it’s guessing you’re searching for. And we can see that this is meant as a pretty productive thing. We’ve done as a workshop. We’ve typed in “governing a,” we get “governing algorithms” as the top hit. Nice predictive effort to fill in a space that I might very well have been typing in.

People have made light of the fact that it comes up with some pretty bizarre answers sometimes. A curious kind of hieroglyph about what it is that people are looking for in the world. And then sometimes you can begin to type and it will fill in some information. So, “how to ki” gives us something, but as soon as you put two more ls in, it stops… And it doesn’t give us anymore results.

I’ll say first, there are a number of queries in which this will happen, where it simply will refuse to give you auto-suggest things. It’s not as if there are no search queries every made that began with “how to kill.” I think Google’s worry, for a number of things, could be that the next word is “yourself,” and that’s really troubling. They’ve had a lot of concerns about if they’re providing information in a suicidal environment. Maybe they’re worried about techniques, teaching people how to do things.

How is it that this instance compares to the Jew Watch instance? In both cases, an algorithm result, based on mathematically rigorous assessment of user search queries and activity, produces a result that’s problematic for Google and troubling to people, and yet in the first case they are proud to say, “We don’t alter the index no matter how reprehensible the result that is returned,” and in this case they say, “No problem, we’ll take things out?” Why are they so willing to censor the auto-complete function when they’re usually so adamant about not censoring search results?

Let me give you a hybrid case. A couple of years ago there was an instance where if you typed in “Michelle Obama” into an image search, the very first image that cropped up was a highly racist, hideous Photoshopping of her face with the face of a baboon. Quite awful, quite stirring up some very old and troubling racist tropes in American society. And similar to the Jew Watch incident, people began to complain, said, “Google should do something about this. This is reprehensible.” And their first answer was exactly as before. They said, “We don’t change the index. We find it reprehensible, but we don’t change the index. This is the Web telling us for whatever reason people are linking to this. That’s what we’re calculating, and sorry.”

But criticism did not subside, and Google made a second decision. The second decision was that they would alter the index. They would take the image out of their image search. They replaced their ad banner with a little message “this index has been altered, click here to found out why.” So a moment where the attempt to say algorithm prevails, this information has to stand because the algorithm measures something and we should let the algorithm do what it does, it’s better to let it do what it does than to start mucking about, fell in response to this criticism.

So maybe race trumps religion? Maybe this was more horrific than Jew Watch. Maybe because it’s a sitting First Lady, right? Those explanations don’t quite stand. What I would suggest is that there is a different sense of proximity to the results. Maybe in legal terms that would be “liability,” but I would say it’s beyond that.

When Google serves up the link to Jew Watch, it is a result that must be clicked on. So the user’s still making a gesture that says “I will go visit this.” Google has offered it up at the top, but it hasn’t actually delivered it unto you. The Michelle Obama image is actually recreated in the image search, in thumbnail form. So Google’s a little closer to providing the image, it actually made it visible to you. Auto-suggest actually makes suggestions. It actually pops those things in. In fact, it’s not only something that seems to be coming out of Google’s mouth, it’s putting words in your mouth. “Isn’t this what you meant? Didn’t you mean, ‘how to kill yourself?’ ”

And that proximity is a really interesting, troubling question, because it raises the question of who’s voice do we think the algorithm is? And the kind of murkiness, the kind of fraught relationship we have to this idea that at an arm’s distance the tool produced that information. You don’t like Jew Watch? The tool produced that information. That’s the Web, and it’s carefully calculated, and we’re just over here doing our job. That distance gets narrower and narrower as we think about where the results are being provided from.

So we have, I would argue, a fraught relationship to the idea of algorithms and what they produce. Sometimes they are reassuringly offered as neutral tools, a reflection of what is. Sometimes they’re a measure of user activity, reflecting of us. And sometimes they’re the voice of the platform, what they say. What does Siri say? What does Apple say? What does Google say?

And this is more of a question of what do naïve users think? It’s not like, “Oh, somebody thinks Siri’s telling me the answer.” It’s how have we positioned these things as being the voice of the provider, or the voice of the tool, or the voice of our activity reflected back to us. And those things are not simple, and they have not been sorted out.

Let me do one more example, because it’s sort of fascinating to me and because there’s a different set of algorithms that I think we have another sort of fraught relationship with. I’m going to pick on Google a little big again, sort of. But curation of algorithmic results for a different reason.

There are a number of tools that I would call internal popularity mechanisms. So, how platforms like to tell us what we’re all doing on that site. Things like what’s the most popular video? Things like what’s the best-selling book? Things like what’s been most-emailed or most-viewed or most-often read? And then something which I’ve spent too much time thinking about, the Trends on Twitter: What are people talking about right now?

These are really fascinating to me, these kind of popularity mechanisms, presenting back in real-time, which I think is important. Apart of the information resource itself, these measures of interest, measures of activity, these are powerful ways of to keep someone on the site. Maybe they’ll click on that article and maybe it’s more likely than random to be an interesting one. And there’s lots of measures of activity and popularity that can be summoned up.

So here the knowledge is both from us and about us, and the question of who’s voice it’s speaking in is once again tricky. This is not new, that we’re told back to us what’s popular. And it’s not an attempt to be naïve and say that we’ve always expected those things to be an unhandled, uncurated measure. Telling us what’s popular is always a mechanism that encourages to buy something or just read something, encouraging us to think about something.

But it’s important to ask what’s the gain for providers to make such characterizations? How do they shape what they’re measuring? And how do these algorithmic glimpses help constitute and codify the very publics that they claim to measure? The publics that would not otherwise exist except that the algorithm called them into existence. That makes it, I think, even trickier when we begin to adjust the results.

YouTube made an announcement in 2008 that it was going to begin to algorithmically demote certain videos, videos that they didn’t find so problematic that they were going to remove them according to their guidelines, but were suggestive enough and adult enough that they wanted them out of their most-viewed, most-favorited lists.

And I thought this was a really peculiar thing to do, right? It’s kind of like, “You just said that they were kind of okay. They don’t break the rules. But we’re going to obscure them a little bit.” This is a very clumsy way to keep bad stuff away from the wrong people. It’s still there, it’s still working.

So the question was what else does this do? What else does that measure of popularity do besides being an actuarial measure of what’s popular? Well, it turns out that YouTube uses those algorithmic measures to pre-populate the front page. When a new user or an unregistered user shows up, they fill that page with videos you might like, and they base that on popularity. What they don’t want to do is have a new user show up on YouTube and find a bunch of bikini videos [and] get the wrong impression. Even though those bikini videos are in YouTube and they said they’re okay.

So rather than curating the list of popularity because something is so offensive that it shouldn’t be there, a kind of classic censorial removal, they’re curating their own self-presentation by altering the algorithm. What do we not measure, so that the product can do work for us, can populate that front page well?

And I’m just going to take one minute to give this idea, because it will connect to Kate’s talk. I want to think about this idea of what I’ve been calling calculated publics, and it’s an unformed idea. Maybe Kate will get it even smarter than I’ve been able to. Is it such that these algorithms that measure up “here’s what’s going on right now, here’s what people care about, here’s what’s highly-ranked,” which are very easy to add as features, very easy to offer. The sites have that data and what a convenient way to maybe get someone to stay on the site a little longer, read one more article, watch one more video.

But when they are offered up as “this is an insight into what people care about,” what’s trending, what’s most watched, what’s most important, do we read off of those an idea of that public that it represents? And if that’s not only algorithmically measured, which means certain people are being counted, certain actions are being counted, certain things are being weighted, but we’re also secondarily using that as a way to constrain it… Not because we want to show what’s popular but because we want to show a carefully curated version of what’s popular (because it serves the front page, because it makes recommendations). Then what kind of assumptions are we making about what this might seem to offer as a true glimpse of the public, versus a kind of curated version of the public?

There’s a fundamental paradox in the articulation of algorithms. Algorithmic objectivity is an important claim for a provider, particularly for algorithms that serve up vital and volatile information for public consumption. Articulating that algorithm as a distinctly technical intervention, as Google often does, helps an information provider answer charges of bias, error, and manipulation. Yet at the same time, there are moments when a platform must be in the service of community and its perceived values. And algorithms get enlisted to curate or are curated.

And there’s commercial value in claiming the algorithm provides better results than its competitors, provides customer satisfaction. In examining the articulation of an algorithm, we should pay particular attention to how these tensions between technically-assured neutrality and the social flavor of the assessment being made are managed and sometimes where they break down.

Thanks for your patience.

Further Reference

Two other presentations followed this, in response:

Response to Tarleton Gillespie’s “The Relevance of Algorithms” by Martha Poon
Can an Algorithm Be Agonistic? Ten Scenes about Living in Calculated Publics by Kate Crawford

The Governing Algorithms conference site with full schedule and downloadable discussion papers.

A special issue of the journal Special Issue of Science, Technology, & Human Values, on Governing Algorithms was published January 2016.

Open Transcripts

presented by Tarleton Gillespie
in Governing Algorithms » The Relevance of Algorithms
on 05/17/2013

Further Reference

Tags

Citations

Common Tags

Open Transcripts