Tal Zarsky: What I’ll try and do now is address the analytical argument presented in Frank’s paper, which to a great degree touches upon a variety of problems arising from the use of predictive algorithms when what they’re trying to predict is human-related decisions and actions.

Now, another subset of Frank’s paper is not only that this dynamic has problems, but some of these problems arise from two traits. One is that this process is automated. And the other that this process is opaque. And when you look at these two elements, then the next argument [that] follows is if we generate a greater extent of transparency and perhaps limit the level of automation, some of these problems will be mitigated, a question I would like to touch upon.

Now my main contribution to the discussion would be trying to look at Frank’s overall discussion that he had—here are several elements in the slide—and try and break down four main topics which are forms of four different attacks or problems with these processes, and run this analytical process of looking to what extent are these problems exacerbated by automation and lack of transparency. And if we take away and have less automation and more transparency how are these problems possibly mitigated?

Now also what I intend to focus on are less on the obvious insights and more on somewhat more provocative points. And the rest of these issues I touch upon on in various articles that I wrote and you could easily find them online.

Now, before I go further, I already made one mistake. Because I noted automation and transparency as two different issues, and to a great extent they’re interwoven and connected at various contexts. And I’ll talk about one context now.

When you make a decision to opt for an automated process, to some extent you’re already by doing so compromising transparency. Or you could say it the other way around. It’s possible to argue that if you opt for extremely strict transparency regulation, you’re making a compromise in terms of automation. And let’s try to demonstrate this with a point from Frank’s paper. He talks about credit ratings, the FCRA, and in a recent amendment he notes that you must inform the individual of the four dominant factors which led to the rating decision about him.

Now, if the process is fully automated, which means that a data-driven process with no analyst involvement takes all the data, crunches it, comes up with various patterns and categorizations, it might be impossible to indicate these four leading factors. There might be thousands of such factors, each one interacting differently with the other. So, once you make this law that you must have the ability to indicate these four dominant factors—and Frank correctly notes that also this rule is watered down. But even when you make this law, there be a compromise here in the level of automation, and some would say in the level of accuracy the process requires.

Now this is a fair point. It’s fair for the regulator to come in and say, “We find transparency more important than having full automation.” But it’s a point that we have to remember. There are other interactions between these terms of automation and transparency but we’ll set that aside for the moment.

Now, I want to go back to the opening point here. And we’re having your discussions about using algorithms, and this is closely related to a term I don’t think I’ve heard this morning—maybe I missed it—the term of “big data” that many people are talking about that to overcome this big data issue you need to run these algorithms, and you have big data in various contexts. But we’re talking here, especially Frank in the first part of his section, we’re talking about information which relates to human behavior. And when you make predictions we’re not making predictions about physics, Frank points out, we’re making predictions about how people will behave.

To make such predictions, you need to have certain assumptions about the consistency of human conduct. That people will carry through their behave. There if such a thing as a human trait. And it’s a very uncomfortable feeling because this builds into our understanding of free will, right. If you could predict how someone will behave, should we actually judge him negatively if he does something wrong? So lawyers and philosophies from various schools might feel very intimidated by this notion, and I think that this leads to the clash that many people have with this notion of predicting and governing algorithms. And it’s very…in my opinion it’s very close to this notion of privacy, also. From a different perspective, that when it comes to privacy, this notion that not everyone could always see what we’re doing and track us at all times, is something that’s part of liberal society. And when it’s coming under attack we have a very visceral feeling that something’s wrong. We can’t put a finger on it. And it might come from this internal clash. What we’ll try to do now is understand where it’s coming from.

So, we could have four general perspectives on problems of predictions. I’m not going to go through the list now, let’s start out with the first. Now the first argument, which is a running theme in Franks paper paper and also presented now is that when you use these predictive algorithms, you’re decreasing overall welfare. What does that mean, it means that we have a process and this process doesn’t really work. It’s ridden with errors from various forms—errors in information, errors in the process. And at the end of the day, no one is going to gain out of this system at all. Okay.

So this is one argument that you see in the literature. But when you think of this argument you need to think of two other elements. One is you have to think of the alternatives, and you have to remember the previous practices that people writing specific reports about individuals. Everything was done in hand. They were looking at their personal traits. Do we do we really want to move into that alternative world? And in addition, we have to talk about what this process facilitates. It facilitates overall a low level of credit in the United States which uses this system. So this has benefits to all of societies, and it’s especially a benefit to those with limited access to capital. So this is a compromise we make within society.

Now, Frank explains that transparency plays a powerful role, or lack thereof—a level of opacity, it leads to the problems of this process. Now, that could be argued from various directions. One argument is we don’t really know what’s happening and therefore we can’t make corrections. Or there’s a lack of an incentive, a lack of an incentive to these firms to manage their practice’s correctly because we don’t know what they’re doing. And there are arguments to be made as to how much merit this argument has because even if we know what is happening it’s not sure that we have reporters going after that. And government will be involved because this is a very complex and technical topic.

But even setting that aside, and our next panelist will talk about this to a greater extent. Once you have transparency, so…there is this fear of gaming, right? And what does gaming actually mean? It means that if people understand how these predictive processes work, people would understand what the proxy is, and they’ll work around the proxy but still at the end of the day they reach the problematic outcome. That means that if there are various indicators for problematic behavior, they will engage in different behavior, but still the outcome will be the same. And we’ll hear later about how a problematic that is. And the problem might be that this undermines the entire system of lowering the level of credit.

Now that is true. But I want to point out an even greater problem. Which is once you have transparency with regard to these proxies, people are going to work around and try and game the system. And the system might crash and we might not care because we might have a higher level of credit. But this might lead to massive negative externalities because people will engage in actions that are deemed problematic to credit. So one example from the paper is that certain discount stores are an indication of a lower level of credit. That means that people will stop shopping in these lower discount stores. Which might lead overall to problems in the economy and decrease overall welfare because we do want people to purchase at discount stores. And this is only one example. We don’t want them constantly acting thinking about how they’ll work through their credit score because that will have overall negative impact. So this is one point. We have more on that but I’ll skip ahead.

Now the next point…I think this will be my final point here, is that we talk about…in the paper, that regardless of the fact that this enhances or decreases overall welfare, there are various unfair outcomes. Because you have transfers of wealth among groups. And we’re pointing to two different groups here. There’s a transfer of wealth between consumers to firms. And there’s a transfer of wealth between more powerful groups to the disenfranchised, or the other way around. And how will a higher level of transparency affect these issues?

So, this is something that’s worth thinking about, and let me give you two quick intuitions. One intuition is that because the firms have so much information about us, they could entice us into these unfair deals. And Oren Bar-Gill writes about this extensively and he also talks about the effect of disclosure on this process, so this is another interesting realm of literature in the law and econ world that actually looks at the same problem and offer the same solution. And it’s compelling to figure out if really transparency will lower this risk of the transfer of wealth from weaker consumers to the larger firms because we have so much information and could structure our predictions based upon that.

But the other point is what will be the effect of having more information about the process about the fact that we’ll have transfer of wealth from various groups to another. Sot the basic intuition is that what’s happening now is that we’re able to put our finger about people that there’s a higher chance that they’re not going to pay. And we’re lowering the risk from them and transferring the wealth to these strong groups. And if I’m sophisticated—I have access to knowledge, I have access to education, I could make sure I’m not going to be in that weaker group. And if there’s transparency…so government and firms will say, “No no, we can’t do that. You can’t allow them to do that,” so that’s how transparency will solve this.

But a possible theory would be if we have transparency this problem will only get worse. Why would that be? Now let’s think about this world. We have full information, and we see the process and we see that various actions, various groups, are indicated as higher levels of risk. So think of Walmart, for instance. Data mining is indicating Walmart, people shopping at Walmart, higher level of risk. What is Walmart going to do? It’s going to put pressure through Washington or directly on these groups. Think about it that there’s no indication about lawyers or doctors and their level of credit. What will groups of lawyers and doctors do? Put pressure through Washington or directly, “Indicate us in the list as lower level or risk.” And therefore, the fact that your have transparency as opposed to a opacity will bring in all these discussions of political economy and private interest into this realm of automated prediction, and make them even more biased and therefore transparency will be a problem and not a solution. So I encourage you to look at this paper, which I hope to open up later. And other papers I wrote about this issue. And for more discussions on this point

Do I have a thirty seconds for a last point? Okay.

So I focused mostly on the credit point. I want to talk briefly about a very interesting segment in the paper which Frank refers to as “the need for speed,” right. That you have great incentives currently in Wall Street that the banks are using technology and telecommunication infrastructure to allow them to engage in faster transactions, even a split second before the other, and in that way they get the deal first and therefore they have advantages. And this is something that security regulators all around the world are thinking about. And recently I with a colleague wrote a paper called Queues in Law and I found this is a very interesting example. That’s the entire premise, that why you have this need for speed is this notion of first in time, first in right. And we really have to think of this very basic notion in society and why are we saying that in this specific instance, first in time should be first in right? Perhaps this is a point that there’s no reason to accept this notion, which is very basic to our understanding of fairness and efficiency, and set that aside.

And I just want to say that at this point here actually, this fact that you have always first in time, first in right, it’s allowing wealth transfer to sophisticated players. So there’s a problem with using this model. But on the other hand this is something we need to think about. There are again massive positive externalities. Because you have firms like Wall Street putting in a lot of money into ICP, telecommunication infrastructure, computer systems that are later used in other realms which assist society. So in fact, all of us investing in stock markets and allowing some of our funding to pass on to Goldman Sachs, that put the money into inventions in telecommunications, which come back to us in an infrastructure advantage, maybe that’s a deal that we really need in society when government doesn’t have incentives to invest in these forms of technologies. Thank you very much, and questions I’ll be happy to receive at this address.

Further Reference

The Governing Algorithms conference site with full schedule and downloadable discussion papers.