J. Nathan Matias: In the first half of today’s sum­mit, we’ve heard Tarleton describe some of the chal­lenges we face around mod­er­a­tion. And we’ve heard Ethan and Karrie talk about ways that peo­ple work­ing from out­side of plat­forms, out­side of com­pa­nies, can use data to under­stand com­plex issues and cre­ate change. But how can we be con­fi­dent that the change we want to cre­ate is actu­al­ly hap­pen­ing? Now, that’s a dri­ving goal behind this idea of cit­i­zen behav­ioral sci­ence, and some­thing that we at CivilServant want to invite you to join us in. 

Now, in a series of short talks we’re going to share exam­ples of some of our past and upcom­ing work, along­side exam­ples from our par­ent orga­ni­za­tion Global Voices. But I want to start by say­ing some­thing about how we go about our work. 

I want to start with this exam­ple that has been appear­ing in the news. People have been pub­lish­ing claims by ex-Google employ­ee Tristan Harris that turn­ing your phone gray can reduce the addic­tion that he says we all have with our smart­phones. Nellie Bowles wrote in The New York Times, I’ve been gray for a cou­ple of days and it’s remark­able how well it has eased my twitchy phone checking.”

Now, Bowles did a smart thing. She decid­ed to do a test and col­lect evi­dence. And when­ev­er we have a ques­tion, when­ev­er we have a con­cern about the way that tech­nol­o­gy might be impact­ing our lives—it could be dis­crim­i­na­tion, it could be the poli­cies that we cocreate—collecting evi­dence is a smart move. Now how could we do that systematically?

So here is a chart of the num­ber of sec­onds that I typ­i­cal­ly look at my phone every time I pick it up. Because when you want to do a test, one of the first thing to do is fig­ure out what to mea­sure. So, I want­ed to know if addict­ing col­ors were influ­enc­ing my atten­tion. I would expect that I would be look­ing at my screen for longer. And that maybe if I turned my phone gray, per­haps I’d be spend­ing less time with the phone in front of my face. 

Now, mea­sure­ment does­n’t nec­es­sar­i­ly have to be a num­ber. I could keep a note­book, or I could have con­ver­sa­tions with peo­ple. But in this case I chose to report it like this. 

Now, I want to ask you to imag­ine if I decid­ed to turn my phone gray on Monday and then observed it over the next two days. I would have looked at this and seen, oh well actu­al­ly this is impact­ing my phone usage and I’m actu­al­ly look­ing at the phone as lit­tle as half as much, every time I pick it up, if I start­ed on Monday and com­pared it to how much I was using my phone on Sunday. 

But now let’s imag­ine that I actu­al­ly decid­ed to start on Friday and did a gray phone chal­lenge then. Well I might look at the data and think well, actu­al­ly turn­ing my phone gray makes me look at my phone even more. It’s the luck of the draw of which day I decid­ed to look at my phone and which day I decid­ed to start observ­ing. It could lead me to a com­plete­ly dif­fer­ent con­clu­sion based on chance. 

So, when we do exper­i­ments we try to con­trol for that fac­tor. We flip a coin. We say well, I’m going to flip a coin on Sunday, I’m going to flip a coin on Monday, Tuesday, Wednesday. And based on pure ran­dom chance, I’m going to then see how much I end up using my phone on that day. That’s a way that allows us to make sure that there’s no out­side influ­ence. Because it’s pos­si­ble that when decid­ing when to start my own gray phone chal­lenge I might have uncon­scious­ly been think­ing, Well I’ll wait to start until I have the time to do some­thing dif­fer­ent.” And that might in fact have been a day when I was using my phone less. 

And so ran­dom­ized tri­als and oth­er kinds of exper­i­ments allow us to do a com­par­i­son that’s not based on the kinds of luck of the draw but gives us a truer or a con­fi­dent pic­ture of the effect of the thing that we’re look­ing at. 

Now of course, if I just looked at it for two days I might be less con­fi­dent than if I’d observed my behav­ior over a longer peri­od of time. With each new day, my con­fi­dence in the result can increase. So let’s say for exam­ple that I am addict­ed to my phone, and that switch­ing my phone to gray reduces the time I spend every time I pick it up by four seconds. 

Now, some­one with my phone usage would actu­al­ly have to flip a coin and mon­i­tor my phone usage for about 315 days, over a year, to have a pret­ty good pic­ture of the effect on my behav­ior on aver­age. That’s a pret­ty incon­ve­nient sam­ple size. But if oth­er peo­ple join me it’s pos­si­ble to answer that ques­tion more quick­ly. Like if fif­teen peo­ple joined in, each of us might only have to look at our data and do this exper­i­ment for three weeks. 

So when we think about how each of us indi­vid­u­al­ly might try to make sense of this online ques­tion or an online risk, there are ben­e­fits that we gain by work­ing togeth­er, ben­e­fits we gain by ask­ing ques­tions with large num­bers of oth­er peo­ple. And in fact, this is some­thing that I’m doing. I’m real­ly curi­ous about this claim by Tristan. Are our phones addict­ing us? Is the col­or in those phones influ­enc­ing us and keep­ing us glued to those phones? And I’ll be launch­ing some­thing soon called the Gray Phone Challenge that invites any­one to find out if your phone is influ­enc­ing your behav­ior in this way, and if so, how. So look out for that.

The rea­son I bring that up is that what I just described is some­thing called a ran­dom­ized tri­al, an A/B test, where like A is your phone has col­or, B is your phone is gray. And that’s a method that is espe­cial­ly use­ful for find­ing out the effect of an attempt to cre­ate change. It might be a ques­tion about a mod­er­a­tion pol­i­cy. It might be a ques­tion about whether we’re see­ing dis­crim­i­na­tion from an algo­rithm. But this basic method is some­thing that we can use in a wide range of con­texts where we know at least that there’s some­thing valu­able that we can measure. 

And that’s one of the meth­ods that’s at the heart of the CivilServant non­prof­it, which orga­nizes peo­ple online to col­lab­o­rate to find dis­cov­er­ies about the effects of dif­fer­ent ideas for cre­at­ing change online. This is a project that start­ed in my PhD here at the MIT Media Lab and the Center for Civic Media. And now, thanks to the Media Lab, Princeton University, and the fun­ders I’ve mentioned—the Ethics in AI Governance Fund, the Mozilla Foundation, and gen­er­ous donors—we’re being able to take CivilServant and turn it into a new non­prof­it that can sup­port the pub­lic to do two things: to ask ques­tions that we have about how to cre­ate a fair­er, safer, more under­stand­ing Internet; whether we’re mod­er­a­tors on Reddit, YouTube chan­nel oper­a­tors, peo­ple on Wikipedia, bystanders on Twitter. We can all find ways to col­lect evi­dence and find out what works for the soci­ety we want to see. 

And the sec­ond part of what CivilServant does is to sup­port peo­ple to do this audit­ing work. To both eval­u­ate and hold account­able pow­er­ful enti­ties for the role of their tech­nolo­gies in society. 

In the next set of talks we’re going to hear from a set of com­mu­ni­ties that we’ve worked with over the last two years. And I want to say some­thing about how that process has worked on the Reddit platform. 

CivilServant, in addi­tion to being an orga­ni­za­tion, is a soft­ware plat­form. You can think of it as a social bot that com­mu­ni­ties invite into their con­text to col­lect data and coor­di­nate research on effec­tive mod­er­a­tion practices. 

So the research process starts when we have a con­ver­sa­tion with the com­mu­ni­ty. Usually our research agen­da is shaped by the pri­or­i­ties and ques­tions that mod­er­a­tors and com­mu­ni­ties on plat­forms like Reddit have. And then we have a con­ver­sa­tion about what they want to test. At that point a com­mu­ni­ty typ­i­cal­ly invites our bot into their sub­red­dit, into their group, to col­lect data with their per­mis­sion, to start observ­ing and mea­sur­ing the things that we’re inter­est­ed in. We then dis­cuss and debate what the right mea­sures are and arrive at a descrip­tion and a process for ask­ing the ques­tion that they have. 

I should note that CivilServant works in such a way that our abil­i­ty to oper­ate in the com­mu­ni­ty is com­plete­ly sub­ject to that com­mu­ni­ty’s dis­cre­tion. If they want to end the study, they can just kick out our bot and we end the work. So we try to do work that is account­able to and shaped by communities. 

We then run that research by uni­ver­si­ty ethics boards and make sure that it’s approved in that way. And once all of that has hap­pened, we start the study. We often pre­fer that there be some kind of wider com­mu­ni­ty con­sul­ta­tion; that’s not always pos­si­ble. Then we start the study, and it might run for a few weeks, in some cas­es it might run for a few months. 

But ulti­mate­ly, we com­plete the peri­od. You know, it’s that case where I’ve observed my behav­ior for 315 days or I found fif­teen of my friends to make it much short­er. We get to the end of the study, and we ana­lyze the data. We do var­i­ous sta­tis­ti­cal meth­ods and arrive at some answer. Does chang­ing your phone to gray reduce your screen time? Does ban­ning accounts in a cer­tain way or respond­ing to harass­ment in a cer­tain way change peo­ple’s behav­ior in the ways that we expect? There are a wide range of ques­tions that you’ll hear from in just a moment. 

And then we also have a pub­lic con­ver­sa­tion, where we let peo­ple know here’s what we did, here’s what we found. Do you have ques­tions? Do have cri­tiques? Is there fur­ther research that we need to do, oth­er debates and ques­tions we have to have about the ethics of this work? How can we be more trans­par­ent about our own research? And that often leads into con­ver­sa­tions with­in those com­mu­ni­ties about how they should make deci­sions and change what they do based upon what we discovered. 

In the case of my own phone behav­ior, at the end of this study I’ll have a deci­sion: do I want to change my phone to only show me gray, or do I want to see col­ors? And that’s a con­ver­sa­tion about val­ues that we as researchers aren’t nec­es­sar­i­ly in the best posi­tion to make a deci­sion on, but we want to put evi­dence into your hands to make the best deci­sion that you can.

And the cycle does­n’t end there. After we’ve com­plet­ed a study with one com­mu­ni­ty, we’re then in a good posi­tion to then sup­port oth­er peo­ple to do sim­i­lar research. If we’ve already built the soft­ware for a study, it’s real­ly easy to set up some­thing sim­i­lar in a new com­mu­ni­ty or a new con­text and gath­er more evi­dence. Because what might work for me—I might be more dis­tractible than some­one else, and rather than just take my word for it or data from my expe­ri­ence, you might also want to learn for your­self. That’s what researchers call a replication. 

And one of the ideas behind the CivilServant non­prof­it is that as we sup­port more and more communities—more cit­i­zen behav­ioral scientists—to ask these ques­tions, all of the results of what we dis­cov­er will be shared to an open repos­i­to­ry where any­one can read the results, and also where any­one can do their own exper­i­ment and build up a larg­er pic­ture of where and under what con­di­tions, and exact­ly how the dif­fer­ent ideas in mod­er­a­tion and oth­er areas of online behav­ior work. 

So CivilServant, as I said, has received fund­ing from a num­ber of sources. We’ve hired our first staff. We’re going to be announc­ing a num­ber of new posi­tions in the next months. And over the next year our hope is that with your help, with the help of peo­ple across the Internet, we’ll be able to not just cre­ate a trick­le of these stud­ies but real­ly scale the pace of knowl­edge around a wide range of ques­tions for a fair­er, safer, more under­stand­ing Internet. Our kind of goal for now is to see how close to 100 new stud­ies we can achieve in the 2018 cal­en­dar year. And we’re hop­ing that you will be part of that con­ver­sa­tion with us. 

So next up, to give you a sense of the kind of research that we’ve already done and will be doing, I’m going to ask up a round of researchers and com­mu­ni­ties who are going to tell us about what we’ve done togeth­er, and also what our peer orga­ni­za­tions have done. So if you’re one of our light­ning talk speak­ers, now is a good time to queue up and be ready to to come on stage. I’ll just ask you to come up, and when you fin­ish, if you can step down and the next talk can commence.