J. Nathan Matias: In the first half of today’s sum­mit, we’ve heard Tarleton describe some of the chal­lenges we face around mod­er­a­tion. And we’ve heard Ethan and Karrie talk about ways that peo­ple work­ing from out­side of plat­forms, out­side of com­pa­nies, can use data to under­stand com­plex issues and cre­ate change. But how can we be con­fi­dent that the change we want to cre­ate is actu­al­ly hap­pen­ing? Now, that’s a dri­ving goal behind this idea of cit­i­zen behav­ioral sci­ence, and some­thing that we at CivilServant want to invite you to join us in.

Now, in a series of short talks we’re going to share exam­ples of some of our past and upcom­ing work, along­side exam­ples from our par­ent orga­ni­za­tion Global Voices. But I want to start by say­ing some­thing about how we go about our work.

I want to start with this exam­ple that has been appear­ing in the news. People have been pub­lish­ing claims by ex‐Google employ­ee Tristan Harris that turn­ing your phone gray can reduce the addic­tion that he says we all have with our smart­phones. Nellie Bowles wrote in The New York Times, I’ve been gray for a cou­ple of days and it’s remark­able how well it has eased my twitchy phone check­ing.”

Now, Bowles did a smart thing. She decid­ed to do a test and col­lect evi­dence. And when­ev­er we have a ques­tion, when­ev­er we have a con­cern about the way that tech­nol­o­gy might be impact­ing our lives—it could be dis­crim­i­na­tion, it could be the poli­cies that we cocreate—collecting evi­dence is a smart move. Now how could we do that sys­tem­at­i­cal­ly?

So here is a chart of the num­ber of sec­onds that I typ­i­cal­ly look at my phone every time I pick it up. Because when you want to do a test, one of the first thing to do is fig­ure out what to mea­sure. So, I want­ed to know if addict­ing col­ors were influ­enc­ing my atten­tion. I would expect that I would be look­ing at my screen for longer. And that maybe if I turned my phone gray, per­haps I’d be spend­ing less time with the phone in front of my face.

Now, mea­sure­ment doesn’t nec­es­sar­i­ly have to be a num­ber. I could keep a note­book, or I could have con­ver­sa­tions with peo­ple. But in this case I chose to report it like this.

Now, I want to ask you to imag­ine if I decid­ed to turn my phone gray on Monday and then observed it over the next two days. I would have looked at this and seen, oh well actu­al­ly this is impact­ing my phone usage and I’m actu­al­ly look­ing at the phone as lit­tle as half as much, every time I pick it up, if I start­ed on Monday and com­pared it to how much I was using my phone on Sunday.

But now let’s imag­ine that I actu­al­ly decid­ed to start on Friday and did a gray phone chal­lenge then. Well I might look at the data and think well, actu­al­ly turn­ing my phone gray makes me look at my phone even more. It’s the luck of the draw of which day I decid­ed to look at my phone and which day I decid­ed to start observ­ing. It could lead me to a com­plete­ly dif­fer­ent con­clu­sion based on chance.

So, when we do exper­i­ments we try to con­trol for that fac­tor. We flip a coin. We say well, I’m going to flip a coin on Sunday, I’m going to flip a coin on Monday, Tuesday, Wednesday. And based on pure ran­dom chance, I’m going to then see how much I end up using my phone on that day. That’s a way that allows us to make sure that there’s no out­side influ­ence. Because it’s pos­si­ble that when decid­ing when to start my own gray phone chal­lenge I might have uncon­scious­ly been think­ing, Well I’ll wait to start until I have the time to do some­thing dif­fer­ent.” And that might in fact have been a day when I was using my phone less.

And so ran­dom­ized tri­als and oth­er kinds of exper­i­ments allow us to do a com­par­i­son that’s not based on the kinds of luck of the draw but gives us a truer or a con­fi­dent pic­ture of the effect of the thing that we’re look­ing at.

Now of course, if I just looked at it for two days I might be less con­fi­dent than if I’d observed my behav­ior over a longer peri­od of time. With each new day, my con­fi­dence in the result can increase. So let’s say for exam­ple that I am addict­ed to my phone, and that switch­ing my phone to gray reduces the time I spend every time I pick it up by four sec­onds.

Now, some­one with my phone usage would actu­al­ly have to flip a coin and mon­i­tor my phone usage for about 315 days, over a year, to have a pret­ty good pic­ture of the effect on my behav­ior on aver­age. That’s a pret­ty incon­ve­nient sam­ple size. But if oth­er peo­ple join me it’s pos­si­ble to answer that ques­tion more quick­ly. Like if fif­teen peo­ple joined in, each of us might only have to look at our data and do this exper­i­ment for three weeks.

So when we think about how each of us indi­vid­u­al­ly might try to make sense of this online ques­tion or an online risk, there are ben­e­fits that we gain by work­ing togeth­er, ben­e­fits we gain by ask­ing ques­tions with large num­bers of oth­er peo­ple. And in fact, this is some­thing that I’m doing. I’m real­ly curi­ous about this claim by Tristan. Are our phones addict­ing us? Is the col­or in those phones influ­enc­ing us and keep­ing us glued to those phones? And I’ll be launch­ing some­thing soon called the Gray Phone Challenge that invites any­one to find out if your phone is influ­enc­ing your behav­ior in this way, and if so, how. So look out for that.

The rea­son I bring that up is that what I just described is some­thing called a ran­dom­ized tri­al, an A/B test, where like A is your phone has col­or, B is your phone is gray. And that’s a method that is espe­cial­ly use­ful for find­ing out the effect of an attempt to cre­ate change. It might be a ques­tion about a mod­er­a­tion pol­i­cy. It might be a ques­tion about whether we’re see­ing dis­crim­i­na­tion from an algo­rithm. But this basic method is some­thing that we can use in a wide range of con­texts where we know at least that there’s some­thing valu­able that we can mea­sure.

And that’s one of the meth­ods that’s at the heart of the CivilServant non­prof­it, which orga­nizes peo­ple online to col­lab­o­rate to find dis­cov­er­ies about the effects of dif­fer­ent ideas for cre­at­ing change online. This is a project that start­ed in my PhD here at the MIT Media Lab and the Center for Civic Media. And now, thanks to the Media Lab, Princeton University, and the fun­ders I’ve mentioned—the Ethics in AI Governance Fund, the Mozilla Foundation, and gen­er­ous donors—we’re being able to take CivilServant and turn it into a new non­prof­it that can sup­port the pub­lic to do two things: to ask ques­tions that we have about how to cre­ate a fair­er, safer, more under­stand­ing Internet; whether we’re mod­er­a­tors on Reddit, YouTube chan­nel oper­a­tors, peo­ple on Wikipedia, bystanders on Twitter. We can all find ways to col­lect evi­dence and find out what works for the soci­ety we want to see.

And the sec­ond part of what CivilServant does is to sup­port peo­ple to do this audit­ing work. To both eval­u­ate and hold account­able pow­er­ful enti­ties for the role of their tech­nolo­gies in soci­ety.

In the next set of talks we’re going to hear from a set of com­mu­ni­ties that we’ve worked with over the last two years. And I want to say some­thing about how that process has worked on the Reddit plat­form.

CivilServant, in addi­tion to being an orga­ni­za­tion, is a soft­ware plat­form. You can think of it as a social bot that com­mu­ni­ties invite into their con­text to col­lect data and coor­di­nate research on effec­tive mod­er­a­tion prac­tices.

So the research process starts when we have a con­ver­sa­tion with the com­mu­ni­ty. Usually our research agen­da is shaped by the pri­or­i­ties and ques­tions that mod­er­a­tors and com­mu­ni­ties on plat­forms like Reddit have. And then we have a con­ver­sa­tion about what they want to test. At that point a com­mu­ni­ty typ­i­cal­ly invites our bot into their sub­red­dit, into their group, to col­lect data with their per­mis­sion, to start observ­ing and mea­sur­ing the things that we’re inter­est­ed in. We then dis­cuss and debate what the right mea­sures are and arrive at a descrip­tion and a process for ask­ing the ques­tion that they have.

I should note that CivilServant works in such a way that our abil­i­ty to oper­ate in the com­mu­ni­ty is com­plete­ly sub­ject to that community’s dis­cre­tion. If they want to end the study, they can just kick out our bot and we end the work. So we try to do work that is account­able to and shaped by com­mu­ni­ties.

We then run that research by uni­ver­si­ty ethics boards and make sure that it’s approved in that way. And once all of that has hap­pened, we start the study. We often pre­fer that there be some kind of wider com­mu­ni­ty con­sul­ta­tion; that’s not always pos­si­ble. Then we start the study, and it might run for a few weeks, in some cas­es it might run for a few months.

But ulti­mate­ly, we com­plete the peri­od. You know, it’s that case where I’ve observed my behav­ior for 315 days or I found fif­teen of my friends to make it much short­er. We get to the end of the study, and we ana­lyze the data. We do var­i­ous sta­tis­ti­cal meth­ods and arrive at some answer. Does chang­ing your phone to gray reduce your screen time? Does ban­ning accounts in a cer­tain way or respond­ing to harass­ment in a cer­tain way change people’s behav­ior in the ways that we expect? There are a wide range of ques­tions that you’ll hear from in just a moment.

And then we also have a pub­lic con­ver­sa­tion, where we let peo­ple know here’s what we did, here’s what we found. Do you have ques­tions? Do have cri­tiques? Is there fur­ther research that we need to do, oth­er debates and ques­tions we have to have about the ethics of this work? How can we be more trans­par­ent about our own research? And that often leads into con­ver­sa­tions with­in those com­mu­ni­ties about how they should make deci­sions and change what they do based upon what we dis­cov­ered.

In the case of my own phone behav­ior, at the end of this study I’ll have a deci­sion: do I want to change my phone to only show me gray, or do I want to see col­ors? And that’s a con­ver­sa­tion about val­ues that we as researchers aren’t nec­es­sar­i­ly in the best posi­tion to make a deci­sion on, but we want to put evi­dence into your hands to make the best deci­sion that you can.

And the cycle doesn’t end there. After we’ve com­plet­ed a study with one com­mu­ni­ty, we’re then in a good posi­tion to then sup­port oth­er peo­ple to do sim­i­lar research. If we’ve already built the soft­ware for a study, it’s real­ly easy to set up some­thing sim­i­lar in a new com­mu­ni­ty or a new con­text and gath­er more evi­dence. Because what might work for me—I might be more dis­tractible than some­one else, and rather than just take my word for it or data from my expe­ri­ence, you might also want to learn for your­self. That’s what researchers call a repli­ca­tion.

And one of the ideas behind the CivilServant non­prof­it is that as we sup­port more and more communities—more cit­i­zen behav­ioral scientists—to ask these ques­tions, all of the results of what we dis­cov­er will be shared to an open repos­i­to­ry where any­one can read the results, and also where any­one can do their own exper­i­ment and build up a larg­er pic­ture of where and under what con­di­tions, and exact­ly how the dif­fer­ent ideas in mod­er­a­tion and oth­er areas of online behav­ior work.

So CivilServant, as I said, has received fund­ing from a num­ber of sources. We’ve hired our first staff. We’re going to be announc­ing a num­ber of new posi­tions in the next months. And over the next year our hope is that with your help, with the help of peo­ple across the Internet, we’ll be able to not just cre­ate a trick­le of these stud­ies but real­ly scale the pace of knowl­edge around a wide range of ques­tions for a fair­er, safer, more under­stand­ing Internet. Our kind of goal for now is to see how close to 100 new stud­ies we can achieve in the 2018 cal­en­dar year. And we’re hop­ing that you will be part of that con­ver­sa­tion with us.

So next up, to give you a sense of the kind of research that we’ve already done and will be doing, I’m going to ask up a round of researchers and com­mu­ni­ties who are going to tell us about what we’ve done togeth­er, and also what our peer orga­ni­za­tions have done. So if you’re one of our light­ning talk speak­ers, now is a good time to queue up and be ready to to come on stage. I’ll just ask you to come up, and when you fin­ish, if you can step down and the next talk can com­mence.


Help Support Open Transcripts

If you found this useful or interesting, please consider supporting the project monthly at Patreon or once via Square Cash, or even just sharing the link. Thanks.