I’ve learned so much already this cou­ple of days days at PopTech, and already some of the things that I’m learn­ing are offer­ing me a lit­tle bit of clar­i­ty. For exam­ple, Moran Cerf yes­ter­day told us that there are actu­al­ly two voic­es in our brains that are help­ing us to make deci­sions. In this case, I have me on the stage stand­ing here right now, and then I have the me that decid­ed that this slide would be a good way to start off the presentation. 

Big data? I think all of us have prob­a­bly seen pre­sen­ta­tions already about big data. Big data in busi­ness, big data in sci­ence, big data in piz­za deliv­ery. Whatever those things are. But I think one of the things that we haven’t heard very much about is what is the sub­jec­tive expe­ri­ence like of liv­ing in this world of big data? What is it like to be us liv­ing in this ever more com­pli­cat­ed world?

I do a lot of trav­el­ing, and and so maybe this is an exam­ple with some bias, but I think if we imag­ine the expe­ri­ence of being in an air­port, we might start to under­stand this data expe­ri­ence. First of all because there are a lot of sys­tems that are trans­par­ent to us that are hap­pen­ing around us. Our bag­gage is mov­ing on carousels. There are secu­ri­ty agents who are herd­ing us through line-ups, and so on and so on and so on.

Second of all, there’s a loss of con­trol there. I think it’s one of the only places that we vol­un­tar­i­ly give up con­trol in our lives, in an air­port. We’re put into line-ups. We’re kind of direct­ed. We’re put on to these planes in these kind of data pack­ets that are then sort­ed into anoth­er air­port and offloaded, and so on and so on and so on.

Maybe more impor­tant­ly, though, is this idea that we’re part of a sys­tem that we can’t pos­si­bly imag­ine the mag­ni­tude of. Right now, as we speak, there are more than a mil­lion peo­ple in the air. The graph­ic that you see behind us is this respir­ing sys­tem of air­planes land­ing and tak­ing off at fifteen-minute inter­vals. There are thou­sands of air­planes in the air right now. So I real­ly think that that this idea of being caught in a sys­tem which is very com­pli­cat­ed, and too com­pli­cat­ed for us to under­stand real­ly mir­rors this expe­ri­ence of big data. 

The American nov­el­ist David Foster Wallace was very pres­i­dent about this. He was asked by his edi­tor when he was writ­ing Infinite Jest about why he put so many foot­notes in the book. The foot­notes in that book are real­ly incred­i­ble. Sometimes the foot­notes have foot­notes, and occa­sion­al­ly those foot­notes have foot­notes, as well. And he said that one of the things he want­ed to do was to mim­ic the infor­ma­tion flood and data triage that he expect­ed to be an even big­ger part of life fif­teen years hence. Infinite Jest was writ­ten six­teen years ago.

I’ve been real­ly excit­ed about this idea of big data since I had a con­ver­sa­tion about three years ago with this man, László Barabási. We were work­ing on a project for Wired mag­a­zine in the UK, where I was the edi­tor of that some­what unfortunately-named sec­tion called Infoporn. I worked with doc­tor Barabási on a project to help show some of the results from a project that he was work­ing on about human mobil­i­ty. And we had the real­ly good luck of work­ing with this data set which is one of the largest data sets of cell phone usage from an unnamed European coun­try. The coun­try has a name, I’m just not allowed to tell you what it is. And so the first thing we see is this graph, which is not a very excit­ing graph.

What this is, is this is a seg­ment of about fifty thou­sand of those peo­ple, and as the graph gets taller on the left, those peo­ple talk a lot on their phones. And as it gets short­er on the right, they don’t talk very much at all. And the peo­ple on the left, they talk eighty-four hours a week on their phone, and the peo­ple on the right are not real­ly talk­ing at all. 

So, on the left-hand side of this graph­ic, I took a whole bunch of these sec­tions and stacked them up so that we could just see the rich­ness of this data. There’s a lot of data, and for each per­son in that data set, we were able to see their call­ing his­to­ry over time, piece it togeth­er, see exact­ly how they were call­ing and who they were calling. 

But maybe more inter­est­ing­ly is the thing that hap­pened on the oth­er side of the page, for which I built these lit­tle cubes which I call mobil­i­ty maps.

And so what we’re see­ing here is we’re see­ing a cube that shows a sin­gle per­son­’s trav­el over about four days. This per­son is clear­ly a com­muter. They trav­el back and forth from one loca­tion to anoth­er. But in that data set, we were able to pro­duce these mobil­i­ty maps for every­body in that data set. Tens of thou­sands, hun­dreds of thou­sands of peo­ple, and see how their lives can be rep­re­sent­ed in this real­ly sim­ple form. And one of the things that sur­prised me and that sur­prised the Barabási group was that there was a lot of pre­dic­tive­ness in this data. And so I start­ed get­ting this idea that we could look at the data trails we were leav­ing behind, and we could start to con­struct things from them. So the next few projects that I worked on car­ried through that idea. 

This is a project called Just Landed. How many of you are on Twitter? Probably most peo­ple are on Twitter. So, you’ve read those tweets, that some­body says, I just land­ed in Hawaii. We’re stuck on the run­way for twen­ty min­utes. This is real­ly irri­tat­ing.” These rich white peo­ple tragedy quotes, right. So I thought it would be real­ly inter­est­ing to take those kind of real­ly self-serving things, these thinly-veiled show-offs and put them togeth­er into a map. Maybe we could recre­ate human mobil­i­ty sys­tem by see­ing how peo­ple are show­ing off about their trav­el around the world.

Maybe a lit­tle at a lit­tle less mean­ly, I also put togeth­er this project, called called GoodMorning!, which looks at every­body say­ing good morn­ing” to each oth­er on Twitter. So, here’s every­body in the world in 2009, so more than three years ago, all say­ing good morn­ing.” The green peo­ple are get­ting up ear­ly and say­ing good morn­ing, the orange peo­ple a lit­tle bit lat­er, and the red peo­ple real­ly late. When we look at the United States, you real­ly see the red on the West Coast and the green on the East Coast.

And I’ve been car­ry­ing these ideas through­out my work ever since then. At the time we built this project, work­ing togeth­er with a sta­tis­ti­cian named Mark Hansen and with the rest of the real­ly tal­ent­ed team at the R&D group at the Times, we built this project called Cascade. And what cas­cade does is it looks at con­ver­sa­tions about Times con­tent on Twitter. And we’re able to recre­ate every sin­gle con­ver­sa­tion that hap­pened about every piece of Times con­tent, in real-time.

So, we’re look­ing at a sto­ry which is a cou­ple of years old here, but it’s an inter­est­ing one for a cou­ple of rea­sons. Just to explain what we’re see­ing, on the left-hand side is the very birth of this con­ver­sa­tion. And then now we’re about twenty-four hours into the con­ver­sa­tion. As degrees of sep­a­ra­tion get above, we’re going from one per­son to the oth­er, who tweets to the oth­er per­son, who tweets to the oth­er person. 

But real­ly what we see is we see the archi­tec­ture of dis­cus­sion. We see some­thing that we’ve nev­er seen before. So excit­ing. This was one of my favorite projects to work on. I felt like an archae­ol­o­gist, expos­ing things for the first time that we’ve nev­er seen before. 

But, some­thing always set real­ly uneasi­ly with me about this project and with the oth­er work that I showed you, and that is that large­ly this work depends on oppor­tunism. We’re depen­dent on tak­ing peo­ple’s data with­out telling them about it. And even though Twitter is overt­ly pub­lic, that I don’t think makes me feel a lot better.

I was speak­ing at a con­fer­ence a cou­ple of years ago, and some­body said this for the first time, I think the first time that I heard it: data is the new oil. And peo­ple…they clapped. They were excit­ed about data being the new oil. I think they were think­ing about this, right?

JR Ewing from the TV show Dallas laughing and holding  a handful of money

Whereas I was think­ing about this, but in the con­text of this:

We did­n’t do very well with oil. And to sug­gest the data can be the new oil I find frankly ter­ri­fy­ing. But maybe there is a piece of this anal­o­gy that works for us, because oil is com­posed of all these tiny microor­gan­isms, these pre­his­toric microor­gan­isms that have been com­pressed into this sort of valu­able resource.

Data con­sists of frag­ments of our lives. The valu­able data that we’re talk­ing about con­tent con­sists of frag­ments of our lives that are being com­pressed into this valu­able resource. Now, maybe it’s the Canadian in me, but I’m not sure that I trust cor­po­ra­tions to take charge of this type of resource. And I’m real­ly inter­est­ed in how we can do a bet­ter job with data than we did with oil.

Do you want to stop dif­fer­ent trans­mit­ted diseases?

Do you want to design bet­ter cities? Do you want to stop traf­fic jams?

The data to do so is there in pri­vate hands, and we need to iden­ti­fy some social con­sen­sus by which the data can be shared with the dif­fer­ent stake­hold­ers who can take advan­tage of that.
edge​.org, Thinking in Network Terms, A Conversation with Albert-László Barabási”

So, László Barabási, who we men­tioned before, in an inter­view last month had some real­ly inter­est­ing things to say about data and its val­ue to us. He says if we want to do all these great things with data, we have to come to a social con­sen­sus, because this data is valu­able and it’s owned by all of us col­lec­tive­ly. So how do we come to a social con­sen­sus to make sure that that data can be used for good and not nec­es­sar­i­ly only for prof­it. Well, three things, I think. 

Data own­er­ship. We have to get peo­ple used to the idea of own­ing their own data. It is your data, you should own that data, and that’s not the way it works right now. So, at the Times, we put togeth­er this project called Open Paths [blprnt​.com blog post] which allows you to store your loca­tion data secure­ly and share it, if you wish, with resources. It’s the if you wish” that’s impor­tant there with with researchers. So you can share this if you’d like to. So please down­load the app, start record­ing your loca­tion data. It’s real­ly fun to explore. And then share that data, if you wish.

The sec­ond thing that we need to real­ly be talk­ing about his date and ethics, because I think ethics have been very, almost all, lack­ing from this con­ver­sa­tion. And it’s real­ly impor­tant that as con­sumers of data ser­vices we start to make deci­sions based on ethics.

And then final­ly let’s get back to the first thing that I talked about, which is this sub­jec­tive expe­ri­ence of liv­ing in a data world. I’m real­ly real­ly real­ly con­vinced that the only way we can reach this con­sen­sus that we’re talk­ing about is by shar­ing with peo­ple and expos­ing to peo­ple what is hap­pen­ing in this data world. And that’s I think where the role of data art comes in. 

I come here today because I’m excit­ed about data but also because I’m ter­ri­fied. I’m ter­ri­fied that we are hav­ing progress with­out cul­ture in the world of data. And as we’ve seen with these failed indus­tries before, progress with­out cul­ture does not work. And there’s a lot of pow­er­ful peo­ple in this room, and if I can leave you with one thing, let’s try to bring cul­ture into our dis­cus­sion with data. And let’s try to not make the same mis­takes with this new resource that we have with the last ones.

Thank you.

Further Reference

This pre­sen­ta­tion at the PopTech site.

Help Support Open Transcripts

If you found this useful or interesting, please consider supporting the project monthly at Patreon or once via Cash App, or even just sharing the link. Thanks.