Carl Malamud: Internet Talk Radio, flame of the Internet.

This is Geek of the Week. We’re talk­ing to Tim Berners-Lee, who’s the orig­i­na­tor of the World Wide Web, one of the most excit­ing resource dis­cov­ery sys­tems out there. It’s a hypertext-based sys­tem, a way of nav­i­gat­ing the net­work. Welcome to Geek of the Week, Tim.

Tim Berners-Lee: Thanks. It’s so great to be a geek.

Malamud: We all wish we could be, right. Maybe you can start by telling us a lit­tle bit about the World Wide Web. What is it? What’s it do?

Berners-Lee: Okay. I’ll tell you two things. I’ll tell you what it is at the moment, and what it orig­i­nal­ly was sup­posed to be when it start­ed off about two or three years ago. 

The World Wide Web ini­tial­ly was designed to be a col­lab­o­ra­tive sys­tem. It was sup­posed to be a col­lab­o­ra­tive hyper­text sys­tem to allow a group of peo­ple to work togeth­er with­out hav­ing to be in the same office. In fact I start­ed orig­i­nal­ly get­ting inter­est­ing in hyper­text when I arrived at CERN and I found that the place, full of cre­ative peo­ple, was quite a mess, quite a web of peo­ple, pro­grams, peo­ple who had writ­ten pro­grams, who used pro­grams, pro­gram­mers who used pro­grams. All sorts of rela­tion­ships there. And when I was first there for six months thir­teen years ago, I just had six months to find out about all this, all these rela­tion­ships. So I wrote a lit­tle pro­gram to do that. And lat­er I real­ized that if oth­er peo­ple could access the same infor­ma­tion, this would save me telling every­body about it. 

So the idea was that it should be what has now been termed computer-supported coop­er­a­tive work, CSCW. And that was the orig­i­nal idea. And then when we pro­duced it, the first thing which we pro­duce for gen­er­al con­sump­tion was a brows­er which would allow you to look at this infor­ma­tion, and as a result has become an infor­ma­tion sys­tem which has a lot of peo­ple brows­ing and very few peo­ple dis­sem­i­nat­ing infor­ma­tion. So now it is, as you described it, an infor­ma­tion sys­tem, resource dis­cov­ery sys­tem which allows a lot of peo­ple to get the infor­ma­tion but only a few peo­ple to pro­duce it. So a bit more like a radio sta­tion. So, we still plan for it to become a col­lab­o­ra­tive system.

Malamud: And what does the brows­er look like? Do you have to be run­ning on a Sun, or does this work on many dif­fer­ent platforms?

Berners-Lee: Well orig­i­nal­ly when we start­ed, the prob­lem was we had a pro­to­type on the NeXTSTEP, which was fun to build and very quick to build. But now, on pret­ty much any platform—you have it on Sun for exam­ple on X you have maybe five browsers now. XMosaic is the most pop­u­lar one which a lot of peo­ple have heard of. 

Malamud: That comes from University of Illinois?

Berners-Lee: That’s from NCSA, yeah. The same peo­ple that pro­duced NCSA Telnet. They have a very strong team pro­duc­ing not only Mosaic for X, but also they’re com­ing up with the same thing for PC with Windows, and the same thing for the Mac. Meanwhile on the mac, at CERN there’s a fair­ly straight­for­ward, sim­ple brows­er avail­able for the Mac from CERN. And there’s some­thing called Cello which is avail­able from the Legal Information Institute at Cornell, who have pro­duced that for windows. 

So, Windows, Mac and X have got graph­ic user inter­faces. The are also a cou­ple of browsers which are very much…much used in fact, although they’re not so excit­ing. There’s a very plain Line Mode Browser which you can use a from hard-copy tele­type which we dis­trib­ute from CERN. And there is a brows­er called Lynx which comes from the University of Kansas. Lou Montulli pro­duced that, and that uses a full-screen VT100 emu­la­tion. These are in fact pret­ty use­ful because we’re inter­est­ed in get­ting to every­body. We’re inter­est­ed in get­ting to every­body who’s got just a VT100, or what­ev­er ter­mi­nal they have. 

Malamud: So what kind of infor­ma­tion do you get out of the Web? What might I be brows­ing when I’m sit­ting here in one of these inter­faces? You’re a physics lab­o­ra­to­ry, CERN. Is this a bunch of nuclear physics infor­ma­tion, for example?

Berners-Lee: We’re a high-energy physics lab, yeah. So, my salary is paid to make high-energy physics avail­able to peo­ple who are work­ing at CERN, or who are col­lab­o­rat­ing with CERN from a dis­tance. But in fact the sys­tem is usable for all kinds of data, and absolute­ly all kinds of data. One of the things we dis­cov­ered very ear­ly on is that the hyper­text mod­el we have, that just by click­ing on high­light­ed words you can jump to some­thing else, allows you in fact to present hyper­text view of exist­ing data­bas­es, exist­ing infor­ma­tion sys­tems. So where­as we start­ed off invent­ing as a hyper­text sys­tem, we real­ized that we could incor­po­rate all sorts of oth­er information.

Malamud: In what sense? When I click on a word I’m actu­al­ly FTPing a file in? Is that what you mean? Or…

Berners-Lee: Well, first of all there’s a prob­lem when you start a new sys­tem, is if you say that every­body, Hey, why don’t you you put your data in here,” they’ll say, Well who’s read­ing it?” And you say well nobody yet because there’s no data in it. And con­verse­ly, nobody wants to read it because there’s no data in it, nobody wants to put data in it because there’s nobody read­ing it.

So we real­ized we’d have to boot­strap our­selves off the ground. And the way that we did this was to allow the W3 clients as well as talk­ing the W3 pro­to­col to W3 servers, to talk to FTP servers, as you say. They can read news arti­cles using the reg­u­lar NNTP pro­to­col. They can talk to Gopher servers using the Gopher pro­to­col. And when they do this, the sort of thing that they put on the screen, when you put up a news arti­cle, or a news­group, is in fact just hyper­text. So if you think about it, when you’re using the clients for all these indi­vid­ual pro­to­cols, what you’re see­ing on the screen is just sen­si­tive things that you can click on; titles of news articles. 

So in fact, with hyper­text capa­bil­i­ty we could and inter­face to all these things, and we could pro­duce just one inter­face to all these things. And then we could go out and say well, every­body’s using this because they’re using it to read their news, they’re using it to read FTP sites, they’re using it to read Gopher. And so there is already a pub­lic out there, an audi­ence out there. So you can put up on the W3 servers. And peo­ple now more and more are putting up— We have…I think it went up last week from when I count­ed, from some­thing like a hun­dred to a hun­dred and fifty servers out there with all kinds of Information, all kinds of top­ics. We have nice mul­ti­me­dia stuff—XMosaic han­dles embed­ded graph­ics very nice­ly and that’s been used for some beau­ti­ful work by for exam­ple the peo­ple who put the Vatican exhi­bi­tion of Renaissance cul­ture online. Beautiful pic­tures of Medieval man­u­scripts, with a text asso­ci­at­ed with them, a com­plete intro­duc­tion to Roman life in the 13th cen­tu­ry. At one end. At the oth­er end there are stel­lar spec­tra for astro­physi­cists. There’s no end to what you can put on there, it’s multimedia.

Malamud: What’s it take to make a W3 serv­er? What’s it take to take an archive of infor­ma­tion we want to put online and make it acces­si­ble to the W3 world?

Berners-Lee: At the base lev­el, if you have a direc­to­ry and you have some files in it, you run the W3 server—httpd. HTTP” is Hypertext Transfer Protocol in the same way that FTP is File Transfer Protocol. And the httpd is just like the ftpd. It’s just a pro­gram You run it under the inet demon, for exam­ple, and you point it a direc­to­ry. You say my /pub, which I’m serv­ing at the moment, I would like to be avail­able using HTTP. This gives you the imme­di­ate advan­tage that peo­ple can pick things up more quick­ly. Because HTTP is faster. It’s a state­less pro­to­col which does­n’t involve all the logon that you get with FTP. So peo­ple so peo­ple can browse through your direc­to­ries, they can fol­low a link from some­body else’s direc­to­ry into your direc­to­ry quickly. 

And what the httpd dae­mon does in this case is it builds a lit­tle hyper­text web, on the fly. So that when peo­ple look at your direc­to­ry what they see is a list of your files with, at the top if you have a readme file it’ll be insert­ed at the top or the bot­tom depend­ing on how you select the option flags. And they’ll see the title of the direc­to­ry will be used for the title of the doc­u­ment. And it’ll do its best to make— or just a very straight direc­to­ry sys­tem full of plain­text files, or image, or graph­ic files.

Malamud: So if one of the files is a sub­di­rec­to­ry you click on it and then the next page is a list of the files in that subdirectory. 

Berners-Lee: That’s right. In fact what you’ve got is a web— it’s a hyper­text web—but in fact it’s been built just out of your direc­to­ry tree. That is the sim­plest way to do it. So, using that, you can put up any­thing which you’ve already got in an FTP directory. 

Malamud: So those are auto­mat­ic links. Can I decide that I want some­thing more sophis­ti­cat­ed? Can I look inside of a file and say I want this word, let’s say the word First Amendment” linked over to Cornell’s ver­sion of the law library that has more infor­ma­tion on the First Amendment? Can I begin tai­lor­ing my system?

Berners-Lee: You bet. [crosstalk] You bet.

Malamud: What’s it take to do that?

Berners-Lee: To do that is fair­ly straight­for­ward, start­ing at the base lev­el. You can add pieces very incre­men­tal­ly. So for exam­ple, let’s sup­pose that you do have a large direc­to­ry space, which is tree-structured, and you’re point­ing a W3 serv­er at it, and you haven’t done any­thing else. Then of course the thing that peo­ple read most often is the root-level direc­to­ry. And if you look at the aver­age FTP server’s root-level direc­to­ry, it’s pret­ty dry. So as that’s the intro­duc­tion page, that’s what peo­ple see of your insti­tute, that’s the first thing you replace with a hyper­text doc­u­ment. So what you do typ­i­cal­ly is you pick up the Line Mode Browser, for exam­ple, and you use it to read that direc­to­ry, and you out­put the result in hyper­text markup for­mat. This is a markup. You out­put it onto a file, and then you play with that file. 

Now, the Hypertext Markup Language, HTML is at Internet Draft stage, or it will be at Internet Draft stage very soon. And the doc­u­men­ta­tion of course for all of this is avail­able on the Web. So if I miss things out, just go onto the Web—get XMosaic, go onto the Web, get all the infor­ma­tion. What you do typ­i­cal­ly, though, is you go and find some­thing on the Web which you like, and using XMosaic you can just look at the source. You pull down the file menu, say look at the source,” and you’ll see what it looks like marked up. And in there you’ll see the lit­tle angle brack­ets and the for­mat for writ­ing out a link to anoth­er document.

Malamud: What is that lan­guages? Is that SGML that you’re using or…

Berners-Lee: HTML is the markup lan­guage. SGML is a meta-markup lan­guage. SGML describes how you define a markup lan­guage, HTML is defined using SGML. So for SGML buffs, HTML is an SGML appli­ca­tion. It has a DTD. And the DTD is—and the spec, is all avail­able on the Web, of course. So for exam­ple, if you are an SGML per­son, then you can take an HTML file and you can parse it using the pub­lic domain SGML par­ers, SGMLS for exam­ple, by pre­fix­ing it with the DTD.

Malamud: Does that mean for exam­ple when the IEEE has adver­tised that they’re tak­ing all their doc­u­ments and they’re mark­ing em up in SGML for­mat inter­nal­ly, if they decide they wan­na join the Web, is that gonna be a no-brainer for them? They just point to this direc­to­ry of SGML-marked-up doc­u­ments and say there they are? Or is it gonna be— IS your HGML [sic] gonna be com­pat­i­ble with the oth­er ver­sions of SGML that pub­lish­ers are begin­ning to use? 

Berners-Lee: Well, to put it by simply—and any SGML buffs will…flay me for this—SGML basi­cal­ly says that you should put the con­trol infor­ma­tion in angle brack­ets in a con­crete syn­tax, and you should use…you should do things in cer­tain ways. But it does­n’t actu­al­ly say what your con­trol words are.

Malamud: Mm hm.

Berners-Lee: So, HTML for exam­ple has an A tag meet­ing this is an anchor which is one end of a link.” And you use the href” attribute of that tag to say where it’s going. If oth­er peo­ple use SGML typ­i­cal­ly they’ll use com­plete­ly dif­fer­ent tags, com­plete­ly dif­fer­ent ele­ment names. So, it won’t be a no-brainer. You won’t just be able to point an HTTP serv­er at the data and have every­body read it. 

We’ll talk about for­mat nego­ti­a­tion lat­er. I think. But, basi­cal­ly if it is marked up in SGML or any­thing which is basi­cal­ly struc­tured, like LaTeX[?], then because you’ve got the infor­ma­tion there, it’s that you need a very small [indis­tinct] to actu­al­ly con­vert it into HTML. So typ­i­cal­ly for exam­ple, you’re going to want to pre­serve the infor­ma­tion about head­ings, and about dif­fer­ent para­graph styles. And all that infor­ma­tion can be prob­a­bly very eas­i­ly con­vert­ed into HTML.

Malamud: It sounds like HGML has a lot more empha­sis on the net­work, on links and things like that than a typ­i­cal SGML DDT [sic] would. 

Berners-Lee: Well, in fact it only has two ele­ments. It has a link ele­ment which is a link from the doc­u­ment to anoth­er doc­u­ment, and it has an anchor which defines a part of the doc­u­ment, which is the begin­ning or end­point of a link. And those are the only two. Then there are attrib­ut­es of those which describe where they go to and what rela­tion­ship is involved, where there is a seman­tic rela­tion­ship between the two things which are linked.

Malamud: Let’s think about that link a sec­ond. You said that W3 is able to link out to Gopher servers and link out to FTP servers. And you’re able to do all that with one basic here’s what I’m point­ing to” command?

Berners-Lee: Right. And you’re lead­ing to one of the fun­da­men­tal things which W3 hangs on. In fact W3 is often described as being a sys­tem which is based on hyper­text, and which is hyper­text hyper­text hyper­text. In fact hyper­text is not the most impor­tant thing behind W3, it’s not the most impor­tant concept. 

Perhaps the most impor­tant con­cept is that any object out there on the net­work should be address­able. This implies that there should be some uni­ver­sal address­ing scheme. Now, we called these things ini­tial­ly Universal Document Identifies. And then when we brought it to the IETF, the uni­ver­sal” became uni­form., the doc­u­ment” became resource,” and the iden­ti­fi­er” became, in the case of the actu­al iden­ti­fiers we’re using at the moment, loca­tor.” So we now have URLs. URLs are things which are dis­cussed at IETF and there’s a spec about them. And the URL is basi­cal­ly the address of a net­work object.

Malamud: Mm hm.

Berners-Lee: The nice thing about a URL is it starts off with a pre­fix which defines what sort of a URL it is, and we can add lat­er extra pre­fix­es to define all sorts of oth­er URLs. So typ­i­cal URLs are FTP URLs which con­tain all the infor­ma­tion you need for extract­ing some­thing by FTP.

Malamud: Which is…“ftp”…the domain name…the pathname…and the filename.

Berners-Lee: Right. 

Malamud: That’s it.

Berners-Lee: In fact, rather than sep­a­rat­ing it with some chat such as, What you need to do is FTP to…ftp​.what​ev​er​.edu, and then cd’ to…and then get file dah dah dah’ ” we then use a lit­tle bit of punc­tu­a­tion and we say it’s ftp colon slash slash, domain name, slash path­name.” And sim­i­lar­ly for Gopher, it looks very sim­i­lar. gopher colon, slash slash domain name, slash selec­tor string.” And for HTTP we have http colon, slash slash domain name, slash” and then some opaque string which could be any­thing in fact which the sev­er under­stands, as defined by the server.

So, those are three impor­tant types of URL. And we can extend that. We have a few oth­er things. We can put point­ers to tel­net ses­sions, for exam­ple, so that if there is a site which is only acces­si­ble through tel­net, it’s very nice to be able to include it in the Web so that with­in a doc­u­ments you can say, Hey,” for exam­ple, for more infor­ma­tion see our library sys­tem.” And you would link the words our library sys­tem” to a tel­net ses­sion to the library sys­tem because all you’ve got from your auto­mat­ed library sys­tem is a tel­net ses­sion. You haven’t yet got a World Wide Web server.

Malamud: So what hap­pens is your user goes to the edge of the Web and then escapes into tel­net land and uses what­ev­er syn­tax they have for that library sys­tem, and when they’re done they’re back in the Web. 

Berners-Lee: Right. And that’s very sub­op­ti­mal, obvi­ous­ly, because the user inter­face changes. Suddenly, when you’re in tel­net land and you have to sud­den­ly learn what sort of library sys­tem is this, how do I get out, what’s the quit” com­mand, how do I find my way around.

Malamud: Couldn’t you do that for the user? Why do they have to go out into tel­net land? Why does­n’t the Web do that nego­ti­a­tion for you and bring the infor­ma­tion back? 

Berners-Lee: Why does­n’t the Web bring the infor­ma­tion back. Why can’t we make some­thing to auto­mat­i­cal­ly run a tel­net serv­er. The prob­lem is that—

Malamud: And make it look like hypertext.

Berners-Lee: The answer is for a gen­er­al inter­ac­tive ses­sion, you can’t. It changes. In prin­ci­ple you could. In prin­ci­ple you can write a script which will attack a human inter­face as a machine and extract infor­ma­tion, but in prac­tice of course this is very very hairy and hor­ri­ble. It would also have to be done indi­vid­u­al­ly for every system. 

What is very very much eas­i­er is if you actu­al­ly have to the pro­gram there, is if you have the sources of the pro­gram, then you can write— Or even if you have the bina­ry of the program—you have some library access pro­gram, for exam­ple, which has a shell com­mand for access­ing data. You can write a W3 serv­er which when it gets a request for show me about the library” because of your lit­tle help file with a few point­ers to some things, and then one of those point­ers is run as the pro­gram and get the list of sec­tions of the library for exam­ple out of it. And it refor­mats that as HTML

So the thing to do in this case is for the guy who runs the library sys­tem to write what is very often just a ten-line Perl script, typ­i­cal­ly, which runs the library soft­ware when­ev­er some­body comes in with a W3 request. Now, what he’s pro­duc­ing a gate­way between W3 world and his own data­base. So he’s adding a whole new world which was pre­vi­ous­ly inac­ces­si­ble to the Web. And peo­ple can be incred­i­bly cre­ative about that. There are some beau­ti­ful exam­ples of vir­tu­al worlds which have be cre­at­ed from rela­tion­al data­bas­es, from bunch­es of files…

Malamud: What kind of worlds? Give me an example.

Berners-Lee: Well there are a num­ber of nice ones. For exam­ple at CERN, Mike Sendall has put togeth­er a data­base about soft­ware tech­nol­o­gy in gen­er­al. And you can throw any World Wide Web index— Some doc­u­ments and are indexed and some aren’t. [indis­tinct phrase] You can throw it at some words to do a text search. So, in this case you doing some­thing very much like WAIS, you know, WAIS func­tion­al­i­ty. So when you’re brows­ing around the Web and you come to Software Technology Interest Group (STING) page, you notice that this index func­tion is enabled—You can throw some words at it. And it will do a search in its glos­sary, and it will do a search in some news arti­cles, and it’ll do a search in some doc­u­ments. And it’ll pro­duce you a lit­tle exec­u­tive report say­ing Well, we found some infor­ma­tion in the glos­sary. Would you like to see or would you like to see some news arti­cles?” And these things are linked, so you click on I’ll see the stuff in the glossary.” 

And all this is being gen­er­at­ed com­plete­ly by a lit­tle pro­gram. I’d like to see the things in the glos­sary, so you click on a link which leads to a vir­tu­al doc­u­ment whose name is sting-slash-glossary-slash…whatever it was you asked for. I asked for objec­tive” and I got back Objective C” and Objective Pascal.” 

And the glos­sary of course is hyper­text. So it said objec­tive lan­guages.” Objective C is an object-oriented lan­guage, object-oriented” is a link. You click on it. Now, I may be talk­ing rub­bish with the par­tic­u­lar links, but you click on object-oriented,” you get a def­i­n­i­tion of object-oriented. You’re brows­ing around the glos­sary recur­sive­ly, and some­times you find links which take you into doc­u­ments. And the whole thing, all the links have been added auto­mat­i­cal­ly by a program. 

Malamud: Now, if I’m doing that from the United States and your W3 serv­er is in Geneva, for exam­ple, we’re bring­ing an awful lot of doc­u­ments back to the US. Each screen this is a doc­u­ment, that doc­u­ment inter­nal­ly has been marked up. What kind of band­width do I need to play in this world effec­tive­ly? What do I need on my end of the net­work pipe to be able to do W3 work?

Berners-Lee: When it comes to band­width, then 14.4 kilo­baud is fine. In fact we’ve done a demon­stra­tion very nice­ly at a con­fer­ence on the end of an ISDN line. And with an ISDN 64-kilobit con­nec­tion you real­ly did­n’t notice too much the delay when pick­ing up hyper­text. For hyper­text, exclud­ing videos, then you’re not in fact trans­fer­ring very much data. Not only that but you’re not keep­ing con­nec­tions up so the load in gen­er­al all round is very small. At CERN we hap­pen to have a T1 to the States. So, we’re lucky. But in gen­er­al in Europe, if you’re fair­ly close to a major cen­ter, you’re not on the end of a piece of wet string, then peo­ple are gen­er­al­ly amazed by the speed. 

What we’re talk­ing about, if you’re look­ing at an XMosaic doc­u­ment and you click on it, if it’s local it should come back with­in a few hun­dred mil­lisec­onds. And oth­er­wise with­in a sec­ond or two. Internationally. And we real­ly want to keep these response times down below the sec­ond if we can. Because that’s the way peo­ple work most effi­cient­ly. When they can fol­low a link, have a look at it, hit the back but­ton because it’s not what they want­ed, and go some oth­er way.

Malamud: Is there a pro­vi­sion in the World Wide Web for data repli­ca­tion or data caching so that infor­ma­tion does­n’t have to go over long, thin pipes many dif­fer­ent times? I’m think­ing for exam­ple the Internet Talk Radio.

Berners-Lee: I bet you are, Carl. I think this is a very inter­est­ing area because it’s not just radio. It’s gonna be all kinds of files. I ha—

Malamud: IETF archives, for exam­ple, same thing.

Berners-Lee: Yes. It’s large files and it’s also files which are just in use all over. For exam­ple we have a file we call the vir­tu­al library. It’s a sub­ject cat­a­log. We keep point­ers in it, in a sub­ject tree, a lit­tle bit like the Dewey sys­tem. We keep the point­ers to every­thing that we found which has a spe­cif­ic sub­ject mat­ter inter­ests, a par­tic­u­lar sub­ject like astron­o­my, astro­physics, high-energy physics, biology. 

Now, the top of this tree, obvi­ous­ly, is one sin­gle file. We keep a copy at CERN. If every­body in the world wants to read that, and that’s a very good place to start look­ing for things, then it’d be very nice to have repli­cat­ed copies, so repli­ca­tion is cer­tain­ly an issue. We haven’t put any­thing in place. We’ve looked at a few inter­est­ing things. Obviously the two things that we we need to have here is one, we need to have good mir­ror­ing soft­ware so that we can make sure that updates get prop­a­gat­ed rapid­ly. And when we have mir­ror sites, then we need to be able to rapid­ly from a client find them. And we’ve played with the idea of hav­ing for exam­ple a dum­my domain, but .web, so that if you look at library.web,” this is a vir­tu­al machine which has a num­ber of IP address­es which are in fact on dif­fer­ent con­ti­nents. This can be regard­ed as an abuse of the Domain Name System, or it could be regard­ed as a neat use of the [crosstalk] Domain Name System.

Malamud: Oh I’m a big fan of abus­ing the Domain Name System.

Berners-Lee: [indis­tinct; So I’ve heard.”?] And that sort of thing com­bined ini­tial­ly with some­thing very sim­ple such as a ping…if you get back five IP address­es, ping them all and see what you get back, may mean that we’ll have some­thing which is fair­ly scal­able which we can use for the large central—I hate to use the word central—but a large heavily-used col­lec­tions of data. Though I think it’s got to be very flex­i­ble as well, because things becom­ing heavily-used very sud­den­ly when peo­ple find out about them. I don’t know whether you’ve found this, that you have a par­tic­u­lar pro­gram, you put it out, and then for some rea­son a point­er to it gets put on some news group every­body’s very excit­ed about it. Suddenly they dive in there.

Malamud: Oh absolute­ly, and in fact they’ll all dive to the same place, which is why I’m curi­ous about that. Even though we’ll have thir­ty copies a file around the world, if one news­group says go to UUNET and grab the file…

Berners-Lee: Right.

Malamud: Everybody goes to UUNET and grabs that file.

Berners-Lee: Yeah. If they’ve got one point­er they don’t want to go out and use Archie to find out where the near­est copy is, and of course when they’ve used Archive they might not even know which of those copies of the near­est one in net­work terms. So it would be very nice to have some­thing which is flexible—automatic, maybe a lit­tle bit con­trol­lable by net­work man­agers, and will allow peo­ple to opti­mize it. I’d like to see it be very very auto­mat­ic, in fact, myself.

Malamud: What oth­er things would you like to see the Web evolve to? Are there oth­er things you’d like to see brought into that system?

Berners-Lee: Well as I said when we first start­ed, at the moment in fact it is prac­ti­cal­ly used for dis­sem­i­na­tion. And one of the rea­sons is that the peo­ple who have actu­al­ly put in the devel­op­ment work very often are sys­tems man­agers. They’re very often peo­ple who have got infor­ma­tion dis­sem­i­nate, and they’ve real­ized that if they dis­sem­i­nate they’ll save the phone ring­ing. This also gives them a high profile. 

But, we real­ly want it to become a col­lab­o­ra­tive sys­tem. The XMosaic team have play­ing with this with a very inter­est­ing group anno­ta­tion serv­er. There’s a tri­al run­ning at the moment where­by you can set Mosaic to point to a par­tic­u­lar serv­er which just stores a list of com­ments on indi­vid­ual doc­u­ments. So, when you’re read­ing a doc­u­ment any­where in the world, you’re hooked up to a group anno­ta­tion serv­er. And if you hap­pen to be read­ing a doc­u­ment some­body else in the same group has com­ment­ed, on at the bot­tom appears a lit­tle link to his con­tri­bu­tion. Now this I think is a stage in mak­ing some­thing which blurs all the bound­aries between data retrieval, data retrieval with front end update, news, where­by infor­ma­tion is spread around in a sort of flood­ing algo­rithm. And mail where it’s sent to spe­cif­ic peo­ple. We need to use all these dif­fer­ent pro­to­cols for dif­fer­ent sit­u­a­tions but they’ve all got to merge togeth­er. So that from the user’s point of view when he’s read­ing a piece of infor­ma­tion, whether he’s retrieved it from a W3 serv­er, or he’s read­ing a news arti­cle, or he’s read­ing some­thing which went to a mail­ing list, when he hits the respond but­ton, he gets a lit­tle win­dow and he can type stuff in, and it’ll be processed accordingly.

Malamud: If we want to find out more about W3, is there an email address peo­ple can send to? Is there some­place they should be FTPing into? How do peo­ple learn more?

Berners-Lee: Basic way to start—please don’t send email until you’ve done this. The sim­plest thing to do is you tel­net to info​.cern​.ch And then you get the very very sim­plest brows­er. And it’ll give you some infor­ma­tion about CERN, and it’ll give you some infor­ma­tion about the World Wide Web. 

Now this brows­er is very very crude. Do not be put off by this. To fol­low hyper­text links you type in the num­ber of the link, and num­ber’s there in lit­tle square brack­ets after the terms. But you can in fact access every­thing, every­thing which is avail­able through Gopher, and news, and the World Wide Web, from that. And in par­tic­u­lar you can find infor­ma­tion about the World Wide Web, infor­ma­tion about the client, infor­ma­tion on the client that you choose, about how to FTP it. 

Now, I say use this because you will then get the most recent infor­ma­tion. I can tell you some FTP sites—I will tell you some FTP sits—but obvi­ous­ly, if you go use the Web in some way you’ll get the up-to-date information.

Malamud: What’s the main FTP site for get­ting the pro­grams and the doc­u­men­ta­tion and things like that?

Berners-Lee: Two impor­tant things. I guess the bulk of lis­ten­ers are gonna have work­sta­tions where they can run Mosaic. If you can run Mosaic you ought to have XMosaic on your—that is, they call it NCSA’s Mosaic for X on your work­sta­tion. You can get it by FTP to ftp​.ncsa​.uiuc​.edu. And—

Malamud: And that’s mir­rored all over the world, too. You could go to you UUNET or many oth­er sites [crosstalk] and get the software. 

Berners-Lee: I guess— All kinds of places. That’s the cen­tral site where you get the most recent ver­sion. We get new ver­sions out per­haps every cou­ple of weeks. It went to from beta to 1.0 a few months ago. That is cur­rent­ly the pre­ferred ver­sion for X for a lot of peo­ple. There’s anoth­er very excit­ing client for X, which is TKWW. It runs using the Tk/Tcl toolk­it which you also have to get. The inter­est­ing thing about that is it’s a hyper­text edi­tor. So you can make your own hyper­text files.

If you’re run­ning a NeXT, then there’s a hyper­text edi­tor you can pick up from CERN. There’s the Line Mode Browser you can get from CERN. The dae­mon stuff you get from CERN. So the oth­er impor­tant FTP site is info​.cern​.ch.

Malamud: Okay, great.

Berners-Lee: Same thing, if you tel­net to it, you can ftp into it.

Malamud: Okay, thank you very much. This is Carl Malamud. We’ve been talk­ing to Tim Berners-Lee, the orig­i­na­tor of the World Wide Web, and this has been Geek of the Week. 

This is Internet Talk Radio, flame of the Internet. You’ve been lis­ten­ing to Geek of the Week. You may copy this pro­gram to any medi­um and change the encod­ing, but may not alter the data or sell the con­tents. To pur­chase an audio cas­sette of this pro­gram, send mail to radio@​ora.​com.

Support for Geek of the Week comes from Sun Microsystems. Sun, The Network is the Computer. Support for Geek of the Week also comes from O’Reilly & Associates, pub­lish­ers of the Global Network Navigator, your online hyper­text mag­a­zine. For more infor­ma­tion, send email to info@​gnn.​com. Network con­nec­tiv­i­ty for the Internet Multicasting Service is pro­vid­ed by MFS DataNet and by UUNET Technologies.

Executive pro­duc­er for Geek of the Week is Martin Lucas. Production Manager is James Roland. Rick Dunbar and Curtis Generous are the sysad­mins. This is Carl Malamud for the Internet Multicasting Service, town crier to the glob­al village.