Carl Malamud: Internet Talk Radio, flame of the Internet.

This is Geek of the Week, and we’re talk­ing to Brewster Kahle, who is founder and President of WAIS Inc. Before that, he was one of the chief devel­op­ers of WAIS, the Wide Area Information either System or Server, depend­ing on which doc­u­ments you read. Welcome to Geek of the Week, Brewster.

Brewster Kahle: Thanks, Carl.

Malamud: What is WAIS, actu­al­ly? What’s the prop­er reverse engi­neer­ing of that acronym?

Kahle: It’s Wide Area Information Servers. It’s an acronym. I can’t stand acronyms, but we could­n’t come up with any­thing bet­ter. If you can come up with a bet­ter name we’d love to change the name. Acronyms aren’t usu­al­ly the right way to go.

Malamud: Maybe we’ll make that a con­test for our listeners.

Kahle: Please.

Malamud: What is WAIS? Can you give me an overview of what that ser­vice is?

Kahle: Yeah. It’s an elec­tron­ic pub­lish­ing sys­tem. Basically it’s try­ing to help peo­ple find and retrieve infor­ma­tion over wires. But I think actu­al­ly where the excite­ment of WAIS is is more help­ing peo­ple cre­ate con­tent and share it. Anybody with a per­son­al com­put­er and a tele­phone should be a pub­lish­er. That’s the goal of WAIS.

Malamud: Now, how can you be a pub­lish­er with WAIS? What is it that it lets you do?

Kahle: All you need is basi­cal­ly some sort of com­put­er, some com­mu­ni­ca­tion line, some­thing to say, and a lit­tle bit of soft­ware. And you can then make it avail­able for every­one to see it. 

Malamud: What is WAIS, though? Is it a pro­to­col, is it soft­ware, is it some index­ing tech­niques? Is it the fact that some­body actu­al­ly has infor­ma­tion out there?

Kahle: Unfortunately, WAIS is all those. It’s…many peo­ple think of WAIS as a pro­to­col. It’s a mech­a­nism for peo­ple to go out and try to find what it is they’re look­ing for. But you want to make sure that there’s stuff to go out there and find. And that’s— We’re spend­ing most of our time try­ing to up the grade and mul­ti­ply the num­ber of sources of infor­ma­tion that are available. 

But the key piece about WAIS is the pro­to­col. It’s mak­ing it so that the clients… Anybody can have a PC, a lap­top, a lit­tle hand­held device, all of those things can go out and probe infor­ma­tion resources no mat­ter where they are.

Malamud: But what’s the pro­to­col? Is it Z39.50 or is it a modification…?

Kahle: There’s a bunch of con­fu­sion about exact­ly what is the WAIS pro­to­col. In fact, the WAIS pro­to­col is about five or six dif­fer­ent stan­dards, all pack­aged into one envi­ron­ment. Z39.50 is impor­tant for the infor­ma­tion retrieval aspect. It came out of the librar­i­an world,but it’s real­ly built to try to help you find card cat­a­logs. Well, most peo­ple don’t real­ly care about card cat­a­log entries. Yes, they’re impor­tant. But a lot of peo­ple want to get at images, video, radio, all sorts of dif­fer­ent types of infor­ma­tion. So those stan­dards for doc­u­ment for­mats come from dif­fer­ent groups. 

We also need a doc­u­ment iden­ti­fi­er so you can go and refer a hyper­text link, if you would, to doc­u­ments that might be in Japan, or in some­body’s lap­top when they’re trav­el­ing. You need a mech­a­nism for point­ing to doc­u­ments. That’s anoth­er stan­dard. The query for­mats, those are oth­er stan­dards. So the WAIS pro­to­col is actu­al­ly prob­a­bly four or five dif­fer­ent pro­to­cols and stan­dards all being used together.

Malamud: When you say stan­dards, do you mean actu­al real stan­dards. Are we look­ing at ISO stan­dards or things that you folks devel­oped to suit your needs?

Kahle: Some things are ISO stan­dard, some things are ANSI stan­dards, some things are just start­ing their way through the stan­dards process. Some are actu­al­ly pro­pri­etary for­mats. For instance, Microsoft Word. There a lot of Microsoft Word doc­u­ments being shared with WAIS. But Microsoft Word for­mat is not an ISO stan­dard. It’s a pro­pri­etary stan­dard. We just tried to make sure when­ev­er there were exam­ples of pro­pri­etary stan­dards for dif­fer­ent parts of the WAIS pro­to­col, you had choic­es. So the only thing, there are no sin­gle pieces of the WAIS pro­to­col that are ter­ri­bly locked into either a par­tic­u­lar com­mit­tee, in terms of the stan­dards, or a par­tic­u­lar ven­dor. That we see s the flex­i­bil­i­ty and why WAIS is going to win, is it’s real­ly rid­ing on top of a set of stan­dards to help peo­ple make sure that the con­sumers, the peo­ple that are try­ing to find infor­ma­tion are get­ting the right stuff from the zil­lions of sources. And as that evolves, WAIS tracks with it.

Malamud: You devel­oped WAIS when you were first at Thinking Machines. And I’ve always won­dered why the man­u­fac­tur­er of a mas­sive par­al­lel proces­sor would devel­op an infor­ma­tion retrieval protocol.

Kahle: Yes, that’s a good ques­tion. Thinking Machines, which is best known for Connection Machines, which are mas­sive­ly par­al­lel machines that have hun­dreds and thou­sands of process­es in them, why would they do WAIS? Well, roll it back a lit­tle bit and the name of Thinking Machines is Thinking Machines” for a rea­son. The idea is to try to make a machine that thinks. Well, that’s pret­ty hard. But that’s the real goal of the com­pa­ny. And at least a machine that’s going to think has got to know a lot. It does­n’t argue that you are able to think if you do know a lot, but at least it’s a pre­cur­sor. So that’s a lot of the inter­est with­in Thinking Machines in try­ing to do this sort of thing.

Basically, what WAIS was was a mech­a­nism for using mas­sive­ly par­al­lel machines from lots and lots of peo­ple. Thinking Machines makes a high-end machine. Utilities. The big-boy com­put­ers. And the only way that Thinking Machines is going to make lots of mon­ey in sell­ing those machines is to have mil­lions and mil­lions of peo­ple use them day in, day out. That was the rea­son why Thinking Machine did WAIS.

Malamud: So in a fully-deployed WAIS world, because of all this mas­sive index­ing and search­ing and retrieval, you were hop­ing at least that you could sell a lot more Thinking Machines.

Kahle: Yes. And in fact Thinking Machines sold a Connection Machine to Columbia Law Library. That’s an inter­est­ing exam­ple there where Columbia Law Library had a prob­lem: they’re in Manhattan. They can’t afford more space. So they tried to eval­u­ate whether it’s cheap­er to buy a com­put­er or buy anoth­er build­ing to store more books. And they basi­cal­ly found a com­put­er was the way to go. So they’re scan­ning huge amounts of infor­ma­tion. The Rosenberg tri­al, the Nuremberg tri­als, lots of United Nations data. Storing that on com­put­ers, run­ning it through opti­cal char­ac­ter recog­ni­tion, search­ing based on the opti­cal char­ac­ter recog­ni­tion, which has got lots of faults in it. So you find the right doc­u­ment, but you retrieve the pic­tures of the pages. This mech­a­nism allows us to basi­cal­ly do ret­ro­spec­tive con­ver­sion of paper at a very inex­pen­sive rate. And what WAIS is allow­ing peo­ple to do is once you’ve done that, share that resource worldwide.

Malamud: What kind of stor­age are we look­ing at? If you’re scan­ning in an image at what, 300 dots an inch, 600 dots an inch? 

Kahle: Yeah.

Malamud: And you’re also run­ning it through OCR. You’ve got your text. And then you index that text, and that takes some order of mag­ni­tude increase in space over the text itself. Can you give me any idea of how much disk space we’re look­ing at to put a library online like that?

Kahle: It turns out that by today’s stan­dards, not very much. But to give it some hard num­bers. What the Adobe peo­ple say is if you scan a page and use their new Acrobat prod­uct, it’s down to 30 to 40k—bytes—per page to be able to store enough to recon­struct that page so that it looks exact­ly like what you had before. What we see in oth­er com­pres­sion tech­nol­o­gy is more of the sort of 80k, 100k, per page. 

That’s still very small. We’re at a thou­sand dol­lars per giga­byte, which is what cur­rent disk dri­ves cost. The twen­ty ter­abytes that peo­ple esti­mate in ASCII that’s in the Library of Congress is just twen­ty mil­lion dol­lars. So that’s not very much mon­ey in terms of being able to store and retrieve [crosstalk] the Library of Congress.

Malamud: It’s not a lot in terms of disk space, but one of the things I’ve noticed with WAIS, it’s very easy to ask a ques­tion like is there any infor­ma­tion out there?” And WAIS is a great search­ing tech­nique, and it comes back and says, Yes, I have a mil­lion doc­u­ments that have infor­ma­tion in them.” Are we going to be flood­ing our net­works? Do we have the net­work infra­struc­ture that allows us to be tru­ly a wide area infor­ma­tion service?

Kahle: Well, you asked two good ques­tions there. There’s the how do you do the right fil­ter­ing? And how do you make sure that you’re… You can only read so much per day. So what’s the right mech­a­nism to help you fil­ter? Your machine, your client pro­gram that’s going to be spend­ing twenty-four hours a day try­ing to find you infor­ma­tion and fil­ter­ing it. And if it’s not doing the best job, you’re going to go and get some­body else’s client, that going to go and fil­ter and find the best infor­ma­tion for you.

Malamud: Is the fil­ter­ing done at the client or at the server?

Kahle: Both. And now increas­ing­ly at inter­me­di­ary sites. So the crude things that we’re doing today in terms of doing content-based retrieval, find­ing doc­u­ments with cer­tain words in them, those are start­ing to be aug­ment­ed by human edi­tors that are say­ing, These are the impor­tant doc­u­ments.” So when you have a flood of infor­ma­tion, you real­ly want to have some­times human help to be able to find the right doc­u­ment. That human help can be embod­ied in servers by peo­ple say­ing, Those are the good doc­u­ments. You might want to think about those doc­u­ments.” And that is anoth­er tech­nique that the WAIS sys­tem is sup­port­ing to help peo­ple find what they want out of the giga­bytes that are out there.

The oth­er ques­tion you asked is is the net­work infra­struc­ture good enough? And the answer is absolute­ly. What we’re find­ing is 56 kilo­bits is plen­ty for doing text and busi­ness graph­ics. The biggest prob­lem we have is get­ting reli­able net­works to peo­ple’s work­sta­tions. We run into all sorts of prob­lems. Novell net­works that aren’t com­pat­i­ble, peo­ple that have anti­quat­ed routers…56 kilo­bits is plen­ty. And in fact, I on my lap­top have a 9,600 baud modem. I use AppleTalk Remote Access from it. And I use WAIS all the time, as a pack­e­tized pro­file, not a dialup-type inter­face but a real graph­i­cal user inter­face. It’s great.

Announcer: You’re lis­ten­ing to Geek of the Week. Support for this pro­gram is pro­vid­ed by O’Reilly & Associates, rec­og­nized world­wide for defin­i­tive books on the Internet, Unix, the X‑Windows sys­tem, and oth­er tech­ni­cal top­ics. Additional sup­port for Geek of the Week comes from Sun Microsystems. Sun, the net­work is the computer.

Don’t touch that mouse. Internet talk radio will be right back. 


Carl Malamud: This is The Incidental Tourist, non-technical reports from out-of-the-way places. So you’re work­ing in Asia and you’ve got the night off. You’re sick to death of the hotel cof­fee shop and don’t know where to eat. Here’s an easy hack.

Most depart­ment stores in big Asian cities have either a base­ment or a pent­house devot­ed to food. Occasionally, the food area—and we’re talk­ing mas­sive square footage here—is full of restau­rants which are usu­al­ly not too bad. Certainly not as good as that charm­ing lit­tle place back behind the tem­ple but hey, you have no idea how to find that place, let alone read the menu.

The restau­rant courts are alright. But if you’re in luck you get a real mar­ket. The kind of place local yup­pies come to do their shop­ping after a hec­tic day of answer­ing calls on those cel­lu­lar phones. These places are tru­ly incred­i­ble food bazaars, citi­fied ver­sions of the tra­di­tion­al out­door mar­kets. There are a few butch­ers, fish­mon­gers, and the like. But the bulk of the spac is usu­al­ly devot­ed to pre­pared ren­di­tions of every del­i­ca­cy the coun­try knows. 

In Japan for exam­ple, you can go to Shinjuku Railroad Station, thread your way through the oth­er three mil­lion peo­ple a day using that sta­tion, and go to the head of the Odakyu Line. There you’ll find the Odakyu Department Store, an estab­lish­ment the size of a good-sized sub­urb. Wander around aim­less­ly until you stum­ble across the esca­la­tor head­ing to the sec­ond sub-basement.

Now, you may think you’re used to crowds but hold on. The Odakyu base­ment is the gas­tro­nom­ic equiv­a­lent of Times Square on New Year’s Eve. Grit your teeth and dive in. Walk up to a stand, hold out your five hun­dred yen, and you’ll get a box of pot­stick­ers or a scrump­tious piece of eel wrapped around sug­ar­cane and grilled, or an assort­ment of pick­led fish.

For a real treat, look for the tofu counter. They hand you a bas­ket and you can pick out all man­ner of lit­tle baked and fried tofu mir­a­cles. One has a piece of shrimp in the cen­ter, anoth­er has a lit­tle flat round burst of black sesame seeds, some are rolled around a fat lit­tle squid.

This same strat­e­gy works all over Asia. In Bangkok for exam­ple, try the Central Department Store on Phloen Chit Road, an adven­ture in din­ing and most things cost about 80 cents for a healthy por­tion. For the price of a bowl of onion soup from room ser­vice, I got enough food to feed a small army. You can get all the clas­sic street food here like satay sticks. You can get popi­ah sot, the fresh local spring rolls, right next to a stand of Chinese-style turnip and fishcakes.

Particularly good was the sakoo, steamed tapi­o­ca dumplings stuffed with pork, peanuts, sug­ar, and gar­lic. If you want an authen­tic local meal, skip the tourist joint with the charm­ing buf­fet and the semi-authentic local dancers, and head to a depart­ment store food bazaar. This is Carl Malamud, the acci­den­tal epi­cure, for for Internet Talk Radio.


Announcer: Internet Talk Radio. Asynchronous times demand asyn­chro­nous radio.

Malamud: If a few peo­ple need to get infor­ma­tion, obvi­ous­ly a 56k link into their desk­top is fine. What hap­pens when there’s mil­lions of WAIS users? Does our over­all net­work infra­struc­ture sup­port it? Can we be run­ning mas­sive WAIS servers in Finland and have mas­sive class­rooms in California retriev­ing those documents?

Kahle: So far, so good. When peo­ple start to search and retrieve things like Internet Talk Radio, that’s going to put some real lim­i­ta­tions on what our net­work can do. But text, busi­ness graph­ics, even weath­er maps, those sorts of things can be sup­port­ed pret­ty eas­i­ly cur­rent­ly. A lot of smart peo­ple are work­ing on get­ting big­ger and big­ger pipes going around, and they seem to be run­ning ahead of what we need in terms of find­ing and retriev­ing infor­ma­tion. Video is putting a great deal of strain, when peo­ple are start­ing to do video WAIS. So you might go and ask, What news pro­grams are talk­ing about what’s going on in Bosnia?” 

Malamud: Does some­one have to cat­a­log that?

Kahle: We don’t have any­thing that real­ly under­stands video in an auto­mat­ic way. What peo­ple are using is the audio tracks that are often tran­scribed for the hand­i­capped. So index that. Often it’s around. There’s tran­scripts of all news pro­grams around. Use that as the guide to help you find the right video that you want to be look­ing for. If peo­ple start to under­stand how to draw some­thing on a piece of paper and find oth­er doc­u­ments that are like it, if that were use­ful, we can sup­port that type of thing. It’s just not there yet.

Some of the inter­est­ing search and retrieval things are actu­al­ly not based on text, that are going on with WAIS. The peo­ple at US Geological Survey are mak­ing map data­bas­es where you can search based on lat­i­tude and lon­gi­tude and retrieve maps. It’s not using text at all. It under­stands only a few types of queries. But it’s get­ting real­ly good stuff for you.

The weath­er map serv­er? It only knows how to answer a cou­ple of ques­tions. But basi­cal­ly what it gives you is the cur­rent up-to-date, up to the hour, weath­er map that’s avail­able. The DNA sequence peo­ple are using peo­ple sub­mit­ting DNA sequences. And they’re match­ing against these huge vol­umes of DNA sequences to find rel­e­vant doc­u­ments. It’s not using text at all.

Malamud: So that’s a new kind of a serv­er. Would my very old WAIS client soft­ware be able to inter­act with this new serv­er, or do I have to upgrade my client every time there’s a new kind of service?

Kahle: Oh. Basically your old WAIS clients can get at all these new ser­vices. The key piece is the pro­to­col. Making it so that new infor­ma­tion ser­vices can come up, and the tens of thou­sands and soon hun­dreds of thou­sands of peo­ple that are using WAIS can get at that new ser­vice. That’s what the infor­ma­tion providers want. And what the infor­ma­tion con­sumers want is, all they want to do is learn one damn inter­face rather than one new inter­face for every­thing that comes along. They want to have lots and lots of val­ue from hav­ing to just learn one inter­face, or just a cou­ple of interfaces.

Malamud: It seems like users are going to have to learn mul­ti­ple inter­faces, because if you look at the area of resource dis­cov­ery in which WAIS is one exam­ple, there’s oth­er things out there. There’s the World Wide Web, there’s Archie. How do these things all fit together?

Kahle: Ah. They’re beau­ti­ful pieces of work. Gopher, World Wide Web, are two of my favorite inter­faces to WAIS. And that may sound a lit­tle bit strange. But Gopher has a real­ly nice brows­ing mech­a­nism to help you get going, to tell you a lit­tle bit of what’s out there, to help direct where you might want to go. And WAIS is just one of the things you can get to. We think of appli­ca­tions as becom­ing more and more WAIS-enabled. So instead of hav­ing a ded­i­cat­ed WAIS inter­face, you’re going to have your own inter­faces that are doing what­ev­er else you want to do. Your email pack­age should be WAIS-enabled. Your soft­ware pack­ages, when you help, it should go out to bul­letin boards that are indexed with WAIS.

Malamud: So are we back to emacs, then, where emacs is the the inter­face to the world?

Kahle: Ah, emacs of course has a WAIS inter­face to it. But what we think is there are going to be hun­dreds of inter­faces to WAIS. And they’re going to be built into all sorts of things. Your CD-ROM play­ers should have WAIS things so that if you want to get up-to-date infor­ma­tion it can call out, get that new infor­ma­tion, and bring it back to your CD-ROM-based interface. 

So WAIS is more of a piece of plumb­ing. It’s more the sign­post, it’s the lines on the road. And most peo­ple don’t think of that as ter­ri­bly inter­est­ing. That’s fine. All we want to be is useful.

Malamud: You talk a lot about indi­vid­ual use of WAIS, and you were talk­ing about a 56k line is enough to get a user on, and you know, occa­sion­al­ly 9.6 will do the job. What’s it going to take to get 56k to indi­vid­ual users? Do you have any ideas? Because your ser­vice depends on that under­ly­ing infrastructure.

Kahle: Basically, peo­ple mak­ing mon­ey. The major dri­ving fac­tor of a lot of this, there’s ISDN, which is about… You know, been about to hap­pen for decades, and they just… The phone folks just don’t think there’s mon­ey in it. If they can start mak­ing mon­ey at it, this stuff can hap­pen extreme­ly fast. All the wires are already laid. It’s just a mat­ter of using it in this new way. That’s to the home, say.

Businesses, often they have even more than that run­ning around their places. And what we’re see­ing is some of the pro­pri­etary pro­to­cols wash­ing away. We’re see­ing the DECnets and the SNAs being replaced by things like TCP/IP. And there’s the pro­pri­etary pro­to­col peri­od that seems to set we com­put­er sci­en­tists back for ten years or so. And what we’re try­ing with WAIS is to go out there in front with a good stan­dard, and an open stan­dard, and say, It’s time to bypass the pro­pri­etary pro­to­col peri­od and get WAIS in place.”

We are com­plete­ly depen­dent on net­work infra­struc­ture. And by Al Gore’s being our new Vice President and hav­ing his clar­i­on call to make sure that we have a nation­al dig­i­tal infra­struc­ture, that’s help­ing a great deal. I’ve been in Washington now for a week or so. There is more a buzz around here about how to get with it, how to get our data­bas­es up. The United States gov­ern­ment data­bas­es, and how offer those. For free access, often, and some­times for fee.

Malamud: Are we going to be in a world where good cit­i­zens go out and put data­bas­es togeth­er and let the rest of the world access them? Is that what you’re try­ing to find, kind of a world­wide free pub­lic library?

Kahle: Some things won’t be free. Some things will cost mon­ey. Other things will have oth­er types of restrict­ed access, because it’s your own pri­vate infor­ma­tion and there’s pri­va­cy con­cerns that go all the way through that sort of thing. But yes, most of us would be just per­fect­ly hap­py to have any­body lis­ten to us, right? That’s why I’m sit­ting here on this radio pro­gram not charg­ing any­thing. I’d love to have peo­ple know about what it is we’re doing. So, lots and lots of peo­ple will pub­lish and make their infor­ma­tion avail­able for free. 


Announcer: This is Geek of the Week, fea­tur­ing inter­views with promi­nent mem­bers of the tech­ni­cal com­mu­ni­ty. Geek of the Week is brought to you by O’Reilly & Associates, and by Sun Microsystems. 

This is Internet Talk Radio. You may copy these files and change the encod­ing for­mat, but may not alter the con­tent or resell the pro­grams. You can send us mail to mail@​radio.​com.

Internet Talk Radio. Same-day ser­vice in a nanosec­ond world. 

Malamud: We had anoth­er exper­i­ment in infor­ma­tion for free, and it’s called the Usenet. And if you look at Usenet, there’s all these news­groups out there. And increas­ing­ly, at least my per­son­al feel­ing is it’s very hard to find infor­ma­tion. Are we going to end up with the same sit­u­a­tion in the WAIS world, where there’s a lot of data­bas­es but maybe the qual­i­ty of the infor­ma­tion isn’t there?

Kahle: Most of the infor­ma­tion even avail­able on WAIS right now is not very good. So yes, we’re going to have just a glut. The Internet is open­ing up lots of sluices to just get at lots of infor­ma­tion. And try­ing to think that you’re going to be able to browse it all or get an idea of what’s out there is about as… That’s not gonna hap­pen. Just try to read books in print some­time. You can’t use it in that way. You need mech­a­nisms to help you find the right thing. And the key piece of WAIS is to not have the pro­duc­er nec­es­sar­i­ly say who should be read­ing it, which is how to Usenet is built. WAIS is try­ing to help the read­er go out and find the things that he wants, or she wants, out of all of those sources. So some more sophis­ti­cat­ed fil­ter­ing mech­a­nisms than the user-supplied fil­ter­ing mech­a­nism that’s in Usenet or email lists, for that matter.

Malamud: So rather than increase the qual­i­ty of the data­bas­es, you think we should increase the pow­er of the tools that search through those databases. 

Kahle: Both.

Malamud: Well, how do we increase the qual­i­ty of databases?

Kahle: At WAIS Incorporated, a lot of what we’re try­ing to do is help encour­age and work with pub­lish­ers to make their infor­ma­tion avail­able. There’s advertising-supported infor­ma­tion that’ll be out there and peo­ple will pay to make it look good. Like, Sun is mak­ing a lot of this infor­ma­tion avail­able through WAIS

But a lot of oth­er pub­lish­ers, tra­di­tion­al pub­lish­ers, are going to need pay­ment mod­els. And so we’re work­ing with them to try to come up with those pay­ment mod­els that are reflec­tive of their costs, which are often great­ly dimin­ished if they can dis­trib­ute over the Internet or oth­er net­works like it. So I think that we’re going to see more and more pub­lish­ers jump­ing in. We’ll see more pro­fes­sion­al data­bas­es. Like the US gov­ern­ment. A lot of what they do is pub­lish. And they’d love to have a mech­a­nism for pub­lish­ing cheap­ly, and they’ll pay for it out of oth­er bud­gets and then give away access. Those are the right sorts of data­bas­es, and we’re work­ing with those folks now—the EPA, Library of Congress, all sorts of peo­ple, to help them get their infor­ma­tion out in for­mats that lots of peo­ple can use.

Malamud: WAIS start­ed as a won­der­ful gift to the world from Thinking Machines, and now you’ve found­ed a com­pa­ny, WAIS Inc. Is WAIS no longer in the pub­lic domain? Have you tak­en it away and you charge for it now?

Kahle: Ah. No, the pub­lic domain envi­ron­ment, and the free­ware world, is one of the most amaz­ing worlds I’ve ever been involved in. I remem­ber back at MIT, where they were lots and lots of free­ware and shar­ing of code. It was a vibrant envi­ron­ment. The Internet has helped mul­ti­ply that by thou­sands, to help peo­ple cre­ate and deploy infor­ma­tion tools. 

From the begin­ning, WAIS was a mixed com­mer­cial and free­ware envi­ron­ment. The orig­i­nal par­tic­i­pants were Thinking Machines, Apple Computers, Peat Marwick, and Dow Jones. The free­ware world was part, a very impor­tant part, but only a part. The com­mer­cial world was a part, only a part, but an impor­tant part.

We’re try­ing to strad­dle three worlds. The .edu world, the .gov world, and the .com world. And it makes for great con­ver­sa­tions when you get those three groups togeth­er, because often they don’t trust each oth­er. They don’t real­ly know what moti­vates each oth­er. So try­ing to keep those worlds togeth­er is impor­tant. What WAIS is about is a pro­to­col to help peo­ple find and retrieve infor­ma­tion and make their infor­ma­tion avail­able. And we’re try­ing to make one pro­to­col, an open pro­to­col, good enough, that all three of those worlds will want to participate.

So what hap­pened was we used the Internet as a mech­a­nism for prov­ing out the tech­nol­o­gy, doing a lot of R&D, and get lots a lots of good peo­ple work­ing in the sys­tem. And now that we’re… When I start­ed to go more com­mer­cial with the ser­vices, and I start­ed plan­ning out WAIS Incorporated, I real­ly need­ed the free­ware to be done well. And I worked with NSF to make a set of mon­ey avail­able to start a WAIS cen­ter. And there is now one in North Carolina, and there’s start­ing to be oth­er WAIS cen­ters that are real­ly ser­vic­ing oth­er domains than what we’re doing.

It’s all got to go hand in hand, and we all have to work togeth­er. And it’s the excit­ing part. And if we lose, what I fear most is either we’re going to make some­thing so bad that peo­ple won’t want to use it, or we’ll self-destruct, or we’ll get cocky. And what will hap­pen then is we’ll see pro­pri­etary stan­dards come in.

Malamud: Are you com­pet­ing with North Carolina? Are they your com­pe­ti­tion for the WAIS data­base provider try­ing to fig­ure out how to do things?

Kahle: No, they’re our broth­ers. They’re extreme­ly impor­tant to the suc­cess of WAIS. This world is grow­ing by leaps and bounds. Millions of dol­lars a year are going into WAIS from all sorts of areas. And there are lots of nich­es that need dif­fer­ent types of ser­vices. A lot of the uni­ver­si­ties real­ly need things for free. So they can play with it, and under­stand, and build it into sys­tems and do research based on it. 

But a lot of peo­ple, when they’ve got mis­sion crit­i­cal data­bas­es, they can’t depend, frankly, on free­ware. They need to have a phone num­ber, some­body they can call. They have to know that when the new ver­sion of the oper­at­ing sys­tem comes out, the new ver­sion of the soft­ware is going to come out. And those are the envi­ron­ments that peo­ple are try­ing to work with. 

US Geological Survey is doing prob­a­bly the best work in the gov­ern­ment domain in help­ing the geo­graph­ic infor­ma­tion peo­ple use WAIS up a storm. So I think the North Carolina peo­ple are real­ly help­ing move the uni­ver­si­ty envi­ron­ment in the free world. Though we’re work­ing with Rice University, where they’ve gone and put up cur­rent con­tent. This is a for-pay pro­pri­etary source that they bought rights to run at the Rice cam­pus. So theirs is an exam­ple of a com­mer­cial WAIS serv­er. But it’s restrict­ed access. And that is an exam­ple where Rice need­ed bet­ter than what was avail­able in the freeware. 

Malamud: But what if North Carolina does a won­der­ful job putting togeth­er free­ware? How are you going to make mon­ey? Why would they pay you mon­ey when they can get it for free from North Carolina?

Kahle: Oh, we’d love more and more pieces to come out of the free­ware domain, whether it’s North Carolina, whether it’s out of the Gopher peo­ple, what­ev­er. Our goal at WAIS Inc. is to try to keep the world togeth­er. To try to make it so that con­sumers know where to find the right infor­ma­tion, and to help peo­ple that think of them­selves as just con­sumers to start publishing.

Malamud: But how do you make mon­ey at that?

Kahle: We’re doing it by sell­ing soft­ware tools. So servers, enhanced clients. But we find what most peo­ple need is help in under­stand­ing what this stuff is. So we do con­sult­ing and con­tract work to help mod­i­fy the exist­ing set of servers out there, and do things for them. We also help pub­lish­ers put things up and help them run those ser­vices. Sometimes we get a per­cent­age of their rev­enues off of those systems. 

So, we’re flex­i­ble. And as much as peo­ple start to step for­ward and say, We’re going to do this piece well,” then we’re often per­fect­ly up for step­ping back. The only way we’re going to win is by lever­ag­ing lots and lots of insti­tu­tions to do what they want to do. So, sys­tems inte­gra­tors are start­ing to step for­ward and say, We want to do the sys­tem inte­gra­tion.” We have peo­ple that are try­ing to bun­dle the WAIS serv­er code into lots of prod­ucts. People are start­ing to bun­dle the clients into prod­ucts, where they’re mak­ing their things WAIS-enabled. And we’re help­ing those that need to be deal­ing with a com­mer­cial enti­ty, or they can’t take it seriously.

Malamud: WAIS is mak­ing the Internet look like a sin­gle library, a sin­gle data­base. Is this the begin­ning of a new kind of library, a dis­trib­uted library, a glob­al library?

Malamud: I think a library’s maybe not the right anal­o­gy. I would think of it as a huge book­store, or a set of infor­ma­tion ser­vices that are avail­able. You know, is a weath­er map updat­ed every hour? A book? Is a library? No, it’s kind of like your tele­vi­sion is going and down­load­ing Geek of the Week, and find­ing that that’s what you want to be lis­ten­ing to. Is that a book­store or a library? No, that’s more like the radio. 

The inter­net is a new com­mu­ni­ca­tion struc­ture. And it’s a new way for humans to com­mu­ni­cate. And what WAIS is try­ing to do is help you nav­i­gate that. What is real­ly excit­ing with a new com­mu­ni­ca­tions struc­ture, say print­ing or the tele­phone, is all sorts of things hap­pen. Industries come and go. There’s realign­ment of how com­pa­nies work, how whole insti­tu­tions work. And what’s fun about this is it’s an open envi­ron­ment for all of us to take part in shap­ing it. And then those peo­ple that shaped the tech­nol­o­gy have an inor­di­nate con­trol over what it looks like. Why are these tele­phone look­ing the way they are? And why weren’t they used for all the things they’re used for now, fifty years ago? Because the peo­ple that were involved ear­ly had one vision. So the invi­ta­tion is, this is an open world, let’s shape it into what we want it to be.

Malamud: There you have it. We’ve been talk­ing to Brewster Kahle, and this has been Geek of the Week. 

Announcer: This has been a Geek of the Week, brought to you by Sun Microsystems. And by O’Reilly & Associates. To pur­chase an audio cas­sette or audio CD of this pro­gram, send a elec­tron­ic mail to radio@​ora.​com.

Internet Talk Radio. The medi­um is the message.

Further Reference

Geek of the Week: Brewster Kahle at the Internet Archive