Carl Malamud: Internet Talk Radio, flame of the Internet. We’re talking to Cliff Neuman, who’s a member of the research faculty at the University of Southern California. He’s also one of the principal designers of Kerberos, and the designer of Prospero, which is one of the new breed of resource discovery protocols. Welcome to Geek of the Week, Cliff.
Clifford Neuman: Hi. Thank you, Carl.
Malamud: Why don’t we start with what Prospero is. What is this nifty service?
Neuman: Well, one of the typical answers that I give to what is Prospero is, what do you want it to be? And one of the reasons for that is that I see Prospero as filling a number of roles. Prospero is primarily a directory service. But in fact I see it as a directory service that can be used to tie together the various components of a distributed system.
Malamud: But what do I see then, as a user? Do I see— Do I type “ls” and get a bunch of file names back? Is this a replacement for the ls command?
Neuman: Well in fact there is a command called vls which you can type which will show you a bunch of files, in particular those files in a virtual directory that is part of a virtual file system that is part of a virtual system. Let me say a little bit about what sort of…the overall idea which is behind Prospero. And that is a concept I call the virtual system model.
Basically, we’re starting to see lots and lots of and lots of information, and lots and lots of services that are available over the Internet. And one of the things that over the past year, or over the past few years, people have started to try to do is to try to come up with a system that can—so that users can think of everything that is out there as a single system. Unfortunately there are some problems with this. In particular when all of this is part of a single system, the system is just too big for users to think about.
Malamud: It’s like having a directory with whole bunch of files in it. [crosstalk] Is that the problem?
Neuman: Yes. That’s correct. In fact the anecdote I like to give about this is suppose everything were a single system and you sit down and you wanted to see who is logged in and you type “finger.” Well, you’d sit there for three days as the name of every user in the world that’s currently logged in typed out on your screen. So to address this problem, I believe that the correct approach is to allow users to select those resources and those parts of the system that are of interest, and then to treat those resources that they’ve selected as if it was a single system. So now, to an individual user, the user sees a single system which is much smaller than everything out there but is very much customized to what their particular needs are. Whereas in the system as a whole, you’ve got different views of this. So different users see different things. And there are a lot of problems that come up when you do this. And some of the mechanisms that Prospero provides are designed to help you resolve the problems that come up from different users seeing different things.
Malamud: So do I have to go out there and say “Well, I like this piece of information. I like that piece of information.” Do I have to scour the world and build this worldview or is it somehow done dynamically and automatically for me?
Neuman: Well that is in fact one of the problems which Prospero addresses. We recognize that certainly it’s impractical to require users to go out without the benefit of Prospero finding what they’re interested in just so that they can pull that back into their view of the world. Instead, the way that things work is there are certain people that have organized information already. And we believe that users should be able to construct their own virtual system by starting from existing virtual systems that others have created and customizing them. And by taking the best parts of different virtual systems and combining them into their own view.
Now how do they find the virtual systems that they start from. Well, typically a user will start out with a virtual system that is set up for them by their site administrator, for example.
Malamud: So there’s a default Prospero that we use? Is that how it works?
Neuman: Yeah. So your site might have default a Prospero, or a default virtual system that has been set up by your administrator, knowing that well you know, this is the chemistry department so the people in this department are really more interested in things related to chemistry instead of things related to computer science, for example. Starting from this, you will have those things that it was expected that you might be interested in nearby. You can still get at all the stuff that’s there for computer science, for example, but you’ve got to go through a few additional hops. But as you find these things by exploring deeper and deeper in your own virtual system, you can forge new links and bring those pieces that you have decided of interest closer to the center of your virtual system.
Malamud: And you actually— Let’s say you find some anonymous FTP archive out there and it’s got a file of chemical abstracts. Do you make a copy of it and bring it back, or does Prospero know that it’s FTP-able and just get it when you point to it?
Neuman: Prospero knows that it’s FTP-able. You simply make a link to it. Well first of all, as you’re exploring you might just say “I want it,” and you can say “cat filename” if you’re in the file system interface. Or if you’re in the menu browser interface, you can simply select the particular item.
One of the important things with Prospero—this is sort of an aside here—is that we’re not trying to provide the user interface. We’re trying to provide infrastructure upon which different applications can build. So for example, you brought up the ls interface. So you’re doing cd’s and ls’s and looking around. But that’s just one way of getting at this data. The same data you can access through a menu browser very similar to the Gopher system. We have plans on adding a hypertext browser. Already, Prospero is the way that most users out there get at information that’s available through the Archie database, although because Prospero provides infrastructure, most users don’t realize that because they just run the application that they’re used to.
Malamud: So they’re running xarchie and they’re actually talking to Prospero?
Neuman: That is correct. In fact—
Malamud: Well now how does that work?
Neuman: Xarchie is simply an application that has been built that makes calls to query the Archie database over the network using the Prospero protocol. So, the Archie database is exported all by a Prospero server on each of the primary Archie sites, in a form that looks like basically a mesh of information or like a file system in some sense. Really it’s a directed graph. You have individual nodes in the graph that correspond to directories. You have nodes in the graph that correspond to files. You have links that bind certain files and other directories in there. And you can have attributes that are associated with individual files.
So in fact, what happens when xarchie makes a query, it goes off, determines what it is that you want based on what particular buttons you clicked in the xarchie interface and the name of the file you specified. It formulates that in the form of…well, here is a note in this directed graph that corresponds to an Archie server. And I know that by specifying a certain file name under this, I am going to get back the contents of a directory that corresponds to the results of the query. And then it gets back that directory which contains all the links that are in matches to your query and presents them to the user. Information about file modes, last modification times, are returned as attributes that come back at the same time.
Malamud: So xarchie talks to Prospero, Prospero then talks to other Archies and gets that information back?
Neuman: Prospero then talks to Prospero servers that are running on Archie sites. And the Prospero server there makes a query to the local Archie database. And then returns the results as links and as directories and as attributes using the Prospero protocol. In fact there is not a separate protocol to get at Archie. The only way to get at Archie over the network is through the Prospero protocol or by telneting to Archie—
Malamud: Or emailing—
Neuman: Or by emailing.
Malamud: So you’re just a…you’re providing a network interface to the Archie world. Do you do that to other worlds, like WAIS servers?
Neuman: Yes. In fact Alan Emtage has recently—or at least one of the other people working at Bunyip has in fact created a Prospero server that will export a WAIS database. We just this summer released a Prospero server that provides a gateway to Gopher space. So in fact, the Gopher graph in some sense, or at least the Gopher hierarchy that you go through if you’re going through Gopher, is in fact available as a directed graph using Prospero, with the particular information about links such as how you display them, where they appear, as attributes of those particular links and of those particular nodes in that directed graph.
Malamud: Now why would I want to do that? Why would I want to add another… I’ve got a Gopher server, I’ve got to Gopher client, why would I want to put a Prospero server on top of my Gopher server and come in that way? Isn’t that an extra level of indirection?
Neuman: Yes. It is an extra level of indirection, and as you may have heard some people say well, any problem in computer science you can always solve with another level of indirection. In fact, that is what we need to do here because if you look at many of the services that are out there—you look at WAIS, you look at Gopher, you look at World Wide Web—these services are vertically integrated. You have a Gopher server, and you have a Gopher client. The Gopher client can access data on the Gopher server. You have a World Wide Web server, or a HTTP server that’s out there, and you have a hypertext client, and that hypertext client can access the stuff that’s in the World Wide Web server.
Malamud: What about Xmosaic? Isn’t that a multiple interface client that happens to talk World Wide Web but it also talks other things?
Neuman: Um, in fact what you find people starting to do in many of these situations is they are— Well there are several approaches that are being taken to address a problem. One is gateways. And the idea behind a gateway is that now all the information from one service becomes available to another. It gets translated in this intermediate machine that’s a gateway.
Also many of these applications that are out there will in fact understand multiple data access protocols in some cases. So Xmosaic for example, and I’m not sure of the actual details with Xmosaic, but I believe it can go and retrieve a file by FTP, it can retrieve the file by HTTP. I don’t know if it can directly retrieve a file by Gopher. It may be able to.
Prospero provides those functions at the data access level as well. But I believe it’s important to have a more uniform meta information level that allows you to export in a common format, directory information that is the relationships between objects, attributes about objects, and there are certainly some other approaches or…given that approach there are certainly some other protocols you might consider for this. So for example—
Malamud: Sounds like X.500.
Neuman: Well, certainly it sounds like X. 500. It sounds like…it sounds like…well why not use the Gopher protocol? Or why not use the HTTP protocol to do all this?
Well, one of the problems with some of the existing services is that the protocols themselves are too closely tied to the presentation. So, for the Gopher protocol for example, the Gopher protocol exports this meta information with the assumption that you’re going to display it using a menu browser. World Wide Web or HTTP exports directory and meta information with the assumption that you’re going to display it as a hypertext document.
Well, in the case of the menu browser, that’s a little bit too restrictive in the sense that there’s lots of stuff that you can’t really represent within a simple menu pressure. For the hypertext document it’s perhaps a bit…it’s not restrictive enough. In the sense that it makes assumptions that links to documents, or links to objects are going to come out of the middle of a document somewhere. And this is difficult for a menu browser to parse.
So instead what you’d like to do is have a protocol that exported the information, the meta information—that is those links, the relationships between the links, the attributes describing the links and describing the objects—in a form that each application can pick and choose those pieces that it needs. So the hypertext browser, if you have a document that is a hypertext document, the directory should be able to specify where those links are supposed to come out of in a document. But if you’re looking at the same document through a menu browser, it should then be able to just look at what those links are and not where specifically in the document they are coming from. And furthermore, there should be some meta information associated with each link so it knows what to display for the particular menu item.
Malamud: In the World Wide Web, I can take a document and format it in HGML and add that to my database and my server goes out and provides that information to the World Wide Web environment. If there’s a Prospero server on top it provides it in that world. Do I ever format a document for Prospero, or does Prospero we sit on top of another system?
Neuman: Prospero actually sits between two parts of the system. It sits above the server, or the data storage mechanisms that are actually storing the data that information providers want to make available. And it sits below the application that is going to access that data. So, one of the advantages of this is that if you have a bunch of applications, or a bunch of interfaces to get this information, they are then able to access all the information exported by all the servers that export information using Prospero. So, instead of having a single vertical stack, the applications along the top should be able to get at all the data along the bottom.
Now, the formatting of the data that you’re going to export should be formatted for specific applications, perhaps. Ideally you would like to pull out as much of the formatting information as possible and represent it as attributes so that that is actually retrievable directly through Prospero. But, you may have different representations of the document for different interfaces. For example the idea with the hypertext documents and how you might represent them using Prospero is that you can have an object in Prospero that is both a file and a directory. The directory contains those links to the other documents that are referenced from the hypertext document. There can be attributes associated with those links that say where the source of the link is within a document, if you’re using a hypertext browser that is going to represent the document that way.
You don’t have the data associated with the object also, which is what you’re going to display to the user. And you might use various formats to represent this data, whether it’s a PostScript document, or an ASCII text file, or an nroff document or something else.
Now, the type of the document or the format of the document is also exported by Prospero as an attribute, what we call object interpretation, so that the application can decide how it is going to interpret that particular document. But, for many types of documents that are simply text it’s still quite appropriate for a simple browser that doesn’t understand HTML, for example, to display it simply as text. And then the attributes that are on the links still allow it to describe where those links are using Prospero directly, even though it does not understand how to interpret the embedded text that is within an HTML document.
Malamud: Are there multiple people that have put together interfaces to Prospero? I mean, can I run this on multiple platforms? What… How do I run it?
Neuman: Right. So, we have several releases of Prospero that are available, some of which we’ve made available, some of which other people have done extensions to. Our basic release is available from the machine prospero.isi.edu in the directory /pub/prospero.
Malamud: Okay.
Neuman: You can read the readme file there and find out information about which release you wanna get. The releases that we distribute run on most most variants of the Unix operating system. Version 4 of Prospero—we just recently released version 5. Version 4 of Prospero, a number of people, in particular Brendan Kehoe who’s at Cygnus, took and stripped out a lot of the pieces that were not necessary for for example the standalone Archie client. And this version which he had then made available was then in fact portable to even more machines. I believe it’s been ported to VMS and it’s been ported to MS-DOS and a few other machines. We just recently released version 5 and I expect we are going to start seeing the same thing happened with version 5. But right now the release that we’re providing is only written for Unix variants.
Malamud: But you distribute source code, [crosstalk] it’s publicly available and…
Neuman: Yes. It’s publicly available source code written in ANSI C. It does require network support for both the User Datagram Protocol and of course the select system call. Those are the two things that—
Malamud: But Prospero’s written in ANSI C, so Prospero is ANSI-compliant.
Neuman: Yes.
Malamud: Why do you call it Prospero?
Neuman: Well, Prospero was the principal character in The Tempest, by Shakespeare. And in The Tempest, when the enemies of Prospero were shipwrecked on an island, through various magic he caused each of the members of the shipwrecked party to think that they were the only person on the island. And they did not have a shared view of everything until they slowly learned about the other survivors.
Malamud: Kerberos has been highly successful in fairly large networks, but one can say that it seems to have found its place in the organization. In the MIT campus network, in the corporate network. And it doesn’t seem to have scaled to the Internet as a whole. We’re looking at new technologies now like public key cryptography. Is that an alternative to Kerberos, or do the systems from RSA and the systems that Privacy-Enhanced Mail are based on, do those somehow fit in with Kerberos to provide a security solution?
Neuman: I think that the two are complementary. In fact, you are quite correct that most of the use of Kerberos to date has been within a particular organization. Now version 5 of Kerberos is scalable. In fact you can organize Kerberos realms—these are collections of Kerberos users that are registered in a common database. You can organize these realms so that users in one realm can in fact communicate with and authenticate themselves to services in another. And you can organize realms hierarchically along the lines of the domain name system, or in fact along similar lines to the certification hierarchies that you have in Privacy-Enhanced Mail.
Malamud: Should we do that instead of the certification hierarchies?
Neuman: No. I believe that— well, I believe that we will start to see hierarchical organizations of Kerberos realms. But there are also benefits the public key cryptography. Kerberos is based on conventional cryptography, which has some limitations, but it also has some benefits. In particular conventional cryptography tends to perform—has better performance than public key cryptography.
There are advantages to public key cryptography related primarily to the fact that you do not have to store secret information on a central server. Whereas in Kerberos, although the central server is something that is presumably much more easily secured than your normal workstation, there’s still the fact that the users’ keys are stored on it.
So there are definite benefits to using public key cryptography. But they’re also definite costs to using. And what I see as the ideal mix is you need to have both of these available. And it is likely that over the course of the next couple of years, we will in fact see authentication protocols in particular perhaps some follow-ons to Kerberos that will provide both Kerberos authentication, based on conventional cryptography, and authentication based on public key cryptography. And such a mechanism will have the advantage that those users that cannot afford the cost in terms of performance of the public key encryptions can choose to use the public keys and still interoperate with those users that are doing authentication based on these public key certification hierarchies.
Additionally, for those sites that are not willing to install public key-based authentication for fear of either infringing on patents or for the desire not to have to pay the licensing fees for that, they too would then be able to make use of the conventional cryptography in one particular variant of Kerberos while still interoperating with those that are making use of the public key cryptography as well.
So I see these things actually coming closer together so that it is just simply a choice of which form of cryptography you choose to use, and which certification hierarchy you tend to use, whether that be the one that will be evolving for Privacy-Enhanced Mail, or the conventional cryptography realm hierarchies that are used for Kerberos. But these should all work together within a common mechanism in the long run, and I think we’re going to start to see that.
Malamud: I think the long run is maybe a key phrase right there. Security has been a long time in coming. We’ve had solutions like Kerberos, we’ve had the RSA-based public key cryptography. Yet for most of the global Internet security consists of typing a password in the clear for a telnet session. Are you disappointed in how long it’s taken security to deploy itself in the Internet?
Neuman: I’m somewhat disappointed, but I’m not at all surprised. And the reason that— Well, there is one thing that we have not provided to the application developers that I think would definitely improve their ability to integrate security with their applications. And that is that right now, integrating security mechanisms within applications is pretty difficult. You have to go and modify the application to make calls to the security services, then send across credentials to the other side, pass them into a function there. You have to actually go in and modify these protocols and these mechanisms at a pretty low level.
Now, there’s work such as the GSSAPI—the Generic Security Services Programming Interface, which allows you to do this once for an application, and then that presumably works for a variety of authentication mechanisms, but there’s still that initial hurdle of going in and changing the existing applications. If we had a service such that you just relinked your application with this library and all of a sudden everything was secure, that would be the ideal world because it would be easy to integrate it with new applications.
One of the things that I would really like to see, and it’s something that I’ve been doing, is as people design new applications, they should think about security up front. And they should include at least the hooks in the mechanism so that they can be easily integrated with security protocols. For example, Prospero which is a directory service, currently supports four kind of authentication. It’s supports version 5 of Kerberos, but recognizing of course that most sites do not yet have Kerberos set up. It also supports password authentication. It also supports authentication based on the Internet address from which you’re coming and the user name.
Malamud: So you think it’s up to the application designers to begin working much more intensively in this area rather than waiting for some security guru to come up with a magic bullet.
Neuman: I think that new application designers need to consider security from day one. Because well, even if you had this magic bullet there are certain issues which cannot be handled just by relinking with a library. In particular, Prospero has support—and the Andrew File System is another example—has support for access control lists, fairly flexible access control lists. That is something that you need to think about in your service model that cannot just be obtained by providing a library; in particular you need the ability to store access control lists and associate them with particular objects. The protocol messages that’re involved for exchanging security information in Prospero, that is something that I designed into Prospero but something that might in—one of these days if one of these magic bullets came along, one of these libraries you can link with—might in fact supplant some of that. But for the time being I would say don’t wait on that. If you are designing a new application or a new protocol, think security up front. At least in the IETF there’s now a requirement for new RFCs to have a section with security considerations. And one of the ideas behind that was to put the people designing these new protocols at least in touch with people in the Security Area Advisory Group so that they can be brought up to speed on what is available out there and what issues they should be thinking of up front.
Malamud: Is that working?
Neuman: That seems to be working. Although we do see a lot of security considerations which simply state “There are none.” Or, “This document is entirely about security.” But, at least it has given us the opportunity when people start looking at new protocols to speak to them and say well here’s what’s available; here are some hooks that you should stick in even if the things are not available yet.
Malamud: Well thank you very much. We’ve been talking to Cliff Neuman.
This is Internet Talk Radio, flame of the Internet. You’ve been listening to Geek of the Week. You may copy this program to any medium, and change the encoding, but may not alter the data or sell the contents. To purchase an audio cassette of this program, send mail to radio@ora.com.
Support for Geek of the Week comes from Sun Microsystems. Sun, the network is the computer. Support for Geek of the Week also comes from O’Reilly & Associates, publishers of the Global Network Navigator, your online hypertext magazine. For more information, send mail to info@gnn.com. Network connectivity for the Internet Multicasting Service is provided by MFS DataNet and by UUNET Technologies.
Executive Producer for Geek of the Week is Martin Lucas. Production Manager is James Roland. Rick Dunbar and Curtis Generous are the sysadmins. This is Carl Malamud for the Internet Multicasting Service, town crier to the global village.