Carl Malamud: Internet Talk Radio, flame of the Internet.
Malamud: This is Geek of the Week, and we’re talking to Dr. Jeff Case. He’s President of SNMP Research and a professor at the University of Tennessee. Welcome to Geek of the Week, Jeff.
Jeff Case: Thank you.
Malamud: You were one of the original four authors of SNMP version 2. I was wondering if you could tell us what is SNMP version 2, and why do we need it?
Case: SNMP version 2 is a second generation of the highly successful SNMP version 1 network management framework. When we talk about SNMP version 2, we’re talking about two things. We’re talking about the protocol, SNMP, but more importantly coming with that protocol is an network management framework.
The SNMP version 2 protocol suite consists of a total of twelve documents—about 400 pages of very very interesting reading, especially if you have insomnia—that include other things like the structure of management information which defines how we name information, often called the SMI, and management information bases, the management information base often called the MIB. So these documents work together to define an entire framework for network management.
Now, in terms of your question how is this different from SNMP version 1, it gives us increased capabilities in several areas. These increased capabilities were selected as a result of feedback and comment from people who had used the SNMP version 1 framework over time. Things like high-speed retrieval of bulk data was a criticism of SNMP version 1 and has been addressed quite nicely in SNMP version 2 and it’s one of the things that I’m personally most proud of.
Malamud: How did SNMP version 1 handle the problem of getting lots of data back from an agent to a manager?
Case: Normally if you had a large table you would retrieve that table one row at a time, and if that table had five columns you might pull back those five information items in a single row, one row at a time, five columns at a time. As a result, if there were 10,000 rows in the table, it might take 20,002, approximately 20,000 packets 10,000 round trips—10,001 round trips to retrieve that information. So you were bringing back lots and lots of packets that were not full.
Malamud: So an example of that might be “Show me the routing table on that router because we’re worried that something’s going wrong.” Would that be a—
Case: Two of the classical examples are the routing database, and on a MAC Layer bridge the bridge forwarding database. The bridge forwarding database typically has a few columns, and a typical Media Access Control MAC Layer bridge such as an Ethernet-to-Ethernet bridge might have 8,000 rows in it.
Malamud: And you’d want to look at it if you suspect you have a loop or something like that.
Case: Some people do that. Some people, for example the MAC forwarding database, they use that to detect when a new node has been added to their network because the bridges here, every packet from every host gets through at least one time and therefore it shows up in the entry. And if you scan those entries from time to time you can detect the appearance of new systems that’ve been added to the network. Perhaps with and perhaps without the knowledge of the network administrator.
Malamud: So how do we handle this problem now in SNMP version 2?
Case: Basically the paradigm is that one adds more information to the packet until the packet is full. Or until a manager station-specified limit is reached, whichever comes first. And the power of transferring full packets is fairly obvious. Because if you as a result are now able to put as many as forty or fifty rows in a single package, there are savings in terms of work on both the manager station and agent, savings in terms of bandwidth on the network, and a reduction in terms of the compute cycles on both ends of the system. And for the user, they see it as a tremendous improvement, a tremendous reduction in the latency between the time they ask for a particular table and the time that table is fully retrieved. In some tables we can retrieve that entire table in one packet, for shorter tables.
The speed-up is better if you have fewer columns and those columns tend to be shorter. For example a single integer or a six-byte MAC address, you will have quite a bit more speed-up compared to where each column is a 255-character printable string. The smaller the information item, the better the speed-up.
Malamud: Because the more rows you’re able to pack into a single packet.
Case: The more you can get into a single— Exactly. You understand perfectly.
Malamud: Well we try to here on Geek of the Week. So you used to have an operator—you still do—call the “powerful GETNEXT operator.” So it sounds like you’ve added a bunch of new operators to the protocol?
Case: Basically there are a few new operators. One new operator is this GETBULK operator, which is a descendant from the powerful GETNEXT operator. We call it the “awesome GETBULK operator.”
Malamud: Now, I thought the powerful GETNEXT operator could do everything. If you read Marshall Rose’s book he was very clear that there’s no need for other new operators. We want to keep the protocol simple. Was there some kind of a tradeoff you had to make here between the simplicity of the protocol and these special-purpose operations like GETBULK?
Case: Oh yes, very much so. And SNMP version 1, we attempted to— I personally argued to have a feature like GETBULK in SNMP version 1 and was unable to get consensus amongst the three coauthors of SNMP version 1. The joke, and it is a joke, is that I was able to get it to be a part of SNMP version 2 but I had to get three different co authors in order to rewrite it SNMP version 2 to get this GETBULK in. But some of the people who worked with SNMP version 1 in the design phases have said that boy it worked out far better than they ever imagined that it could have. Management stations for SNMP version should be written around GETNEXT. Management stations on SNMP version 2 should be written around GETBULK.
There are several other improvements to GETBULK as well. One is that GETBULK will never return a Too Big error, for example. Whereas the GETNEXT operator will occasionally do so. That if you ask a question where the answer is that it’s too large to fit the response, the size of the response, in SNMP version 1 GETNEXT—and SNMP version 2 GETNEXT—you get back a Too Big error. A GETBULK will never return a Too Big error. And so it’s just one less exception that a manager station has to program around.
Malamud: You, in SNMP version 1 had a clean slate to work off. There was a great lack of network management in the Internet. With SNMP version 2, you’ve got a very large installed base to contend with. How did you handle the problem of transition of that installed base? Is there a flag day in which everyone’s going to have to run SNMP v2? Are you gonna be able to keep SNMP v1 out there for a long time?
Case: One of the twelve SNMP version 2 documents is the coexistence and transition document that addresses these kind of issues. Which is a way of concretely saying that these sorts of problems were thought about a great deal, enough that we write some things down about it.
At the time that SNMP version 2 was first announced, and in those days we were calling it SMP—and SMP being the input to the standards committee and SNMP version 2 being the output of the standards committee. In those days, we had implementations of the two basic strategies to let this peaceful coexistence occur. These are both discussed in the document. Marshall Rose implemented the proxy approach and I implemented the bilingual system approach.
Malamud: Could you describe those two approaches?
Case: In one case, in proxy approach, you have a SNMP version 2 manager, an SNMP version 1 agent, and a third device—we’ll call it a proxy device—which receives an SNMP version 2 query or command, translates that into an SNMP version 1 query or command, forwarding it on to the version 1 agent. When the response comes back it does a similar translation. So, in some ways it’s like a translating gateway, or a protocol converter. There are several names you could give this sort of a thing.
Malamud: Now, if there’s more functionality in SNMP version 2 isn’t it possible you’re asking to do something that can’t be done in a version 1 agent? I mean, how does the proxy handle the mismatch in functionality between the two versions?
Case: Well, a GETBULK can always be downgraded to a GETNEXT of a single row. So in GETBULK we go retrieve rows until the packet is full— And I’m oversimplifying here a bit. And basically with when we downgrade that to a GETNEXT the packet comes back, it’s perfectly what one would expect, it’s just not quite as full as you might hope for to get get the maximum performance. But interoperability does not suffer through this translating gateway, although the performance win does not happen. The performance is not worse, it’s just not better.
The second approach is this “bilingual” approach, and that is where you have a system which for example, a bilingual agent would be one which supports both protocols simultaneously. And if it receives an old packet it processes it according to the old rules and produces an old response. And if it gets a new packet it processes according to the new rules and produces a new response.
I personally have implemented both approaches in the time since we first announced this. I personally prefer the bilingual. It’s one less thing to go wrong. It’s one less thing you have to configure. But both approaches are viable, and either one will work, and a network administrator can select them as they see fit.
Malamud: Are these temporary transition mechanisms, or do you anticipated that SNMP version 1 agents and nodes will be out there for a long time?
Case: Yes, it is temporary and we expect that temporary is on the order of ten years.
Malamud: Now, SNMP version 2 has a lot more functionality and therefore one would expect that the code is bigger? It’s takes more CPU and memory and resources to be an SNMP version 2‑compliant agent or manager. Is that correct?
Case: Well there are several features that we haven’t mentioned in addition to GETBULK that do add some size. The GETBULK frankly adds only about thirty to forty lines of code in a typical agent implementation. One of the things that we did is that in any agent implementation—and we’re focusing on keeping the agent implementation small because that’s where the pressure is rather on the management station.
One of the things that we did is that we tried to make it so that the changes for transitioning from version 1 to version 2, or from version 1 to bilingual, were concentrated where you would have minimum pain. That is, on a typical implementation you have a part that is protocol-dependent and MIB-independent, and another part that is MIB-dependent but protocol-independent. And much of the existing investment that’s out there in the world today is in the MIB-dependent/protocol-independent part. Well that is, we worked very hard to design it so that existing MIB modules and existing code which implements MIBs could continue to be protocol-independent to the maximum extent possible, not require changes in application programming interfaces. And I believe that we’ve been really successful at doing that. And so therefore the changes are in the core part of which there are a relatively smaller number of implementations, and that it concentrates the pain in one place instead of distributing it throughout the entire implementation. So that in order to add SNMP version 2 to an agent implementation, for example, you might need to make changes to the core agent engine, but all of the MIB instrumentation and the software to support those additional MIBs, that investment continues to be preserved.
In terms of sizes, there are two parts that then are a part of SNMP version 2. That is the changes in the core part and the changes in those MIB modules. Changes in the MIB modules come as a result of just additional MIB that are supported as a part of SNMP version 2. In version 1 all of the configuration a network administrator needed to do to set up what manager stations are valid and where do I send traps was very implementation-dependent and buried all over the map. In SNMP version 2, that’s all very well-codified, very well-documented, and can be done through SNMP transactions themselves.
However, the work to make that configurable throughout SNMP version 2 has associated with it a fairly large MIB. That’s where most of the size gain comes as a part of SNMP version two. The configuration of administrative relationships, the configuration of security parameters. The additional size as a result of the security code itself is fairly small. The size as a result of the configuration of the security is where all of the bloat is.
Malamud: When you mention security, what are some of the security features in SNMP version 2, and how does that differ from what was in the original specification?
Case: In SNMP version 1, we had attempted to define some mechanisms for security, but every time we attempted to do so, we couldn’t get agreement. So in SNMP version 1 we proposed that we have a thing called a community string, and a wrapper where we separated the authentication wrapper from the payload, called the protocol data unit. So that at some point in the future, one could make changes to the authentication without changing the payload, the protocol data unit. And that was insightful in 1988 and was used recently to make these changes, to keep those changes fairly well modular and divorced from one another.
The SNMP version 1 stuff was fairly weak in terms of its authentication, and could easily be masqueraded and subject to various kinds of attacks.
Malamud: It was based on the concept of a community string?
Case: That’s correct. The community string is a plain-text piece of the message header that would identify the sender. And if the manager station sent a query or command to the agent, the agent would scan a table and see if there was a match. And if there was then it would process that query or command according to the configuration associated with that community string.
However, anyone who was snooping the network could steal that community string and could eavesdrop on it, and then masquerade as if they were the node that was configured to do that. So it’d be like stealing someone’s username and password by watching the network traffic go by and then using that to make access to that account.
Malamud: So it was more of a synchronization mechanism, just to make sure you didn’t by accident step on the wrong router or something like that. [crosstalk] I mean you could—
Case: Well it kept the honest people honest but it certainly did nothing for the dishonest people.
In SNMP version 2, there are three levels of security, and these can be configured by the network administrator and can switch from one level to another level on a transaction-by-transaction basis. So those three levels are: No authentication and no privacy; authentication based on the message digest algorithm MD5 with no privacy; and the third level—the top level—is MD5 authentication with privacy based on DES, the Data Encryption Standard.
These address the various threats that were thought to be important to network administrators. We believe that most network administrators will use no authentication and no privacy for routine monitoring. And that will allow anyone to be able to monitor the network and would not lock anyone out of being able to monitor.
Authentication without privacy we believe will be used by network administrators for control operations—when they want to change a configuration when they want to perform an SNMP set request, rather than inspecting the configuration to alter it.
We believe that in most networks, authentication with privacy will be rarely if ever used, but is there for those network administrators who have applications which require it. For example if you wanted to use SNMP in a system administration environment and you wanted to create a new username and password in a host administration application, you might want to use privacy so that the new password remained secret, that someone eavesdropping on the network would not be able to learn the password of this newly-created account.
But, different sites can have different policies. The protocol supports these three levels, and different sites can enforce different policies. And in fact if one wishes you can use authentication and privacy on all transactions if you wish, or you can use no authentication on any transaction if you wish. That’s totally in the hands of the network administrator. And down on the farm of course we have a saying, “With freedom comes responsibility.” That is, as they make these choices they’ll be responsible for the consequences of making those choices.
Malamud: You’re using the MD5 message digest algorithm. You’re using DES. Those are two of the basic building blocks of another security environment, Privacy-Enhanced Mail. And there’s a third building block, which is public key cryptography. Do you view the SNMP security mechanisms somehow integrating with PEM and the certification hierarchies that are being built? Or is this just a separate security regime?
Case: I think that it’s a separate security regime. The fact that they use some of the same technology was intentional. I might also add that SNMP version 2 while it does not presently call for the use of public key cryptography, the hooks are there. If one wants to add that one is free to do so. That is that the SNMP version 2 framework was made very extensible so that additional authentication types, additional key management types, and additional privacy types could in fact be added to the protocol in a very straightforward fashion, in a very modular fashion. So that if for example in a government application in a super secret network, if they felt that DES was not the appropriate protocol to use for cryptography and they had a different protocol of choice, that can easily be added on a specification level, and we’ve made in our implementation so that it’s very easy to craft that into the code as well.
Malamud: Does that make any sense? Can you think of a situation where you’d recommend that?
Case: Yes, but I can’t tell you about it because if I did then I’d have to shoot you.
Malamud: Jeff Case, SNMP version 2 was developed by you and three of your colleagues, and was presented to the IETF is the Simple Management Protocol as “Here is a proposal that we’re making.” And some people looked at that and said gee, you really should have included the community earlier in the specification process. Do you think that there’s a balance here between the efficiency of a design team like the group that you put together for SNMP version 2 and the openness of a standards body like the IETF? Is there a tradeoff there?
Case: There certainly is a tradeoff. The sooner you get the input, the longer it’s going to take to achieve the result. We were in an environment where proposals were invited. We put together the strongest team that we knew how to put together. We in fact invited some other people who for various reasons chose not to participate that could have made the team even stronger. But, we put together the strongest team we could possibly muster, and we prepared a proposal.
We were very concerned that we had to come in with a strong proposal, because we did not want to freeze the market. If there’s a lot of fear, doubt, and uncertainty in the market for an extended period of time, this is not healthy for vendors, this is not healthy for consumers or the technology. So we put together a proposal and presented it. It turned out that that proposal was found to be worthwhile and was used at the basis to the input to the standards community.
The standards community did make some changes. I think they made some good changes. I think they made some bad changes. They deleted some things that I think should’ve been in there that were in their early drafts that didn’t make it through the end of the pipeline. And they added some things that I don’t fully agree with, but the process worked. And the general people that wanted to have input, who wanted to invest the time to participate, had their day. And they could make changes. And what we see now is SNMP version 2. Still not perfect, but much much better as a result of having gone through that open process.
Malamud: Are you beginning work on SNMP version 3 at this point?
Case: [laughs] Uh, no.
Malamud: No. No “but,” just no. [laughs]
Case: Would you like me to expand on that?
Malamud: Yeah, that’s one of those words that has several meaning sometimes. Are we gonna need a new network management framework in ten years? Do we need to add object-oriented this or client-server that? Or do you think we have something that’s gonna last for a long time.
Case: Well, ten years is a long time and I can’t possibly see that far into the future. Will we need something different in ten years, my guess is yes. The question is will the infrastructure that we have be enough to carry that and the new technology will be simply layered on top of that, or will it be a replacement? I have no clue. I have absolutely no clue. But you did poke me with something here that does get me started, and that’s this object-oriented thing.
SNMP version 1, some of the anti-SNMP bigots would say gosh, SNMP simply can’t be used for network management because it’s not object-oriented. But then the pro-SMP bigots would say of course this is nonsense. And they’d go back and forth, back and forth. Let me use an example that will maybe help you understand where I stand on that.
If you think about a typical Unix workstation based upon a RISC processor, it has a particular way of accessing memory, a fairly primitive addressing scheme, a way it names where data is stored. It has an address, typically 32 bits wide. And sequential addresses are numbered sequentially. And it has a relatively limited amount of data types. You might have characters, you might have short integers, and you might have long integers, and perhaps one or two different floating point formats. No object orientation here. Just a very simple flat addressing space, not nearly as complex as that object identifier tree from SNMP, and a relatively limited number of data types—again not nearly as rich as the data types found in SNMP.
Now. Does this mean that we can’t implement an object-oriented database on that Unix workstation with that RISC processor? Of course not. People do it every day. Does this mean that you can’t implement an object-oriented infrastructure on top of the SNMP management framework? No, it doesn’t mean that. Quite the contrary, of course you can do that. The infrastructure is there. Just get busy. It’s just code. Take as long as you need. Take all weekend.
Malamud: Well there you have it. This has been Geek of the Week and we’ve been talking to Dr. Jeff Case. Down on the farm, he’s got a saying. Thanks a lot Jeff.
Case: Sure. Thank you for having me.
This is Internet Talk Radio, flame of the Internet. You’ve been listening to Geek of the Week. You may copy this program to any medium and change the encoding, but may not alter the data or sell the content. To purchase an audio cassette of this program, send mail to radio@ora.com.
Support for Geek of the Week comes from Sun Microsystems. Sun, The Network is the Computer. Support for Geel of the Week also comes from O’Reilly & Associates, publishers of the Global Network Navigator, your online hypertext magazine. for more information send email to info@gnn.com. Network connectivity for the Internet Multicasting Service is provided by MFS DataNet and by UUNET Technologies.
Executive producer for Geek of the Week is Martin Lucas. Production Manager is James Roland. Rick Dunbar and Curtis Generous are the sysadmins. This is Carl Malamud for the Internet Multicasting Service, town crier to the global village.