Carl Malamud: Internet Talk Radio, flame of the Internet.


Malamud: This is Geek of the Week. We’re talk­ing to Craig Partridge. He’s a senior sci­en­tist at BBN and he’s the author of a new book by Addison-Wesley called Gigabit Networking. Welcome to Geek of the Week, Craig.

Craig Partridge: Thank you, Carl. It’s fun to be here.

Malamud: So this book, Gigabit Networking… Is it any good?

Partridge: Boy. Huh. How does one answer that ques­tion? Uh…it sold out its first print­ing in five weeks. So…either people—

Malamud: But why giga­bit net­work­ing? Why not megabit net­work­ing or ter­abit net­work­ing. I mean why is this a sig­nif­i­cant thing to write a book about?

Partridge: Well, I think the real rea­son is that that’s the speed that’s comin’ round the cor­ner at us. I mean megabit net­work­ing we’ve been doing for a long long time. We know how to do—we’ve been doing it since 73. I mean we can sort of do it in our sleep. And ter­abit—true ter­abit net­work­ing I mean, in which a sin­gle host gets you know, a ter­abit per sec­ond, is still…I would guess three to four or five years away…

Malamud: So when we say giga­bit” we mean…a host swal­low­ing a giga­bit per second.

Partridge: You mean a sin­gle host can swal­low or you know, send a giga­bit per sec­ond. There are peo­ple who have a dif­fer­ent def­i­n­i­tion in which they sort of say well you know, I’ve got a hun­dred hosts, each send­ing at 10 megabits over the same wire, this equals a giga­bit net­work. But since no one host can get more than 10 meg I sort of say no, it’s not a giga­bit. I mean we’ve always talked about band­width in terms of what can a sin­gle node attached get. And I don’t believe we should change those rules now.

Malamud: So what’s the chal­lenge of going from a 10 megabit con­nec­tion, which seems to be pret­ty stan­dard, up to a giga­bit? How does that change the way we do net­work­ing sup­port on a computer?

Partridge: Well…it changes things in weird ways. The first thing it does is it makes the bit short­er. Everyone thinks a giga­bit means you go faster, you know. It’s like an air­plane goes faster or a car goes faster. It’s not like that. What it real­ly means is that we’ve man­aged to pack more bits per unit dis­tance of fiber. So it’s sort of like we make the high­ways wider, or the plane big­ger, and it does­n’t cost any more to fly from here to there but we can car­ry you know, a thou­sand times as much stuff. 

And so it changes all the dynam­ics. I mean it used to be, even on a 10 megabit net­work, a sin­gle bit is a very long thing in a cable. It takes a very long time to get from here to there. Ethernet works on that. Ethernet, the CSMA col­li­sion is based on the notion that you know, when you’re send­ing a bit, the oth­er guys hear it before you fin­ish send­ing it because there’s so many elec­trons you have to put into the cable. It’s not quite like that, they actu­al­ly allow some­thing like 100 bits to be in flight. But the point is there’s a limit.

Malamud: So the min­i­mum pack­et length means that the data stays on the net­work long enough that oth­er peo­ple can hear it on the net and there­fore not send—avoid their collision.

Partridge: Absolutely. And if you try to scale it up to giga­bit speeds, you sud­den­ly start find­ing that the pack­ets become these mon­stergrams. The min­i­mum pack­et size is just huge. It’s mea­sured in thou­sands of bytes, which is just crazy for a min­i­mum pack­et size. Imagine every click at your win­dow sud­den­ly trans­late into this Moby pack­et over your sort of giga­bit Ethernet because that’s the small­est pack­et size you can send.

Malamud: So col­li­sion detect, mul­ti­ple access, the way we do Ethernets not going to work in a giga­bit world?

Partridge: It’s not going to work in a giga­bit world. If you see what they’ve done even to get to a hun­dred megabits, what they did was you’ve got two choic­es. You can either increase the min­i­mum pack­et size or you can short­en the cable length. And for hundred-megabit Ethernet what they did was short­en the cable length to sort of what 10Base‑T required. If you short­en it again you’re start­ing to talk about cable runs of about you know, ten, fif­teen feet. I mean it’s just not inter­est­ing at giga­bit speed.

Malamud: That sounds like a periph­er­al but, not a network.

Partridge: Well, yeah… Of course you know, a lot of peo­ple believe that giga­bit net­work­ing should be an extend­ed periph­er­al bus. I mean, there are lots of peo­ple out there who are run­ning the two types of tech­nolo­gies togeth­er. HIPPIs one, SCI’s anoth­er, fiber chan­nel’s a third. It’s a lit­tle weird.

Malamud: Why don’t you explain those three tech­nolo­gies? HIPI, SCI, and fiber chan­nel. Those are three of the main con­tenders for [crosstalk] the giga­bit net­work of—

Partridge: Yeah. They’re three of the main con­tenders par­tic­u­lar­ly for sort of your local area giga­bit net­work. HIPPI is the high-performance par­al­lel inter­face. And the main thing that makes HIPPI inter­est­ing is it’s here, it’s now, and it’s real­ly real­ly real­ly sim­ple. It’s 800 megabits per sec­ond in its stan­dard form, and there’s a double-wide HIPPI that peo­ple are now mov­ing to, which is also in the stan­dard but no one was using ini­tial­ly, which is sixteen-hun­dred megabits, so 1.6 gigabits. 

Malamud: Now, HIPPI is a circuit-switched— As I under­stand it, many HIPPI switch­es are actu­al­ly phys­i­cal con­nec­tions that close when you say I wan­na talk to that host.”

Partridge: Absolutely. What you do is you say that the HIPPI switch I want to talk to that out­bound port,” and it makes a phys­i­cal con­nec­tion, says I’m ready to talk.” Then you blast a pack­et through, then your tear down the con­nec­tion. This is what makes it a LAN tech­nol­o­gy, because you don’t want to wait, you know, miles upon miles of dead fiber, basi­cal­ly. You’re not doing any­thing while you’re wait­ing for this request to go through to link up the switch­es. But with­in a— Well, HIPPI was orig­i­nal­ly designed to work with­in a com­put­er room. And it’s been extend­ed with some­thing called ser­i­al HIPPI so that it can run a few kilo­me­ters around a switch. Within that dis­tance you can do very well. Cray shows 700+ megabits per sec­ond over an 800 megabit HIPPI chan­nel between two Crays using…you know, full con­nec­tion set­up each pack­et, tear­ing it down for each pack­et. And it you know, uses 802 fram­ing and it’s all pret­ty straightforward. 

Now, you know, the weird thing about is that at the same time it also sup­ports IPI-3 disk instruc­tions in par­al­lel with your 802 stuff. So you’ve got this net­work is both an 802 net­work, and it’s also this sort of…disk access controlling…pipe.

Malamud: So IPI, Intelligent Peripheral Interface.

Partridge: Yeah.

Malamud: And so I can basi­cal­ly put a disk dri­ve far away from a computer?

Partridge: Yeah. Through a HIPPI switch. And it will talk IPI to the disk farm. And it will talk 802 to oth­er net­work­ing devices. And pro­vid­ed you know what you’re doing, that all works.

Malamud: Now this is real. There are [indis­tinct; crosstalk] switch­es or are com­put­ers using em…?

Partridge: This is— Absolutely real. You can go out, you can buy most of the pieces of tech­nol­o­gy for— You can buy HIPPI inter­faces for on the order of a cou­ple thou­sand bucks per for most major hosts. Almost every major ven­dor has a HIPPI inter­face, you can buy HIPPI switch­es from lots of peo­ple that’re you know, sec­ond and third par­ties pro­vid­ing stuff for HIPPI. It’s been around for a cou­ple years, it’s mature, it’s here, it’s now.

Malamud: Okay, what about SCI and fiber chan­nel, then.

Partridge: Well, let’s see. SCI is a new pro­pos­al that I think is still try­ing to fig­ure out where its head­’s at. But the notion is sort of…um, you take a bus pro­to­col and you try to extend it so that it does everything. 

Malamud: Is this long-distance SCSI? Is that what that is? I mean the ini­tials are the same.

Partridge: Yeah, well it has some of that feel but I’ve nev­er been able to get a strai— I mean, I’ve talked to the guys who’re writ­ing the stan­dards and we talk about SCI and they sort of go Well you know, it can slice and dice and it’s a gin­su knife pro­to­col com­bin­ing net­work­ing and buses.”

And you sort of go, How does it work?”

And they say, Well it’s a gin­su knife pro­to­col that does—”

And you say, Well how does it work?”

And they say Well, it’s a gin­su knife pro­to­col—” and you start to get a lit­tle bored.

Malamud: Reminds me of the Generic Ultimate Protocol—GUP—pro­pos­al.

Partridge: Right.

Malamud: Okay, what about fiber chan­nel, then?

Partridge: Fiber chan­nel… The sim­plest way to explain fiber chan­nel is fiber chan­nel is HIPPI spelled IBM.” It’s basi­cal­ly a more com­pli­cat­ed ver­sion of HIPPI, same data rates, um…you know, sup­ports a few more remote access protocols—I think it does SCSI as well as IPI3 and 802. It is very close to being real. You can go buy fiber chan­nel switch­es now, but if you buy fiber chan­nel switch­es from two ven­dors they won’t nec­es­sar­i­ly inter­op­er­ate with each oth­er, which—

Malamud: Do HIPPI switch­es interoperate?

Partridge: HIPPI switch­es inter­op­er­ate just fine. But you buy two fiber chan­nel switch­es and they won’t. And there are debates about what is the prop­er way to read dif­fer­ent pieces of the spec. And the ink isn’t dry on all the specs yet. They’re sup­posed to dry…you know, some­time in late 1994.

Malamud: Okay. And how fast does fiber chan­nel run?

Partridge: Same as HIPPI, 800 megabits per sec­ond. So that’s why peo­ple say it’s HIPPI spelled IBM. It’s more sophis­ti­cat­ed, it’s got some more things, IBM’s the big pro­po­nent. Um, but in terms of tech­nol­o­gy for net­work­ing that you get when you buy fiber chan­nel you’re not buy­ing a faster data rate or any­thing that you’re not get­ting basi­cal­ly with HIPPI.


Malamud: Craig Partridge, we’ve talked about HIPPI, we’ve talked about SCI and fiber chan­nel. You haven’t men­tioned ATM. Is ATM a can­di­date for the giga­bit LAN?

Partridge: Well cer­tain­ly peo­ple think it is. And I think that’s prob­a­bly fair. One of— I mean, ATM’s basic tech­nol­o­gy is it’s a switch tech­nol­o­gy. It’s a switch tech­nol­o­gy like HIPPI’s a switch tech­nol­o­gy. So if you think HIPPI’s a can­di­date, ATM’s obvi­ous­ly a can­di­date, too. The major mer­it of ATM is that it scales beau­ti­ful­ly over a range of speeds. So if you want to plug in one host at say 55 megabits per sec­ond and anoth­er on at 155 which is SONET OC-3, if you want anoth­er one at 2.4 giga­bits, you can plug em all in, they can all talk ATM. You just pull one— You know, it all looks— You can even make the inter­face look the same to the host. You just plug in one at a high­er speed and a lit­tle bet­ter [indis­tinct] and off you go. And the the­o­ry is that the ATM switch in the mid­dle will be able to con­nect up at all these dif­fer­ent speeds and will work just beautifully. 

And…that’s a the­o­ry. In real­i­ty right now you can go out and you can buy ATM switch­es. Much like fiber chan­nel they won’t inter­op­er­ate quite yet, though they are clos­er to inter­op­er­at­ing I think than are fiber chan­nel switch­es, since it’s at this point a small mat­ter of soft­ware not hard­ware issues that are hold­ing them up.

And right now the fastest you can go on any one of them is 155 megabits per sec­ond. So it’s not a giga­bit yet. So if you want­ed to buy a giga­bit today you’re going to have to look at HIPPI or fiber chan­nel. But, anoth­er three, four years you’ll find your­self prob­a­bly look­ing at high-speed ATM inter­faces at a gigabit.

Malamud: Now what about wide-area net­works? Are we even think­ing about wide-area giga­bit net­works now?

Partridge: Oh sure. Thinking about em a lot. Um…limited choic­es. The basic answer is SONET. And that’s it. SONET is the Synchronous Optical Network. It is a tele­pho­ny stan­dard for how you mul­ti­plex bits over a wire, okay. And essen­tial­ly that’s what every­one’s using because every­one wants to basi­cal­ly be pro­vid­ing you know, SONET con­nec­tiv­i­ty for the phone company. 

Now while it’s called Synchronous Optical Network” it’s actu­al­ly part of a broad­er suite of pro­to­cols called the Synchronous Digital Hierarchy. And…if you real­ly want to talk about details, there are a few slight dif­fer­ences between SDH, broad umbrel­la, and SONET as the par­tic­u­lar instan­ti­a­tion of the SDH. But for prac­ti­cal pur­pos­es you can basi­cal­ly think of them as the same.

And so what you can think in terms of microwave SONET. There’s 2.4 giga­bit microwave link in north­ern New Jersey that AT&T’s been test­ing over about a 40 kilo­me­ter distance.

Malamud: So SONET is the inter­face to that very fast fiber, just like DS3 might be the inter­face to a T3 line—

Partridge: Absolutely. Same deal, same pur­pose. SONET con­tains all the stuff to keep every­thing ple­siochro­nous and all that messy stuff that we have to wor­ry about— You know. I mean there’s this basic prob­lem which is that fiber, or any media in the phone net­work, if exposed to warm tem­per­a­tures gets longer. And you’re clock­ing bits in at one end, you’re clock­ing bits out at the oth­er end, but the fiber got longer and so the clock­ing can get out of sync. And so you need some pro­to­col that restores clock­ing and deals with the skew. That’s what’s SONET basi­cal­ly does and it allows you to add and drop lines at var­i­ous speeds.

Malamud: Would a com­put­er talk SONET, or would a com­put­er talk some oth­er pro­to­col which would talk SONET?

Partridge: Computer would talk SONET as the bit-level framing…bit-level protocol—you know, the thing that just marks the bits, okay. And actu­al­ly SONET pro­vides a lit­tle fram­ing around the bits, too. SONET is a framed-bit protocol. 

And then what you have to do, though, to do almost any­thing use­ful is you have to put some­thing on top of SONET. Now, right now there are only like three seri­ous pro­pos­als for things put on top of SONET. Two of which I sus— Well, at least one of which prob­a­bly will go away over time. One is HIPPI over SONET. There are rules for putting HIPPI frames in SONET frames and ship­ping them over the wire. And this is large­ly because if you’ve got a bunch of HIPPI net­works and you want to con­nect them up over long dis­tances the only thing you can do is lease a a SONET line, so you’ve got­ta put HIPPI over SONET

Malamud: And is that being done? [crosstalk] People are doing that.

Partridge: It’s being done. Yeah, you can go buy HIPPI over SONET adapters now. I think there are like two or three dif­fer­ent pro­pos­als for it. So you may get a dif­fer­ent one from dif­fer­ent ven­dors. But you can do it.

Another thing is a pro­pos­al for PPP over SONET, which is real­ly sort of a wild idea. But basi­cal­ly, you know, you do the equiv­a­lent of dial­ing up the SONET link…

Malamud: What about SLIP?

Partridge: Uh, no one has pro­posed SLIP over SONET. The only thing we’ve seen so far is PPP over SONET. There is a pro­pos­al, there’s appar­ent­ly some group fab­bing a chip to see how actu­al­ly it would turn out. 

And then the third thing, which almost every­body does, is ATM over SONET. And you put your ATM cells in your SONET frame. And by the way, SONET’s what makes ATM scal­able. Everyone says ATM, gor­geous­ly scal­able.” Basically what hap­pened was…the ATM folks said well here’s a way to put ATM into SONET. And they’d already solved all the scal­ing issues for SONET. So once you say ATM, SONET, and a speed for the SONET line, you know how to do it. It’s easy, it’s trivial.

Malamud: Why does SONET scale?

Partridge: SONET scales because what they did was they framed every­thing. They said that you know, basi­cal­ly what SONET does is it sends out groups of bits in chunks called frames.” And basi­cal­ly one frame comes every 125 microsec­onds. Surprise. Okay. And…at OC1, the data rate equals 55 megabits per sec­ond. So you can sort of fig­ure out how big the frame is.

Now, what you can do is you can define a rule that says there are these things called…they’re actu­al­ly called frames” again which is sort of con­fus­ing, let’s call them superframes—in which basi­cal­ly what you do is you say okay, we’re still send­ing frames at 125 microsec­onds but we’re gonna send you a mul­ti­ple of the num­ber of frames we send you at say, 55 megabits, so at OC1.

OC3, which is what every­one does for 155 megabits is OC1 (55 megabits), an OC1 frame mul­ti­plied three times and sent in 125 microsec­onds. So you send three frames at 125 microsec­onds. And that gives you the high­er speed. You go to OC-192, I’m send­ing you 192 SONET frames in a 125-microsecond shot. 

And there are two ways to han­dle those frames. One is to han­dle them as a bunch of sep­a­rate 55-megabit frames, okay. The oth­er one is to con­cate­nate them. So they’re treat­ed as sot of one super­frame of the three frames or 192 frames smashed togeth­er. And that’s called Concatenated SONET. And that’s why you hear peo­ple talk­ing about OC3, which is sim­ply three 55-megabit chan­nels, or OC-3c, okay, which is con­cate­nat­ed in 155

Malamud: So that scales between any one point and any oth­er point. What is it about SONET that would make it scale as a glob­al high-speed com­mu­ni­ca­tion system?

Partridge: What makes it scale as a glob­al high-speed com­mu­ni­ca­tion sys­tems is again this fram­ing which allows you to mul­ti­plex lower-speed line into larg­er-speed lines and add/drop par­tic­u­lar lines clean­ly. And you look inside the phone net­work tech­nol­o­gy today, it’s extreme­ly tricky. Those mul­ti­plex­es they have are a very messy, tricky beast to actu­al­ly mul­ti­plex lines and remul­ti­plex them again. And SONET said We’re not going to play those fan­cy bit-stuffing games and all this wild stuff. We’re going to think of every­thing as frames and we can demul­ti­plex and mul­ti­plex in frames at any SONET mux­ing sys­tem you want.” So if you want and OC-3c line, and I want OC-24, okay, there’s a way to take my OC-24 line and your OC3 line and mul­ti­plex them into a larg­er SONET line triv­ial­ly through a mul­ti­plex­er. And even…you know, if you decide next week you want more speed, turn up the speed of em link and we’re shar­ing a link, and it’ll all mul­ti­plex it togeth­er fine, okay. But it’s not a pack­et pro­to­col, it’s a mul­ti­plex­ing pro­to­cols. So you can do any­thing you could do with a multiplexer.


Malamud: Craig Partridge, um…I had a rule of thumb when I was doing con­sult­ing. If I had a 1-MIP machine, I need­ed at least a megabyte of RAM for a one-megabit wide-area con­nec­tion. That was the old VAX, if you would think about it. And then work­sta­tions came in and we had 10-megabit links and you need­ed at least an order of mag­ni­tude more RAM, 16 megabytes. And it was a multi-MIP machine, let’s say 10 MIPS. Now we’re look­ing at a gigabit-per-second net­work. Am I gonna need a giga­byte of RAM? Am I gonna need a megaflop machine in order to be able to keep up with all that data com­ing in?

Partridge: Well yes and no. Let’s step back for a moment. The basic rule you’re describ­ing is Amdahl’s old rule of thumb, which said that for every instruc­tion you need a bit of I/O and a byte of mem­o­ry. Okay, well…that rule’s held up pret­ty well. And sure, in the near future you’re going to have a machine with…one BIP, a 1 bil­lion instruc­tion per sec­ond proces­sor, and you’re gonna need one giga­byte of main mem­o­ry, sure. And then we’re gonna plug into a 1 giga­bit per sec­ond net­work link. And I don’t think that that’s a terribly…scary thing to think about. I mean you know, we’re already see­ing peo­ple with 300 mega­hertz proces­sors com­ing out, and the speeds are goin’ up. And 300 mega­hertz, if you take the usu­al proces­sor scal­ing rate…we’re gonna be 1 BIP very short­ly. 1997 is not a crazy tar­get date to think of.

And by that time a giga­byte of mem­o­ry won’t look so crazy. I mean, mem­o­ry prices also scaled up rea­son­ably nice­ly. Memory speeds haven’t, and that’s a pain. Trying to get stuff in and out of your mem­o­ry keeps get­ting harder. 

And a giga­bit net­work inter­face, I don’t find that par­tic­u­lar scary easy, either. A giga­bit net­work inter­face says that you’re mov­ing data around at the inter­face, in sil­i­con. On 32 bit wide paths, that’s a lit­tle over 30 mega­hertz. Well, I mean 30 mega­hertz boards are not exact­ly rare in this world already. In fact, if you take a look at some fair­ly stan­dard proces­sors today and you look at the amount of data that’s going through them per sec­ond, it’s well over a giga­bit, in your work­sta­tion and prob­a­bly in you know, your portable lap­top. It’s actu­al­ly prob­a­bly mov­ing at the core right around the CPU, a giga­bit per sec­ond through the data paths already. And so I don’t find it all very scary to think about those numbers. 

Now, peo­ple do. I mean, when I start­ed teach­ing cours­es on giga­bit net­work­ing in 1990 I had peo­ple send in reviews that said You’re a crack­pot, every week talk­ing about a 1 BIP work­sta­tion.” Now they sor­ta look at me and say When?” And you know, anoth­er few years they’re going to say How much?” 

Malamud: Or at least say let’s put you under non-disclosure.

Partridge: That’s right, yeah. It’s just not…

Malamud: So what’re we gonna do on these machines? Is this just going to mean more…is it just peo­ple like me that’re caus­ing the need for this, peo­ple that’re spit­ting out larg­er and larg­er amounts of data? Or will we do some­thing fun­da­men­tal­ly dif­fer­ent on our machines?

Partridge: Well you’re a prime time gen­er­a­tor. And sure I mean, peo­ple like you will be part of it. But of course…I mean, that’s impor­tant. I mean I’m hard­ly gonna argue that we don’t want to be able to put more multimedia-type stuff on the net­work. And mul­ti­me­dia costs. I mean a sin­gle HDTV con­nec­tion is a 20-megabit con­nec­tion. Okay, well you know, you’re doing a video con­fer­ence with three or sites sites and you want all the data on your screen, and…well you know, you just blew 100 meg right there on the video. And you know, the audio you can for­get. I mean the audio’s not such a big deal. 

But eas­i­ly you can chew it up in video. If you want to do vir­tu­al real­i­ty it gets pret­ty scary. Those guys are talk­ing mil­lions and mil­lions or hun­dreds of mil­lions of poly­gons updat­ed per sec­ond. And you know, every poly­gons requires so many bytes of descrip­tion being sent over the net­work. Well I mean, I’ve heard esti­mates of up to near­ly 1 bil­lion poly­gons per sec­ond to give you a real high-fidelity vir­tu­al real­i­ty envi­ron­ment, par­tic­u­lar­ly some of the com­pli­cat­ed ones. And you know, I don’t see any way that 1 bil­lion poly­gons is going to be any less than sev­er­al bil­lion bits per sec­ond. So sev­er­al giga­bits per sec­ond, just to do vir­tu­al reality. 

Now, you may say well you know, is vir­tu­al real­i­ty the thing that’s going to dri­ve it? Lots of things are going to dri­ve it. I mean, oth­er things to think about is your poor work­sta­tion. It’s sit­ting there. It’s you know, got this tremen­dous­ly pow­er­ful proces­sor doing one BIPS per sec­ond. And it page faults. Okay. Well, we’ve known for a long time that if you’ve got a real­ly fast box and you want to make it real­ly fast you bet­ter make sure that the data path all the way to the file serv­er is fast enough to feed you the data back for your page fault or what­ev­er. Or you’re nev­er gonna see it. I mean it’s why peo­ple a few years ago used to all log into the file servers. Remember that era when your sys­tem man­ag­er would come down and say, Stop log­ging into the file serv­er!” because every­one’s every­one was log­ging in. Why? Because the net­work path was­n’t opti­mized, and so as a result the data com­ing off the disk was avail­able to you much much faster on the file serv­er than it was on the remote client. And you said well I don’t want to sit here going slow, and so you’d log into the file serv­er. And that would dri­ve every­one else’s per­for­mance down because they’re com­pet­ing with you the local pig on the file serv­er against their clients, and so they’d all log in and every­one would log in, and you’d get ter­ri­ble per­for­mance on the file serv­er. Well you know, it’s because the data paths weren’t opti­mized. And if you do the data path opti­miza­tion, it’s pret­ty clear you’re gonna need a giga­bit link between your file servers and your clients. 


Malamud: We’ve talked data links and things like SONET. We’ve talked appli­ca­tions, things like file servers and vir­tu­al real­i­ty. Is there some­thing in the mid­dle that’s gonna have to change? Is TCP/IP gonna work or are we’re gonna have to all move towards an OSI plat­form to do giga­bit net­work­ing? Is there some­thing in the guts of the net­work stack that is gonna have to adjust at those speeds?

Partridge: A whole lot and very lit­tle at the same time. So let me see if I can sort of answer the two parts. 

Things that don’t have to change. IP does­n’t have to change very much. If you just want to move data at a giga­bit per sec­ond IP works just fine today. 

Malamud: Do we need a big­ger pack­et size or something?

Partridge: No, not par­tic­u­lar­ly. And you don’t need a fixed pack­et size, either. Lots of peo­ple say well you know, we’re gonna have to make the IP pack­et size real­ly narrowly-bounded to do isochro­nous traf­fic, isochro­nous stuff being you know, data with strict tim­ing require­ments. Not true. And in fact, one of the things that’s true is as you get to high­er speeds all these tim­ing prob­lems get so easy because the net­work’s mov­ing the data so fast. You know, you want to syn­chro­nize to a mil­lisec­ond, sure that’s a snap. Synchronizing to the mil­lisec­ond’s very easy when you’re mov­ing data this fast; a mil­lisec­ond’s a megabit. A megabit’s a whole lot­ta data. We’re not going to send megabit pack­ets around. So you know, you don’t have to wor­ry about inter­fer­ence as much because there are lots of chances for each pack­et in a time to meet you mil­lisec­ond requirement.

We will have to do a lit­tle bit to man­age our queues a bit. Because in giga­bit net­works nec­es­sar­i­ly the queues scale up like the mem­o­ry scales up and every­thing else. So we’ll have to do a lit­tle more fan­cy queu­ing if we want to be allowed to for exam­ple deliv­er Internet Talk Radio in real-time over large parts of the giga­bit Internet. But that all ough­ta come along pret­ty eas­i­ly, and in fact there are peo­ple build­ing exper­i­men­tal giga­bit routers today. Bell Labs has one they wrote up about a year ago. 

TCP’s a trick­i­er prob­lem. We know TCP can go at a giga­bit. If you guy buy Cray’s TCP right now, plug it in, you will go at a giga­bit per sec­ond. This is not hard. The trick­i­ness is…scaling prob­lems. And let me see if I can explain that very very briefly. 

In the Internet today what we assume when we start up a TCP con­nec­tion is that we have no clue about how much band­width is avail­able to us and that for exam­ple a con­nec­tion between you and me, you may have you know, 100 giga­bit per sec­ond link but I may be at the tail end of a 9.6 kilo­bit link. And so your TCP when it starts send­ing is very con­ser­v­a­tive and sends one pack­et, sees how long it takes for that to come back, then says well gee, it got through, I’ll send two pack­ets. And so it scaled up until it starts to feel like it’s over­driv­ing the link, and then you fall back and you try again. And this call slow start.”

Well the prob­lem is that in a giga­bit Internet, there are still going to be peo­ple at the end of 9.6 kilo­bit links. And so there’s this dual prob­lem which is that start­ing up with send­ing one pack­et over a giga­bit link, wait­ing for the ACK to come back then send­ing two pack­ets, and then say send­ing four and so on takes a huge amount of time to scale up—several sec­onds or more, right. On the oth­er hand if you assumed it was a giga­bit link to start with and launched all the data, fwoom, okay, then I’m going to be wait­ing sev­er­al min­utes or hours at the end of my 9.6 kilo­bit link to have it clear out because you just dumped far more data into the path than I can cope with. 

And the best guess is the way we’re going to solve this prob­lem is we’re going to find ways to give you more infor­ma­tion in the rout­ing pro­to­col. So you as a host, when you’re start­ing up will say to your near­by router, Hey, psst. Could you tell me how much band­width there is between me and Craig?” 

And it will say, Well, we know the slow­est link is a 9.6 kilo­bit link.”

And you go, Okay, right. We’ll do the nice sim­ple feed.” But you say, Psst. What’s the data rate between say me and Barry Shine[sp?]?”

And it comes back and says, Oh. Yeah. No prob­lem. 10 giga­bits all the way straight through.” Then you’ll fire up your TCP and your TCP win­dow will start big and launch lots of data in to start. And you’ll sort of accom­mo­date your­self around a 10 giga­bit path.


Malamud: Craig Partridge, we spend a lot of time wor­ry­ing about mak­ing a user be able to use a bil­lion bits per sec­ond. Should we be wor­ry­ing more about get­ting a bil­lion users on the net­work? Should we be wor­ry­ing less about feed­ing a few super­com­put­ers and more about things like uni­ver­sal access? Or are the two not incompatible?

Partridge: I don’t think they’re incom­pat­i­ble. Uh… Let’s put it this way. The cost of fiber, okay, is drop­ping dra­mat­i­cal­ly. It is now get­ting to the point at which if you were think­ing of cabling up an office build­ing, the cost of fiber and the cost of say grade 5 twist­ed pair is about the same, as is the instal­la­tion cost. Okay. What that says is that if you wire up a build­ing, you’re actu­al­ly gonna fiber it up. And that means that every­body’s desk­top gets a gigabit-capable link basi­cal­ly for free. You know, here is this link that’s com­plete­ly pre­pared to send at one giga­bit per sec­ond, all you got­ta do is put the elec­tron­ics at both ends. 

Well, the elec­tron­ics is almost here. I mean, HIPPI inter­faces don’t cost very much, and oth­er higher-speed inter­faces will come along short­ly. So every office will short­ly have a giga­bit, and it’s just gonna come for free. So…yes we want to wire up more peo­ple. I think that’s impor­tant in terms of the…sort of glob­al vil­lage that every­one is sort of now dream­ing of and which…god help me, when the Internet was [chuck­les] thir­ty nets was not quite what was envisioned.

Malamud: It’s grown a lit­tle bigger.

Partridge: Yeah. Well, it’s very strange to some­one who basi­cal­ly got on at the moment of the ARPANET/MILNET split and then to watch it all boom. But, at any rate, yes we want it to boom. But as we give it to more peo­ple we have to upgrade the back­bone. So back­bone speeds have to soar. I mean you know, to take the tele­phone net­work where every­one gets no bet­ter than a 64 kilo­bit link, the back­bones of most major phone net­works these days are giga­bit links. The Internet’s going to need the same. So in the back­bone we clear­ly need it. 

And the prob­lem is at the edges we’re also deliv­er­ing giga­bits to peo­ple now, or will be very short­ly. It’s just the costs are com­ing down so sharply every­one’s gonna have giga­bits in their office. They might not have it in their home yet. In I would guess you won’t get it in your home until after the year 2000. But you’ll have it in your office by late 1997, 1998 at the lat­est.

Malamud: With com­put­ers a fun­ny thing hap­pened. As they got more pow­er­ful, they did­n’t get hard­er to use, they got eas­er to use. Will net­works get eas­i­er to use as we get more bandwidth?

Partridge: Um… That’s an inter­est­ing ques­tion. I think in fact that the problem…and maybe this just shows that I’m myopic. I think the real answer is that we’re gonna get to throw a lit­tle more soft­ware at mak­ing it eas­i­er to use a net­work from the user’s point of view. But the net­work itself does­n’t care, it just moves these pack­et around and you know, the user nev­er real­ly sees that. And I mean, your ques­tion real­ly just gave me this vision of sort of a nice friend­ly pack­et fly­ing by and I’m not quite sure [laughs] how you’d make a brack­et friend­lier. But it’s clear that we can spend more time mak­ing our soft­ware pack­ages more network-friendly. And there’s no incon­sis­ten­cy there. I mean peo­ple say well but you know, how do we make em friend­ly and fast. Well, most of our sys­tems soft­ware right now for net­work­ing is very poorly-tuned and very slow, and we can speed it up dra­mat­i­cal­ly just by going in and tun­ing it a bit, and that gives us some extra com­pute cycles back to make it more friend­ly and more easy to use.


Malamud: You’ve been lis­ten­ing to Geek of the Week, a pro­duc­tion of the Internet Multicasting Service. To pur­chase an audio cas­sette of this pro­gram, send mail to audio@​ora.​com. You may copy this file and change the encod­ing for­mat, but may not resell the con­tent or make a deriv­a­tive work. 

Support for Geek of the Week comes from Sun Microsystems. Sun, mak­ers of open sys­tem solu­tions for open minds. Support for Geek of the Week also comes from O’Reilly & Associates. O’Reilly & Associates, pub­lish­ers of the Global Network Navigator. Send mail to info@​gnn.​com for more infor­ma­tion. Additional sup­port is pro­vid­ed by HarperCollins and Pearsall. Network con­nec­tiv­i­ty for the Internet Multicasting Service is pro­vid­ed by UUNET Technologies, and MFS DataNet.

Geek of the Week is pro­duced by Martin Lucas, and fea­tures Tungsten Macaque, our house band. This is Carl Malamud for the Internet Multicasting Service, flame of the Internet.