NetStalgia

Ivan’s recent very interesting post on LAN Data Link addressing takes me back.  Specifically, footnote #1, referring to “ThickNet” Ethernet:  “The coaxial cable had to be bright yellow”.  In the US, at least, we also used to call the stuff “Frozen Yellow Garden Hose”, for obvious reasons.

The original Ethernet physical medium was rather interesting.  I’ve often talked about my love of layer 1, but I got into the business in the twisted-pair era.  I did do a bit of ThinNet networking, but only a little.  And I had one encounter with ThickNet.  Twisted pair is fun to work with and easy to manage.  Coax is not.

The original “ThickNet” Ethernet used, as the name implied, a fairly thick coaxial cable.  The cable had to be physically routed in a path which put it in proximity to the stations that would use it.  Connections to the cable were originally made with a “tap”, requiring physical piercing of the cable to reach the center conductor.  A drop cable then extended down from the tap to the station using it.  Messy, and inconvenient.

“ThinNet” used a substantially smaller cable, and it was far more flexible.  Think about something close to your TV cable coax, but even more flexible, and using BNC connectors.  As with ThickNet, the routing of the cable was tricky, but instead of physical taps we used T-style connectors to add stations into the network.  My very first Ethernet network consisted of a ThinNet segment tied into a 10Base-T twisted pair segment with an adapter.

I’ve written a few times about my first full-time networking job at the San Francisco Chronicle.  Back when I worked there, people actually read the paper and its printing and distribution was a big deal.  After the 9/11 attacks, you may remember several authorities and news agencies in the United States were mailed powdered anthrax.  These events prompted our management to set up a “disaster recovery” site, where we would have enough equipment to produce the newspaper in the event our main building was incapacitated.

For a reason I never understood, they chose a building on the same block.  Behind the Chronicle building were a number of alleys, and the company owned a building back there.  It made no sense at all.  If an earthquake hit, and the main building was destroyed, presumably the building across the street would be too.  If anthrax were mailed to the main building, I assume the entire block would be shut down.  But anyways…

We set up this empty building with computers and phones, and added a frame relay connection so it could connect to the WAN if the main building were out.  But we still wanted a high speed connection for file transfers and backups, so I was tasked with somehow getting a faster connection to the HQ.  I called our cabling contractor, who also resold several wireless building-to-building alternatives, but they were all expensive.  Then I made a discovery.

In the basement of this historic building was our MDF, the main phone equipment and wiring room.  There was a large distribution frame for phone wiring to the entire building, three or four racks of phone equipment, a desk that looked like it was made by prisoners, and a grumpy phone lady who sat in there.  One day I also noticed a pipe with a tag hanging off of it.  It had the address of the disaster recovery building in it, and a disconnected piece of frozen yellow garden hose hanging down.  Could it be?

I went outside and saw a conduit running from the top of the HQ building to the disaster recovery site.  At this time, ThickNet was not being used by anyone, but 10Mbps would be more than sufficient for this DR site that would probably never be fully staffed.

One of my fellow engineers (who had worked at the paper forever) had a lab packed full of all kinds of esoteric electronic equipment.  I went spelunking in the dingy room and found two ThickNet transceivers.  Thankfully the ThickNet was terminated on both ends with connectors, so I didn’t need to drill and tap.  I hooked up the transceivers, connected the RJ45 ports to a nearby switch on each end, and then tested the connection.  It worked!  I had no idea how many years that frozen yellow hose had been in place, but it took one problem off my plate pretty easily.

The “DR” building sat unused for several years, but with keys to the building it made a nice place to take a nap occasionally.  The site was decommissioned eventually.  I wouldn’t be surprised if the frozen yellow garden hose is still in the conduit to this day.

A lot more detail on ThickNet can be found at Matt’s Tech Pages here.

The first company where I worked as a “systems administrator” had no Internet connectivity at all when I started.  By the time I left, I had installed an analog phone line which was shared amongst several users with modems for dial-up service.  The connectivity options in 1995 were limited, and very expensive.  Our company operated on a shoestring budget and could not afford the costly dedicated service offerings from our ISP.

When I moved to a “consulting” company, I finally had the opportunity to work with real dedicated Internet service.  For the customers I worked with, our main options were two:  ISDN and T1 lines.

ISDN stood for Integrated Services Digital Network.  It came in two major flavors, but we exclusively used the lower-end Basic Rate Interface (BRI).  ISDN was a digital phone line, and like its analog counterpart required dialing a phone number.  The BRI had two data channels which were 64 Kbps each, but we usually ran them together for a combined 128 Kbps.  At the time, this was quite speedy, more than double the speed of the modems we had, and our smaller customers loved ISDN.  Because it was a dial-up technology, however, per minute rates applied.  This meant the line would time out and disconnect periodically to save costs.  When it was down and an outbound packet arrived at the router with the ISDN interface, the router would dial up the SP again.  The connection time was much faster than analog modems, but it still added latency which was annoying.

I don’t know if other regions did it, but we had a local hack to get around this.  Understanding the hack requires a little background on the phone systems of the time.  Two kinds of business phone systems were typically in use:  PBXs and key systems.  With a key system, every phone had an extension number, but no direct dial into it.  If you wanted to speak to the person at extension 302, you dialed the main phone number for the business, and either asked the receptionist to connect you, or else an automated system did it for you.  For outbound dialing, the user would either lift the handset and select an unused line from the pool of lines available, or perhaps dial 9 for the key system to connect them to the next available line.  PBXs, on the other hand, were used by large companies, gave each user their own phone line and allowed inter-office extension-to-extension calling, as well as direct dial from the outside world.  If my extension was 3202, I would have a direct dial phone number of, say, 415-555-3202.

Some companies instead opted for the phone company to do their internal switching.  This was known as a Centrex service.  The phone company provided hard wired analog phone lines to the customer, but enabled extension-to-extension direct dial.  Thus, if I was at extension 3202 and I needed to dial extension 3203, I could pick up the phone and just dial the four digits.  The phone company took care of routing it.

What does this have to do with ISDN?  We used to order Centrex service for our customers in the same Centrex group as their ISP. Thus, the customer’s ISDN line became an “extension” of the ISP’s Centrex group.  Not only could the customer then dial the ISP with four digits (not a big deal when the router is doing the dialing), but there were no toll charges on Centrex lines.  We used to nail the line up so it would never disconnect, and if it did for an reason, it would auto-redial.  And then we had dedicated Internet service on a dial-up line!

T1’s (E1’s elsewhere) were 1.544 Mbps, blazing fast at the time.  Unlike the single-pair ISDN line, T1’s were delivered on four wires, two for TX and two for RX.  I won’t get into the details of line coding on T1’s, which we all studied as junior network engineers.  T1 lines were truly dedicated, and provided a point-to-point connection from customer to ISP.  They were distance-priced, but I worked in San Francisco which is a small city, so it wasn’t usually a factor.  Because 1.544 Mbps was expensive for some customers, we had the option of ordering fractional T1s, fewer channels at a slower speed, but still faster than ISDN.  In the early days we had to terminate the T1 on an external CSU/DSU device, and then run a serial cable to the router, but eventually the CSU/DSU came integrated on the router interface card.

When I worked at the San Francisco Chronicle, we were providing Internet service via a T1 line terminating on a 2500-series router.  (The same one disabled by paint roller in this story.)  1000 users on a single T1 was painfully slow, and we made the decision to upgrade to a DS3 (T3) which ran at 45Mbps.  The interface for DS3 used two BNC coax connections.  I remember being amazed that the phone company could deliver service over coax, but it turns out that service into the building used fiber optics.  Inside the building we ran coax.  The run of coax from the basement to our 2nd floor data center was expensive, but the result was phenomenal.  The DS3, which we terminated on a brand new 7200vxr, was vastly superior to the crawling T1, and the effort paid off with our users.

DSL was groundbreaking.  There was nothing consumer-grade before that, and small companies could not even afford Internet connectivity.  I was one of the first home adopters of DSL.  The freshly-trained phone guy showed up at my apartment and installed a splitter box in the basement.  This was needed because residential service was ADSL, which multiplexed digital service on an analog line.  Unlike ISDN, which converted analog phone signal to digital, ADSL left the analog in tact, adding the digital part of the signal onto the higher frequencies.  The splitter box took the incoming phone line from the street and peeled off the high frequencies, providing an analog signal for telephones.  It then passed through the analog/digital mix intact to the modem, which just ignored the analog frequencies.  The phone guy then sat down with his toolbelt and tried to configure TCP/IP on my computer.  He gave up because he had no idea what he was doing.  I told him to leave me the IP addresses and I’d do it myself.  Eventually the telco would just send you an small filter to plug into each analog phone jack yourself, and they could turn on the service without sending a phone guy to rewire things.  Once DSL in its various forms came out, the Internet was available to the masses.  Of course, cable modems came shortly after.

We take for granted instant connectivity from every location on portable devices.  Once upon a time,  connectivity was only available at certain locations, often requiring dialing a service provider.  There was a real excitement as new technologies emerged for making connectivity faster and easier.  Now, of course, we just expect things to work and get angry when they don’t.

My readership is limited, so consider a post to be “viral” if I get more than 2 thumbs up at the bottom of a page.  (Incidentally, I’ve only ever gotten one thumbs down, for this post, but I don’t know why.)  My 2021 post For the Love of Wiring got 3 thumbs up (!) but actually did get a lot more hits than usual after Tom Hollingsworth linked to it from his own blog.  How about a little more layer 1?

After my initial foray into stringing Cat 3 cable around in various unwise ways, Category 5 quickly became the standard.  I hated Cat 5 cable.  Cat 3 had a small number of twists per foot (or meter, non-Americans, or metre, Brits!), so upon removing the jacketing of the cable it was quite easy to untwist it before punching it down.  Cat 5 is very twisted.  Not only are the pairs hard to untwist, but they remain kinked after untwisting, and they take a lot of work to smooth out.  (If you correctly terminate Cat 5, you shouldn’t have to untwist and smooth the wires, but I didn’t know that at first.)  I remember once, on my 10 Mbps Ethernet, running a speed test on Cat 3 cable and then being very disappointed when I saw no improvement running the same test over Cat 5.  (Doesn’t quite work that way, and for 10 Mbps, Cat 3 was more than adequate.)

I did a lot of research to learn how to run cable the correct way.  Mainly this means preserving the tight twists.  Cat 5 cable cannot be kinked or bent sharply, and the twists must be maintained up to the point of termination.  Not only did I use this information to run my own cable, but once I took a job at a computer consulting company, I oversaw many cabling projects and needed to inspect the work done by our vendors.  Voice cable did not have the stringent requirements of data, so often phone cabling experts would run the Cat 5 with tight bends and would untwist the wires several inches before punching down.

The consulting company used one such phone installer to do many of their jobs, often as a sub-contractor.  This was in the days before wireless, when every computer connected to the network, even laptops, had to be plugged in.  I remember one client, a small architectural firm in Berkley, where our installer ran a brand new, Cat 5 Ethernet network.  We showed up, installed a hub, Ethernet cards, etc., and got everyone online.

A week or so later we got called back.  Stations were dropping on and off the network.  I fought may way through Bay Area traffic back to the office to figure out what was going on.

With any layer 1 issue, replacing cables is a good first step.  As I unplugged one station from the wall jack, the entire jack and face plate fell off the wall.  Whoops.

Normally when a network jack is installed in an office building with sheetrock (drywall) walls, the installer cuts a fairly large opening in the sheetrock and then installs a “low voltage ring”.  This ring secures to the drywall from behind, and provides a place for the faceplate to screw into.  Then the Cat 5 cable is punched down on a small “keystone” jack, over which a cover is placed, and which then snaps into the faceplate.

Low voltage ring

Our clueless installer had not done this.  Instead he cut a hole in the drywall just small enough for the jack to fit through.  He never installed the low voltage ring, instead screwing the faceplate directly into the drywall.  He also never installed the cover on the contacts on the jack, so the contacts were covered with drywall powder.  Because screws don’t hold well in drywall, when I pulled the cable from the jack, the whole thing fell out.  I also found out that when he had installed the small office patch panel in their supply closet, he put the screws straight into the drywall as well.  Normally you would use a backboard, screw it into a stud, or at least use drywall anchors.  The patch panel fell off the wall too.

Keystone jack with cover

Needless to say, I wasn’t too happy and neither was the customer.  I hate taking the fall for something that’s not my fault, but the customer considered it our mistake.  I made the cabling vendor come out and redo the entire installation.  After that, I told the owner of our firm to never use that vendor again.

A major concern, even with good cabling vendors, was having people in the office around the cables before they were fully installed.  I remember one client where we had a reputable vendor install the cabling before everyone moved in.  They ran one really large bundle of Cat 5 on the floor, because the client was going to install a raised floor afterwards.  Unfortunately, it took them months to get the raised floor in, and the bundle of cable ran right outside of a row of offices.  People stepped on them going in and out of their offices.  One time I remember a guy in cowboy boots standing right on top of the bundle.  I asked him to move.  By the time the floor covered the cables, they had gone from a clean, round bundle, to totally flattened.  Oddly enough, I never had any problems with the wiring in the time I worked there.

When I worked at the San Francisco Chronicle, our cabling vendor was installing some new fiber optic cabling to some data center racks.  The data center also housed our operations team (NOC, more or less.)  There was one lady who worked there who was very nice, rather large, and a tad immature.  The vendor had laid the fiber out on the floor before routing it under the floor tiles.  We looked up and there was the woman, jumping up and down on the fiber and laughing hysterically.  “Is this good for the cables, is this good for the cables?!” she was saying.  When we explained the interior was made out of glass, she looked horrified and stopped, but it was too late.  It cost us a bit, but fortunately for the NOC lady, she was in a union and well protected.

Working on software now, I don’t have to worry about cabling very much anymore.  I touch racks so infrequently I still call SFPs “GBICs”.  I do think it’s good for network engineers to stay informed on layer 1.  As much as you may know about protocols, software defined networking, or automation systems, none of it will work if the wires aren’t right.

19

There’s a lot of talk about networking simplicity these days.  There’s been a lot of talk about networking simplicity, in fact, for as long as I can remember.  The drive to simplify networking has certainly been the catalyst for many new products, most (but not all) unsuccessful.  Sometimes we forget that networking has some inherent complexities (a large distributed system with multiple os’s, protocols, media types), but that much of the complexity can be attributed to humans and their choices.  IPv4 is a good example of this.

When I got into network engineering, I had assumed that network protocols were handed down from God and were immaculate in their perfection.  Reading Radia Perlman’s classic book Interconnections changed my understanding.  Aside from her ability to explain complex topics with utter clarity, Perlman also exposed the human side of protocol development.  Protocols are the result of committees, power politics, and the limitations of human personality.  Some protocols are obviously flawed.  Some flaws get fixed, but widely deployed protocols, like IPv4, are hard to fix.  Of course, v6 does remedy many of the problems of v4, but it’s still IP.

My vote for simplest protocol goes to AppleTalk.  When I was a young network guy, I mostly worked on Mac networks.  This was in the beige-box era before Jobs made Apple “cool” again.  The computers may have been lame, but Apple really had the best networking available in the 1990’s.  I’ve written about my love for LocalTalk, and its eminently flexible alternative PhoneNet in the past.  But the AppleTalk protocol suite was phenomenal as well.

N.B.  My description of AppleTalk protocol mechanics is largely from memory.  Even the Wikipedia article is a bit sparse on details.  So please don’t shoot me if I misremember something.

In the first place, you didn’t need to do anything to set up an AppleTalk network.  You just connected the computers together and switched either the printer or modem port into a network port.  Auto-configuration was flawless.  Without any DHCP server, AppleTalk devices figured out what network they were on, and acquired an address.  This was done by first probing for a router on the network, and then randomly grabbing an address.  The host then broadcast its address, and if another host was already using it, it would back off and try another one.  AppleTalk addresses consisted of a two byte network address which was equivalent to the “network” portion of an IP subnet, and a one-byte host address (equivalent to the “host” portion of an IP subnet.)  If this host portion of the address is only one byte, aren’t you limited to 255 (or so) addresses?  No!  AppleTalk (Phase 2) allowed aggregation of contiguous networks into “cable ranges”.  So I could have a cable range of 60001-60011, multiple networks on the same media, and now I could have 2530 end stations, at least in theory.

Routers did need some minimal configuration, and support for dynamic routing protocols was a bit light.  Once the router was up and running, it would create “zones” in the end-user’s computer in an application called “Chooser”.  They might see “1st floor”, “2nd floor”, “3rd floor”, for example, or “finance”, “HR”, “accounting”.  However you chose to divide things.  If they clicked on zone, they would see all of the AppleTalk file shares and printers.  You didn’t need to point end stations at their “default gateway”.  They simply discovered their router by broadcasting for it upon start up.

AppleTalk networks were a breeze to set up and simple to administer.  Were there downsides?  The biggest one was the chattiness of the protocols.  Auto-configuration was accomplished by using a lot of broadcast traffic, and in those days bandwidth was at a premium.  (I believe PhoneNet was around 200 Kbps or so.)  Still, I administered several large AppleTalk networks and was never able to quantify any performance hit from the broadcasts.  Like any network, it required at least some thinking to contain network (cable range) sizes.

AppleTalk was done away with as the Internet arose and IP became the dominant protocol.  For hosts on LocalTalk/PhoneNet networks, which did not support IP, we initially tunneled it over AppleTalk.  Ethernet-connected Macs had a native IP stack.  The worst thing about AppleTalk was the flaky protocol stack (called OpenTransport) in System 7.5, but this was a flaw in implementation, not protocol design.

I’ll end with my favorite Radia Perlman quote:  “We need more people in this industry who hate computers.”  If we did, more protocols might look like AppleTalk, and industry MBAs would need something else to talk about.

I’ve mentioned my first job as a network engineer several times on this blog.  I worked at the San Francisco Chronicle, the biggest newspaper in Northern California.   I was brought in to manage the network as a Cisco-certified engineer, having just passed a four-day CCNA bootcamp.  Right before the dot-bomb economic crash, network engineers were in short supply.

The Chronicle’s network had recently been completely re-engineered, and the vendor selected was Foundry Networks.  Foundry was an up-and-coming vendor famous for selling high-speed switches to internet service providers.  They weren’t known for selling into enterprises, but they had convinced the previous network manager to install their hardware in nearly all of the Chronicle’s wiring closets.

It didn’t go very well.  The network had become incredibly unstable.  No company wants an unstable network, but newspapers are a particularly high-pressure environment since they have tight deadlines in order to get the paper out every single day, without fail.  Management of the data network was taken away from the previous manager and assigned to the head of the telecom department.  The plan was to rip out the Foundry and replace it with Cisco.

Foundry, of course, had other ideas.  Their account manager, whom I’ll call Bill, was quite aggressive in trying to restore the good name of Foundry.  I’ll give him credit for his doomed mission.

We had several problems.  The first was that we had only a single core router.  The router had two management modules, but failover between them was not fast, and our reporters and advertising people used Tandem systems which were sensitive to even slight network outages.  Foundry was well known for their fast IP switches, but we used AppleTalk and IPX as well, and their protocol stacks were not well implemented.  The BigIron 8000 was prone to crashing and taking out a lot of our users.  We had only one because the previous manager had been trying to save money.

The second problem was not Foundry’s fault entirely, although I do blame the SE in part.  Nobody ever set the spanning tree bridge priority on the core box.  By default, STP selects the bridge with the lowest bridge identifier as root.  Since the BID is comprised of a user-configured priority and the MAC address, if no priority is configured, the oldest switch in the network becomes the root bridge, since MAC addresses and OUI’s are sequential.

It turned out our Windows guys had been hauling around an ancient Cabletron switch to multiplex switch ports when working on end users’ computers.  (This was before wireless).  They would plug in, the Cabletron would dutifully assume STP root, and the entire network would reconverge for 50 seconds, spanning tree roots not being sticky.  I remember once paying a bill at a nearby restaurant before we were finished and running with the other engineers back to the office, hoping to catch an outage in progress after our pagers went off.  Foundry’s logs were not very good and we didn’t know why the network kept going down.  Eventually I figured it out, I don’t remember how.

The third problem was that the Foundry FastIron switches we used in the wiring closets had bad optics.  The Molex optics Foundry had selected for its management modules were flaky, and so we had to replace every single one with modules using Finisar optics.  I remember Bill, our account manager, coming in for our middle-of-the-night maintenance window several weekends in a row, blades in tow, and helping us to swap out the cards.

All of these problems created a bad reputation for Foundry within the Chronicle.  I remember Bill walking out of the front door carrying a Foundry box with an RMA’d management module.  A non-technical employee, perhaps a reporter or advertising salesman, saw the box and shouted, “Hey, they’re getting rid of Foundry!”  People in the lobby started cheering.  Bill looked at me and said, “soon they’ll be cheering when I come into the building with a Foundry box.”

It never happened.  We ripped out Foundry and replaced everything with Cisco Catalyst 4k and 6k switches.

The fact of the matter is, had we added a second BigIron in the core, fixed the root bridge problem, and replaced all the faulty modules, we probably would have had a solid network.  But there often comes a point when a vendor has destroyed their reputation with a customer.  It takes a multitude of factors to reach this point, but there is definitely a point of no return.  Once that line is crossed, the customer will often allow cordial meetings, listen with sympathy to the account team and execs, and then go their separate way.

A few years later I was laid off from my job at a Gold Partner, and was interviewing with another Gold Partner.  The technical interviewer looked at my resume and said, “I see you worked at the San Francisco Chronicle.”

“Yes,” I said, “I was brought in to replace the Foundry network they had with Cisco.  The whole thing was a disaster, poorly designed and bad products.”

“I designed that network,” he replied, “when I worked for another partner.  I also installed it.”

I didn’t get the job.

I’m thinking of doing some video blogging and kicking it off with a series with my thoughts on technical certifications.  Are they valuable or just a vendor racket?  Should you bother to invest time in them?  Why do the questions sometimes seem plain wrong?

Meanwhile, a little Netstalgia about the first technical certification I (almost) attempted:  The Apple Certified Server Engineer.

Back in the 1990’s, I worked for a small company doing desktop and network support.  When I say small, I mean small.  We had 60 employees and 30 of them had computers.  Still, it was where I first got into the computer industry, and I learned a surprising amount as networking was just starting to take off.

I administered a single AppleShare file server for the company, and I even set up my very first router, a Dayna Pathfinder.  I was looking for more, however, and I didn’t have much of a resume.  A year and a half of desktop support for 30 users was not impressing recruiters.  I felt I needed some sort of credential to prove my worth.

At the time Microsoft certifications, in particular the MCSE, were a hot commodity.  Apple decided to introduce its own program, the ACSE.  Bear in mind, this was back before Steve Jobs returned to Apple.  In the “beige-box” era of Apple, their products were not particularly popular, especially with corporations.  Nonetheless, I saw the ACSE as my ticket out of my pathetic little job.  I set to work on preparing for it.  If memory serves (and I can find little in the Wayback machine), the certification consisted of four exams covering AppleTalk networking, AppleShare file servers, and Backup.

Apple outsourced the certification development to a company called Network Frontiers, and its colorful leader, Dorian Cougias.  I had seen Dorian present at Macworld Expo once, and he clearly was very knowledgeable.  (He asked the room “what’s the difference between a switch and a bridge?” and then answered his own question.  “Marketing.”  Good answer.)  Dorian wrote all of the textbooks required for the program.  He may have known his stuff, but I found his writing style insufferable.  The books were written in an overly conversational tone, and laced with constant bad jokes.  (“To remove the jacketing of the cable you need a special tool…  I’d call it a ‘stripper’ but my mother is reading this.”  Ugh…)  A little levity in technical documentation is nice, but this got annoying fast.

This was in the era before Google, and despite my annoyance I did scour the books for scarce information on how networking actually worked.  I didn’t really study them, however, which you need to do if you want to pass a test.  I downloaded the practice exam on my Powerbook 140 laptop and fired it up.  I assumed with my day-to-day work and having read the book, I’d pass the sample exam and be ready for the real deal.

Instead, I scored 40%.  I used to be a bit dramatic back in my twenties, and went into a severe depression.  “40%???  I know this stuff!  I do it every day!  I read the book!  I’ll never get out of this stupid job!!!”  I had my first ocular migraine the next day.

In reality, it doesn’t matter how good or bad, easy or hard an exam is.  You’re not going to pass it on the first go without even studying.  And this was a practice exam.  I should have taken it four or five times, like I learned to do eventually studying Boson exams for my CCNP.

Instead, I gave up.  And, very shortly later, Apple cancelled the program due to a lack of interest.  Good thing I didn’t waste a lot of time on it.  Of course, I managed to get another job, and pass a few tests along the way.

I learned a few things about technical certifications from that.  In the first place, they can often involve learning a lot of knowledge that may not be practical.  Next, you can’t pass them without studying for them.  Also, that the value and long-term viability of the exams are largely up to the whims of the vendors.  And finally, don’t trust a certification when the author of the study materials thinks he’s Jerry Seinfeld.

 

Well, the blog has been languishing for a while, as I’ve been extraordinarily busy with a new EVP, a round of layoffs, and many personal distractions.  Here’s a little Netstalgia piece, not really technical, for your enjoyment.

I’ve told a few stories about my years at the Cisco Gold Partner, where I did both pre- and post-sales roles.  The Cisco practice in the San Francisco office was new, so being the only Cisco guy required wearing a lot of hats.  That said, one day I wore a hat I didn’t expect or want.

At the end of every week I’d look at our calendar to figure out my schedule for the next week.  It was maintained by a project coordinator.  Some appointments I had put on the calendar myself, others were requested by account managers or customers directly.  One day as I looked at my calendar, I saw the following week booked.  “City of San Mateo,” it said.  I had no experience with this customer, so I called our project coordinator to figure out what the mystery job was.

“You’ll be placing phones,” she said.

“Placing them?” I asked, confused.  She told me we had sold a VOIP deal with San Mateo to replace all of their PBX-phones with Cisco IP phones.  The entire San Francisco office had been roped in to physically placing the phones on desks across the city.  Even our Citrix guy was going to be there.

I called my VP of services and complained.  “I have two CCIEs and you want me to run around for a week plugging in phones?”

“Just be glad you have billable hours,” he said.  Were we really that desperate?

It turns out, yes.  Myself, the office Citrix guy, and one or two other folks met in San Mateo city hall and divided up box after box of IP phones.  We had to do city hall and library, which were the easiest.  Then I ended up doing the police headquarters.  I remember putting phones on all the desks in the detective room, with concerned police officers looking on as I rooted around on my hands and knees for data jacks under their desks.  I had to move weapons (non-lethal), ballistic vests, and other police gear to find the ports.

I also had to do the fire department.  For a small city, San Mateo has a lot of fire stations.  It wasn’t always easy to park.  The first one I pulled up to in my BMW, loaded with phones, had no parking anywhere.  I found a notepad and a pencil in my car, scrawled out “OFFICIAL BUSINESS” on a sheet of lined paper, stuck it in my window, and parked on the sidewalk.  I used my pass at several fire stations, earning quizzical looks from firemen when I parked myself on the sidewalk in front of their station.

I learned an important lesson in leadership from this event.  If the VP had called a meeting the week before, he could have said the following:  “Look team, I know you’re all highly skilled and don’t want to do manual labor.  But we have a big deal here, it’s important to the company, and it’s all hands on deck.  I’ll be there myself with you placing phones.  Let’s get this done and I’ll buy you all a nice dinner at the end of the week.”  Had he said something like this, I think we would have rallied around him.  Instead, he just surreptitiously had it added to the calendar and copped an attitude when challenged.  He actually wasn’t a bad guy, but he missed on this one.

Anyways, plugging in phones is the closest I became to being a VOIP guy.

Two articles (here and here) in my Netstalgia series covered the old bulletin board system (BBS) I used to operate back in the late 1980’s.  It wasn’t much by today’s standards, but I thoroughly enjoyed my time as a Sysop (systems operator).  How the BBS died is a lesson in product management.

My BBS ran on an Apple IIGS with a 2400 baud modem and two external 30MB hard drives (the Apple II series did not support internal hard drives.)  Hard drives were ridiculously expensive back then, and I had acquired the cheapest hard drives I could buy, manufactured by a company called Chinook.  I never knew anybody else who had Chinook hard drives, probably for good reason.  I had some of the files backed up on floppy disks, but there really wasn’t a good way to back up 60 megs of data without another hard drive.

One day I had the BBS shut down for some reason or other, and I went to turn it back on.  When I flipped the switch on Chinook #1, the disk didn’t spin up.  It simply clicked.  Not knowing what to do, I decided to call tech support.  I had lost the manual, however, so I had to do what we did before the Internet:  I called information.  By dialing 411 on my phone, I was connected with an operator who helped me to hunt down the number.

A 30MB Chinook HD

I dialed the number for Chinook.  A nice midwestern, older sounding man answered the phone.  He patiently listened while I explained my conundrum, and then said to me: “This is the Chinook fencing company.  You’re looking for Chinook, a computer company, it sounds like.”  I went back to information and got the right number.

Explaining my situation yet again, this time I got an answer.  “I want you to pick up the front of the hard drive and drop it on the table,” said the tech support guy.  I did it, and voila!  The hard drive spun up.  Despite my tender age of 16, I somehow suspected this was, as we say in the corporate world, “an unsustainable operating model.”

Luckily I rarely shut the hard drive down, but when I did I needed to drop it on the table to get it going again.  Chinook #2 started to have the same problem.  One day I flipped the switch on Chinook #1 and heard a metal-on-metal grinding noise.  And thus, my career as a Sysop ended.  All for the better I suppose, as the Internet was just around the corner.

I still have the Chinook hard drives, in the vain hope that I could crack them and recover some data some day.  I once called DriveSavers to see if they could do it, but the request to recover data on 1980’s Apple II crashed hard drives was just too weird for them.  Their proposal was expensive and not likely to succeed.

Three years ago, when I moved into my new neighborhood, we had a block party, and I ended up sitting next to an older fellow who had been a long-time product manager for Apple.  He provided a wealth of interesting stories about the Apple II line, and the history of many of the computers I got my start on so many decades ago.  I mentioned to him the Chinook problem, and to my surprise he knew Chinook.  Chinook actually repackaged a particular model Seagate HD, which was notorious for locking up and needing physical force to unstick the head.  My neighbor told me that this hard drive was included in the original prototypes of the Mac SE, over the objections of the technical product managers.  The business-types who were running things wanted the drive, either because it was cheap or because they had an agreement with Seagate (I don’t really recall).

Finally one of the technical PMs built a version of the SE which had a pinball plunger attached to the front of the built in HD.  Great idea!  When the hard drive got stuck, just pull back the plunger and let it rip!  He showed it to management and they decided to pick a different hard drive.  Good for them, the SE was to be a very popular Mac and the pinball plunger might have prevented that.  Anyways, as I had learned, the plunger wouldn’t work for very long.

Sun Ultra 10

It was four o’clock in the early hours of one Sunday morning in 2001.  I had been up all night sitting in our data center at the San Francisco Chronicle with our Unix guy.  He was handing off responsibility for managing the firewalls to the network team, and he was walking me through the setup.  He’d been trying all night to get failover to work between the two firewalls, and so far nothing was going right.

We were using Checkpoint which was running on Solaris.  Despite my desire to be Cisco-only, I was interested in security and happy to be managing the firewalls.  Still, looking at the setup our Unix guy had conceived, my enthusiasm was waning.

He drew a complex diagram on a piece of paper, showing the two Solaris servers.  There was no automatic failover, so any failure required manual intervention.  He has two levels of failover.  First, he was using RAID to duplicate the main hard disk over to a secondary hard disk.  If the main disk failed, we’d need to edit some text files with vi to somehow bring the Sparc Ultra 10 up on the second drive.  If the Ultra 10 failed entirely, we would have to edit some text files on the second Ultra 10 to bring it up with the configuration of the first.  With Unix guys, it’s always about editing text files in vi.

Aside from being cumbersome, it didn’t work.  We’d been at it for hours, and whatever disk targets he changed in whatever files, failover wasn’t happening. At the newspaper, we had until 5am Sunday to do our work, after which everything had to be back on line.  And we were getting concerned it wouldn’t come back at all.

Finally the Unix guy did manage to get the firewall booted up and running again.  On Monday I called Checkpoint and asked how we could get off Solaris.  They made a product called SecurePlatform, which installed a hardened Linux and Checkpoint all with one installer.  I ordered it at once, along with two IBM servers.

The software worked as promised, and I brought up a new system, imported our rules, and did interface and box failover with no problem.  I told the Unix guy to decommission his Ultra 10s.  He was furious that there was a *nix system on the network his team wasn’t managing.  I told him it was an appliance and there was no customization allowed.  The new system worked flawlessly and I didn’t even have to touch vi.

Network engineers are used to relatively simple devices that just work.  Routers and switches can be upgraded with a single image, and device and OS-level management is mostly under the hood.  While a lot of network engineers like Linux or Unix and have to work with these operating systems, at the end of the day when we want to do our job, we want systems that install and upgrade quickly, and fail over seamlessly.  As networking vendors move more into “software”, we need to keep that in mind.

2
1

When I worked at the San Francisco Chronicle, I started a project to bring Internet connectivity to a number of sites that had only limited mainframe circuits.  To do this I decided to get DSL lines and run IPSec over them, a relatively new way of doing things for the time.  It was a lot cheaper than the Frame Relay we used at larger sites.

After setting up connectivity at one of our sites, the local office manager called me.  Web pages, he said, were only loading partially.  Some of the text and none of the images would show up.

Everyone blamed the network for everything, so I punted him to desktop support.  I could ping across the tunnel, I could send traffic just fine, the latency was minimal, and nothing was obviously wrong.  The network is usually up or down, but web pages don’t partially load when everything else is working.  Degraded service might cause the pages to load slowly, but not partially.

The desktop guys told me it was my problem.  We had a constant battle, as nine times out of ten they blamed the network, and nine times out of ten it was not the network.  The office manager was getting angry, so I decided I would do some investigation on site and prove to the desktop guys that they were wrong.

I went to the office and fired up my laptop.  Pages were partially loading for me too.  Hmmm.  I did what every network engineer does and fired up a packet sniffer.

I could see the TCP handshake succeeding, and the browser requests and data exchange.  It looked normal, but why wasn’t the browser displaying the images?  I tried another browser and saw the same thing.

As I examined the sniffs, something hit me.  All the packets were being sourced with the Do not Fragment (DF) bit set in the header.  Could it be that the IPSec/GRE headers were causing the packets to be large enough to require fragmentation?  And why was Windows setting the DF bit anyways?

As I wasn’t a desktop guy, I left the latter question alone.  I jumped on the router and built a routing policy which cleared the DF bit on incoming packets.  The pages started loading fine.  I left the policy in place and hoped that there would not be any unanticipated consequences.  I never saw any.

Sometimes, it is, indeed, the network.