spanning tree

All posts tagged spanning tree

I’ve mentioned my first job as a network engineer several times on this blog.  I worked at the San Francisco Chronicle, the biggest newspaper in Northern California.   I was brought in to manage the network as a Cisco-certified engineer, having just passed a four-day CCNA bootcamp.  Right before the dot-bomb economic crash, network engineers were in short supply.

The Chronicle’s network had recently been completely re-engineered, and the vendor selected was Foundry Networks.  Foundry was an up-and-coming vendor famous for selling high-speed switches to internet service providers.  They weren’t known for selling into enterprises, but they had convinced the previous network manager to install their hardware in nearly all of the Chronicle’s wiring closets.

It didn’t go very well.  The network had become incredibly unstable.  No company wants an unstable network, but newspapers are a particularly high-pressure environment since they have tight deadlines in order to get the paper out every single day, without fail.  Management of the data network was taken away from the previous manager and assigned to the head of the telecom department.  The plan was to rip out the Foundry and replace it with Cisco.

Foundry, of course, had other ideas.  Their account manager, whom I’ll call Bill, was quite aggressive in trying to restore the good name of Foundry.  I’ll give him credit for his doomed mission.

We had several problems.  The first was that we had only a single core router.  The router had two management modules, but failover between them was not fast, and our reporters and advertising people used Tandem systems which were sensitive to even slight network outages.  Foundry was well known for their fast IP switches, but we used AppleTalk and IPX as well, and their protocol stacks were not well implemented.  The BigIron 8000 was prone to crashing and taking out a lot of our users.  We had only one because the previous manager had been trying to save money.

The second problem was not Foundry’s fault entirely, although I do blame the SE in part.  Nobody ever set the spanning tree bridge priority on the core box.  By default, STP selects the bridge with the lowest bridge identifier as root.  Since the BID is comprised of a user-configured priority and the MAC address, if no priority is configured, the oldest switch in the network becomes the root bridge, since MAC addresses and OUI’s are sequential.

It turned out our Windows guys had been hauling around an ancient Cabletron switch to multiplex switch ports when working on end users’ computers.  (This was before wireless).  They would plug in, the Cabletron would dutifully assume STP root, and the entire network would reconverge for 50 seconds, spanning tree roots not being sticky.  I remember once paying a bill at a nearby restaurant before we were finished and running with the other engineers back to the office, hoping to catch an outage in progress after our pagers went off.  Foundry’s logs were not very good and we didn’t know why the network kept going down.  Eventually I figured it out, I don’t remember how.

The third problem was that the Foundry FastIron switches we used in the wiring closets had bad optics.  The Molex optics Foundry had selected for its management modules were flaky, and so we had to replace every single one with modules using Finisar optics.  I remember Bill, our account manager, coming in for our middle-of-the-night maintenance window several weekends in a row, blades in tow, and helping us to swap out the cards.

All of these problems created a bad reputation for Foundry within the Chronicle.  I remember Bill walking out of the front door carrying a Foundry box with an RMA’d management module.  A non-technical employee, perhaps a reporter or advertising salesman, saw the box and shouted, “Hey, they’re getting rid of Foundry!”  People in the lobby started cheering.  Bill looked at me and said, “soon they’ll be cheering when I come into the building with a Foundry box.”

It never happened.  We ripped out Foundry and replaced everything with Cisco Catalyst 4k and 6k switches.

The fact of the matter is, had we added a second BigIron in the core, fixed the root bridge problem, and replaced all the faulty modules, we probably would have had a solid network.  But there often comes a point when a vendor has destroyed their reputation with a customer.  It takes a multitude of factors to reach this point, but there is definitely a point of no return.  Once that line is crossed, the customer will often allow cordial meetings, listen with sympathy to the account team and execs, and then go their separate way.

A few years later I was laid off from my job at a Gold Partner, and was interviewing with another Gold Partner.  The technical interviewer looked at my resume and said, “I see you worked at the San Francisco Chronicle.”

“Yes,” I said, “I was brought in to replace the Foundry network they had with Cisco.  The whole thing was a disaster, poorly designed and bad products.”

“I designed that network,” he replied, “when I worked for another partner.  I also installed it.”

I didn’t get the job.