Skip navigation

Tag Archives: san francisco chronicle

I’ve mentioned my first job as a network engineer several times on this blog.  I worked at the San Francisco Chronicle, the biggest newspaper in Northern California.   I was brought in to manage the network as a Cisco-certified engineer, having just passed a four-day CCNA bootcamp.  Right before the dot-bomb economic crash, network engineers were in short supply.

The Chronicle’s network had recently been completely re-engineered, and the vendor selected was Foundry Networks.  Foundry was an up-and-coming vendor famous for selling high-speed switches to internet service providers.  They weren’t known for selling into enterprises, but they had convinced the previous network manager to install their hardware in nearly all of the Chronicle’s wiring closets.

It didn’t go very well.  The network had become incredibly unstable.  No company wants an unstable network, but newspapers are a particularly high-pressure environment since they have tight deadlines in order to get the paper out every single day, without fail.  Management of the data network was taken away from the previous manager and assigned to the head of the telecom department.  The plan was to rip out the Foundry and replace it with Cisco.

Foundry, of course, had other ideas.  Their account manager, whom I’ll call Bill, was quite aggressive in trying to restore the good name of Foundry.  I’ll give him credit for his doomed mission.

We had several problems.  The first was that we had only a single core router.  The router had two management modules, but failover between them was not fast, and our reporters and advertising people used Tandem systems which were sensitive to even slight network outages.  Foundry was well known for their fast IP switches, but we used AppleTalk and IPX as well, and their protocol stacks were not well implemented.  The BigIron 8000 was prone to crashing and taking out a lot of our users.  We had only one because the previous manager had been trying to save money.

The second problem was not Foundry’s fault entirely, although I do blame the SE in part.  Nobody ever set the spanning tree bridge priority on the core box.  By default, STP selects the bridge with the lowest bridge identifier as root.  Since the BID is comprised of a user-configured priority and the MAC address, if no priority is configured, the oldest switch in the network becomes the root bridge, since MAC addresses and OUI’s are sequential.

It turned out our Windows guys had been hauling around an ancient Cabletron switch to multiplex switch ports when working on end users’ computers.  (This was before wireless).  They would plug in, the Cabletron would dutifully assume STP root, and the entire network would reconverge for 50 seconds, spanning tree roots not being sticky.  I remember once paying a bill at a nearby restaurant before we were finished and running with the other engineers back to the office, hoping to catch an outage in progress after our pagers went off.  Foundry’s logs were not very good and we didn’t know why the network kept going down.  Eventually I figured it out, I don’t remember how.

The third problem was that the Foundry FastIron switches we used in the wiring closets had bad optics.  The Molex optics Foundry had selected for its management modules were flaky, and so we had to replace every single one with modules using Finisar optics.  I remember Bill, our account manager, coming in for our middle-of-the-night maintenance window several weekends in a row, blades in tow, and helping us to swap out the cards.

All of these problems created a bad reputation for Foundry within the Chronicle.  I remember Bill walking out of the front door carrying a Foundry box with an RMA’d management module.  A non-technical employee, perhaps a reporter or advertising salesman, saw the box and shouted, “Hey, they’re getting rid of Foundry!”  People in the lobby started cheering.  Bill looked at me and said, “soon they’ll be cheering when I come into the building with a Foundry box.”

It never happened.  We ripped out Foundry and replaced everything with Cisco Catalyst 4k and 6k switches.

The fact of the matter is, had we added a second BigIron in the core, fixed the root bridge problem, and replaced all the faulty modules, we probably would have had a solid network.  But there often comes a point when a vendor has destroyed their reputation with a customer.  It takes a multitude of factors to reach this point, but there is definitely a point of no return.  Once that line is crossed, the customer will often allow cordial meetings, listen with sympathy to the account team and execs, and then go their separate way.

A few years later I was laid off from my job at a Gold Partner, and was interviewing with another Gold Partner.  The technical interviewer looked at my resume and said, “I see you worked at the San Francisco Chronicle.”

“Yes,” I said, “I was brought in to replace the Foundry network they had with Cisco.  The whole thing was a disaster, poorly designed and bad products.”

“I designed that network,” he replied, “when I worked for another partner.  I also installed it.”

I didn’t get the job.

I’ve wanted to kick off a series for a while now on technical interviewing. Let me begin with a story.

My first job interview for a full network engineering role was at the San Francisco Chronicle in 2000. I had been working for five years in IT, mostly doing desktop and end-user support. I then decided to get a master’s degree in telecommunications management, which didn’t help at all, followed by a CCNA certification, which got me the interview.

My first interview was with the man who would be my boss. Henry was a manager who had almost no technical knowledge about networking, but I didn’t know that at the time. “Do you know Foundry switches at all?” Henry asked.

“No.” I was already worried.

“I doubted you would. That’s ok because we want to replace them all with Cisco and you know Cisco.” He pulled out a network diagram and handed it to me. “If you look at this, do you see a problem?” he asked.

I had never worked on a network larger than a couple switches, and now I was staring at a convoluted diagram depicting the network of the largest newspaper in Northern California. I was looking at subnet masks, link speeds, and hostnames, trying to find something wrong.

“I’m not sure,” I had to reply meekly.

He pointed at the main core switch for the network. There was only one, with no redundancy.  “There’s a huge single point of failure,” he said. I felt stupid missing the forest for the trees.

Henry brought me upstairs to interview with Tom, who was an on-site project management contractor from Lucent. I was extremely nervous–Lucent (later Avaya) was a big name in the industry and this guy worked for them! Henry left me with Tom. Tom pulled out a copy of the same diagram Henry was showing me earlier.

“Do you notice anything wrong with this?” he asked.

“Wow, that’s a huge single point of failure,” I replied.

He nodded his head in approval. “That’s right–very good.” He asked me a technical question about supernetting. I answered nervously, although it quickly became clear I knew more than he did.

The door flew open and another guy named Vincent walked in. He was the desktop support contractor, but again I didn’t know that. “Ask Jeff a few technical questions,” Tom said.

“Question number one,” said Vincent. “If you were running a network this size, would you subnet it?”

Now the answer seemed obviously to be “yes”, but I was trying to figure out if this is a trick. “Yes,” I answered, deciding to play it safe.

“Good! Next question: Can you route NetBios?” My desktop years were almost exclusively dedicated to Macs and I didn’t even know what NetBios was. I figured it was a 50/50 shot, and the way he asked it seemed to suggest the answer.

“No,” I said, trying to sound confident.

“He’s good,” said Vincent.

Next, the door flew open again, and in walked Bing. Bing was carrying some sort of network device with her. She handed it to me. “Is this a switch or a hub?” she asked. There was no obvious labeling on it, and as I turned the device over and over again in my hands, I had a sinking feeling.

“I don’t know,” I replied.

“Look at this,” Bing said. She pointed to a collision light. “Since there is one on each port, you can tell this is a switch.”

We don’t have collision lights on switches any more, but at the time we did and she had a valid point. Realizing this, I explained to her that a since a hub has a single collision domain, it would only have one collision light. I explained to her the concept of a collision domain, and how a switch worked versus a hub. It turns out she was a project manager for desktop support and she didn’t know any of that. Someone had just shown her the collision light thing and she thought it would be a good question.

“He’s good,” said Bing.

Tom had told me my next interview would be with a CCIE from Lucent. Now that was definitely intimidating. I knew of the reputation of CCIEs, and I didn’t expect to do well. The CCIE guy never showed up. As Tom was walking me to the elevator, however, we ran into him in the hallway. It turns out that Mike, who is still a friend of mine, and who later got three CCIE’s, had not passed the exam yet. We ended up talking about his home lab for a few minutes.

“He’s good,” said Mike. And a good thing too, as I’ve been in a couple of interviews with Mike and I’ve seen him grill people mercilessly.

I got a call with a job offer a few days later, and ended up working there five years.

For a while now I’ve had several posts in my drafts folder on the subject of technical interviewing. As you can see from the above story, interviews are often chaotic, disorganized, and conducted by unqualified people who have no plan. In the case of the San Francisco Chronicle, they made the right decision on me, and I don’t think anybody there would dispute that. I was thankful to begin my career in network engineering.

That said, I’ve had other interviews that didn’t go so well. Over the next few articles, I’d like to cover technical interviewing. Why do we interview people? How can we select good people from bad people? How worthwhile are the typical technical questions? Are gotchas worth throwing out “just to see how the candidate reacts”? Are interviews purely subjective or can we make them data-driven and objective?

I’ll throw out a few more anecdotes from my own experience to illustrate my points–feel free to comment with some of your own!

My first full-time networking job was at the San Francisco Chronicle.  Now there isn’t much to the Chronicle anymore, but in the early 2000’s the newspaper was still going strong.  It was the beginning of the decline, but most people still took their local newspaper as their primary source of news.  Being a network engineer at a major metropolitan newspaper was fascinating.  It is a massive operation to print and distribute a newspaper every single day, and you can never, ever, miss.  There is no slippage of production deadlines.  It has to be out every day, and every day you start all over, with a blank page.

As the lead network engineer, I touched everything from editorial (the news and photography content of the paper) to advertising, pre-press, production systems, and circulation.  Every one of these was critical.  If editorial content didn’t make it through, there was nothing to go into the paper.  If advertising didn’t make it in, we didn’t earn revenue.  If pre-press or production had problems, the paper wasn’t printed.  If circulation wasn’t working, nobody could get their paper.

The Chronicle owned and operated three printing plants in the Bay Area.  One was on Army Street in San Francisco, while the other two were in Union City and Richmond in the East Bay.  The main office was on Fifth and Mission in downtown SF, so the paper was prepared in San Francisco and then sent to the plants via microwave.  That’s where I came in.

Our microwave system used a dish on the clock tower of our building.  From 5th and Mission we sent a signal up to Roundtop Mountain in the East Bay hills. At Roundtop we leased space in a little concrete bunker that was used for various kinds of radio communication including cellular.  From Roundtop we bounced the signal back to the three printing plants.

Chronicle building with the microwave visible on the clock tower

The microwave presented itself to us as T1 lines.  I had the T1 lines connected to dual routers at the main site and each of the plants.  In addition to the microwave, we had two additional backup T1’s to each plant which were landlines from different carriers with diverse paths into the buildings.  We kept the microwave and the first T1 plugged into the routers, with the third one on manual standby in case we needed it.  You don’t take chances with production in a newspaper, and we had triple redundancy on everything.  I used OSPF for redundancy between the microwave and #1 backup circuit on the routers, and HSRP for gateway redundancy.  With only four sites it was a simple enough topology and it never gave me much trouble.

Until, that is, the day when I got a call from our operations center that the primary circuits were all down.  We were running on backups.  I immediately called up the production systems engineer who managed the microwave and told him his circuits were down.  “Impossible!” he said, “that microwave is five-nines reliable.  Check your router!”  I tried a few of the usual:  shut/no shut the interface, changing the line encoding, etc.  No go.  He wanted me to start swapping hardware, which was a big deal in a live newspaper environment, and seemed pointless.  If it was hardware, why would all of the circuits be down?

We bickered a bit before I moved to have the tertiary backup circuits swapped in so we had automatic failover while we worked on the microwave.  I got out our old T-berd tester to see if I could find any indication of the problem.  Then the systems engineer called:  “We need to meet at the clock tower, I’ve found the problem,” he said.  It’s always a relief to hear that when finger pointing is going around.

T-berd T1 Tester

I showed up at the entrance to the tower and followed the systems guy up a rusty ladder mounted to the wall.  Up in the tower there were bird droppings and as I climbed higher I fought the urge to look down.  I’ve never much liked heights and being out of shape and relying on my own strength to keep from falling several stories onto concrete was not promising.  Once I got to the top there was a large separation between the ladder and the floor, and I fought the urge to panic as I flung my leg way over to climb onto the concrete flooring.  From there we went outside and I saw the problem right away.

If you’ve ever been to a convention in San Francisco, chances are it took place in the Moscone Center.  In the early 2000’s, the city decided to expand Moscone by building a new Moscone Center West on 4th and Howard streets.  And from up on the clock tower it was plain as day:  they had built a cooling tower on the roof right in the path of our microwave beam.  I looked at the systems guy and said, “Well, I guess you could make popcorn in that cooling tower.  Anyways, there goes your five nines.”

We hastily called meetings together to decide what to do.  Sue the city?  Call the FCC?  Find another building to bounce the microwave off of?  Those were long term solutions but we had an immediate problem.  Two circuits might seem like enough, but they were telco circuits and not as reliable as the microwave was, at least when its path wasn’t blocked.

Getting the city to cut the cooling tower off Moscone West was a non-starter, especially when it was the newspaper asking, a newspaper that made its money being critical of city officials.  So, we decided to lease roof space from another building and add an additional repeater.  However, this was a long process.  We needed to negotiate with the landlord, replan the radio deployment, license it and obtain permits, add the new repeater, and re-point the old dish to the new building.  That last item was not as simple as it sounded, since this wasn’t a DirecTV dish.  It was welded to the tower, so we needed to hire ironworkers to cut it off and re-position it.

Meantime, we ordered T1’s from downtown SF up to Roundtop to bypass the segment that wasn’t working.  We’d go hard wire to Roundtop, the microwave the rest of the way.  This was not, by any means, an ideal solution, nor was it an overnight solution, but we could at least get some redundancy faster than it would take to add the repeater.  I’m glad we did because shortly after the microwave went down we started having terrible problems with the landlines and needed the triple redundancy.

If you drive by Fifth and Mission now, the microwave dish is gone from the clock tower.  The Chronicle, a shadow of its former self, no longer operates its own printing plants, and has a circulation far smaller than it did in 2004, when I left.  As I said in my last post, it’s great to have a sense of purpose when you work in IT.  It wasn’t about fixing a microwave but about getting that paper in the hands of our readers.  I’m thankful I got to be a part of that for a few years, even if it cost me some vertigo and sleepless nights.

In my first article in the “Ten Years a CCIE” series, I discuss the mystique of the CCIE certification which made me want to attempt the test.

Learning about the CCIE

My first vague awareness of the CCIE certification came in 1999 while I was a Master’s student in Telecommunications Management at Golden Gate University in San Francisco. A family friend was staying at my father’s house, an instructor in the PhD program in telecommunications at the University of Idaho. I was excited to meet him because I was thinking of pursuing further graduate studies, but I was a bit surprised when I told him of my ambitions to be a network engineer, and of my coursework at GGU. He told me not to waste my time in a graduate program if I wanted to be a network engineer. “You should get a Cisco certification instead,” he said, “they’re gold.” A disappointing comment, seeing that I was in my second year of the Masters program, but I kept it in mind and completed my degree. Read More »