Before the Internet: The Bulletin Board System II

In my last post, I discussed the BBS and how it worked.  (It would be helpful to review, to understand the terminology.)  In this post, I have resurrected, in part, the BBS I used to run from 1988-1990.  It was called “The Tower”, for no particularly good reason except that it sounded cool to my teenage mind.

Now, bringing this back to life was no simple task, but was aided by some foresight I had 20 years ago.  I had a Mac with a disk drive, and realizing the floppy era was coming to a close, I decided to produce disk images of all the 3.5 inch floppies I had saved from my Apple II days.  Fortunately, my last Apple II, the IIGS, used 3.5″ drives instead of the 5.25″ that were more common on the Apple IIs.  The Macs that had floppy drives all had 3.5″ drives.  Additionally, Apple had included software to on the pre OSX MacOS to read ProDOS (Apple II) disks.  Thus, in the year 2000, I could mount an Apple II floppy from a dozen years prior and make an image out of it.

I did not have a full working version of my GBBS, however, so I had to download a copy.  I also had to do a lot of work to bring it up to Macos (not MacOS, but Macos, Modified ACOS), which was a modified form of the GBBS compiler I used at the time.  All of my source files required Macos and not the stock GBBS software.  Believe me, even though I ran the BBS for a couple years and wrote a lot of the code, remembering how to do any of this after 30 years was non-trivial.

Rather than hook up my old IIGS, which I still have, it made a lot more sense to use an emulator.  (It also enabled me to take screen shots.)  I used an emulator called Sweet16, which is a bit bare bones but does the trick.  In case you’re not familiar with the Apple II series, the early models were primarily text-driven.  They had graphics, of course, but they were not GUI machines.  After the Mac came out, there was a push to incorporate a GUI into the Apple II and the result was the Apple IIGS (Graphics and Sound).  While it had a GUI-driven OS (ProDOS 16 at first, replaced by GS/OS), it was backwards compatible with the old Apple II software.  The GBBS software I ran was classic Apple II, and thus it was a bit of a waste to run it on an Apple IIGS, but, well, that’s what I did.

In this screen shot (Figure 1), you can see the Apple IIGS finder from which I’m launching the BBS software, the only GUI shot you’ll see in the article:

Figure 1: The Apple IIGS ProDOS Finder

The next shot (Figure 2) shows the screen only visible to the sysop, while waiting for a call.  As sysop, I had the option to hit a key and log in myself, but if a user dialed in the system would beep and the user would begin the log in process.  I’m not sure why we’re awaiting call 2 which will be call 1 today, but it looks like a bug I need to hunt down.  The screen helpfully tells me if new users have signed up for the BBS, and whether I have mail.

Figure 2: The landing page while waiting for a call

(If you want to know why I used the silly handle “Mad MAn”, please see the previous article.)

The next screen shows the BBS right after logon.  The inverse text block at the top was a local sysop-only view, showing user information including the user name and phone number, as well as the user’s flags.  These are interesting.  Some BBS software provided access levels for controlling what a user could and could not do.  Instead of sequential access levels, GBBS provided a series of binary flags the sysop could set.  Thus, I could give access to one area but not another, whereas the sequential access levels mean that each access level inherits the privileges of the previous level.  Very clever.  A few other stats are displayed that I won’t go into.  I’ll turn off the sysop bar for the remaining screen shots.

Figure 3: The main level prompt with sysop bar. Be sure to report error #20!

Note the prompt provided to the user in figure 3.  It tells you:

  • That the sysop is available
  • That the user has not paged the sysop
  • The double colons (::) normally would display time left on the system.  Since this was a dial-up system, I needed to limit the time users could spend on the BBS.  But as sysop, I of course had unlimited time.
  • The BBS had different areas and the prompt (like an IOS prompt) tells you where you are (“Main level”)

Next, in figure 4 you can see the main menu options for a user logged into the BBS.  This is the default/stock GBBS menu, as my original is lost.  Despite the limited options, this was like entering a new world in the days of 64K RAM.  You can see that a user could send/read mail, go to a file transfer section, chat (or attempt to chat) with the system operator, or read the public message boards.

Figure 4: The BBS main menu. This is the GBBS default, not the custom menu I built

Next, the user list.  I had 150 users on my BBS, not all of them active.  I blacked out the last names and phone numbers, but you can get a sense of the handles that were used at the time.  In addition to these names, there were a lot of Frodo’s and Gandalf’s floating around.  Also note that most BBSing was local (to avoid long-distance charges.)  Sadly, none of these users has logged on since 1989.  I wish they’d come back.  Oggman, whom I mentioned in my last post, was a user on my board.

Figure 5: My user list

Conclusions

I recently interviewed a recent college grad who asked me how she could be successful at a company like Cisco.  My answer was that you have to understand where we came from in order to understand where we are.  You cannot understand, say, SD-WAN without understanding how we used to build WANs.  Go back to the beginning.  Learn what SneakerNet was.  Understand why we are where we are.  Even before SneakerNet, some of us were figuring out how to get computers to talk to each other over an already existing network–the analog telephone network.  As a side note, I love vintage computing.  It’s a lot of fun using emulators to resurrect the past, and I hope to do some physical restorations some day.  Trying to figure out how to boot up a long-defunct system like this BBS provides a great reminder of how easy we have it now.

Vintage DDoS

With Coronavirus spreading, events shut down, the Dow crashing, and all the other bad news, how about a little distraction?  Time for some NetStalgia.

Back in the mid 1990’s, I worked at a computer consulting firm called Mann Consulting.  Mann’s clientele consisted primarily of small ad agencies, ranging from a dozen people to a couple hundred.  Most of  my clients were on the small side, and I handled everything from desktop support to managing the small networks that these customers had.  This was the time when the Internet took the world by storm–venture capitalists poured money into the early dotcoms, who in turn poured it into advertising.  San Francisco ad agencies were at the heart of this, and as they expanded they pulled on companies like Mann to build out their IT infrastructure.

I didn’t particularly like doing desktop support.  For office workers, a computer is the primarily tool they use to do their job.  Any time you touch their primary tool, you have the potential to mess something up, and then you are dealing with angry end users.  I loved working on networks, however small they were.  For some of these customers, their network consisted of a single hub (a real hub, not a switch!), but for some it was more complicated, with switches and a router connecting them to the Internet.

Two of my customers went through DDoS episodes.  To understand them, it helps to look at the networks of them time.

Both customers had roughly the same topology.  A stack of switches was connected together via back-stacking.  The entire company, because of its size, was in a single layer2/layer 3 domain.  No VLANs, no subnetting.  To be honest, at the time I had heard of VLANs but didn’t really understand what they were.  Today we all use private, RFC1918 addressing for end hosts, except for DMZs.  Back then, our ISP assigned us a block of addresses and we simply applied the public addresses directly on the end-stations themselves.  That’s right, your laptop had a public IP address on it.  We didn’t know a thing about security;  both companies had routers connected directly to the Internet, without even a simple ACL.  I think most companies were figuring out the benefits of firewalls at the time, but we also had a false sense of security because we were Mac-based, and Macs were rarely hacked back then.

One day, I came into work at a now-defunct ad agency called Leagas Delaney.  Users were complaining that nothing was working–they couldn’t access the Internet and even local resources like printing were failing.  Macs didn’t even have ping available, so I tried hitting a few web sites and got the familiar hung browser.  Not good.

I went into Leagas’ server room.  The overhead lights were off, so the first thing I noticed were the lights on the switches.  Each port had a traffic light, and each port was solid, not blinking like they usually did.  When they did occasionally blink, they all did in unison.  Not good either.  Something was amiss, but what?

Wireshark didn’t exist at the time.  There was a packet sniffer called Etherpeek available on the Mac, but it was pricey–very pricey.  Luckily, you could download it with a demo license.  It’s been over 20 years, so I don’t quite recall how I managed to acquire it with the Internet down and no cell phone tethering, but I did.  Plugging the laptop into one of the switches, I began a packet capture and immediately saw a problem.

The network was being aggressively inundated with packets destined to the subnet broadcast address.  For illustration, I’ll use one of Cisco’s reserved banks of public IP addresses.  If the subnet was 209.165.200.224/27, then the broadcast address would be 209.165.200.255.  Sending a packet to this address means it would be received by every host in the subnet, just like the broadcast address of 255.255.255.255.  Furthermore, because this address was not generic, but had the subnet prefix, a packet sent to that broadcast address could be sent through the Internet to our site.  This is known as directed broadcast.  Now, imagine you spoof the source address to be somebody else’s.  You send a single packet to a network with, say, 100 hosts, and those 100 hosts reply back to the source address, which is actually not yours but belongs to your attack target.  This was known as a smurf attack, and they were quite common at the time.  There is really no good reason to allow these directed broadcasts, so after I called my ISP, I learned how to shut them down with the “no ip directed-broadcast” command.  Nowadays, this sort of traffic isn’t allowed, most companies have firewalls, and they don’t use public IP addresses, so it wouldn’t work anyhow.

My second story is similar.  While still working for Mann, I was asked to fill in for one of our consultants who was permanently stationed at an ad agency as their in-house support guy.  He was going on vacation, and my job was to sit in the server room/IT office and hopefully not do anything at all.  Unfortunately, the day after he left a panicked executive came into the server room complaining that the network was down.  So much for a quiet week.

As I walked around trying to assess the problem, of course I overheard people saying “see, Jon leaves, they send a substitute, and look what happens!”  People started questioning me if I had “done” anything.

A similar emergency download of a packet sniffer immediately led me to the source of the problem.  The network was flooded with broadcast traffic from a single host, a large-format printer.  I tracked it down, unplugged it, and everything started working again.  And yet several employees still seemed suspicious I had “done” something.

Problems such as these led to the invention of new technologies to stop directed broadcasts and contain broadcast storms.  It’s good to remember that there was a time before these thing existed, and before we even had free packet sniffers.  We had to improvise a lot back then, but we got the job done.

Before the Internet: The Bulletin Board System

It’s inevitable as we get older that we look back on the past with a certain nostalgia.  Nostalgia or not, I do think that computing in the 1980’s was more fun and interesting than it is now.  Personal computers were starting to become common, but were not omnipresent as they are now.  They were quite mysterious boxes.  An error might throw you into a screen that displayed hexadecimal with no apparent meaning.  Each piece of software had its own unique interface, since there were no set standards.  For some, there might be a menu-driven interface.  For others you might use control keys to navigate.  Some programs required text commands.  Even working with devices that had only 64 Kilobytes of memory, there was always a sense of adventure.

I got my start in network engineering in high school.  Computer networks as we understand them today didn’t really exist back then.  (There was a rudimentary Internet in some universities and the Defense Department.)  Still, we found ways to connect computers together and get them to communicate, the most common of which was the Bulletin Board System, or BBS.

The BBS was an individual computer equipped with a modem, into which other computer users could dial.  For those who aren’t familiar with the concept of a modem, this was a device that enabled computer data to be sent over analog telephone lines.    Virtually all BBS’s had a single phone line and modem connecting to a single computer.  (A few could handle multiple modems and callers, but these were rare.)  The host computer ran special BBS software which received connections from anyone who might dial into it.  Once the user dialed in, then he or she could send email, post messages on public message boards, play text-based video games, and do file transfers/downloads.  (Keep in mind, the BBS was text-only, with no graphics, so you were limited in terms of what you could do.)  An individual operator of a BBS was called a System Operator or Sysop (“sis-op”).  The sysop was the master of his or her domain, and occasionally a petty tyrant.  The sysop could decide who was allowed to log into the board, what messages and files could be posted, and whether to boot a rude user.

Because a BBS had a single modem, dialing in was a pain.  That was especially true for popular BBS’s.  You would set your terminal software to dial the BBS phone number, and you would often get a busy signal because someone else was using the service.  Then you might set your software to auto re-dial the BBS until you heard the musical sound of a ring tone followed by modems chirping to each other.

How did you find the phone numbers for BBS’s in the era before Google?  You might get them from friends, but often you would find them posted as lists on other BBS’s.  When we first bought our modem for my Apple II+, we also bought a subscription to Compuserve, a public multi-user dial-in service.  On one of their message boards, I managed to find a list of BBS’s in the 415 area code where I resided.  I dialed into each of them.  Some BBS on the list had shut down and I could hear someone saying “Hello??” through the modem speaker.  Others connected, I set up an account, and, after perusing the board, I would download a list of more BBS numbers and go on to try them.

Each sysop configured the board however seemed best, so the BBS’s tended to have a lot of variation.  The software I used–the most common among Apple II users–was called GBBS.  GBBS had its own proprietary programming language and compiler called ACOS, allowing heavy customization.  I re-wrote almost the entire stock bulletin board system in the years I ran mine.  It also allowed for easy exchange of modules.  I delegated a lot of the running of my board to volunteer co-sysops, and one of them wanted to run a fantasy football league.  He bought the software, I installed it, and we were good to go.  I had friends who ran BBS’s on other platforms that did not have GBBS, and their boards were far less customize-able.

A funny story about that fantasy football sysop.  Back then the software came on floppy disks, and while I insisted on him mailing it to me, he insisted on meeting me in person and handing it over.  I was terrified of meeting this adult and revealing that I was only 14 years old.  I wanted everyone on the board to think I was an adult, not a teenager.  It helped project authority.  He wouldn’t budge, so we agreed to meet at a local sandwich shop.  Imagine my surprise when a 12-year-old walked in carrying the disks!  We had a nice lunch and I at least knew I could be an authority figure for him.  I suspect most of my users were no older than seventeen.

Each user on a BBS had a handle, which was just a screen name.  I’m somewhat embarrassed to admit that mine was “Mad MAn”.  I don’t really recall how I thought of the name, but you always wanted to sound cool, and to a 15 year old “madman” sounded cool.  This was in the era before school violence, so it wasn’t particularly threatening.  I spelled it with two words because I didn’t know how to spell “madman”, and this was before every spelling mistake was underlined in red.  The second A was capitalized because I was a bad typist and couldn’t get my finger off the shift key fast enough.  Eventually I just adopted that as a quirk. Because the BBS population consisted largely of nerdy teenage boys, a lot of the handles came from Lord of the Rings and other fantasy and sci-fi works.  I can’t tell you how many Gandalf’s were floating around, but there were a lot.  I had a Strider for a co-sysop.  Other handles, like mine, attempted to sound tough.  I had another co-sysop whose handle was Nemesis.

Since each BBS was an island, if someone sent you an email on BBS1, you couldn’t see it on BBS2.  So, if you were active on five BBS’s, you had to log in to all five and check email separately.  At one point a sysop who went by the handle “Oggman” launched a system called OGG-Net.  (His BBS also had a cool name, “Infinity’s Edge”.)  Oggy’s BBS became a central repository for email, and subscribing boards would dial in at night to exchange emails they had queued up.  This of course meant that it could take an entire day for email to propagate from one BBS to another, but it was better than before.

I’m writing this post in my “NetStalgia” series for a couple reasons.  First, it’s always important to look back in order to know where you are going.  Second, I’ve resurrected my old BBS using an Apple II emulator, and in my next post I’m going to share a few screen shots of what this thing actually looked like.  I hope you’ll enjoy them.

Interviewing #1: How I got my first networking job

I’ve wanted to kick off a series for a while now on technical interviewing. Let me begin with a story.

My first job interview for a full network engineering role was at the San Francisco Chronicle in 2000. I had been working for five years in IT, mostly doing desktop and end-user support. I then decided to get a master’s degree in telecommunications management, which didn’t help at all, followed by a CCNA certification, which got me the interview.

My first interview was with the man who would be my boss. Henry was a manager who had almost no technical knowledge about networking, but I didn’t know that at the time. “Do you know Foundry switches at all?” Henry asked.

“No.” I was already worried.

“I doubted you would. That’s ok because we want to replace them all with Cisco and you know Cisco.” He pulled out a network diagram and handed it to me. “If you look at this, do you see a problem?” he asked.

I had never worked on a network larger than a couple switches, and now I was staring at a convoluted diagram depicting the network of the largest newspaper in Northern California. I was looking at subnet masks, link speeds, and hostnames, trying to find something wrong.

“I’m not sure,” I had to reply meekly.

He pointed at the main core switch for the network. There was only one, with no redundancy.  “There’s a huge single point of failure,” he said. I felt stupid missing the forest for the trees.

Henry brought me upstairs to interview with Tom, who was an on-site project management contractor from Lucent. I was extremely nervous–Lucent (later Avaya) was a big name in the industry and this guy worked for them! Henry left me with Tom. Tom pulled out a copy of the same diagram Henry was showing me earlier.

“Do you notice anything wrong with this?” he asked.

“Wow, that’s a huge single point of failure,” I replied.

He nodded his head in approval. “That’s right–very good.” He asked me a technical question about supernetting. I answered nervously, although it quickly became clear I knew more than he did.

The door flew open and another guy named Vincent walked in. He was the desktop support contractor, but again I didn’t know that. “Ask Jeff a few technical questions,” Tom said.

“Question number one,” said Vincent. “If you were running a network this size, would you subnet it?”

Now the answer seemed obviously to be “yes”, but I was trying to figure out if this is a trick. “Yes,” I answered, deciding to play it safe.

“Good! Next question: Can you route NetBios?” My desktop years were almost exclusively dedicated to Macs and I didn’t even know what NetBios was. I figured it was a 50/50 shot, and the way he asked it seemed to suggest the answer.

“No,” I said, trying to sound confident.

“He’s good,” said Vincent.

Next, the door flew open again, and in walked Bing. Bing was carrying some sort of network device with her. She handed it to me. “Is this a switch or a hub?” she asked. There was no obvious labeling on it, and as I turned the device over and over again in my hands, I had a sinking feeling.

“I don’t know,” I replied.

“Look at this,” Bing said. She pointed to a collision light. “Since there is one on each port, you can tell this is a switch.”

We don’t have collision lights on switches any more, but at the time we did and she had a valid point. Realizing this, I explained to her that a since a hub has a single collision domain, it would only have one collision light. I explained to her the concept of a collision domain, and how a switch worked versus a hub. It turns out she was a project manager for desktop support and she didn’t know any of that. Someone had just shown her the collision light thing and she thought it would be a good question.

“He’s good,” said Bing.

Tom had told me my next interview would be with a CCIE from Lucent. Now that was definitely intimidating. I knew of the reputation of CCIEs, and I didn’t expect to do well. The CCIE guy never showed up. As Tom was walking me to the elevator, however, we ran into him in the hallway. It turns out that Mike, who is still a friend of mine, and who later got three CCIE’s, had not passed the exam yet. We ended up talking about his home lab for a few minutes.

“He’s good,” said Mike. And a good thing too, as I’ve been in a couple of interviews with Mike and I’ve seen him grill people mercilessly.

I got a call with a job offer a few days later, and ended up working there five years.

For a while now I’ve had several posts in my drafts folder on the subject of technical interviewing. As you can see from the above story, interviews are often chaotic, disorganized, and conducted by unqualified people who have no plan. In the case of the San Francisco Chronicle, they made the right decision on me, and I don’t think anybody there would dispute that. I was thankful to begin my career in network engineering.

That said, I’ve had other interviews that didn’t go so well. Over the next few articles, I’d like to cover technical interviewing. Why do we interview people? How can we select good people from bad people? How worthwhile are the typical technical questions? Are gotchas worth throwing out “just to see how the candidate reacts”? Are interviews purely subjective or can we make them data-driven and objective?

I’ll throw out a few more anecdotes from my own experience to illustrate my points–feel free to comment with some of your own!

Moving carpets for $2000

I worked for two years at a Cisco Gold Partner.  The first year was great.  We were trying to start up a Cisco practice in San Francisco (they were primarily a Citrix partner before), so my buddy and I wined and dined Cisco channel account managers trying to impress them with our CCIE’s and get them to steer business our way.  Eventually, the 2009 financial crisis hit and business started to dry up.  The jobs became fewer and less interesting.  I had two CCIE’s and at one point, I drove out to Mare Island near San Francisco to install a single switch for a customer whose entire network consisted of–a single switch.  I always recommend people not to stay in jobs like this too long, as it hurts your prospects for future employment.

Potential Employer:  “So what kind of jobs have you done lately?”

You:  “Uh, I installed one switch at a customer.”

Anyhow, we had one other customer that managed to keep me surprisingly busy, considering their network was quite small as well.  They were a local builder, and with three small offices connected together with ASAs and VPN tunnels.  The owner was filthy rich and also paranoid about security, which meant I was out there a lot changing passwords, tightening up ACLs, and cleaning up the mess the last network engineer had left.

The owner had a ranch near Wilits, CA which was reputed to be the size of the city of Concord, CA.  He also had two jets to take him to his private landing strip at his ranch.  Being a pilot myself, the prospect of a trip in a small jet to his ranch made me wish for some sort of network problems up there.  However, there wasn’t much up there for me to work on.  He had a single ASA 5505 connected to satellite uplink which he primarily used to connect to the cameras (which he had everywhere) at the ranch.

One day, my contact at the builder told me the cameras weren’t reachable.  Yes!  Finally a trip in the jet.  We set a date and I spent my time wondering whether I’d get the Lear or the Citation.

Unfortunately, when the day rolled around, the weather was hideous.  A Lear jet can handle most any weather, but the little airstrip had no instrument approaches.  Instead, my contact gave me an alternative:  I was to drive up there with her in-house cabling contractor (I’ll call him “Tim”) to do the job.  (I never understood why a business this small had an in-house cabling contractor.  As far as I knew he didn’t work on the actual construction projects associated with the company.)  Now from San Francisco, the drive to Willits is about 2.5 hours.  However, the ranch was near Willits.  After driving 2.5 hours to Willits, we had another hour drive over dirt roads to the middle of nowhere.

The cabling contractor was exactly the sort of person with whom I have nothing in common, and spending 3.5 hours in a car with him, in the era before smartphones are a handy distraction, was painful.  Tim loved fishtailing his truck as we drove on dirt roads on the side of a mountain.  I think he also liked just scaring the white collar guy.  It worked.

We arrived at the ranch and Tim opened up the back of his pickup.  “Can you give me a hand here?” he asked.  In the bed of his truck were several large carpet rolls and piles of dry cleaning.  I grabbed one end of a carpet roll and began the backbreaking work.  My company was billing me out at $250/hour to haul some lady’s dry-cleaning into her ranch.

The ASA itself was located in a pole in the middle of the property, which had a satellite dish on top.  I was amazed the ASA 5505 even functioned out there, given that the external temperature could reach over 100 degrees Fahrenheit.  The metal box housing the ASA was like an oven.  I consoled into it and immediately saw a problem.  Latency on the link was over one second round-trip.  There was no way he was going to get real-time video streaming with this slow satellite uplink.  I reported my findings to Tim and, after eating lunch with the ranch hands, we hopped back in the truck.  Tim put on a song called “You piss me off, f*cking jerk” while we drove.  I guess he didn’t like me.

When I mentor people, I often tell them you have to know the right time to quit a job.  There were several signs in this story that it was time for a change.  With two CCIEs, installing a single switch or working on a single ASA 5505 was not really a good use of my skills.  Neither was moving in carpet rolls and dresses for $250/hour.  Luckily I had enough big jobs at the partner that I managed to get through my interviews at Juniper without trouble.

Meanwhile, a few years later I read about the FBI raiding the builder who was my customer.  I guess he had good reasons for cameras.

 

Moscone Microwave

My first full-time networking job was at the San Francisco Chronicle.  Now there isn’t much to the Chronicle anymore, but in the early 2000’s the newspaper was still going strong.  It was the beginning of the decline, but most people still took their local newspaper as their primary source of news.  Being a network engineer at a major metropolitan newspaper was fascinating.  It is a massive operation to print and distribute a newspaper every single day, and you can never, ever, miss.  There is no slippage of production deadlines.  It has to be out every day, and every day you start all over, with a blank page.

As the lead network engineer, I touched everything from editorial (the news and photography content of the paper) to advertising, pre-press, production systems, and circulation.  Every one of these was critical.  If editorial content didn’t make it through, there was nothing to go into the paper.  If advertising didn’t make it in, we didn’t earn revenue.  If pre-press or production had problems, the paper wasn’t printed.  If circulation wasn’t working, nobody could get their paper.

The Chronicle owned and operated three printing plants in the Bay Area.  One was on Army Street in San Francisco, while the other two were in Union City and Richmond in the East Bay.  The main office was on Fifth and Mission in downtown SF, so the paper was prepared in San Francisco and then sent to the plants via microwave.  That’s where I came in.

Our microwave system used a dish on the clock tower of our building.  From 5th and Mission we sent a signal up to Roundtop Mountain in the East Bay hills. At Roundtop we leased space in a little concrete bunker that was used for various kinds of radio communication including cellular.  From Roundtop we bounced the signal back to the three printing plants.

Chronicle building with the microwave visible on the clock tower

The microwave presented itself to us as T1 lines.  I had the T1 lines connected to dual routers at the main site and each of the plants.  In addition to the microwave, we had two additional backup T1’s to each plant which were landlines from different carriers with diverse paths into the buildings.  We kept the microwave and the first T1 plugged into the routers, with the third one on manual standby in case we needed it.  You don’t take chances with production in a newspaper, and we had triple redundancy on everything.  I used OSPF for redundancy between the microwave and #1 backup circuit on the routers, and HSRP for gateway redundancy.  With only four sites it was a simple enough topology and it never gave me much trouble.

Until, that is, the day when I got a call from our operations center that the primary circuits were all down.  We were running on backups.  I immediately called up the production systems engineer who managed the microwave and told him his circuits were down.  “Impossible!” he said, “that microwave is five-nines reliable.  Check your router!”  I tried a few of the usual:  shut/no shut the interface, changing the line encoding, etc.  No go.  He wanted me to start swapping hardware, which was a big deal in a live newspaper environment, and seemed pointless.  If it was hardware, why would all of the circuits be down?

We bickered a bit before I moved to have the tertiary backup circuits swapped in so we had automatic failover while we worked on the microwave.  I got out our old T-berd tester to see if I could find any indication of the problem.  Then the systems engineer called:  “We need to meet at the clock tower, I’ve found the problem,” he said.  It’s always a relief to hear that when finger pointing is going around.

T-berd T1 Tester

I showed up at the entrance to the tower and followed the systems guy up a rusty ladder mounted to the wall.  Up in the tower there were bird droppings and as I climbed higher I fought the urge to look down.  I’ve never much liked heights and being out of shape and relying on my own strength to keep from falling several stories onto concrete was not promising.  Once I got to the top there was a large separation between the ladder and the floor, and I fought the urge to panic as I flung my leg way over to climb onto the concrete flooring.  From there we went outside and I saw the problem right away.

If you’ve ever been to a convention in San Francisco, chances are it took place in the Moscone Center.  In the early 2000’s, the city decided to expand Moscone by building a new Moscone Center West on 4th and Howard streets.  And from up on the clock tower it was plain as day:  they had built a cooling tower on the roof right in the path of our microwave beam.  I looked at the systems guy and said, “Well, I guess you could make popcorn in that cooling tower.  Anyways, there goes your five nines.”

We hastily called meetings together to decide what to do.  Sue the city?  Call the FCC?  Find another building to bounce the microwave off of?  Those were long term solutions but we had an immediate problem.  Two circuits might seem like enough, but they were telco circuits and not as reliable as the microwave was, at least when its path wasn’t blocked.

Getting the city to cut the cooling tower off Moscone West was a non-starter, especially when it was the newspaper asking, a newspaper that made its money being critical of city officials.  So, we decided to lease roof space from another building and add an additional repeater.  However, this was a long process.  We needed to negotiate with the landlord, replan the radio deployment, license it and obtain permits, add the new repeater, and re-point the old dish to the new building.  That last item was not as simple as it sounded, since this wasn’t a DirecTV dish.  It was welded to the tower, so we needed to hire ironworkers to cut it off and re-position it.

Meantime, we ordered T1’s from downtown SF up to Roundtop to bypass the segment that wasn’t working.  We’d go hard wire to Roundtop, the microwave the rest of the way.  This was not, by any means, an ideal solution, nor was it an overnight solution, but we could at least get some redundancy faster than it would take to add the repeater.  I’m glad we did because shortly after the microwave went down we started having terrible problems with the landlines and needed the triple redundancy.

If you drive by Fifth and Mission now, the microwave dish is gone from the clock tower.  The Chronicle, a shadow of its former self, no longer operates its own printing plants, and has a circulation far smaller than it did in 2004, when I left.  As I said in my last post, it’s great to have a sense of purpose when you work in IT.  It wasn’t about fixing a microwave but about getting that paper in the hands of our readers.  I’m thankful I got to be a part of that for a few years, even if it cost me some vertigo and sleepless nights.

A Passive Star

I was hoping to do a few technical posts but my lab is currently being moved, so I decided to kick off another series of posts I call “NetStalgia”.  The TAC tales continue to be popular, but I only spent two years in TAC and most cases are pretty mundane and not worthy of a blog post.  What about all those other stories I have from various times and places working on networks?  I think there is some value in those stories, not the least because they show where we’ve come from, but also I think there are some universal themes.  So, allow me to take you back to 1995, to a now-defunct company where I first ventured to work on a computer network.

I graduated college with a liberal arts degree, and like most liberal arts majors, I ended up working as an administrative assistant.  I was hired on at company that both designed and built museum exhibits.  It was a small company, with around 60 people, half of whom worked as fabricators, building the exhibits, while the other half worked as designers and office personnel.  The fabricators consisted of carpenters, muralists, large and small model builders, and a number of support staff.  The designers were architects, graphic designers, and museum design specialists.  Only the office workers/designers had their own computers, so it was a quite small network of 30 machines, all Macs.

When the lead designer was spending too much time on maintaining the computer network, the VP of ops called me in and asked me to take over, since seemed to be pretty good with computers and technical stuff, like fixing the fax machine.

Back then, believe it or not, PCs did not come with networking capabilities built in.  You had to install a NIC if you wanted to connect to a network.  Macs actually did come with an Apple-proprietary interface called LocalTalk.  The LocalTalk interface consisted of a round serial port, and with the appropriate connectors and cables, you could connect your Macs in a daisy-chain topology.  Using thick serial cables with short lengths to network office computers was a big limitation, so an enterprising company named Farallon came up with a better solution, called PhoneNet.  PhoneNet plugged into the rear LocalTalk port, but instead of using serial cables it converted the LocalTalk signal so that it ran on a single twisted pair of wires.  The brilliance of this was that most offices had phone jacks at every desk, and PhoneNet could use the spare wires in the jacks to carry its signal.  In our case, we had a digital phone system that consumed two pairs of our four-pair Cat 3 cables, so we could dedicate one to PhoneNet/LocalTalk and call it good.

PhoneNet connector with resistor

We used an internal email system called SnapMail from Cassidy and Greene.  SnapMail was great for small companies because it could run in a peer-to-peer mode, without the need for an expensive server.  In this mode, an email you sent to a colleague went directly to their machine.  The obvious problem with this is that if I work the day shift, and you work the night shift, our computers will never be on at the same time and you won’t get my email.  Thankfully, C&G also offered a server option for store-and-forward messaging, but even with the server enabled it would still attempt a peer-to-peer delivery if both sender and receiver were online.

One day I started getting complaints about the reliability of the email system.  Messages were being sent but not getting delivered.  Looking at some of the trouble devices, I could see that they were only partially communicating with each other and the failed messages were not being queued in the server.  This was because the peers seemed to think each other was online, when in fact there was some communication breakdown.

Determining a cause for the problem was tough.  Our network used the AppleTalk protocol suite and not IP.  There was no ping to test connectivity.  I had little idea what to do.

As I mentioned, PhoneNet used a single pair of phone wiring, and as we expanded, the way I added new users was as follows:  when a new hire came on board, I would connect a new phone jack for him, and then go to the 66 punch-down block in a closet in the cafeteria and tie the wires into another operative jack. Then I would plug a little RJ11 with a resistor on it in the empty port of the LocalTalk dongle, because the dongle had a second port for daisy-chaining and this is what we were supposed to do if it was not in use.  This was a supported configuration known in PhoneNet terminology as a “passive star”.  Passive, because there was nothing in between the stations.  This being before Google, I didn’t know that Farallon only supported 4 branches on a passive star.  I had 30.  Not only did we have too many stations and too much cable length, but the combined resistance on this giant circuit was huge because of all the resistors.

I had a walkthrough with our incredulous “systems integrator”, who refused to believe we had connected so many devices without a hub, which was called a “Star Controller” in Farallon terminology.  When he figured out what I had done, we came up with a plan to remove some of the resistors and migrate the designers off of the LocalTalk network.

Some differences between now and then:

  • Networking capability wasn’t built in on PCs, but it was on Macs.
  • I was directly wiring together computers on a punch-down block.
  • There was no Google to figure out why things weren’t working.
  • We used peer-to-peer email systems.

Some lessons that stay the same:

  • Understand thoroughly the limitations of your system.
  • Call an expert when you need help.
  • And of course:  don’t put resistors on your network unless you really need to!