Archives

All posts by ccie14023

Getting a session at Cisco Live is not a given, even for a Principal TME.  I started at Cisco in October 2015, and I certainly didn’t expect to present at, or even go to, Cisco Live Berlin in January 2016.  Normally, there are three ways to secure a session at CL:

  1. Submit an idea during the “Call for papers” process about six months before a given event.  The Session Group Managers (SGMs) who manage individual tracks (e.g., security, data center) then must approve it.  It can be hard to get that approval.  SGMs need to have faith in your ability as a presenter, as well as believe your topic is relevant and unique.
  2. Carry over a session from the previous CL.  If you’ve presented a session, you can check a box in the Cisco Live management tool to ask for it to be carried over to the next CL.  Again, this is dependent on the SGMs approving it.
  3. Be handed a session by somebody who had it approved, but is not going to present it.  Perhaps they are leaving, taking a new job, or just don’t want to do it any more.  Usually the SGMs need to approve any re-assignments as well.  (You can see the SGMs are rather powerful for CL.  It helps to know them well, and it hurts when they turn over!)

When I joined Cisco, I was assigned to a wonderful and humble team working on programmability.  We met and discussed the various assignments and approvals we had for Cisco Live, and they kindly offered my two:  booth duty for an Ansible with Nexus demo, and a session at DevNet called “Automation with NXOS:  Let’s get started!”

A Principal TME is a director-level position, and normally a PTME would not be expected to spend all day at a booth.  However, since I was new to the company, position, and team, I decided it would be a good idea to do some of the grunt work a normal TME does, and experience booth duty.

My 2016 Berlin CL Booth before opening

As for the DevNet session:  DevNet, Cisco’s developer enablement team, runs a large section of the Cisco Live show floor.  DevNet has theaters which are open-seating and divided from the rest of the show floor.  A typical CL breakout takes place in a room, whereas DevNet sessions are out in the open.  In 2016, it was pretty easy to get DevNet sessions, and nobody cared when the team re-assigned it to me.  I had a free trip to Europe and plenty to do when I got there.

What you see at Cisco Live is the fruit of months of preparation.  I had to develop the entire booth demo from scratch–I was supposed to have help from another TME from a different team, but he was totally worthless.  I set up the lab and wrote the demo script myself.  For the DevNet session, I pulled together slides from my colleagues and did my best to master them.  Keep in mind, I came to Cisco after six years at Juniper.  I didn’t know a thing about Nexus, and programmability was new to me.

Every new speaker at CL is required to undergo speaker training, so I signed up for mine.  In 1 hour the non-technical trainer gave me a few pointers.  I’ve been through enough speaker training in the past that it wasn’t terribly helpful, but the box was checked.

Arriving in Berlin, I registered at CL and as a speaker and staffer was given an “all-access” pass.  I could go anywhere at the show.  Personally, I’ve always loved having backstage access to anything, and so I headed to the World of Solutions (WoS, the show floor) and spent a long time trying to find my booth.  WoS before it opens is a genuine mess–people running cherry-pickers and forklifts, laying down carpet, and well-dressed booth people all contending for space.  There are usually challenges getting the demo computers up and running, connected to demos back home, etc.

WoS is a mess when we arrive

Working a booth can be frenetic or boring.  The positioning of my booth and the content of the demo (Ansible automation of NXOS) did not generate a lot of traffic.  I spent hours standing at the booth with nothing to do.  For the occasional customer who would show an interest, I’d run the demo, and possibly do a little white boarding.  Then, reset the demo and wait for the next guy. It wasn’t a lot of fun.

Eventually, the time came for the DevNet session.  I was really nervous for my first time in front of a CL audience. Would I mess up?  Would I choke up due to nerves?  Would my audience ask questions I couldn’t answer?

Seeing your own name on the board is exciting and nerve-wracking

As I said, the DevNet sessions are presented out on the show floor, and it’s a terrible speaking environment.  It’s noisy, you cannot hear yourself, and the participants were given headphones so they could hear.  It was like speaking into a void.  I remember one gentleman bouncing between sleep and wakefulness, his head nodding down and then coming alive again.  The presentation was not one of my best, but I got the job done acceptably and the participants filled out their paper score sheets at the end.  I mostly had 5’s, and a few 4’s.  At that point DevNet sessions did not receive an official score, so my numbers didn’t “count”, but I could show them to my boss and get some credit, at least.

My wife had traveled with me and we took a few sightseeing trips.  We saw the amazing museums on Berlin’s “museum island” and also hired a driver to give us a tour.  We had several team events around the city–Cisco Live is famous for parties–and ate some very good German food.  One of my colleagues was well known for arranging parties that went until four or five AM, and many TMEs would show up to their 8am session with only a couple hours of sleep.  In fact, one of the other Hall of Fame Distinguished Speakers claims this was his secret to success!  I myself, avoid parties like that and spend hours in my room practicing my presentation before giving it.  To each his own, I suppose.

Ah, the perks of Cisco Live!

Network engineers are a breed unto ourselves, and I think we have a distinct feeling of community.  Our field is highly specialized, and because we often have to defend our domain from those who do not understand it (“it’s not the network, ok?!”), we have a camaraderie that’s hard to match.  I left Berlin on a real high, feeling more a part of that community than ever having been there in a Cisco uniform, and having gotten up in front of an audience.  I didn’t know what my future held at Cisco, but it was the first of many such experiences to come.

The last Cisco Live I attended was in Barcelona in January 2020.  As I was in the airport heading home, I was reading news of a new virus emerging from China.  I looked with bemusement at a troop of high-school-age girls who all had surgical masks on.  Various authorities told us not to wear masks, saying they don’t do much to prevent viral spread at a large scale.  The girls kept pulling the masks on and off.  I thought back on my performance at Cisco Live, and looked forward to Cisco Live in Las Vegas in the summer.  Who knew that, a few months hence, everyone would be wearing masks and Cisco Live,  physically, would be indefinitely postponed.

For Technical Marketing Engineers (TMEs), Cisco Live (technically Cisco Live!) measures the seasons of our year like the crop cycle measures the seasons of a farmer’s year.  Four times annually a large portion of our team would hop on an airplane and depart for Europe, Cancun, Melbourne, or a US city.  Cancun and Melbourne were constant, but the European and US cities would change every couple of years.  In my time with Cisco, I have traveled to Cancun and Melbourne, Berlin, Barcelona, Las Vegas, Orlando, and San Diego to present and staff Cisco Live.

A trade show may just be a corporate event, but for those of us who devoted our career to that corporation’s products, it’s far more than a chance for a company to hawk its products.  The breakout sessions and labs are critical for staying up-to-date on a fast-moving industry, the keynotes are always too high-level but with entertaining productions, and the parties are a great chance to connect with other network engineers.  CL was fun for participants, exhausting for those of us staffing it, but still my favorite part of the job.

Cisco Live was originally called Networkers, and started in 1990.  For many years I badly wanted to go to this temporary Mecca of networking technology, but I worked for companies that would not pay the cost of a badge and the travel fees, a total of thousands of dollars.  Even when I first worked at Cisco, from 2005-2007, as a lowly TAC engineer I never had the opportunity to attend.  My first trip to CL came in 2007, when I was working for a Gold partner.  They sent several of us to the Anaheim show, and I remember well the thrill of walking into a CL for the first time.  I walked the show floor, talked to the booth staffers, and attended a lot of breakout sessions of varying quality.  I was quite excited to go to the CCIE party, but I’m not sure why I thought a party full of CCIEs would really be all that exciting.  I remember hanging out by myself for an hour or so before I gave up because I didn’t know anyone there.

The same partner sent me to Orlando in 2008 as well, just barely.  The recession was starting and we were short on cash.  My boss wanted me to share a badge with a colleague, and I didn’t like the idea of having to juggle time slots nor or trying to explain to security how my name could be “Nguyen”.  Thankfully, they ponied up the cash for a second badge.  I’m not a fan of loud music, so I generally don’t go to the CL party, but for Orlando they opened up Universal Studios for us and the aforementioned Nguyen and I, along with a couple others, had a great time on the rides and attending the Blue Man Group.  (OK, some loud music there, but it is an entertaining show.)

I attended CL once more before I came back to Cisco–in 2014, ironically, as an employee of Juniper.  Somehow I convinced my boss to give me a pass on the grounds of researching what Cisco IT was doing.  (They do present at Cisco Live.)  I remember sitting in just a few rows back at the keynote as John Chambers presented, amused I’d be bringing a report back to Juniper about what I’d heard.

 

My view of Chambers at CL 2014

It was actually at Cisco Live when I first got the idea to be a technical marketing engineer.  It’s a bit embarrassing, but I sat in a presentation given by a TME and thought, “I could do better than this guy.”  It took a few years, but I finally managed to get into tech marketing.

I became a Principal TME at Cisco in late 2015 and was told I’d be presenting at Cisco Live in Berlin in January, 2016!  Needless to say, I was thrilled to be given the opportunity, humbled, and more than a little nervous about standing up in front of an audience at the fabled event.

It’s been a sad year in so many ways.  After I came home from Barcelona in January 2020, I received another Distinguished Speaker award and knew I would be inducted into the Hall of Fame.  This was a dream of mine for years, but instead of standing up in front of my peers at Cisco Live Vegas to receive the award, it was mailed to me.  There would be no show floor, no breakouts, no CCIE parties.  The event would go virtual.  I must say, I am impressed with the CL team’s ability to pivot to a virtual format in so short a time.  Still, it was a sad year for those of us who organize the event, and those who were hoping to attend.

In the next couple posts, I thought I would offer a little behind-the-scenes look at how we put on CL, and look at a few events from the past.

My first IT job was at a small company in Novato, California, that designed and built museum exhibits.  At the time most companies either designed the exhibits or built them, but ours was the only one that did both.  You could separate the services, and just do one or the other, but our end-to-end model was the best offering because the fabricators and designers were in the same building and could collaborate easily.  The odd thing about separating the functions was that we could lose a bid to design a project, but win the bid to build it, and hence end up having to work closely with a competitor to deliver their vision.

A museum exhibit we designed and built

The company was small–only 60 employees.  Half of them were fabricators who did not have computers, whereas the other half were designers and office staff who did.  My original job was to be a “gopher” (or go-fer), who goes for stuff.  If someone needed paint, screws, a nail gun, fumigation of a stuffed tiger, whatever, I’d get in the truck and take care of it.  However, they quickly realized I was skilled with computers and they asked me to take over as their IT guy.  (Note to newbies:  When this happens, especially at a small company, people often don’t forget you had the old job.  One day I might be fixing a computer, then the next day I’d be hauling the stuffed tiger.)

This was in the mid-1990’s, so let me give you an idea of how Internet connectivity worked:  it didn’t.  We had none when I started.  We had a company-internal network using LocalTalk (which I described in a previous post), so users could share files, but they had no way to access the Internet at all.  We had an internal-only email system called SnapMail, but it had no ability to do SNMP or connect beyond our little company.

The users started complaining about this, and I had to brainstorm what to do when we had virtually no operating budget at all.  I pulled out the yellow pages and looked under “I”, and found a local ISP.  I called them, and the told me I could use Frame Relay, a T1, or ISDN.  I had no idea what they were talking about.  The sales person faxed me a technical description of these technologies, and I still had no idea what they were talking about.  At this point I didn’t know the phone company could deliver anything other than, well, a phone line.  I wasn’t at the point where I needed to hear about framing formats and B8ZS line encoding.

We decided we could afford neither the ongoing expense, nor the hardware, so we came up with a really bad solution.  We ordered modems for three of the computers in the office:  the receptionist, the CEO, and the science researcher.  For those of you too young to remember, modems allow you to interface computers using an ordinary phone line.  We ordered a single phone line (all we could afford).  When one of them wanted to use the Internet, they would run around the office to check with the other two if the line was free.

A circa-1990’s Global Village modem

The reason we gave the receptionist a modem is amusing.  Our dial-up ISP allowed us to create public email addresses for all of our employees.  However, they all dumped into one mailbox.  The receptionist would dial in in the morning, download all the emails, and copy and paste them into the internal email system.  If somebody wanted to reply, the would send it to the receptionist via SnapMail and she would dial up, paste it into the administrator account, and send it.  Brilliant.

Needless to say, customer satisfaction was not high, even in those days.  Sick of trying to run IT with no money, I bailed for a computer consulting company in San Francisco and started installing the aforementioned T1s and ISDN lines for customers, with actual routers.

If ever you’re annoyed with slow Wi-Fi, be glad you aren’t living in the 1990’s.

After I left TAC I worked for two years at a Gold Partner in San Francisco.  One of my first customers there was one of my most difficult, and it all came down to timing.

I was dispatched to perform a network assessment of a small real-estate SaaS company in the SF East Bay.  Having just spent two years in TAC, I had no idea how to perform a network assessment, and unfortunately nobody at the partner was helping me.  I had been told they had a dedicated laptop loaded with tools just for this purpose, but nobody could locate it.  I started downloading tools on my own, but I couldn’t find a good, free network analysis tool.  Another engineer recommended a product called “The Dude” from MikroTik, and since it was easy to install I decided to use it.  I needed to leave it collecting data for a few days, and since nobody had provided me an assessment laptop I had to leave my own computer there.  I distinctly remember the client asking me what tool I was using to collect data, and sheepishly answering “Uh, it’s called The Dude.”  He looked at me skeptically.  (Despite the name, the tool was actually quite decent.)

Without any format or examples for an assessment, I looked at bandwidth utilization, device security, IOS versions, and a host of other configuration items.  The network was very simple.  It was a small company with a handful of switches in their main office, and a T1 line connecting them to a satellite office in LA.  They used a Cisco VoIP system for phones, and the satellite phones connected over the T1 back to the main campus.  I wrote up a document with recommendations and presented it to the customer.  Almost everything was minor, and they agreed to have me come back in and make a few upgrades.

One item I noted in the assessment was that the clocks on the routers and switches were set incorrectly.  The clock has absolutely nothing to do with the operation of the device, but having just come from TAC I knew how important device clocks are.  If there is a network-wide incident, one of the first things we look at is the logging messages across the network, and without an accurate device clock we cannot properly compare log messages across multiple network devices.  We need to know if the log message on this router happened at the same time as that other log message on that switch, but if the clocks are set to some random time, it is  difficult or impossible.

I proceeded to make my changes, including synchronizing the clocks to NTP and doing a few IOS upgrades.  Then I closed out the work order and moved on to other clients.  I thought I was done, but I sure wasn’t.

We started getting calls from the customer complaining that the T1 between the two offices wasn’t working right.  I came over and found that it had come back up, but soon they were calling again.  And again.  I had used up all the hours on our contract, and my VP was not keen on me providing services for free.  But our client insisted the problem began as a result of my work, and I had to fix it.  Nothing I had done was major, but I went back and reverted to the old IOS (in case of a bug) and reverted to saved configs (which I had kept.)

With the changes rolled back, the problem kept happening.  The LA office was not only losing its Internet connectivity, but was dealing with repeated voice outages.  The temperature of the client was hot.  I opened a TAC case and had an RMA sent out for the WIC, the card that the T1 connected to.  I replaced it and the problem persisted.  At this point I was insisting it could not have been my fault, since I rolled back the changes, but the customer didn’t see it that way and I don’t blame them.

The customer called up their SBC (now ATT) rep, basically a sales person, to complain as well.  He told her a consultant had been working on his network, and she asked what had been changed.  He said “the clock on the router” and she immediately flagged that as the problem.  Sadly, the rep mistook the router clock, which has no effect on operations, with the T1 clocking, which does.  I never touched the T1 clocking.  I knew the sales rep as she had been my sales rep years before at the San Francisco Chronicle, and I knew she was a non-technical sales person who had no idea what she was talking about.  Alas, she had planted the seeds in the customer’s mind that I had messed everything up by touching the clock.  I pleaded my two CCIE’s and two years of TAC experience to try to persuade this customer that the router clock has zero, nada, zilch to do with the T1, to no avail.

The customer then, being sick of us, hired another consultant who got on the phone with SBC.  It turns out there was an issue with the line encoding on the T1, which SBC fixed and the problem went away.  The new consultant looked like a hero, and the next we heard from the client was a letter from a lawyer.  They were demanding their money back.

It’s funny, I’ve never really had another charge of technical incompetence leveled at me.  In this case I hadn’t done anything wrong at all, but the telco messed up the T1 line around the same time as I made my changes.  So I guess in more than one way, you could say it was a matter of bad timing.

I have written more than once (here and here, for example) about my belief that technological progression cannot always be considered a good thing.  We are surrounded in the media by a form of technological optimism which I find disconcerting.  “Tech” will solve everything from world hunger to cancer, and the Peter Thiels of the world would have us believe that we can even solve the problem of death.  I don’t see a lot of movies these days, but there used to be a healthy skepticism of technological progress, which was seen as a potential threat to the human race.  Some movies that come to mind:

  • Demon Seed (1977), a movie I found profoundly disturbing when I was taken to see it as a child. I have no idea why anyone took a child to this movie, by the way.  In it, a scientist invents a powerful supercomputer and uses it to automate his house.  Eventually, the computer forms a prosthesis and uses it to inseminate the doctor’s wife to produce a hybrid human-computer being.
  • The Terminator (1984) presented a world in which humans were at war with computers and robots.
  • The Matrix (1999), a move I actually don’t like, nevertheless presents us with a world in which, again, computers rule humans, this time to the point where we have become fuel cells.

Most readers have certainly heard of the latter two, but I’m guessing almost none have heard of the first.  I could go on.  From West World (1973) to RoboCop (1987), there has been movie after movie presenting “tech” not as the key to human progress, but as a potential destroyer of the human race.  I suspect the advent of nuclear weapons had much to do with this view, and with the receding of the Cold War and the ever-present nuclear threat, maybe we are lest concerned about the destruction our inventions are capable of.

The other day I was thinking about my own resistance to Apple AirPods.  It then occurred to me that they are only one step away from the “Cerebral Communicator” in another movie you probably haven’t heard of, The President’s Analyst.

Produced in 1967, (spoilers follow!) the film features James Coburn as a psychoanalyst who is recruited to provide therapy to the President of the United States.  At first excited by the prospect, Coburn himself quickly becomes overwhelmed by the stress of his assignment, and decides to flee.  He is pursued by agents of the CIA and FBI (referred to as CEA and FBR in the movie), the KGB, and the mysterious organization called “TPC”.  Filmed in the psychedelic style of the time, with numerous double-crossings and double-double-crossings, the movie is hard to follow.  But, in the end, we learn that this mysterious agency called “TPC” is actually The Phone Company.

1967 was long before de-regulation, and there was a single phone company in the United States controlling all telephone communications.  It was quasi-governmental in size and scope, and thus a suitable villain for a movie like The President’s Analyst.  The ultimate goal of TPC is to implant mini-telephones into people’s brains.  From Wikipedia:

TPC has developed a “modern electronic miracle”, the Cerebrum Communicator (CC), a microelectronic device that can communicate wirelessly with any other CC in the world. Once implanted in the brain, the user need only think of the phone number of the person they wish to reach, and they are instantly connected, thus eliminating the need for The Phone Company’s massive and expensive-to-maintain wired infrastructure.

I’ve only seen the movie a couple times, but I wonder if the AirPods implanted in people’s ears remind me too much of the CC.  Already we have seen the remnants of the phone company pushing us to ever-more connectivity, to the point where our phones are with us constantly and we stick ear buds in our heads.  Tech companies love to tell us that being constantly connected to one another is the great path forward for humanity.  Meanwhile, we live in a time as divided as any in history.  If connecting humanity were the solution to the world’s problems, why do we seem to be in a state of bitter conflict?  I wonder if we’ve forgotten the lesson of the Babel Fish in Douglas Adams’ science fiction book, Hitchhiker’s Guide to the Galaxy.  The Babel fish is a convenient device for Adams to explain a dilemma in all science fiction:  how do people from different planets somehow understand each other?  In most science fiction, the aliens just speak English (or whatever) and we never come to know how they could have learned it.  But Adams uses his fictional device to make an amusing point:

The Babel fish is small, yellow, leech-like, and probably the oddest thing in the Universe…if you stick a Babel fish in your ear you can instantly understand anything said to you in any form of language. The speech patterns you actually hear decode the brainwave matrix which has been fed into your mind by your Babel fish…[T]he poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloodier wars than anything else in the history of creation.

 

In 1998 I left my job as a computer “consultant” to pursue a master’s degree in Telecommunications Management.  I was stuck in my job, tired of troubleshooting people’s email clients and installing Word on their desktops, and was looking for a way to make a leap into bigger and better things.  That did happen–although not how I expected–but meanwhile for two years I needed to support myself while achieving my degree.  I took the easy path and stole a client from my previous employer.  This was at the height of the dotcom boom, and he was frankly too busy to even notice.  For a couple years I worked part-time at my advertising agency client, setting up computers, managing the servers, running the fairly simple network they had implemented.  It was a good deal for both of us, as the office manager had responsibility for IT and little inclination to work on technology.

Anyone who was around back then will remember the “Y2K” scare.  As we approached the year 2000, someone realized that many computers had been storing the year part of dates with only the last two digits.  For example, a program would store “98” instead of “1998.”  This meant that, as we moved into the new millennium, the year 2001 would be interpreted by these systems to be “1901”.  A legitimate problem, to be sure.  Some software, such as banking programs running on mainframes, could act unexpectedly and potentially lead to serious consequences.

However, the panic that followed was completely out of proportion to the threat.  Instead of focusing on the limited systems that would likely experience problems, a paranoia built up fueled by newspeople who had no idea what they were talking about.  People became concerned that, on January 1, 2000, our entire society would melt down and we would descend into chaos as the financial system melted down, stoplights stopped working, water and power systems crashed, and millions prepared to regress to living like cavemen.  A brilliant ad from Nike from just before the millennium captured the overall belief of what would happen come New Year’s Day.  (Sadly, the ad looks more realistic for 2020 than it did for 2000).

The ad agency I was working was told by their parent company that every single piece of electronic equipment they had needed to be Y2K-certified.  The downside of capitalism is that armies of opportunists arise (sound familiar?) to take advantage of social panics like these.  In fact, for such opportunists, exacerbating the panic is in their best interests.  Thus were spawned legions of “Y2K consultants”, non-technical business-types telling us that even our smoke alarms needed to be Y2K-certified and receive a literal sticker of approval.

When my office manager boss told me I needed to certify every piece of equipment I controlled my response was as follows:  “It doesn’t matter whether a network switch has the Y2K problem or not.  Most of the network gear was built recently and doesn’t even have this issue.  But if the vendor did make the mistake of storing the date with two digits instead of four, the worst that will happen is the time stamp on the log will be off.  Everything will still work.  So, rather than bill out hours and hours of me doing the certification, I’m going to do nothing.  And if I’m wrong, you can personally sue me for negligence.”

My boss didn’t care much about corporate HQ and took my advice.  New Year’s Eve, 2000, came and went.  A lot of people, expecting the worst, stayed home.  I went out and partied, well…like it was 1999.  And nothing happened.  Of course, the Y2k consultants patted themselves on the back for a job well done.  If the army of MBA’s hadn’t saved the day with their stickers, life would have ground to a halt.  I, on the other hand, knew the reality.  A panic was whipped up by the media, a bunch of opportunists swooped in to make a buck of the crisis, and helped to whip it up further, and life would have gone on either way.

I’ve been revising my Cisco Live session on IOS XE programmability, and it’s made me think about programming in general, and a particular idea I’ve been embarrassed to admit I loathe: Object Oriented Programming.

Some context:  I started programming on the Apple II+ in BASIC, which shows my age.  Back then programs were input with line numbers and program control was quite simple, consisting of GOTO and GOSUB statements that jumped around the lines of code.  So, you might have something that looked like this:

10 INPUT "Would you like to [C]opy a File or [D]elete a file?"; A$
20 IF A$ = "C" THEN GOTO 100
30 IF A$ = "D" THEN GOTO 200

This was not really an elegant way to build programs, but it was fairly clear.  Given that code was entered directly into the DOS CLI with only line-by-line editing functionality, it could certainly get a bit confusing what happened and where when you had a lot of branches in your code.

In college I took one programming course in Pascal.  Pascal was similar in structure to C, just far more verbose.  Using it required a shift to procedural-style thinking, and while I was able to get a lot of code to work in Pascal, my professor was always dinging me for style mistakes.  I tended to revert to AppleSoft BASIC style coding, using global variables and not breaking things down into procedures/functions enough.  (In Pascal, a procedure is simply a function that returns no value.)  Over time I got used to the new way of thinking, and grew to appreciate it.  BASIC just couldn’t scale, but Pascal could.  I picked up C after college for fun, and then attempted C++, the object-oriented version of C.  I found I had an intense dislike for shifting my programming paradigm yet again, and I felt that the code produced by Object Oriented Programming (OOP) was confusing and hard to read.  At that point I was a full-time network engineer and left programming behind.

When I returned to Cisco in 2015, I was assigned to programmability and had to learn the basics of coding again.  What I teach is really basic coding, scripting really, and I would have no idea how to build or contribute to a large project like many of those being done here at Cisco.  I picked up Python, and I generally like it for scripting.  However, Python has OO functionality, and I was once again annoyed by confusing OO code.

In case you’re not familiar with OOP, here’s how it works.  Instead of writing down (quite intuitively) your program according to the operations it needs to perform, you create objects that have operations associated with them according to the type of object.  An example in C-like pseudocode can help clarify:

Procedural:

rectangle_type: my_rect(20, 10)
print get_area(my_rect)

OOP:

rectangle_type: my_rect(20, 10)
print my_rect.get_area()

Note that in OOP, we create an object of type “rectangle_type” and then is has certain attributes associated with it, including functions.  In the procedural example, we just create a variable and pass it to a function which is not in any way associated to the variable we created.

The problem is, this is counter-intuitive.  Procedural programming follows a logical and clear flow.  It’s easy to read.  It doesn’t have complex class inheritance issues.  It’s just far easier to work with.

I was always ashamed to say this, as I’m more of a scripter than a coder, and a dabbler in the world of programming.  But recently I came across a collection of quotes from people who really do know what they are talking about, and I see many people agree with me.

How should a network engineer, programming neophyte approach this?  My advice is this:  Learn functional/procedural style programming first.  Avoid any course that moves quickly into OOP.

That said, you’ll unfortunately need to learn the fundamentals of OOP.  That’s because many, if not most Python libraries you’ll be using are object oriented.  Even built in Python data-types like strings have a number of OO functions associated with them.  (Want to make a string called “name” lower-case?  Call “name.lower()”) You’ll at least need to understand how to invoke a function associate with a particular object class.

Meanwhile I’ve been programming in AppleSoft quite a bit in my Apple II emulator, and the GOTO’s are so refreshing!

I’ve mentioned before that, despite being on the Routing Protocols team, I spent a lot of time handling crash cases in TAC.  At the time, my queue was just a dumping ground for cases that didn’t fit into any other bucket in the High Touch structure.  Backbone TAC had a much more granular division of teams, including a team entirely dedicated to crash.  But in HTTS, we did it all.

Some crashes are minor, like a (back then) 2600-series router reloading due to a bus error.  Some were catastrophic, particularly crashes on large chassis-type routing systems in service provider networks.  These could have hundreds of interfaces, and with sub-interfaces, potentially thousands of customers affected by a single outage.  Chassis platforms vary in their architecture, but many of the platforms we ran at the time used a distributed architecture in which the individual line cards ran a subset of IOS.  Thus, unlike a 2600 which had “dumb” WIC cards for interface connections, on chassis systems line cards themselves could crash in addition to the route processors.  Oftentimes, when a line card crashed, the effect would cascade through the box, with multiple line cards crashing, which would result in a massive meltdown.

The 7500 was particularly prone to these.  A workhorse of Cisco’s early product line, the 7500 line cards ran IOS but forwarded packets between each other by placing them into special queues on the route processor.  This was quite unlike later products, such as the Gigabit Switch Router (GSR), which had a fabric architecture enabling line cards to communicate directly.  On the 7500, oftentimes a line card having a problem would write bad data into the shared queues, which the subsequent line cards would read and then crash, causing a cascading failure.

One of our big customers, a Latin American telecommunications company I’ll call LatCom, was a heavy user of 7500’s.  They were a constant source of painful cases, and for some reason had a habit of opening P1 cases on Fridays at 5:55pm.  Back then HTTS day-shift engineers’ shifts ended at 6pm, at which point the night shift took over, but once we accepted a P1 or P2 case, unlike backbone TAC, we had to work it until resolution.  LatCom drove us nuts.  Five minutes was the difference between going home for the weekend and potentially being stuck on the phone until 10pm on a Friday night.  The fact that LatCom’s engineers barely spoke English also proved a challenge and drew out the cases–occasionally we had to work through non-technical translators, and getting them to render “there was a CEF bug causing bad data to be placed into the queue on the RP” into Spanish was problematic.

After years of nightmare 7500 crashes, LatCom finally did what we asked:  they dropped a lot of money to upgrade their routers to GSRs with PRPs, at that time our most modern box.  All the HTTS RP engineers breathed a sigh of relief knowing that the days of nightmare cascading line card failures on 7500’s were coming to an end.  We never had a seen a single case of such a failure on a GSR.

That said, we knew that if anything bad was going to happen, it would happen to these guys.  And sure enough, one day I got a case with…you guessed it, a massive cascading line card failure on a GSR!  The first one I had seen.  In the case notes I described the failure as follows:

  1. Six POS (Packet over Sonet) interfaces went down at once
  2. Fifteen seconds later, slots 1 and 15 started showing CPUHOG messages followed by tracebacks
  3. Everything stabilized until a few hours later, when the POS interfaces go down again
  4. Then, line cards in slots 0, 9, 10, 11, and 13 crashed
  5. Fifteen seconds later, line cards in slots 6 and 2 crash
  6. And so forth

My notes said: “basically we had a meltdown of the box.”  To make matters worse, 4 days later they had an identical crash on another GSR!

When faced with a this sort of mess, TAC agents usually would send the details to an internal mailer, which is exactly what I did.  The usual attempt by some on the mailer to throw hardware at the problem didn’t go far as we saw the exact same crash on another router.  This seemed to be a CEF bug.

Re-reading the rather extensive case notes bring up a lot of pain.  Because the customer had just spent millions of dollars to replace their routers with a new platform that, we assured them, would not be susceptible to the same problem, this went all the way to their top execs and ours.  We were under tremendous pressure to find a solution, and frankly, we all felt bad because we were sure the new platform would be an end to their problems.

There are several ways for a TAC engineer to get rid of a case:  resolve the problem, tell the customer it is not reproducible, wait for it to get re-queued to another engineer.  But after two long years at TAC, two years of constant pressure, a relentless stream of cases, angry customers, and problem after problem, my “dream job” at Cisco was taking a toll.  When my old friend Mike, who had hired me at the San Francisco Chronicle, my first network engineering job, called me and asked me to join him at a gold partner, the call wasn’t hard to make.  And so I took the easiest route to getting rid of cases, a lot of them all at once, and quit.  LatCom would be someone else’s problem.  My newest boss, the fifth in two years, looked at me with disappointment when I gave him my two weeks notice.

I can see the case notes now that I work at Cisco again, and they solved the case, as TAC does.  A bug was filed and the problem fixed.  Still, I can tell you how much of a relief it was to turn in my badge and walk out of Cisco for what I wrongly thought would be the last time.  I felt, in many ways, like a failure in TAC, but at my going away party, our top routing protocols engineer scoffed at my choice to leave.  “Cisco needs good engineers,” he said.  “I could have gotten you any job you wanted here!”  True or not, it was a nice comment to hear.

I started writing these TAC tales back in 2013, when I still worked at Juniper.  I didn’t expect they’d attract much interest, but they’ve been one of the most consistently popular features of this blog. I’ve cranked out 20 of these covering a number of subjects, but I’m afraid my reservoir of stories is running dry.  I’ve decided that number 20 will be the last TAC Tale on my blog.  There are plenty of other stories to tell, of course, but I’m finished with TAC, as I was back in 2007.  My two years in TAC were some of the hardest in my career, but also incredibly rewarding.  I have so much respect for my fellow TAC engineers, past, present, and future, who take on these complex problems without fear, and find answers for our customers.

 

“Progress might have been alright once, but it has gone on too long.”
–  Ogden Nash

The book The Innovator’s Dilemma appears on the desk of a lot of Silicon Valley executives.  Its author, Clayton Christiensen, is famous for having coined the term “disruptive innovation.”  The term has always bothered me, and I keep waiting for the word “disruption” to die a quiet death.  I have the disadvantage of having studied Latin quite a bit.  The word “disrupt” comes from the Latin verb rumperewhich means to “break up”, “tear”, “rend”, “break into pieces.”  The word, as does our English derivative, connotes something quite bad.  If you think “disruption” is good, what would you think if I disrupted a presentation you were giving?  What if I disrupted the electrical system of your heart?

Side note:  I’m fascinated with the tendency of modern English to use “bad” words to connote something good.  In the 1980’s the word “bad” actually came to mean its opposite.  “Wow, that dude is really bad!” meant he was good.  Cool people use the word “sick” in this way.  “That’s a sick chopper” does not mean the motorcycle is broken.

The point, then, of disruption is to break up something that already exists, and this is what lies beneath the b-school usage of it.  If you innovate, in a disruptive way, then you are destroying something that came before you–an industry, a way of working, a technology.  We instantly assume this is a good thing, but what if it’s not?  Beneath any industry, way of working, or technology are people, and disruption is disruption of them, personally.

The word “innovate” also has a Latin root.  It comes from the word novus, which means “new”.  In industry in general, but particularly the tech industry, we positively worship the “new”.  We are constantly told we have to always be innovating.  The second one technology is invented and gets established, we need to replace it.  Frame Relay gave way to MPLS, MPLS is giving way to SD-WAN, and now we’re told SD-WAN has to give way…  The life of a technology professional, trying to understand all of this, is like a man trying to walk on quicksand.  How do you progress when you cannot get a firm footing?

We seem to have forgotten that a journey is worthless unless you set out on it with an end in mind.  One cannot simply worship the “new” because it is new–this is self-referential pointlessness.  There has to be a goal, or an end–a purpose, beyond simply just cooking up new things every couple years.

Most tech people and b-school people have little philosophical education outside of, perhaps (and unfortunately) Atlas Shrugged.  Thus, some of them, realizing the pointlessness of endless innovation cycles, have cooked up ludicrous ideas about the purpose of it all.  Now we have transhumanists telling us we’ll merge our brains with computers and evolve into some sort of new God-species, without apparently realizing how ridiculous they sound.  COVID-19 should disabuse us of any notion that we’re not actually human beings, constrained by human limitations.

On a practical level, the furious pace of innovation, or at least what is passed off as such, has made the careers of technology people challenging.  Lawyers and accountants can master their profession and then worry only about incremental changes.  New laws are passed every year, but fundamentally the practice of their profession remains the same.  For us, however, we seem to face radical disruption every couple of years.  Suddenly, our knowledge is out-of-date.  Technologies and techniques we understood well are yesterday’s news, and we have to re-invent ourselves yet again.

The innovation imperative is driven by several factors:  Wall Street constantly pushes public companies to “grow”, thus disparaging companies that simply figure out how to do something and do it well.  Companies are pressured into expanding to new industries, or into expanding their share of existing industries, and hence need to come up with ways to differentiate themselves.  On an individual level, many technologists are enamored of innovation, and constantly seek to invent things for personal satisfaction or for professional gain.  Wall Street seems to have forgotten the natural law of growth.  Name one thing in nature that can grow forever.  Trees, animals, stars…nothing can keep growing indefinitely.  Why should a company be any different?  Will Amazon simply take over every industry and then take over governing the planet?  Then what?

This may seem a strange article coming from a leader of a team in a tech company that is handling bleeding edge technologies.  And indeed it would seem to be a heresy for someone like me to say these things.  But I’m not calling for an end to inventing new products or technologies.  Having banged out CLI for thousands of hours, I can tell you that automating our networks is a good thing.  Overlays do make sense in that they can abstract complexity out of networks.  TrustSec/Scalable Group Tags are quite helpful, and something like this should have been in IP from the beginning.

What I am saying is that innovation needs a purpose other than just…innovation.  Executives need to stop waxing eloquent about “disrupting” this or that, or our future of fusing our brains with an AI Borg.  Wall Street needs to stop promoting growth at all costs.  And engineers need time to absorb and learn new things, so that they can be true professionals and not spend their time chasing ephemera.

Am I optimistic?  Well, it’s not in my nature, I’m afraid.  As I write this we are in the midst of the Coronavirus crisis.  I don’t know what the world will look like a year from now.  Business as usual, with COVID a forgotten memory?  Perhaps.  Great Depression due to economic shutdown?  Perhaps.  Total societal, governmental, and economic collapse, with rioting in the streets?  I hope not, but perhaps.  Whatever happens, I do hope we remember that word “novel”, as in “novel Coronavirus”, comes from the same Latin root as the word “innovation”.  New isn’t always the best.

In my last post, I discussed the BBS and how it worked.  (It would be helpful to review, to understand the terminology.)  In this post, I have resurrected, in part, the BBS I used to run from 1988-1990.  It was called “The Tower”, for no particularly good reason except that it sounded cool to my teenage mind.

Now, bringing this back to life was no simple task, but was aided by some foresight I had 20 years ago.  I had a Mac with a disk drive, and realizing the floppy era was coming to a close, I decided to produce disk images of all the 3.5 inch floppies I had saved from my Apple II days.  Fortunately, my last Apple II, the IIGS, used 3.5″ drives instead of the 5.25″ that were more common on the Apple IIs.  The Macs that had floppy drives all had 3.5″ drives.  Additionally, Apple had included software to on the pre OSX MacOS to read ProDOS (Apple II) disks.  Thus, in the year 2000, I could mount an Apple II floppy from a dozen years prior and make an image out of it.

I did not have a full working version of my GBBS, however, so I had to download a copy.  I also had to do a lot of work to bring it up to Macos (not MacOS, but Macos, Modified ACOS), which was a modified form of the GBBS compiler I used at the time.  All of my source files required Macos and not the stock GBBS software.  Believe me, even though I ran the BBS for a couple years and wrote a lot of the code, remembering how to do any of this after 30 years was non-trivial.

Rather than hook up my old IIGS, which I still have, it made a lot more sense to use an emulator.  (It also enabled me to take screen shots.)  I used an emulator called Sweet16, which is a bit bare bones but does the trick.  In case you’re not familiar with the Apple II series, the early models were primarily text-driven.  They had graphics, of course, but they were not GUI machines.  After the Mac came out, there was a push to incorporate a GUI into the Apple II and the result was the Apple IIGS (Graphics and Sound).  While it had a GUI-driven OS (ProDOS 16 at first, replaced by GS/OS), it was backwards compatible with the old Apple II software.  The GBBS software I ran was classic Apple II, and thus it was a bit of a waste to run it on an Apple IIGS, but, well, that’s what I did.

In this screen shot (Figure 1), you can see the Apple IIGS finder from which I’m launching the BBS software, the only GUI shot you’ll see in the article:

Figure 1: The Apple IIGS ProDOS Finder

The next shot (Figure 2) shows the screen only visible to the sysop, while waiting for a call.  As sysop, I had the option to hit a key and log in myself, but if a user dialed in the system would beep and the user would begin the log in process.  I’m not sure why we’re awaiting call 2 which will be call 1 today, but it looks like a bug I need to hunt down.  The screen helpfully tells me if new users have signed up for the BBS, and whether I have mail.

Figure 2: The landing page while waiting for a call

(If you want to know why I used the silly handle “Mad MAn”, please see the previous article.)

The next screen shows the BBS right after logon.  The inverse text block at the top was a local sysop-only view, showing user information including the user name and phone number, as well as the user’s flags.  These are interesting.  Some BBS software provided access levels for controlling what a user could and could not do.  Instead of sequential access levels, GBBS provided a series of binary flags the sysop could set.  Thus, I could give access to one area but not another, whereas the sequential access levels mean that each access level inherits the privileges of the previous level.  Very clever.  A few other stats are displayed that I won’t go into.  I’ll turn off the sysop bar for the remaining screen shots.

Figure 3: The main level prompt with sysop bar. Be sure to report error #20!

Note the prompt provided to the user in figure 3.  It tells you:

  • That the sysop is available
  • That the user has not paged the sysop
  • The double colons (::) normally would display time left on the system.  Since this was a dial-up system, I needed to limit the time users could spend on the BBS.  But as sysop, I of course had unlimited time.
  • The BBS had different areas and the prompt (like an IOS prompt) tells you where you are (“Main level”)

Next, in figure 4 you can see the main menu options for a user logged into the BBS.  This is the default/stock GBBS menu, as my original is lost.  Despite the limited options, this was like entering a new world in the days of 64K RAM.  You can see that a user could send/read mail, go to a file transfer section, chat (or attempt to chat) with the system operator, or read the public message boards.

Figure 4: The BBS main menu. This is the GBBS default, not the custom menu I built

Next, the user list.  I had 150 users on my BBS, not all of them active.  I blacked out the last names and phone numbers, but you can get a sense of the handles that were used at the time.  In addition to these names, there were a lot of Frodo’s and Gandalf’s floating around.  Also note that most BBSing was local (to avoid long-distance charges.)  Sadly, none of these users has logged on since 1989.  I wish they’d come back.  Oggman, whom I mentioned in my last post, was a user on my board.

Figure 5: My user list

Conclusions

I recently interviewed a recent college grad who asked me how she could be successful at a company like Cisco.  My answer was that you have to understand where we came from in order to understand where we are.  You cannot understand, say, SD-WAN without understanding how we used to build WANs.  Go back to the beginning.  Learn what SneakerNet was.  Understand why we are where we are.  Even before SneakerNet, some of us were figuring out how to get computers to talk to each other over an already existing network–the analog telephone network.  As a side note, I love vintage computing.  It’s a lot of fun using emulators to resurrect the past, and I hope to do some physical restorations some day.  Trying to figure out how to boot up a long-defunct system like this BBS provides a great reminder of how easy we have it now.