The death of CLI: A lesson from Swift

Update:  From Fred, who was the guy referenced in the first paragraph:

Actually it was a white button with a router icon on it and “make cli great again”, I know this because it was me. It was June 2016. Needless to say in my view that did not age well.

When I attended Cisco Live sometime around the election of Donald Trump, there was a fellow walking around with a red hat with white lettering on it:  MAKE CLI GREAT AGAIN.  Ha!  I love Cisco Live.  These are my people.

I remember back when I worked at Juniper, one exec looked at me working on CLI and said, “you know that’s going to be gone soon.  It’ll all be GUI.”  That was 8 years ago…how’s that going?  When I work on CLI (and I still do!), or programming, my wife always says, “how can you stare at that cryptic black screen for hours?”  Hey, I’ve been doing it since I was a kid.

The black screen won’t go away, I’m afraid.  I’ve recently been learning iOS app development for fun (not profit).  It’s surprisingly hard given the number of successful app developers out there.  I may be too used to Python to program in Swift, and my hatred of object-oriented programming doesn’t help me when there is no way to avoid it in Swift.  Anyways, it took me about a week to sort out the different UI frameworks used in iOS.  There are basically three:

  • Storyboards.  Storyboards are a graphical design framework for UI layout.  Using storyboards, you drag and drop UI elements like buttons and textfields onto a miniature iPhone screen.
  • UIKit.  (Technically storyboards use UIKit, but I don’t know what else to call this.)  Most high-end app developers will delete the storyboard in their project and write the UI as code.  They actually type in code to tell iOS what UI elements they want, how to position them, and what to do in the event they are selected.  Positioning is fairly manual and is done relative to other UI elements.
  • SwiftUI.  Apple is pushing towards this model and will eventually deprecate the other two.  SwiftUI is also a UI-as-code model, but it’s declarative instead of imperative.  You tell SwiftUI what you want and roughly how you want to position things, and Swift does it for you.

Did you catch my point?  The GUI-based layout tool is going away in favor of UI-as-code!  The black screen always comes back!

The difference between computer people and non-computer-computer-people (many industry MBAs, analysts, etc.), is that computer people understand that text-based interaction is far more efficient, even if the learning curve is steeper.

Andrew Tanenbaum, author of the classic Computer Networks, typeset his massive work in troff.  Troff is a text-based typesetting tool where you enter input like this:

.ll 3i
.mk a
.ce
Preamble
.sp
We, the people of the United States, in order
to form a more perfect Union...

Why doesn’t he just use Word?  I’ll let Dr. Tanenbaum speak for himself:

All my typesetting is done using troff. I don’t have any need to see what the output will look like. I am quite convinced that troff will follow my instructions dutifully. If I give it the macro to insert a second-level heading, it will do that in the correct font and size, with the correct spacing, adding extra space to align facing pages down to the pixel if need be. Why should I worry about that? WYSIWYG is a step backwards. Human labor is used to do that which the computer can do better.  (Emphasis added.)

I myself am not quite enough of a cyborg to use troff (though I use vi), but I have used Latex with far better results than Word.  (Dr. Tanenbaum says “real authors use troff,” however.)

One of my more obscure interests (I have many) is Gregorian Chant.  Chant uses a musical notation which is markedly different from modern music notation, and occasionally I need to typeset it.  I use a tool called Gregorio, where I enter the chant like this:

(cb3) Ad(d)ór(f’)o(h) te(h’) de(h)vó(hi)te,(h.) (,) la(g)tens(f) Dé(e’)i(d)tas,(d.)

The letters in parentheses represent the different musical notes.  I once tried typesetting the chant graphically, and it was far more tedious than the above.  Why not enter what I want and let the typesetting system do the work?

Aside from the mere efficiency, text files can be easily version controlled and diff’d.  Try that with your GUI tool!

It’s very ironic that many of my customers who use controllers like DNAC or vManage are actually accessing the tool through APIs.  They bought a GUI tool, but they prefer the black screen.  The controller in this case becomes a point of aggregation for them, a system which at least does discovery and allows some level of abstraction.

The non-computer-computer-people look at SwiftUI, network device CLI, troff, Gregorio, APIs, and rend their garments, crying out to heaven, “why, oh why?!”  Some may even remember the days of text-based editing systems on their DOS machines, which they could never learn, and the great joy that WYSIWYG brought them.  It reminds me of a highly incompetent sales guy I worked with at the Gold partner back in the day.  He once saw me configuring a router and said:  “Wow, you still use DOS to configure routers!”

“It’s actually IOS CLI, not DOS.”

“That’s DOS!” he densely replied.  “I remember DOS.  I can’t believe you still use DOS!”

It’s funny that no matter how hard we try to get away from code, we always come back to it.  We’re hearing a lot about “low code” environments these days.  It tells you something when the first three Google hits on “low code” just come back to Gartner reports.  Gee, have we been down this path before?  Visual Basic was invented in 1991If low code is so great, why is Apple moving from storyboards to SwiftUI?

In my last post I wrote about the war on expertise.  This is one of the fronts in the war.  The non-computer-computer-people cannot understand the black screen, and are convinced they can eliminate it.  They learned about “innovation” in business school, and read case studies about Windows 95 and the end of DOS.  They read about how companies like Sun Microsystems went belly-up because they are not “disruptive.”  They did not, however, read about all the failed attempts to eliminate the black screen, spanning decades.  I believe it was George Santayana who said, “If you don’t remember computer history, you’re doomed to repeat it.”

Product Management

It’s impossible to count how many people at my college wanted to be “writers”.  So many early-twenty-somethings here in the US think they are going to spend their lives as screenwriters or novelists.  My colleagues from India tell me most people there want to be doctors or engineers, which tells you something about the decline of the United States.

Back in the mid-2000’s, a popular buddy-comedy came out about a novelist and an actor and their adventures in the “California wine country”.  The author of the film is an LA novelist.  The only people he knew, and the only characters he could create, were writers and actors.  I read that his first novel was about a screenwriter.  The movie was popular, but I found the characters utterly boring.  Who cares about a novelist and his romantic adventures?  Herman Melville spent years at sea, giving him the material to write Moby Dick.  Fyodor Dostoevsky wanted to be a writer from an early age, but he spent years in a prison camp followed by years of forced military service, to give him a view into nihilism and its effect on the human soul.  The point is, these great writers earned the right to talk about something, they didn’t just go to college and come out a genius with brilliant things to say.

I’ve been hearing a lot about “product management” lately.  I work in product management, in fact, and I’ve worked with product managers for many years.  However, I didn’t realize until recently that product management is the hot new field.  Everyone wants to major in PM in business school.  As one VP I know told me, “people want to be PMs because that’s where CEOs come from.”  Well, like 19-year-olds feeling entitled to be great novelists, b-school students are apparently expecting to become CEOs.  Somewhere missing in this sense of entitlement is that achievement has to be earned, and that is has to be earned by developing specific expertise.  A college student who wants to be a novelist thinks he or she simply deserves to be a novelist by virtue of his or her brilliance;  a b-school PM student apparently thinks the same way about being a CEO.

Back when I worked in TAC, one of my mentors was a TAC engineer who had previously been a product manager for GSR (12000-series) line cards.  He went back to TAC because he wanted to get into the new CRS-1 router and felt it was the best place to learn the new product quickly.  It made sense at the time, but it is inconceivable now that a PM would go to TAC.  The product manager career path is directed towards managing business, not technology, and it would be a step down for product managers to become technical again.

If you don’t work for a tech company, you may not know a lot about product management, but PMs are very important to the development of the products you use.  They decide what products are brought to market;  what features they will have;  they prioritize product roadmaps.  They are held accountable for the revenue (or lack thereof) for a product.

Imagine, now, that somebody with that responsibility for, say, a router has no direct experience as a network engineer, but instead has an MBA from Kellogg or Haas or Wharton.  They’ve studied product management as a discipline, but know nothing about the technology that they own.  Suppose this person has no particular interest in or passion for their field–they just want to succeed in business and be a CEO some day.  What do you think the roadmap will look like?  Do you think the product will take into account the needs of the customer?  When various technologists come to such a PM, will he be able to rationally sort through their competing proposals and select the correct technology?

To be clear, I am not criticizing any individual or my current employer here.  This problem extends industry-wide and explains why so many badly conceived products exist.  The problem of corporatism, which I’ve written about often, extends beyond product management too.  How often are decisions in IT departments made by business people who have little to no experience in the field they are responsible for?  I got into network engineering because I was fascinated by it and loved it.  I’m not the best engineer out there–I’ve worked with some brilliant people–but I do care about the industry and the products we make.  And most importantly, I care about network engineers because I’ve been one.

Corporatists believe generic management principles can be learned which apply to any business, and that they don’t really need domain-specific expertise.  They know business, so why would they?  True, there are some “business” specific tasks like finance that where generic business knowledge is really all that’s needed.  But the mistaken thinking that generic business knowledge qualifies one to be authoritative on technical topics doesn’t make sense.  This is how tech CEO’s end up CEO of coffee companies–it’s just business, right?

I don’t mean to denigrate product management as a discipline.  PMs have an important role to play, and product management is the art of making decisions between different alternatives with constrained resources.  I am saying this:  that if you want to become a product manager, spend the time to learn not just the business, but the actual thing you are product managing.  You’d be better off spending a couple years in TAC out of business school than going straight into PM.  Not that many CEO-aspiring PMs would ever do that, these days.

Now off to write my first novel.

Computers with Brains

A couple of years back I purchased an AI-powered energy monitoring system for my home.  It clips on to the power mains and monitors amperage/wattage.  I can view an a graph showing energy usage over time, which is really quite helpful to keep tabs on my electricity consumption at a time when electricity is expensive.

The AI part identifies what devices are drawing power in my house.  Based simply on wattage patterns, so they claim, the app will tell me this device is a light, that device is an air conditioner, and so on.  An electric oven, for example, consumes so much power and switches itself on and off in such a pattern that AI can identify it.  The company has a large database of all of the sorts of products that can be plugged into an outlet, and it uses its database to figure out what you have connected.

So far my AI energy monitor has identified ten different heaters in my house.  That’s really cool, except for the fact that I have exactly one heater.  When the message popped up saying “We’ve identified a new device!  Heater #10!”, I must admit I wasn’t surprised.  It did raise an eyebrow, however, given that it was summer and over 100 degrees (38 C) outside.  At the very least, you’d think the algorithm could correlate location and weather data with its guesses.

Many “futurists” who lurk around Silicon Valley believe in a few years we’ll live for ever when we merge our brains with AI.  I’ve noticed that most of these “futurists” have no technological expertise at all.  Usually they’re journalists or marketing experts.  I, on the other hand, deal with technology every day, and it leaves me more than a little skeptical of the “AI” wave that’s been sweeping over the Valley for a few years.

Of course, once the “analysts” identify a trend, all of us vendors need to move on it.  (“SASE was hot last fall, but this season SSE is in!”)  A part of that involves labeling things with the latest buzzword even when they have nothing to do with it.  (Don’t get me started on “controllers”…)  One vendor has a tool that opens a TAC case after detecting a problem.  They call this something like “AI-driven issue resolution.”  Never mind that a human being gets the TAC case and has to troubleshoot it–this is the exact opposite of AI.  We can broaden the term to mean a computer doing anything on its own, in this case calling a human.  Hey, is there a better indicator of intelligence than asking for help?

Dynamic baselines are neat.  I remember finding the threshold altering capabilities in NMS tools useless back in the 90’s.  Do I set it at 50% of bandwidth?  60%?  80%?  Dynamic baselining determines the normal traffic (or whatever) level at a given time, and sets a variable threshold based on historical data.  It’s AI, I suppose, but it’s basically just pattern analysis.

True issue resolution is a remarkably harder problem.  I once sat in a product meeting where we had been asked to determine all of the different scenarios the tool we were developing would be able to troubleshoot.  Then we were to determine the steps the “AI” would take (i.e., what CLI to execute.)  We built slide after slide, racking our brains for all the ways networks fail and how we’d troubleshoot them.

The problem with this approach is that if you think of 100 ways networks fail, when a customer deploys the product it will fail in the 101st way.  Networks are large distributed systems, running multiple protocols, connecting multiple operating systems, with different media types and they have ways of failing, sometimes spectacularly, that nobody ever thinks about.  A human being can think adaptively and dynamically in a way that a computer cannot.  Troubleshooting an outage involves collecting data from multiple sources, and then thinking through the problem until a resolution is found.  How many times, when I was in TAC, did I grab two or three other engineers to sit around a whiteboard and debate what the problem could be?  Using our collective knowledge and experience, bouncing ideas off of one another, we would often come up with creative approaches to the problem at hand and solve it.  I just don’t see AI doing that.  So, maybe it’s a good thing it phones home for help.

I do see a role for AI and its analysis capabilities in providing troubleshooting information on common problems.  Also, data can be a problem for humans to process.  We’re inundated by numbers and often cannot easily find patterns in what we are presented.  AI-type tools can help to aggregate and analyze data from numerous sources in a single place.  So, I’m by no means saying we should be stuck in 1995 for our NMS tools.  But I don’t see AI tools replacing network operations teams any time soon, despite what may be sold.

And I certainly have no plans to live forever by fusing my brain with a computer.  We can leave that to science fiction writers, and their more respectable colleagues, the futurists.

Back to Cisco Live

I haven’t posted in a while, for the simple reason that writing a blog is a challenge.  What the heck am I going to write about?  Sometimes ideas come easily, sometimes not.  Of course, I have a day job, and part of that day job involves Cisco Live, which is next week, in person, for the first time in two years.  Getting myself ready, as well as a coordinating with a team of almost fifty technical marketing engineers, does not leave a lot of free time.

For the last several in-person Cisco Lives, I did a two-hour breakout on programmability and scripting.  The meat of the presentation was NETCONF/RESTCONF/YANG, and how to use Python to configure/operate devices using those protocols.  I don’t really work on this anymore, and I have a very competent colleague who has taken over.  I kept delivering the session because I loved doing it.  But good things have to come to an end.  At the last in-person Cisco Live (Barcelona 2020), I had just wrapped up delivering the session for what I assumed would be the last time.  A couple of attendees approached me afterwards.  “We love your session, we come to it every year!” they told me.

I was surprised.  “But I deliver almost the same content every year,” I replied.  “I even use the same jokes.”

“Well, it’s our favorite session,” they said.

At that point I resolved to keep doing it, even if my experience was diminishing.  Then, COVID.

I had one other session which was also a lot of fun, called “The CCIE in an SDN world.”  Because it was in the certification track, I wasn’t taking a session away from my team by doing it.  There is a bit about the CCIE certification, its history, and its current form, but the thrust of it is this:  network engineers are still relevant, even today with SDN and APIs supposedly taking over everything.  There is so much marketing fluff around SDN and its offshoots, and while there may be good ideas in there (and a lot of bad ones), nevertheless we still need engineers who study who to manage and operate data networks, just like we did in the past.

I will be delivering that session.  I have 50 registered attendees, which is far cry from the 500 I used to pack in at the height of the programmability gig.  Being a Senior Director, you end up in limbo between keynotes (too junior) and breakouts (too senior).  But the cert guys were gracious enough to let me speak to my audience of 50.

Cisco Live is really the highlight of the TME role, and I’m happy to finally be back.  Let’s just hope I’m still over my stage fright, I haven’t had an audience in years!

Are coffee and networks the same?

I shall avoid naming names, but when I worked for Juniper we had a certain CEO who pumped us up as the next $10 billion company.  It never happened, and he left and became the CEO of Starbucks.  Starbucks has nothing to do with computer networking at all.  Why was he hired by Starbucks?  How did his (supposed) knowledge of technology translate into coffee?

Apparently it didn’t.  Howard Schultz, Starbucks’ former CEO, is back at the helm.  “I wasn’t here the last four years, but I’m here now,” he said, according to an article in the Wall Street Journal (paywall).  “I am not in business, as a shareholder of Starbucks, to make every single decision based on the stock price for the quarter…Those days, ladies and gentlemen, are over.”  Which of course, implies that that was exactly what the previous CEO was doing.

What happened under the old CEO?  “Workers noticed an increasing focus on speed metrics, including the average time to prepare an order, by store.”  Ah, metrics, my old enemy.  There’s a reason one of my favorite books is called The Tyranny of Metrics and why I wrote a TAC Tales piece just about the use of metrics in TAC.  More on that in a bit.

As I look at what I refer to as “corporatism” and its effect on our industry, it often becomes apparent that the damage of this ethos extends beyond tech.  The central tenet of corporatism, as I define it, is that organizations are best run by people who have no particular expertise other than management itself.  That is, these individuals are trained and experienced in generic management principles, and this is what qualifies them to run businesses.  The generic management skills are translate-able, meaning that if you become an expert in managing a company that makes paper clips, you can successfully use your management skills to run a company that makes, say, medical-device software.  Or pharmaceuticals.  Or airplanes.  Or whatever.  You are, after all, a manager, maybe even a leader, and you just know what to do without any deep expertise or hard-acquired industry-specific knowledge.

Those of us who spend years, even decades acquiring deep technical knowledge of our fields are, according to this ethos, the least qualified to manage and lead.  That’s because we are stuck in our old ways of doing things, and therefore we don’t innovate, and we probably make things complex, using funny acronyms like EIGRP, OSPF, BGP, STP, MPLS, L2VNI, etc., to confuse the real leaders.

Corporatists simply love metrics.  They may not understand, say, L2VNIs, but they look at graphs all day long.  Everything has to be measured in their world, because once it’s measured it can be graphed, and once it’s graphed it’s simply a matter of making the line go the right direction.  Anyone can do that!

Sadly, as Starbucks seems to be discovering, life is messier than a few graphs.  Management by metric usually leads to unintended consequences, and frequently those who operate in such systems resort to metric-gaming.  As I mentioned in the TAC Tale, measuring TAC agents on create-to-close numbers led to many engineers avoiding complex cases and sticking with RMAs to get their numbers looking good.  Tony Hsieh at Zappos, whatever problems he may have had, was totally right when he had his customer service reps stay on the phone as long as needed with customers, hours if necessary, to resolve an issue with a $20 pair of shoes.  That would never fly with the corporatists.  But he understood that customer satisfaction would make or break his business, and it’s often hard to put a number on that.

Corporatism of various sorts has been present in every company I’ve worked for.  The best, and most successful, leadership teams I’ve worked for have avoided it by employing leaders that grew up within the industry.  This doesn’t make them immune from mistakes, of course, but it allows them to understand their customers, something corporatists have a hard time with.

Unfortunately, we work in an industry (like many) in which the stock value of companies is determined by an army of non-technical “analysts” who couldn’t configure a static route, let alone explain what one is.  And yet somehow, their opinions on (e.g.) the router business move the industry.  They of course adhere to the ethos of corporatism.  And I’m sure they get paid better than I do.

Starbucks seems to be correcting a mistake by hiring back someone who actually knows their business.  Would that all corporations learn from Starbucks’ mistake, and ensure their leaders know at least something about what they are leading.

Blogging vs. Video

I must admit, I’m a huge fan of Ivan Peplnjak.  This despite the fact that he is a major thorn in the side of product management at Cisco.  It is, of course, his job to be a thorn in our side and Ivan is too smart to ignore.  He has a long history with Cisco, with networking, and his opinions are well thought out and highly technical.  He is a true network engineer.  The fact that I like Ivan does not mean he gave me an easy time a few years back when I did a podcast with him on NETCONF at Cisco Live Berlin.

Ivan had an interesting post recently entitled “Keep Blogging, Some of Us Still Read“.  It reminds of my own tongue-in-cheek FAQ for this blog, in which I said I wouldn’t use a lot of graphics because I intended my blog for “people who can read”.  As a blogger, I think I quite literally have about 3 regular readers, which occasionally makes me wonder why I do it at all.  I could probably build a bigger readership if I worked at it, but I really don’t work at it.  I think part of the reason I do it is simply that I find it therapeutic.

Anyhow, the main claim Ivan is responding to is that video seems to be dominant these days and blogging is becoming less rewarding.  There is no question video creation has risen dramatically, and in many ways it’s easier to get noticed on YouTube than on some random blog like mine.  Then again, with the popularity of SubStack I think people are actually still reading.

Ivan says “Smart people READ technical content.”  Well, perhaps.  I remember learning MPLS back in 2006 when I worked at TAC.  I took a week off to run through a video series someone had produced and it was one of the best courses I’ve taken.  Sometimes a technical person doesn’t want to learn by reading content.  Sometimes listening to new concepts explained well at a conversational pace and in a conversational style is more conducive to actually understanding the material.  This is why people go to trade shows like Cisco Live.  They want to hear it.

I’ve spent a lot of time on video lately, developing a series on technical public speaking as well as technical videos for Cisco.  In the process I’ve had to learn Final Cut Pro and DaVinci resolve.  Both have, frankly, horrendous user interfaces that are hard to master.  Nine times out of ten I turn to a YouTube video when I’m stuck trying to do something.  Especially with GUI-based tools, video is much faster for me to learn something than screen shots.

On the other hand, it’s much harder to produce video.  I can make a blog post in 15 minutes.  YouTube videos take hours and hours to produce, even simple ones like my Coffee with TMEs series.

The bottom line is I’m somewhere down the middle here.  Ivan’s right, technical documentation in video format is much harder to search and to use for reference.  That said, I think video is often much better for learning, that is for being guided through an unfamiliar concept or technology.

If you are one of my 3 regular readers and you would prefer to have my blogs delivered to your inbox, please subscribe at https://subnetzero.substack.com/ where I am cross-posting content!

The Ghost of Jobs

A post recently showed up in my LinkedIn feed.  It was a video showing a talk by Steve Jobs and claiming to be the “best marketing video ever”.  I disagree.  I think it is the worst ever.  I hate it.  I wish it would go away.  I have deep respect for Jobs, but on this one, he ruined everything and we’re still dealing with the damage.

A little context:  In the 1990’s, Apple was in its “beige box” era.  I was actively involved in desktop support for Macs at the time.  Most of my clients were advertising agencies, and one of them was TBWA Chiat Day, which had recently been hired by Apple.  Macs, once a brilliant product line, had languished, and had an out-of-date operating system.  The GUI was no longer unique to them as Microsoft had unleashed Windows 95.  Apple was dying, and there were even rumors Microsoft had acquired it.

In came Steve Jobs.  Jobs was what every technology company needs–a visionary.  Apple was afflicted with corporatism, and Jobs was going to have none of it.

One of his most famous moves was working with Chiat Day to create the “Think Different” ad campaign.  When it came out, I hated it immediately.  First, there was the cheap grammatical trick to get attention.  “Think” is a verb, so it’s modified by an adverb (“differently”).  By using poor grammar, Apple got press beyond their purchased ad runs.  Newspapers devoted whole articles to whether Apple was teaching children bad grammar.

The ads featured various geniuses like Albert Einstein and Gandhi and proclaimed various trite sentiments about “misfits” and “round pegs in square holes”.  But the ads said nothing about technology at all.

If you watch the video you can see Jobs’ logic here.  He said that ad campaigns should not be about product but about “values”.  The ads need to say something about “who we are”.

I certainly knew who Chiat Day was since I worked there.  I can tell you that the advertising copywriters who think up pabulum like “Think Different” couldn’t  write technical ads because they could barely turn on their computers without me.  They had zero technological knowledge or capability.  They were creating “vision” and “values” about something they didn’t understand, so they did it cheaply with recycled images of dead celebrities.

Unfortunately, the tech industry seems to have forgotten something.  Jobs didn’t just create this “brilliant” ad campaign with Chiat Day.  He dramatically improved the product.  He got Mac off the dated OS it was running and introduced OS X.  He simplified the product line.  He killed the Apple clone market.  He developed new chips like the G3.  He made the computers look cool.  He turned Macs from a dying product into a really good computing platform.

Many tech companies think they can just do the vision thing without the product.  And so they release stupid ad campaigns with hired actors talking about “connecting all of humanity” or whatever their ad agency can come up with.  They push their inane “values” and “mission” down the throats of employees.  But they never fix their products.  They ship the same crappy products they always shipped but with fancy advertising on top.

The thing about Steve Jobs is that everybody admires his worst characteristics and forgets his best.  Some leaders and execs act like complete jerks because Steve Jobs was reputed to be a complete jerk.  They focus on “values” and slick ad campaigns, thinking Jobs succeeded because of these things.  Instead, he succeeded in spite of them.  At the end of the day, Apple was all about the product and they made brilliant products.

The problem with modern corporatism is the army of non-specialized business types who rule over everything.  They don’t understand the products, they don’t understand those who use them, they don’t understand technology, but…Steve Jobs!  So, they create strategy, mission, values, meaningless and inspiring but insipid ad campaigns, and they don’t build good products.  And then they send old Jobs videos around on LinkedIn to make the problem worse.

Consulting Gigs

In my years in the corporate world, I’ve attended many corporate self-help type sessions on how to or increase leadership, creativity, and innovation.  There are many young consultants who are starting their careers off helping us to develop new skills in these areas, so I thought I would provide some helpful tips to get started.  Enjoy!

  1. If you are going to do consulting or presentations on innovation and leadership, it’s very important that you have never led anyone or invented anything.  Rather, you simply need to interview people who have done those things.  A lot of them.  Two or three thousand.  Actually, even if it’s only been 10 or 11, just say you’ve interviewed two or three thousand innovators or leaders.  This is called “research.”
  2. It’s especially important, if you are teaching career technology people how to innovate, that you loathe technology and cannot even upgrade your iPhone without help.  Remember, they may understand technology, but you understand how to innovate!  Two different things.
  3. You’re going to be making claims that are either wrong, or so obvious they don’t bear repeating.  Remember that you need to do several things to make those statements credible:
    • Begin by citing unverifiable claims from evolutionary biology as the basis for your statements.  Be sure to mention that we used to live out on open plains where we were at risk for being eaten.  Also be sure to mention “fight or flight.”  Bad:  “Strong leaders need to cultivate loyalty.” Good: “evolutionary biology has shown us that, back when we lived on the plain and were vulnerable to getting eaten by lions, our brains developed a need to be loyal to a leader.”
    • Next, cite the latest neuroscience to substantiate your claims.  In fact, it doesn’t have to be real neuroscience.  Remember, nobody will ever check!  Just say “the latest research on the brain has shown…” and leave it at that.  Bad:  “To be innovative we need time to think.”  Good:  “The latest neuroscience has shown that our brains can’t innovate when they are overwhelmed and don’t have time to reason properly.”
    • Remember, if you’re going to be hired by corporations and paid thousands of dollars in speaking fees, you need to state obvious truths in a technical way that makes you seem smart.  Invent new terminology so when you regurgitate to people what they already know, you sound authoritative.  For example, instead of saying, “criticism hurts people’s feelings and can cause them to leave,” invent a “criticism-despair cycle.”  Make a diagram with arrows showing “criticism->rejection->despair->attrition”.  See how much more impressive you sound already?
  4. It really helps if you are a “Doctor”.  There are many unaccredited diploma mills that will send you a Ph.D. based on your “life experience.”  Or better yet, just start calling yourself “doctor”.  Do you really think anyone will call and verify your doctorate?

Remember, the most lucrative careers don’t involve building skills through years of hard work, but telling people who know better than you how to do their jobs.  I hope you have a rewarding career as a consultant!

The AWS Outage

I have to give AWS credit for posting a fairly detailed technical description of the cause of their recent outage.  Many companies rely on crisis PR people to phrase vague and uninformative announcements that do little to inform customers and put their minds at ease.  I must admit, having read the AWS post-mortem a couple times, I don’t fully understand what happened, but it seems my previous article on automation running wild was not far off.  Of course, the point of the article was not to criticize automation.  An operation the size of AWS would be simply impossible without it.  The point was to illustrate the unintended consequences of automation systems.  As a pilot and aviation buff, I can think of several examples of airplanes crashing due to out-of-control automation as well.

AWS tells us that “an automated activity to scale capacity of one of the AWS services hosted in the main AWS network triggered an unexpected behavior from a large number of clients inside the internal network.”  What’s interesting here is that the automation event was not itself a provisioning of network devices.  Rather, the capacity increase caused “a large surge of connection activity that overwhelmed the networking devices between the internal network and the main AWS network…”  This is just the old problem of overwhelming link capacity.  I remember one time, when I was at Juniper, and a lab device started sending a flood of traffic to the Internet, crushing the Internet-facing firewalls.  It’s nice to know that an operation like Amazon faces the same challenges.  At the end of the day, bandwidth is finite, and enough traffic will ruin any network engineer’s day.

“This congestion immediately impacted the availability of real-time monitoring data for our internal operations teams, which impaired their ability to find the source of congestion and resolve it.”  This is the age-old problem, isn’t it?  Monitoring our networks requires network connectivity.  How else do we get logs, telemetry, traps, and other information from our devices?  And yet, when our network is down, we can’t get this data.  Most large-scale customers do maintain a separate out-of-band network just for monitoring.  I would assume Amazon does the same, but perhaps somehow this got crushed too?  Or perhaps what they refer to as their “internal network” was the OOB network?  I can’t tell from the post.

“Operators continued working on a set of remediation actions to reduce congestion on the internal network including identifying the top sources of traffic to isolate to dedicated network devices, disabling some heavy network traffic services, and bringing additional networking capacity online. This progressed slowly…”  I don’t want to take pleasure in others’ pain, but this makes me smile.  I’ve spent years telling networking engineers that no matter how good their tooling, they are still needed, and they need to keep their skills sharp.  Here is Amazon, with presumably the best automation and monitoring capabilities of any network operator, and they were trying to figure out top talkers and shut them down.  This reminds me of the first broadcast storm I faced, in the mid-1990’s.  I had to walk around the office unplugging things until I found the source.  Hopefully it wasn’t that bad for AWS!

Outages happen, and Amazon has maintained a high-level of service with AWS since the beginning.  The resiliancy of such a complex environment should be astounding to anyone who has built and managed complex systems.  Still, at the end of the day, no matter how much you automate (and you should), no matter how much you assure (and you should), sometimes you have to dust off the packet sniffer and figure out what’s actually going down the wire.  For network engineers, that should be a reminder that you’re still relevant in a software-defined world.

Automation gone wild

As I write this, a number of sites out on the Internet are down because of an outage at Amazon Web Services.  Delta Airlines is suffering a major outage.  On a personal note, my wife’s favorite radio app and my Lutron lighting system are not operating correctly.  Of course, this outage is a reminder of the simple principle of not putting one’s eggs in a single basket.  AWS became the dominant web provider early on, but there are multiple viable alternatives now.  Long before the modern cloud emerged, I regularly ran disaster recovery exercises to ensure business continuity when a data center or service provider failed.  Everyone who uses a cloud provider better have a backup, and you better figure out a way to periodically test that backup.  A few startups have emerged to make this easier.

While the cause of the outage is yet unknown, there was an interesting comment in an Newsweek article on the outage.  Doug Madory, director of internet analysis an Kentik Inc, said:  “More and more these outages end up being the product of automation and centralization of administration…”  I’ve been involved in automation in some form or another for my entire six years at Cisco, and one aspect of automation is not talked about enough:  automation gone wild.  Let me give a non-computer example.

Back when I worked at the San Francisco Chronicle, the production department installed a new machine in our Union City printing plant.  The Sunday paper, back then, had a large number of inserts with advertisements and circulars that needed to be stuffed into the paper.  They were doing this manually, if you can believe it.

The new machine had several components.  One part of the process involved grabbing the inserts and carrying them in a conveyor system high above the plant floor, before dropping them down into the inserter.  It’s hard to visualize, so I’ve included a picture of a similar machine.

You can see the inserts coming in via the conveyor, hanging vertically.  This conveyor extended quite far.  One day I was in the plant, working on some networking thing or other, and the insert machine was running.  I looked back and saw the conveyor glitch somehow, and then a giant ball of paper started to form in the corner of the room, before finally exploding and raining paper down on the floor of the plant.  There was a commotion and one of the workers had to shut the machine down.

The point is, automation is great until it doesn’t work.  When it fails, it fails big.  You don’t just get a single problem, but a compounding problem.  It wasn’t just a single insert that got hit by the glitch, but dozens of them, if not more.  When you use manual processes, failures are contained.

Let’s tie this back to networking.  Say you need to configure hundreds of devices with some new code, perhaps adding a new routing protocol.  If you do it by hand in one device, and suddenly routes start dropping out of the routing table, chances are you won’t proceed with the other devices.  You’ll check your config to see what happened and why.  But if you set up, say, a Python script to run around and do this via NETCONF to 100 devices, suddenly you might have a massive outage on your hands.  The same could happen using a tool like Ansible, or even a vendor network management platform.

There are ways to combat this problem, of course.  Automated checks and validation after changes is an important one, but the problem with this approach is you cannot predict every failure.  If you program 10 checks, it’s going to fail in way #11, and you’re out of luck.

As I said, I’ve spent years promoting automation.  You simply couldn’t build a network like Amazon’s without it.  And it’s critical for network engineers to continue developing skills in this area.  We, as vendors and promoters of automation tools, need to be careful how we build and sell these tools to limit customer risk.

Eventually they got the inserter running again.  Whatever the cause of Amazon’s outage, let’s hope it’s not automation gone wild.