mpls

All posts tagged mpls

As I mentioned in my last post, I like modeling networks using tools like Cisco Modeling Labs or GNS3.  I recalled how, back in TAC, I had access to a Cisco-internal (at the time) tool called IOS on Unix, or IOU.  This enabled me to recreate customer environments in minutes, with no need to hunt down hardware.  Obviously IOU didn’t work for every case.  Often times, the issue the customer raised was very hardware specific, even when it was a “routing protocol” issue.  However, if I could avoid hardware, I would do the recreate virtually.

When I worked at Juniper (in IT), we did a huge project to refresh the WAN.  This was just before SD-WAN came about.  We sourced VPLS from two different service providers, and then ran our own layer 3 MPLS on top of it.  The VPLS just gave us layer 2 connectivity, like a giant switch.  We had two POPs in each region which acted as aggregation points for smaller sites.  For these sites we had CE routers deployed on prem, connecting to PE routers in the POPs.  This is a basic service provider configuration, with us as a service provider.  Larger sites had PE routers on site, with the campus core routers acting as CEs.

We got all the advantages of layer 3 MPLS (traffic engineering, segmentation via VRF) without the headaches (peering at layer 3 with your SP, yuck!)

As the “network architect” for IT, I needed a way to model and test changes to the network.  I used a tool called VMM, which was similar to IOU.  Using a text file I could define a topology of routers, and their interconnections.  Then I used a Python script to start it up.  I then had a fully functional network model running under a hypervisor, and I could test stuff out.

I never recreated the entire network–it wasn’t necessary.  I created a virtual version with two simulated POPs, a tier 1 site (PE on prem), and a tier 2 site (PE in POP).  I don’t fully remember the details, there may have been one or two other sites in my model.

For strictly testing routing issues assuming normally functioning devices, my VMM-based model was a dream.  Before we rolled out changes we could test them in my virtual lab.  We could apply the configuration exactly as entered into the real device, to see what effect there would be in the network.  I just didn’t have the cool marketing term “digital twin” as it didn’t exist yet.

I remember working on a project to roll out multicast on the WAN using Next Generation Multicast VPN (NGMVPN).  NGMVPN was (is?) a complex beast, and as I designed the network and sorted out things like RP placement, I used my virtual lab.  I even filed bugs against Juniper’s NGMVPN code, bugs I found while using my virtual devices.  I remember the night we did I pilot rollout to two sites.  Our Boston office dropped off the network entirely.  Luckily we had out-of-band access and rolled back the config.  I SSHd into my virtual lab, applied the config, and spend a short amount of time diagnosing the problem (a duplicate loopback address applied), and did so without the stress of troubleshooting a live network.

I’ve always been a bit skeptical of the network simulation/modeling approach.  This is where you have some software intellgence layer that tries to “think through” the consequences of applied changes.  The problem is the variability of networks.  So many things can happen in so many ways.  Actual devices running actual NOS code in a virtual environment will behave exactly the way real devices will, given their constraints.  (Such as:  not emulating the harware precisely, not emulating all the different interface types, etc.)  I may be entirely wrong on this one, I’ve spent virtually no time with these products.

The problems I was modeling were protocol issues amongst a friendly group of routers.  When you add in campus networking, the complexity increases quite dramatically.  Aside from wireless being in the mix, you also have hundreds, thousands of non-network devices like laptops, printers, and phones which often cause networks to behave unpredictably.  I don’t think our AI models are yet at the point where they can predict what comes with that complexity.

Of course, the problem you have is always the one you don’t predict.  In TAC, most of the cases I took were bugs.  Hardware and software behaves unexpectedly.  As in the NGMVPN case, if there is a bug in software that is strictly protocol related, you might catch it in an emulation.  But many bugs exist only on certain hardware platforms, or in versions of software that don’t run virtually, etc.

As for digital twins, I do think learning to use CML (of course I’m Cisco-centric) or similar tools is very worthwhile.  Preparing for major changes offline in a virtual environment is a fantastic way to prep for the real thing.  Don’t forget, though, that things never go as planned, and thank goodness for that, as it gives us all job security.

When I worked for the Gold partner I generally serviced clients in the San Francisco Bay Area, but because we were a national partner I was occasionally called to other locations around the country.  Being a double CCIE who had worked in TAC, I had a unique skill set among our engineers, which was often demanded by other field offices.

One day my boss called me and told me he needed my help with a customer out of Des Plaines, Illinois.  The company was a manufacturer of fuses.  They were experiencing a network meltdown and needed troubleshooting help.  Great, I thought, I left TAC and came here precisely to get out of this sort of thing.  I liked doing sales calls and new installations, not fixing buggy messes.

I was assigned to the customer on a Monday and was immediately pulled into what we often call a “shit show”.  (Pardon the language.)  The customer had a large international MPLS network with VPN backup.  Several of the sites were experiencing performance issues.  Sites were unable to perform manufacturing and the previous CIO had been fired.  The interim CIO was an ex-military person who seemed to think he was George S. Patton.  He was scheduling calls from early in the morning until late at night, status updates, live troubleshooting sessions, and pow wows with TAC.

Meanwhile, I was starting to feel ill.  Not because of the case, just sick to my stomach.  I didn’t think much of it at first, but it started to go downhill fast.  Luckily I was working from home due to the crazy hours.

But not for long.  The CIO had set up a troubleshooting session in the middle of the night Saturday, into Sunday morning.  He got my boss and me on the phone and insisted I come to Des Plains that weekend, in person.  We argued every way we could that I could be just as productive remotely, but Patton was having none of it.  “If this is your best guy,” he said to my boss, “you need to have him on a plane and out here in person.  Otherwise we can take our business somewhere else.”  Not only was it the weekend, and not only did I feel ill, it was also Memorial Day weekend.  My brother, who actually was (and is) in the Army was paying a rare visit to the Bay Area.  There was no sympathy from the customer, and soon I was booking my ticket to Illinois.  The local account executive booked me a car with GPS and promised to meet me at the airport.  Keep in mind, this was before smartphones, and so you needed to rent a car with a built-in GPS unit if you wanted to get around without maps.

I had a miserable flight and was starting to feel more sick.  There’s nothing worse than being sick to your stomach on a plane.  Being forced to stay in your seat and long lines for tiny bathrooms make for torture.  I ate nothing, and arrived at Chicago airport late.  The account executive was nowhere to be found, and the car he had arranged did not have GPS.  The rental car company didn’t have any GPS-equipped cars, so they provided me with a map and directions.

I drove through Chicago to Des Plaines.  Realizing I needed to eat something, I found a McDonalds, the only thing open at that time, and managed to choke down a half of a quarter pounder.  My stomach felt like burning acid.  I continued my drive, through a bad part of the city.  I was on the right road, but I needed to keep pulling over to check the address.  A couple times I pulled over, swarms of what I assume were drug dealers would approach.  I’d pull out just as they got to the car.

Eventually I made it to the customer site, and met the general.  I was shown in to a conference room with a raised floor right next to the data center, and we began troubleshooting.

It was nothing I couldn’t have aided with remotely.  Basically, the customer had scoped circuits that were too small for the volume of traffic they were carrying.  They were also having degradation problems on the MPLS.   Some of the sites were performing better on VPN backup circuits, so we were switching them to backups.  We performed tests with the telco.  We also looked into an issue with their core Catalyst 6k switch.  When they had done a circuit switch earlier in the week, all traffic on the network had stopped, according to the customer.  The customer had reloaded the core device and traffic came back.  Because there was no crash or crashdump file, and nothing in the logs, I could not explain this event.  It was a smorgasbord of issues, mostly due to bad design and a little due to bad luck.

The troubleshooting window was supposed to end at 2am, but we worked until 6am.  I had a flight to catch and hadn’t slept all night.  The customer wanted me to stick around but I told him to stuff it and left.  I checked in to the hotel room I had booked, slept one hour, checked out, and got on my plane.

On the flight back, I was seated in the middle seat between a two very large people.  I figured out that they were married, but they only spoke Spanish.  I used my rudimentary Spanish to extract myself.  “Yo, a la ventana.  Su esposa, aquí,” I suggested.  They liked the idea.  I moved to the window seat.  When the plane took off, they spread out a massive feast on the tray tables.  I ate a single muffin from Starbucks, one bite at a time, until I landed and went home.

I never determined if it was a norovirus or food poisoning, but I lost twenty pounds in a week.  The customer realized they needed to invest in new circuits, which had a 12-16 week turnaround time.  I think the new CIO got fired as well.  And the account executive never invited me back.  I only wish he had shown up at the airport as I might have thrown up on him.

“Progress might have been alright once, but it has gone on too long.”
–  Ogden Nash

The book The Innovator’s Dilemma appears on the desk of a lot of Silicon Valley executives.  Its author, Clayton Christiensen, is famous for having coined the term “disruptive innovation.”  The term has always bothered me, and I keep waiting for the word “disruption” to die a quiet death.  I have the disadvantage of having studied Latin quite a bit.  The word “disrupt” comes from the Latin verb rumperewhich means to “break up”, “tear”, “rend”, “break into pieces.”  The word, as does our English derivative, connotes something quite bad.  If you think “disruption” is good, what would you think if I disrupted a presentation you were giving?  What if I disrupted the electrical system of your heart?

Side note:  I’m fascinated with the tendency of modern English to use “bad” words to connote something good.  In the 1980’s the word “bad” actually came to mean its opposite.  “Wow, that dude is really bad!” meant he was good.  Cool people use the word “sick” in this way.  “That’s a sick chopper” does not mean the motorcycle is broken.

The point, then, of disruption is to break up something that already exists, and this is what lies beneath the b-school usage of it.  If you innovate, in a disruptive way, then you are destroying something that came before you–an industry, a way of working, a technology.  We instantly assume this is a good thing, but what if it’s not?  Beneath any industry, way of working, or technology are people, and disruption is disruption of them, personally.

The word “innovate” also has a Latin root.  It comes from the word novus, which means “new”.  In industry in general, but particularly the tech industry, we positively worship the “new”.  We are constantly told we have to always be innovating.  The second one technology is invented and gets established, we need to replace it.  Frame Relay gave way to MPLS, MPLS is giving way to SD-WAN, and now we’re told SD-WAN has to give way…  The life of a technology professional, trying to understand all of this, is like a man trying to walk on quicksand.  How do you progress when you cannot get a firm footing?

We seem to have forgotten that a journey is worthless unless you set out on it with an end in mind.  One cannot simply worship the “new” because it is new–this is self-referential pointlessness.  There has to be a goal, or an end–a purpose, beyond simply just cooking up new things every couple years.

Most tech people and b-school people have little philosophical education outside of, perhaps (and unfortunately) Atlas Shrugged.  Thus, some of them, realizing the pointlessness of endless innovation cycles, have cooked up ludicrous ideas about the purpose of it all.  Now we have transhumanists telling us we’ll merge our brains with computers and evolve into some sort of new God-species, without apparently realizing how ridiculous they sound.  COVID-19 should disabuse us of any notion that we’re not actually human beings, constrained by human limitations.

On a practical level, the furious pace of innovation, or at least what is passed off as such, has made the careers of technology people challenging.  Lawyers and accountants can master their profession and then worry only about incremental changes.  New laws are passed every year, but fundamentally the practice of their profession remains the same.  For us, however, we seem to face radical disruption every couple of years.  Suddenly, our knowledge is out-of-date.  Technologies and techniques we understood well are yesterday’s news, and we have to re-invent ourselves yet again.

The innovation imperative is driven by several factors:  Wall Street constantly pushes public companies to “grow”, thus disparaging companies that simply figure out how to do something and do it well.  Companies are pressured into expanding to new industries, or into expanding their share of existing industries, and hence need to come up with ways to differentiate themselves.  On an individual level, many technologists are enamored of innovation, and constantly seek to invent things for personal satisfaction or for professional gain.  Wall Street seems to have forgotten the natural law of growth.  Name one thing in nature that can grow forever.  Trees, animals, stars…nothing can keep growing indefinitely.  Why should a company be any different?  Will Amazon simply take over every industry and then take over governing the planet?  Then what?

This may seem a strange article coming from a leader of a team in a tech company that is handling bleeding edge technologies.  And indeed it would seem to be a heresy for someone like me to say these things.  But I’m not calling for an end to inventing new products or technologies.  Having banged out CLI for thousands of hours, I can tell you that automating our networks is a good thing.  Overlays do make sense in that they can abstract complexity out of networks.  TrustSec/Scalable Group Tags are quite helpful, and something like this should have been in IP from the beginning.

What I am saying is that innovation needs a purpose other than just…innovation.  Executives need to stop waxing eloquent about “disrupting” this or that, or our future of fusing our brains with an AI Borg.  Wall Street needs to stop promoting growth at all costs.  And engineers need time to absorb and learn new things, so that they can be true professionals and not spend their time chasing ephemera.

Am I optimistic?  Well, it’s not in my nature, I’m afraid.  As I write this we are in the midst of the Coronavirus crisis.  I don’t know what the world will look like a year from now.  Business as usual, with COVID a forgotten memory?  Perhaps.  Great Depression due to economic shutdown?  Perhaps.  Total societal, governmental, and economic collapse, with rioting in the streets?  I hope not, but perhaps.  Whatever happens, I do hope we remember that word “novel”, as in “novel Coronavirus”, comes from the same Latin root as the word “innovation”.  New isn’t always the best.

The case came into the routing protocols queue, even though it was simply a line card crash.  The RP queue in HTTS was the dumping ground for anything that did not fit into one of the few other specialized queues we had.  A large US service provider had a Packet over SONET (PoS) line card on a GSR 12000-series router crashing over and over again.

Problem Details: 8 Port ISE Packet Over SONET card continually crashing due to

SLOT 2:Aug  3 03:58:31: %EE48-3-ALPHAERR: TX ALPHA: error: cpu int 1 mask 277FFFFF
SLOT 2:Aug  3 03:58:31: %EE48-4-GULF_TX_SRAM_ERROR: ASIC GULF: TX bad packet header detected. Details=0x4000

A previous engineer had the case, and he did what a lot of TAC engineers do when faced with an inexplicable problem:  he RMA’d the line card.  As I have said before, RMA is the default option for many TAC engineers, and it’s not a bad one.  Hardware errors are frequent and replacing hardware often is a quick route to solving the problem.  Unfortunately the RMA did not fix the problem, the case got requeued to another engineer, and he…RMA’d the line card.  Again.  When that didn’t work, he had them try the card in a different slot, but it continued to generate errors and crash.

The case bounced through two other engineers before getting to me.  Too bad the RMA option was out.  But the simple line card crash and error got even weirder.  The customer had two GSR routers in two different cities that were crashing with the same error.  Even stranger:  the crash was happening at precisely the same time in both cities, down to the second.  It couldn’t be a coincidence, because each crash on the first router was mirrored by a crash at exactly the same time on the second.

The conversation with my fellow engineers ranged from plausible to ludicrous.  There was a legend in TAC, true or not, that solar flares cause parity errors in memory and hence crashes.  Could a solar flare be triggering the same error on both line cards at the same time?  Some of my colleagues thought it was likely, but I thought it was silly.

Meanwhile, internal emails were going back and forth with the business unit to figure out what the errors meant.  Even for experienced network engineers, Cisco internal emails can read like a foreign language.  “The ALPHA errors are side-effects the GULF errors,” one development engineer commented, not so helpfully.  “Engine is feeding invalid packets to GULF and that causes the bad header error being detected on GULF,” another replied, only slightly more helpfully.

The customer, meanwhile, had identified a faulty fabric card on a Juniper router in their core.  Apparently the router was sending malformed packets to multiple provider edge (PE) routers all at once, which explained the simultaneous crashing.  Because all the PEs were in the US, forwarding was a matter of milliseconds, and thus there was very little variation in the timing.  How did the packets manage to traverse the several hops of the provider network without crashing any GSRs in between?  Well, the customer was using MPLS, and the corruption was in the IP header of the packets.  The intermediate hops forwarded the packets, without ever looking at the IP header, to the edge of the network, where the MPLS labels get stripped, and IP forwarding kicks in.  It was at that point that the line card crashed due to the faulty IP headers.  That said, when a line card receives a bad packet, it should drop it, not crash.  We had a bug.

The development engineers could not determine why the line card was crashing based on log info.  By this time, the customer had already replaced the faulty Juniper module and the network was stable.  The DEs wanted us to re-introduce the faulty line card into the core, and load up an engineering special debug image on the GSRs to capture the faulty packet.  This is often where we have a gulf, pun intended, between engineering and TAC.  No major service provider or customer wants to let Cisco engineering experiment on their network.  The customer decided to let it go.  If it came back, at least we could try to blame the issue on sunspots.

In my previous post, we saw the theory behind hub-and-spoke VPN. We saw how H/S involves multiple VRFs with cross-importation between them, and we traced the basic flow of a route advertised from one spoke to another.
Next, we are going to look at two options for configuring H/S VPNs. In this post, I will cover using BGP as the PE-CE routing protocol without independent route reflectors. In my next post, I will cover OSPF. Finally, I will return to BGP and examine the issues that come up when we use independent route reflectors with hub and spoke VPN. Continue Reading

One of the JNCIE-SP exam objectives I found difficult was hub and spoke VPN. Conceptually it’s not easy, and as is often the case, the documentation is only somewhat helpful. This series of posts is designed to walk you through the concepts of hub and spoke VPN, as well as its basic configuration using BGP, and then OSPF as the PE-CE protocol. Finally I will talk about route reflector issues when using H/S VPN. It is an important topic to master for your JNCIE exam, and if properly explained you will find it’s not as difficult as it seems. This article assumes you are familiar with basic configuration of Layer 3 MPLS VPN, including vrf import and export policies.

Note: I recommend that you only use vrf import/export policies for the JNCIE lab and avoid the vrf-target command for layer 3 VPN.

We’ll take a simple example topology with three sites.  Normally, if these sites are configured in a Layer 3 MPLS VPN, site 1 and site 2 are able to talk directly over the MLPS network. That is, the traffic between site 1 and site 2 is never seen by the hub site.
hs1
This is, of course desirable. Routing the site-to-site traffic through a central site could produce significant delays, especially if the geographic distance between the sites is large. It also would consume bandwidth unnecessarily on the hub interfaces. By default MPLS VPNs avoid this behavior.

Hub and spoke VPN changes this model and actually routes the spoke-to-spoke traffic through the hub site.
hs2
Given the above paragraph, why would anybody want to do this? Well, I would give you the caveat that for expert-level certification exams, you should not spend too much time asking why. Nevertheless, one reason you might use this configuration is to apply some sort of monitoring to the spoke-to-spoke traffic.

You will notice that the hub site has two interfaces. Whether physical or logical, the hub site will need to have two interfaces.

We’re going to blow the picture up a bit, and put some actual routers in to flesh out the picture, and then we are going to look at how the control plane works.

hs-route

Site 1 CE advertises a route (172.255.255.9) to its PE. The PE router then advertises the route via MBGP to the hub PE. The hub PE then advertises it over the top link to the hub CE. The hub CE learns the route and then re-advertises it back to the same PE it learned it from. If you haven’t seen h/s VPN before this may confuse you, but here comes the trick: the two interfaces that connect the hub CE to the hub PE are in different VRFs. Therefore, the hub PE will have a copy of the 172.255.255.9 route in both VRFs. This is the part which makes it a bit confusing.

The hub PE, having re-learned the route from the hub CE, then advertises it out to the site 2 PE, which sends it on its way to the site 2 CE. Since site 2’s route came from the hub PE via the hub CE, the route actually is pointing along the path through the hub site. As a result of this hocus-pocus, the data plane will forward through the hub site. To achieve this we will need to define specific route export/import policies on the PEs and do some cross-VRF importing, as illustrated below.

hs-communities

The spoke PEs do not learn routes directly from each other, but instead only learn routes that have been re-advertised by the hub site. Therefore, they will only import routes into their VRF that have the hub VRF target. They will not, repeat, will not import routes from the spoke VRF target. This may be counterintuitive because they are not importing routes with their “own” VRF target. However, this is how we achieve the cross-VRF importation. Even though they do not import routes with their own target, they do export with it. All routes exported from the spoke sites have the spoke target.

The hub PE is even more complex. It has two VRFs. The spoke VRF imports routes that have the spoke target, but it doesn’t export anything. That’s because we want to force the spoke sites to pull from the hub VRF. No routes are exported from the spoke. Meanwhile the hub instance does not import anything at all. It pulls no routes from MBGP or the other PEs. Instead, any routes in its table (aside from interface routes) it learns from the hub CE. Unlike the spoke, however, it does export routes, which are tagged with the hub VRF target. It is this target that the spokes accept.

In my next article you will see how we configure a null policy to achieve these goals. In the meantime, a little memory trick helped me to configure this. On the hub router, the spoke instance imports but does not export. I named my import policy “spoke-in” which sounds a bit like Spokane, the city. It may not be the greatest memory trick, but you can remember all the rest of the hub PE policies if you remember this. Remember, each VRF has a “null” policy, and they are the opposite of each other. So, if the spoke has an import policy (“spoke-in”), its export policy must be null, and therefore the import policy for the hub must be null and it must have an export policy (“hub-out”). This will become clearer in future articles when I explain the configuration if it’s not clear now.

To summarize this article:

  • Hub and Spoke MPLS VPN routes traffic through a hub site instead of directly between spokes.
  • To achieve this, the control plane (i.e. routing) also follows a hub and spoke model.
  • The spoke routers import and export from different hub PE VRFs.
  • The hub PE has two VRFs, one for sending routes to the hub CE and one for receiving them.

Look forward to the next article which will explain how to configure this with BGP as the PE-CE routing protocol.

This article continues to be the most popular one on this blog.  However, I published it back in 2014 while I was working on my JNCIE-SP, and that was a long time ago.  I now work at Cisco and do not have access to Junos, and my memory of Junos is getting spotty.  I am happy if the article helps you, and feel free to leave a comment, but unfortunately I will not be able to help you with specific questions on this or other Juniper topics.

 

Continuing on the subject of confusing Junos features, I’d like to talk about RIB groups. When I started here at Juniper, I remember being utterly baffled by this feature and its use. RIB groups are confusing both because the official documentation is confusing, and because many people, trying to be helpful, say things that are entirely wrong. I do think there would have been an easier way to design this feature, but RIB groups are what we have, so that’s what I’ll talk about. Continue Reading

When I first started configuring MPLS on Juniper routers, I came across the strange and mysterious inet.3 table.  What could it possibly be?  When I worked in Cisco TAC I handled hundreds of MPLS VPN cases, but I never had encountered anything quite like inet.3 in IOS land.  As I researched inet.3 I found the documentation was sparse and confusing, so when I finally came to understand its purpose I decided to create a clear explanation for those who are searching in vain.  I will focus on the basics of how inet.3 works, leaving details of its use for later posts. Continue Reading