Question about MPLS, EVPN/VXLAN, overlays

Methanar · March 9, 2025, 3:37pm

I’m doing a bunch of bare metal datacenter kubernetes stuff which is the prompt.

Am I correct in understanding that MPLS is effectively just encapsulation-as-a-service sold by your carrier from some IX to another? With layer 2 and layer 3 flavors.

For example imagine I was doing some dumb virtual ARP, VIP thing. ARP depends on broadcasting and if I wanted a broadcast domain stretched between two or more disparate DCs I’d obviously need some form of encapsulation to maintain my VLAN tags across the wan to have a vlan span the DCs. There are ways of doing this, I could do a site-site ipsec vpn with L2TP, I could do dumb things with pseudowires, or VXLANs.

Am I correct in understanding that MPLS, despite a different underlaying implementation, is effectively solving the same problems, just as a provider managed thing they sell me where I don’t need to own an IPsec VPN deal myself?

If MPLS is just another encapsulation deal, how does its implementation and path selection differ in a meaningful way from others. It’s often depicted as a cloud, but of course that’s just an abstraction. What does an MPLS network really mean in terms of l3 connectivity and fiber that’s different from the normal IP on normal fiber that I love. Does MPLS traffic get multiplexed onto the very same longhaul fiber as the normal IP stuff?

-–

Sort of unrelated, but what exactly is EVPN. Huawei of all companies seems to actually have the most readable docs on this, where its described as a control plane for VXLANs.https://support.huawei.com/enterprise/en/doc/EDOC1100168670

Is that a fair description? Why would I ever want to use BGP EVPN, layer 2 extension stuff in either a DC, or a carrier setting over some alternative. What real world problem does this solve.

error404 · March 9, 2025, 3:37pm

This is a bit of a bugbear of mine…

MPLS is simply a protocol for tagging and label switching packets. If someone is selling you MPLS that is anything other than a router or switch of some sort, especially if it’s a service, they’re selling you something built using MPLS, not ‘MPLS’. When carriers talk about MPLS they usually but not always mean they are going to build you an L3VPN that runs on their MPLS core. Pretty much every service delivered by a modern carrier will be built on top of MPLS, so literally anything you buy from them could be called ‘MPLS’. It’s a marketing term that causes more confusion than clarity.

Remainder assumes you are talking about a carrier-provided L3VPN solution.

Am I correct in understanding that MPLS, despite a different underlaying implementation, is effectively solving the same problems, just as a provider managed thing they sell me where I don’t need to own an IPsec VPN deal myself?

L3VPN won’t help you span L2, but you should really avoid doing that anyway. It will give you a private routing domain that you can use to route traffic between any number of sites over the provider’s WAN. It doesn’t replace IPsec, since it is usually not encrypted, but from a high level ‘how does traffic travel’ point of view, it is similar to a mesh VPN.

What does an MPLS network really mean in terms of l3 connectivity and fiber that’s different from the normal IP on normal fiber that I love. Does MPLS traffic get multiplexed onto the very same longhaul fiber as the normal IP stuff?

These questions you will have to pose to the particular carrier you are buying from. For most modern carriers, yes, all their IP and non-IP (EVPL etc.) traffic will be carried on the same MPLS core. They may (or may not) provide higher QoS for their carrier services traffic than their generic IP transit, but usually you at least have the option of paying (a lot, IME) extra for this. In the naive case it’s fairly likely that between two given POPs, the MPLS and transit traffic will use the same path. There is a lot of flexibility though, so they can do many things here.

Is that a fair description? Why would I ever want to use BGP EVPN, layer 2 extension stuff in either a DC, or a carrier setting over some alternative. What real world problem does this solve.

You no longer have any L2 traffic anywhere, which is a huge boon. No more VLAN pruning, no more broadcast issues, you can use ECMP and route-based fast failover mechanisms, it makes dynamic provisioning much simpler. It’s pretty awesome if you a service provider that does a lot of L2 in the DC. For Enterprise it is probably much less useful.

staticv0id · March 9, 2025, 3:37pm

MPLS is effectively just encapsulation-as-a-service sold by your carrier from some IX to another?

MPLS service is never offered at an Internet Exchange. It is offered on a single carrier network. You can buy an MPLS network from Lumen or AT&T, but you would not access that through DECIX or AMSIX or CoreSite Any2, for example.

Now, that carrier may leverage other carriers to reach your other locations, so Lumen might buy a circuit from AT&T on your behalf to your location in Illinois, for example. But that is a different circuit from an Internet transit, peering, or IX link.

With layer 2 and layer 3 flavors.

To an enterprise, “MPLS” typically means layer 3 service. The carrier provisions a port that has an IP on it, and that IP is contained in a virtual routing table that is separate from the Internet.

In the layer 2 realm, there are also:

Ethernet Private Lines, or EPLs. An EPL takes an Ethernet frame from the customer and sends it to another port on the service provider network.
Ethernet Virtual Private Lines (EVPL): Same as an EPL, except this can connect to many locations. 802.1Q VLAN tags are used to identify what traffic goes where.
Virtual Private LAN Service (VPLS): This is a way to provide the same broadcast domain to many locations at once. Every port at every location resides on the same layer 3 network.
Ethernet VPN (EVPN): Similar to VPLS, except the mechanism for learning hardware MAC addresses on each router is different (covered below).

Am I correct in understanding that MPLS, despite a different underlaying implementation, is effectively solving the same problems,

Yes, with the added value of quality of service, and a L3VPN or L2VPN is isolated from the Internet. Internet traffic cannot touch your router that is connected to an MPLS network, unless you opt for a network-based firewall from the carrier.

There are different, more effective tools to manage flows and congestion. Also, the VPN capability of MPLS is secondary to its primary goal: reducing the number of IP table lookups for Internet traffic within a service provider network.

What does an MPLS network really mean in terms of l3 connectivity andfiber that’s different from the normal IP on normal fiber that I love.

Isolation from the Internet and QOS. Same fiber, different service.

Does MPLS traffic get multiplexed onto the very same longhaul fiber as the normal IP stuff?

Yes. There is an extra MPLS label at the end of the packet telling that destination router on which customer routing table and which interface to route the packet.

MPLS also has a traffic engineering capability, so an SP that has configured that capability can route your packets over a different set of backbone circuits, if they administratively choose to do so.

Sort of unrelated, but what exactly is EVPN.

EVPN is a way to extend a broadcast domain across a service provider network. It is similar to VPLS in that Ethernet frames are tagged and routed through the SP network, but EVPN learns MAC addresses via BGP to provide loop-free routing. By contrast, in VPLS, each router learns MACs independently, similar to a switch.

What real world problem does this solve.

VPLS can be difficult to administer. Each router with the L2 VFI must have a direct or indirect link to another, which becomes unwieldy when many routers must participate in the same VFI. I have also seen issues in a VPLS full mesh configuration, because routers start learning MACs through multiple paths. EVPN moves the MAC learning function into BGP.

HTH!

edit: editor ate a line, formatting

jiannone · March 9, 2025, 3:37pm

If MPLS is just another encapsulation deal, how does its implementation and path selection differ in a meaningful way from others. It’s often depicted as a cloud, but of course that’s just an abstraction.

Traffic engineering is a significant component of mpls networking though the primary selling point of mpls from the sp perspective is backbone convergence. Multiple services and tenants enter the network at a label ingress router and one backbone interface supports all services. No ATM PVC, no frame relay DLCI, no VLANs in the core.

Traffic engineering started with RSVP signaling at ingress PEs with CSPF. CSPF selects a route with criteria beyond interface metrics. Now TE is supported by controllers via PCEP in RSVP and segment routing. PCEP in controllers provides more flexible and more dynamic path selection using temporal criteria, like latency changes and link utilization. I don’t know how widely deployed that level of TE is in production networks but that’s the promise.

rankinrez · March 9, 2025, 3:37pm

There is a ton of information on this online, including the RFC specifications.

You’re kind of getting the wrong end on MPLS. It’s a way to get data from one place to another, in which devices switch traffic based on a “label” in the header. Various protocols exist to build label tables / paths across the network. It also allows for virtualised “overlay” networks by using a second, inner-label which represents a particular virtual network (L3 VRF or L2 Vlan etc), which the routers at the edge of the network can use.

EVPN is a BGP SAFI, basically it’s a variant of BGP which can send both IP and MAC addresses (to distribute information about L2 and L3 network destinations.) It also includes attributes in all updates which identify what network they belong to, allowing it to distribute information for multiple separate networks in the same protocol. Updates also include label information for MPLS or VXLAN encapsulation. You can consider it as an evolution of the VPN-IPv4/6 MP-BGP address families.

I strongly prefer routed networks to “stretching” layer-2, but there are cases you encounter where the application needs L2. Some of these include app/VM “mobility”, moving from one physical server to another, where it may not work unless both servers are on the same L2 segment. Some load-balancing techniques require L2 adjacency too. Probably way more. Generally better to engineer applications and server infra so you don’t need L2 adjacency, but if you haven’t then creating that adjacency with an overlay is a lot safer and more scaleable than trunking vlans everywhere.

OhMyInternetPolitics · March 9, 2025, 3:37pm

I see others have posted quite a bit about MPLS, but I did want to add my two cents about EVPN.

Ethernet VPN (EVPN) is a signaling protocol implemented via BGP to support a stretched L2 bridge over L3. It allows learned MACs to be advertised as routes between routers.

This becomes useful when you have to span L2 for some reason (the more common use case is vMotion between two locations). In a vMotion scenario:

VM-1 is in Datacenter A; Router A learns the MAC address of VM-1 and sends a prefix update to Router B in Datacenter B with the MAC Address and a sequence number (let’s say 1 for now)
VM-1 is transferred to Datacenter B via vMotion
Router B learns the MAC address VM-1 on its interfaces (versus the router from BGP) and sends a prefix update to Router A in Datacenter A with the MAC Address and a sequence number of 2
Router A sees the route from Router B with a higher sequence number - so its own route is out of date - and withdraws its announcement and prefers the BGP route from Router B

How does the rest of the network see VM-1? If VM-1 is in Datacenter A:

All hosts in Datacenter A will learn VM-1’s MAC normally
All hosts in Datacenter B will receive a spoofed MAC address coming from Router B.

In the event that there’s a serious problem (a loop, or maybe a bad link) that causes the MAC address to flip-flop between datacenters, EVPN has a dampening process that freezes out learning/advertising a MAC address to other neighbors for a short period of time.

So EVPN is just a signaling protocol - essentially a control plane.

EVPN integrates with multiple protocols, depending on what your network uses - the most common being either VXLAN or MPLS (there’s also PBB and E-Tree, but they’re less common). VXLAN/MPLS will actually be forwarding the network communication between hosts.

HTH!

agould246 · March 9, 2025, 3:37pm

SRv6 interestingly parts ways with MPLS … (I didn’t say SR-MPLS)

Layer 2 in the SP is a big deal… just look at that long list of MEF-speak dropped by staticv0id … and a lot of it ain’t EVPN… cellular backhaul is huge… also many wholesale/enterprise svcs are requested to be layer 2 Ethernet end to end

Good discussion folks

Anonymous · March 9, 2025, 3:37pm

I don’t have anything to add, but the comment section was a very interesting (and informing) read.

Ardeck_1 · March 9, 2025, 3:37pm

MPLS is really good at scaling. You have 5 sites, not interesting, you have 500, it is a must have.

It solves two issues, mutualisation if links, for multiple site and customer (so very interesting for the provider, meaning cheaper for customer). The other advantage is the simplicity, bgp, LDP,MPLS are very closed together and everything is automated (I mean for the label/vpn exchange) so again good for the provider.

Evpn is very similar to MPLS and it adds a feature missing in MPLS : L2 vpn. In MPLS the vpn are exchanged with L3 prefixes. In vxlan you have both )2 and L3 prefix.
So evpn brings you scalability, mutualisation AND layer 2 between multiple sites.

It is difficult to directly answer you question.
MPLS encapsulate l3, pseudo wire are just a hack. Vxlan natively support L2

Vxlan and MPLS is L3 based so you can transport vxlan over MPLS and vice versa. Or vxlan over vxlan
So your “standard IP stuff” is often carried by an MPLS backbone. You speak about ix but for internet it is usually a different backbone (security, bandwidth consumption, features) or a vrf on the MPLS backbone.

hjuringen · March 9, 2025, 3:37pm

Calling the service you buy from a service provider MPLS is as useful as calling postal service wheels. All postal services uses bikes, cars, trucks, trains or planes. All these has wheels, so postal service should be called wheels (not!).

My primary service provider provides at least PSTN, cellular phone service , internet, L2 VPN and L2 VPN using MPLS.

Methanar · March 9, 2025, 3:37pm

For most modern carriers, yes, all their IP and non-IP (EVPL etc.) traffic will be carried on the same MPLS core.

This may yet be another carrier-specific question. But are you saying that some carriers with MPLS networks don’t run ethernet at all between their routers? But rather, their “l2” is MPLS. Everything, even IP, is built on top of an MPLS mesh?

That’s interesting. Is path selection handled at the MPLS level or the IP level then. I recognize its possible to do all of the route advertisements with BGP for both options. It’s almost strange to me that IP wouldn’t be the routed protocol in the WAN space.

I also recognize that the path selection process is different for MPLS vs IP. MPLS uses shortest path prefix matching with Dijkstra’s algorithm on the labels, rather than IP using longest prefix match on a cidr block.

In the IP space, you’re only ever really determining the next hop of a given packet given a routing table, or I guess PBR source routing shenanigans if you’re doing weird things. Whereas in MPLS you’re doing a one-time look up at the ingress to the MPLS where you determine the full end-to-end path of the packet with all intermediaries up front. Once the path selection process is complete for this packet, it gets encapsulated where the header information describes this full ‘virtual circuit’ path to take and no further lookups are made as the packet is sent on its way and goes to each hop in the virtual circuit. Is this correct?

JasonDJ · March 9, 2025, 3:37pm

Great explanation. I would add on to this one other point – the carriers sell an L2VPN or L3VPN solution when they are selling MPLS.

Many non-network folks and even some early-career folks will see the “VPN” part of that statement and take for granted that the “P” means private and assume that means encrypted.

It is not.

VPN’s that you get in the consumer space like Nord or PIA or whatever, or VPNs that you use to work from home, are encrypted, and that’s likely the source of the confusion…but “private” doesn’t mean “invisible”. You may call your genitals your “private parts” but if you walk around naked, people can still see them.

When a carrier sells you a VPN, the “P” is often inaccurately interpreted to mean it’s fully private. It’s private in the sense that it’s your own “private” network. It’s not private in the sense that nobody can see it. The “VPN” is really no more than a VRF, your own private routing instance, and nothing else. It’s on the customer to provide their own encryption if they deem it necessary (which many regulatory audits likely would). Keep this in mind when speccing out a solution, as you’d likely want to make sure that the router/firewall making the connection is capable of performing IPsec at a minimum, and likely a solution that allows for shortcut tunnel creation (like DMVPN or ADVPN).

Don’t just assume that everything is end-to-end encrypted and that’s good enough. Remember SSL still passes the certificates in the clear, most internal DNS solutions are clear, and there’s plenty of companies out there using unencrypted LDAP and SMTP/IMAP. I could be wrong, but I think CIFS, NFS, and SMB are still all unencrypted protocols by default as well.

Newdeagle · March 9, 2025, 3:37pm

Great explanation, I learned a lot from this. Is there any reading you can recommend to learn more about these topics? You seem to be really knowledgeable!

moratnz · March 9, 2025, 3:37pm

truck resolute melodic husky seed violet outgoing sable elderly hurry

This post was mass deleted and anonymized with Redact

my-qos-fu-is-bad · March 9, 2025, 3:37pm

MPLS is like Layer 2.5, sits between L2 and Layer 3.

Layer 2 could be anything, Ethernet, PPP, HDLC, you name it. Google Tag Switching so you can get how MPLS started.

Path selection is handled by the IGP and the local FIB. The info from the FIB and the use of LDP creates an LFIB which assigns locally labels for prefixes, which in the end are “label switched” by the device.

So in the end, there’s no MPLS doing SPF calculations or anything like that.

Probably this answer doesn’t clarify your questions, I hope they can guide you to Googling the right documentation so you can read and learn about the technology.

My 2 cents.

error404 · March 9, 2025, 3:37pm

Most of your questions have been answered, but a few additional points maybe I can add here.

This may yet be another carrier-specific question. But are you saying that some carriers with MPLS networks don’t run ethernet at all between their routers? But rather, their “l2” is MPLS. Everything, even IP, is built on top of an MPLS mesh?

In many networks, yes, they have a ‘BGP free core’. This means they only operate an IGP and MPLS on their core network, and have BGP only at the network edge. This enables the use of cheaper / higher speeds/feeds MPLS switches rather than needing big iron routers in the core. All production traffic then travels over LSPs on the MPLS network, rather than being routed. As others have pointed out already, there are lots of ways to build those LSPs and map next-hops to them, so the internal policy is going to be very specific to each carrier. But in general, the edge router at ingress will resolve the next-hop of the egress router, and put the packet into an LSP that will take it there. So routing decisions only occur on the ingress and egress edge routers, the rest is label-switched.

Some networks do still maintain a non-MPLS BGP network for their DFZ, often this is the same network they run MPLS on top of, using the ‘internet’ as the underlay. So your transit traffic isn’t necessarily being MPLS encapsulated, but any ‘carrier services’ almost certainly are.

One exception is within the metro, some networks (ILECs in particular) are still using 802.1ad QinQ for short-haul Ethernet circuits, but if it’s leaving the metro it’s going to be put into an l2circuit somewhere.

Once the path selection process is complete for this packet, it gets encapsulated where the header information describes this full ‘virtual circuit’ path to take and no further lookups are made as the packet is sent on its way and goes to each hop in the virtual circuit. Is this correct?

Each P hop still of course needs to do a lookup of the label to determine where to forward it and the label swap/pop that needs to occur, but the lookup is fixed-length, and a much smaller table, so is much simpler than a full route lookup and requires much less FIB/TCAM. It also scales with the size of the network, not the number of routes it handles, so you should ‘never’ need to upgrade due to FIB capacity. But notionally this is correct, the path is fixed by the label the ingress router puts on and no routing decisions are made until it hits the egress PE.

Methanar · March 9, 2025, 3:37pm

Fascinating stuff, I appreciate your effort in the write up. This isn’t the kind of stuff that shows up in google searches.

One more question, if the path is ultimately being governed by MPLS-level path selection, how does that work with BGP community strings, or other standard attributes, that I set on my customer side for influencing the path my traffic takes.

For example NTT has an extensive set of community strings they support where I can influence how my traffic is announced on their side. Is it that this is actually entirely irrelevant for the routing decisions made within AS2914 and the internal segment routed iBGP/ISIS doesn’t care at all - At no point is there any sort of redistribution down to the segment routing level. And only inter-ASN decision routing take such BGP attributes into account.

https://www.gin.ntt.net/support-center/policies-procedures/routing/

moratnz · March 9, 2025, 3:37pm

You’re spot on; your BGP community strings only impact the inter-AS routing decisions made (well, mostly; if they change where the traffic is existing my AS, that’ll obviously change how it’s routed inside my AS, since it’s going somewhere different).

All of the MPLS traffic engineering is about how to get it from in AS ingress point to the AS egress point as efficiently as possible, with a side order in fast failover (because if I have a bunch of prefixes routed down an LSP, I can deal with a link failure by rerouting the LSP, without the routing protocol being aware anything has happened).