EVPN Multi-Homing — ESI-LAG Active/Active

A production AI server has two NICs into two leaves — both active, both forwarding, no active/standby waste. EVPN multi-homing (RFC 7432, ESI-LAG) is the standards-based way to get there: no proprietary MLAG cabling, no inter-switch sync link. Just BGP, an Ethernet Segment Identifier, and the protocol does the rest.

Active/Active Server Attachment

A GPU server with two bonded NICs attaches to two leaves. Both leaves share the same Ethernet Segment ID (ESI). Both advertise the server's MAC into EVPN with the same ESI. Remote leaves install both as ECMP next-hops — aliasing across the ESI peers. On link failure, mass-withdraw collapses convergence to the BGP propagation time.

EVPN multi-homing with ESI-LAG Active/Active Topology showing a GPU server with bonded NICs attached to two leaves. Both leaves share an Ethernet Segment Identifier (ESI). The leaves connect upward to two spines. A remote leaf at the bottom installs both leaf-1 and leaf-2 as ECMP next-hops via EVPN aliasing. Bottom band describes ESI, designated forwarder, and mass-withdraw. ESI 00:11:22:33:44:55:00:01 GPU Srv2 × NIC bond Leaf-1 (DF)VTEP 10.0.0.1 Leaf-2 (non-DF)VTEP 10.0.0.2 Spine-1EVPN RR / ECMP Spine-2EVPN RR / ECMP Leaf-Remotealiasing → ECMP EVPN ESI-LAG · ALIASING ECMP · DESIGNATED FORWARDER · MASS-WITHDRAW · RFC 7432

Why ESI-LAG over MLAG

Traditional Multi-Chassis LAG (MLAG) gives you Active/Active server attachment, but at the cost of a proprietary Inter-Chassis Link (ICL), per-vendor synchronization protocols, and forklift compatibility constraints between leaf models. EVPN multi-homing replaces all of that with BGP and a six-byte Ethernet Segment Identifier.

With EVPN multi-homing, the two leaves don't need to know about each other directly. They both advertise the same ESI on the relevant Ethernet Segment, and the EVPN control plane handles designated forwarder election, aliasing, and mass-withdraw. The leaves can be different vendors, different generations, even different platforms — as long as they speak EVPN and ESI-LAG correctly, multi-homing works.

The four EVPN multi-homing primitives

Type-1 Route

Auto-Discovery per ESI / per EVI

Each leaf advertises Type-1 (Auto-Discovery) routes for the ESI. Receivers learn which leaves participate in the segment and use this for aliasing and mass-withdraw on failure.

Type-4 Route

Ethernet Segment route

Type-4 routes drive Designated Forwarder election among the leaves attached to the same ESI. The DF is responsible for forwarding BUM (broadcast/unknown/multicast) traffic toward the segment.

Aliasing

ECMP across the ESI peers

Remote VTEPs install both leaf VTEPs as next-hops for the segment's MACs. Unicast traffic ECMP-spreads across the two paths — Active/Active utilisation without per-flow stickiness.

Mass Withdraw

Sub-second convergence on failure

When a leaf loses its link to the server, it withdraws its Type-1 ESI route. Remote VTEPs collapse the ESI's next-hop set in a single update. No per-MAC withdrawal storm.

Split Horizon

BUM loop prevention

The non-DF and the DF coordinate via the segment's local-bias to prevent a BUM frame from looping back to its origin server. ESI label split-horizon filtering makes this stateless on the data plane.

VLAN-Aware

Service interface flexibility

OcNOS supports both VLAN-Based and VLAN-Aware service interfaces, with per-EVI ESI configuration. Mix tenants and physical-segment topologies as the deployment requires.

What this gives you in production

  • Standards-based redundancy. RFC 7432 and RFC 8365 — same protocol every modern DC vendor implements. No proprietary tax, no leaf-vendor lock-in.
  • 2× bandwidth utilisation. Both NICs forward live traffic; no Active/Standby waste. Critical for AI servers where 2× 200G or 2× 400G into the leaf is the cabling baseline.
  • Sub-second link-failure convergence. Mass-withdraw collapses the convergence event to BGP propagation time — typically inside one second on a tuned fabric.
  • No ICL cable. The MLAG inter-chassis link goes away. Cabling, port consumption, and the failure-mode complexity of ICL split-brain all disappear.
  • Multi-vendor leaf pairs. The two leaves on the same ESI don't need to be the same model or vendor. EVPN handles the protocol; the data plane just forwards.
  • Validated in OcNOS-DC. ESI-LAG Active/Active is part of the DC-IPBASE feature set — production-grade on every supported Tomahawk and Trident platform.

Designing leaf redundancy for an AI fabric? Let's spec the ESIs together.

Request a Technical Demo →