Coming in OcNOS 7.1

Global Load Balancing — Fabric-Wide Adaptive Routing

DLB makes the right decision at one hop. GLB makes the right decision across the whole fabric. Arriving in OcNOS 7.1, Global Load Balancing extends adaptive routing from a per-port view to end-to-end path quality — closing the multi-hop hot-spot gap on 3-stage Clos AI fabrics at 1k–16k GPU scale.

End-to-End Path Telemetry

A 3-stage Clos slice — leaf, spine, super-spine — carrying GPU AllReduce. Every tier streams queue-occupancy and link-utilisation telemetry back toward the ingress leaves. GLB picks the path with the best end-to-end score, not the best local-egress score.

Global Load Balancing across a 3-stage Clos AI fabric Three-stage Clos AI fabric. Two super-spines on top, four spines in the middle, two leaves at the bottom. Telemetry arrows flow upward and back down so the ingress leaf sees end-to-end path quality. One spine-to-super-spine link is congested and bypassed in favour of an alternative end-to-end path. end-to-end telemetry Super-Spine-1TH5 · 51.2T Super-Spine-2TH5 · 51.2T Spine-1e2e ✓ Spine-2e2e ✓ Spine-3uplink hot Spine-4e2e ✓ Ingress LeafGLB · ranks paths Egress Leaftarget rack GLB · END-TO-END PATH SCORING · MULTI-HOP CONGESTION AWARENESS · OcNOS 7.1

The multi-hop hot-spot problem

DLB scores each ECMP next-hop using local egress queue-depth — what's happening on this switch's outbound port. That's optimal on a 2-tier leaf-spine. But scale to a 3-tier Clos and you can pick a spine with a clean uplink, only to land on a super-spine where its downlink back to the egress leaf is congested. The local view is correct; the end-to-end view is wrong.

At 1,024-GPU and larger fabrics — the size where 3-stage Clos with super-spines becomes the standard topology — this is the dominant remaining source of tail-latency outliers. OcNOS 7.1 introduces Global Load Balancing to solve it: every tier publishes path-quality telemetry back toward ingress leaves, so the ingress decision is based on full end-to-end score.

DLB vs GLB — scope of the path decision

Local — DLB

Per-hop adaptive routing

Each switch ranks its own ECMP next-hops using local egress queue-depth and link-utilisation. Excellent for 2-stage fabrics and the leaf→spine hop in 3-stage. Available today on TH4 / TH5.

Global — GLB · 7.1

End-to-end path scoring

Every tier publishes congestion telemetry back to ingress leaves. The ingress ranks complete paths — leaf→spine→super-spine→spine→leaf — and selects on a full-fabric quality score, not just the local hop.

The OcNOS 7.1 GLB implementation

Telemetry Plane

Path-quality publish

Every spine and super-spine publishes per-port queue-occupancy and utilisation deltas to a fabric-wide adjacency. Updates are sub-millisecond and use existing in-band signalling — no extra control-plane chatter.

Path Scoring

End-to-end aggregation

Ingress leaves combine local egress quality with downstream telemetry to compute an aggregate score per candidate path. The worst hop dominates the score — the same intuition operators use when troubleshooting.

Selection

Flowlet-aligned

Like DLB, GLB rebinds at flowlet boundaries — preserving in-order delivery for RoCEv2 and TCP. The difference is what feeds the decision: full-fabric quality, not local-port quality.

Backwards-Compatible

Layered on DLB

GLB extends the DLB decision; it does not replace it. Mixed fabrics with GLB-capable and DLB-only switches behave correctly — non-GLB switches simply contribute local-only quality.

Scale

1k–16k GPU validated

Reference designs with 256-port spine tiers and 64-port super-spine tiers, sized for 1,024 / 4,096 / 16,384 GPU clusters using 64×800G TH5 chassis as the building block.

Telemetry Out

gNMI for the ops team

Per-path scores, rebind events, and worst-hop attribution stream over gNMI/OpenConfig — so SREs can correlate fabric decisions with NCCL job behaviour without a black-box.

Roadmap and availability

  • OcNOS 7.1 — first release. GLB ships as part of the 7.1 OcNOS-DC train, on the same TH4 / TH5 hardware running DLB today. Schedule and feature scope at the OcNOS releases page.
  • Same SKU. Included in OcNOS-DC PLUS — no per-feature paywall, no new license keys at upgrade time.
  • In-place upgrade. Brownfield upgrade from 7.0 to 7.1 is supported; mixed-version fabrics keep working with DLB-only behaviour during the upgrade window.
  • UEC-aligned. The path-quality plane is being designed to interoperate with Ultra Ethernet Consortium signalling once UEC NIC ecosystems mature, so 7.1 GLB is forward-compatible with where the industry is heading. See Ultra Ethernet (UEC).
  • Architecture review available. If you're sizing a 1k+ GPU fabric, we will run a sizing exercise that includes the GLB telemetry plane.

Sizing a multi-thousand-GPU fabric? Let's run the numbers together.

Book an Architecture Review →