An open AI fabric — built for what your training job actually feels.
At thousands of accelerators you don't measure switches in Tbps — you measure job completion time, GPU utilization, and tail latency under microbursts. OcNOS-DC moves those numbers on open merchant silicon with a 24/7 carrier-grade SLA: the same technical floor as the closed AI stacks, none of the lock-in.
Take it offline. Read it on a plane.
Two short downloads that go deeper than this page: the lossless AI fabric architecture and the EVPN-VXLAN data center reference.
OcNOS 800G Ethernet-Based Lossless AI Fabric
Non-blocking RoCEv2 fabric on Tomahawk 4/5 spines — SKU tiers, validated platforms, and deployment architecture.
Get the briefEVPN-VXLAN Data Center Fabric
Carrier-grade leaf-spine data center fabric: symmetric IRB, Type-2/Type-5 routes, distributed anycast gateway.
Get the brief"Will my training job actually finish faster?"
At scale, traditional network metrics lose their meaning. What matters is Job Completion Time, GPU utilization, and tail latency under microbursts — because every minute a multi-billion-dollar cluster waits on a synchronization step is capital burned.
The lossless, low-latency performance AI needs no longer requires a closed, proprietary stack. On open merchant silicon with a carrier-grade SLA, OcNOS-DC matches the technical floor of closed architectures with no vendor lock-in — congestion management, sub-millisecond dynamic routing, and Ultra Ethernet alignment, tuned for the bursty patterns of collective traffic. GPUs spend their time processing data, not waiting on the network.
Every threshold is exposed, so your team can tune it against real xCCL (NCCL / RCCL / oneCCL) traffic. Below: each workload pattern, the mechanism that handles it, and what the operator gets back.
→ DLB rebinds flowlets sub-ms on live queue depth.
→ GLB (OcNOS 7.1) scores leaf · spine · super-spine.
→ DCQCN (xCCL-tuned ECN + CNP) caps rate before the drop.
→ PFC Watchdog auto-drains stuck queues per-port.
→ UEC 1.0: packet spray + multi-path RDMA + out-of-order delivery.
→ The switch you buy today stays when UEC NICs land.
Reference benchmark. DLB lifts fabric utilization from ~55% on static ECMP to 90%+ on the same hardware — no extra uplinks. Local at each hop; system-wide across the AllReduce. (Industry-published Broadcom flowlet-rebalancing figure, replicable on TH4/TH5.)
DLB deep-dive →800G spine-leaf, lossless from rack to rack.
A 3-stage Clos: eBGP unnumbered underlay, ECMP at every tier, PFC/ECN per priority group, isolated out-of-band bus for ZTP and telemetry. Hover any node for switch, port count, and ASIC.
Hover nodes for capability and platform details · Full HCL: 40+ validated platforms at ipinfusion.com/hcl
Four layers of losslessness — correct on Day 1.
Most AI fabric failures trace to one misconfigured PFC priority group or an ECN threshold tuned for cloud, not RDMA. OcNOS-DC ships RoCEv2 buffer profiles validated per Broadcom ASIC — so your first AllReduce runs lossless without a tuning sprint.
PFC + ECN — priority-group lossless control
PFC pauses per-priority traffic before buffers overflow; ECN marks packets early for sender-side slowdown. No drops, no port-wide stall. PFC over L3 for routed multi-row fabrics.
DLB — flowlet-level adaptive routing
Static-hash ECMP collides when 8 NICs hash to the same spine. DLB watches live queue depth and rebinds flowlets to less-loaded paths sub-ms — the AllReduce stops dragging on the slowest link.
DCBX — server config auto-pushed over LLDP
The leaf pushes correct PFC and ETS config to the GPU server automatically — no silent loss of losslessness when a node gets re-imaged, the most common production failure mode.
gNMI on-change telemetry — sub-second visibility
PFC pauses, ECN marking, DCQCN thresholds, and buffer depths as gNMI on-change sensor paths — straight into Prometheus / Grafana / OpenTelemetry. Catch congestion before it stalls a job.
40+ validated platforms — view full HCL →
The fabric profile is ready before the NICs are. That's the point.
RoCEv2 is the production transport in 2026; UEC is what comes next. The UEC 1.0 fabric profile adds packet spray, multi-path RDMA, and out-of-order-friendly forwarding — closing the single-hash limit that kept earlier RoCE a step behind InfiniBand on multi-rail collectives. OcNOS-DC tracks the UEC 1.0 fabric profile today, while UEC NICs roll out. The point isn't leading the standard — everyone is aligning to it. It's that the switch you buy this quarter won't need replacing when your UEC NIC arrives.
Packet spray
Single flow uses every parallel path simultaneously instead of being pinned to one ECMP hash. Multi-rail bandwidth is no longer left on the table.
Multi-path RDMA
Reorder buffers handle out-of-order delivery in hardware. Modern congestion control replaces NACK-based loss recovery on tail latency.
Same hardware, forward path
The TH4 and TH5 platforms validated for OcNOS-DC today extend into UEC. No fork. No second SKU line. One fabric, two transport generations.
Where OcNOS-DC sits — honestly, by name.
The race has converged on a shared floor: lossless RoCEv2, DCQCN, adaptive routing, UEC alignment. Everyone ships these. The real differentiator is solution shape — vertical lock-in vs. open NOS, locked vs. open hardware, closed-loop IB vs. standards Ethernet. Pick the trade-off you can live with for five years.
Every row ships a real product — including OcNOS-DC. The question is rarely a missing feature; it's the trade-off you'll live with.
What it actually is — and where it stops.
An AI cluster is three layers. The fabric moves bytes between switches; the NIC terminates RDMA; the scheduler decides what runs where. "AI-aware fabric" usually means one vendor bundled all three under one SKU. OcNOS-DC owns the fabric, exposes every threshold, and stays out of the layers above. Here's the boundary, named.
What OcNOS-DC owns.
- Lossless RoCEv2 transport — PFC + ECN + ETS + DCBX
- DCQCN with xCCL-validated default thresholds, every knob YANG-modeled
- DLB sub-ms flowlet rebinding on live ASIC queue depth
- GLB fabric-wide path scoring (OcNOS 7.1)
- PFC deadlock watchdog — per-port, per-priority
- UEC 1.0 fabric-profile alignment — packet-spray-friendly forwarding
- gNMI on-change telemetry, OpenConfig YANG, sub-second cadence
Your NIC vendor's job.
- xCCL collective implementation and tuning
- RDMA verbs, queue pairs, retransmit logic
- UEC packet spray endpoint + reorder buffer (UEC NICs)
- GPU-direct memory access, NVLink coordination
- Per-flow rate limiting and end-host congestion response
Your orchestration platform's job.
- Training-job placement, gang scheduling, gradient-sync windows
- Epoch / training-phase awareness
- Tenant isolation, queue priority, resource quotas
- xCCL ring topology assignment, rail-group affinity
- Cross-job interference detection
Every mechanism on this page has its own deep-dive.
The page above is for picking a fabric. These are for tuning one — packet captures, ASIC behavior, YANG paths, and where each feature ships in the release train.
RoCEv2 + PFC + ECN + DCQCN
The lossless RDMA transport layer for GPU collectives. Buffer profiles pre-tuned per Broadcom ASIC, xCCL-class DCQCN defaults, sub-µs jitter under load.
Read deep-dive → AI Fabric · LocalAdaptive Dynamic Load Balancing (DLB)
Sub-millisecond flowlet rebinding using live ASIC queue-depth telemetry. Closes the ECMP hash-collision gap on AllReduce elephant flows.
Read deep-dive → AI Fabric · Fabric-wide OcNOS 7.1Global Load Balancing (GLB)
End-to-end path scoring across leaf · spine · super-spine for clusters up to 16k GPU. The multi-hop adaptive layer DLB cannot see alone.
Read deep-dive → AI Fabric · Frontier UEC 1.0Ultra Ethernet (UEC)
Packet spray, multi-path RDMA, out-of-order delivery, modern congestion control. The standards-based open answer to InfiniBand.
Read deep-dive → AI Fabric · Reference DesignsTopologies — single-pod to 16k GPU
Rail-only and rail-optimized designs map the fabric shape directly onto the xCCL 8-rail multi-NIC pattern. 3-stage Clos for multi-pod scale-out to the 16k-GPU ceiling. Port counts on TH4 / TH5.
Read deep-dive → AI Fabric · Congestion ControlDCQCN — RDMA Congestion Control
WRED ECN marking, CNP feedback, quantized rate control. xCCL-class defaults out of the box; every threshold YANG-modeled for tuning.
Read deep-dive → AI Fabric · SurvivalWatchdog — PFC Deadlock Detection
Per-port, per-priority watchdog detects paused-queue cycles and auto-drains the affected queue before training jobs hang.
Read deep-dive → AI Fabric · Decision GuideInfiniBand vs Ethernet for AI
Workload-specific decision guide. Where modern Ethernet (RoCEv2 + DLB + UEC) closes the gap, where IB still wins, and how to pick.
Read deep-dive → ObservabilitygNMI Streaming Telemetry
gNMI Subscribe over gRPC, OpenConfig YANG, dial-out collectors. Integrations with Telegraf, Prometheus, and Grafana.
Read deep-dive →Three cluster shapes. Three fabric stories.
Framed by what the job feels, not by switch features. Pick the shape closest to yours; the deep-dives have the configs.
The multi-week LLM pre-training run.
AllReduce dominates the network. Every GPU must hold high in-collective utilization and survive microbursts without restarting a nine-day run.
Mechanisms: DCQCN + DLB + PFC Watchdog. Rail-optimized for single-pod; 3-stage Clos with GLB for multi-pod scale-out.
Outcome: AllReduce at line rate, zero collective restarts, JCT inside schedule.
The high-throughput inference fleet behind a public API.
Real-time inference where p99 tail latency drives the SLO. Inference must never queue behind batch retraining, and ops needs per-flow visibility the moment latency drifts.
Mechanisms: ETS strict-priority + gNMI on-change telemetry into Prometheus / OpenTelemetry.
Outcome: p99 held inside SLO; regressions caught in milliseconds, not the support queue.
The neocloud renting H100 / H200 / Blackwell to tenants.
A multi-tenant GPU cloud. Each tenant needs isolated lossless RoCEv2 paths — without a separate fabric segment per customer or a second NOS image.
Mechanisms: EVPN-VXLAN isolation + lossless RoCEv2 on one OcNOS-DC instance.
Outcome: per-tenant isolation, one ops model, one SLA, one image to upgrade.
Bring your topology. We'll show you the path.
Every IPI architecture review is led by a network engineer running production OcNOS — no slides, no sales theatre. Bring your GPU count, NIC choice, and target JCT; we'll map it to topology, SKUs, and configs that ship today.
Connect it to everything else.
AI is one segment of the data center. DC Fabric and DCI extend the same OcNOS image into compute, storage, and remote sites — same NOS, same CLI, same SLA.
The honest FAQ.
OcNOS 800G Ethernet-Based Lossless AI Fabric
Quick form — your PDF will download immediately after submit.
✓ Your download is starting…
If it doesn't begin automatically, use the link below.
Solution_Brief-OcNOS_800G_Ethernet-Based_Lossless_AI_Fabric.pdfEVPN-VXLAN Data Center Fabric
Quick form — your PDF will download immediately after submit.
✓ Your download is starting…
If it doesn't begin automatically, use the link below.
OcNOS-modernize-your-data-center-EVPN-VxLAN_Solution-Brief.pdfAI fabric & DC deployments
Production AI clusters and data-center fabrics running OcNOS-DC on Broadcom Tomahawk 4/5.

NTT DATA partnered with IP Infusion to take disaggregated open networking solutions to market, offering OcNOS-powered Cell Site Routers, Routed…

Scott Data deployed OcNOS with open networking hardware from UfiSpace, Edgecore Networks, and Celestica, replacing legacy vendor stacks across its…

Madeo Consultant, a France-based data center systems integrator, replaced Cisco Nexus and Catalyst switches with IP Infusion OcNOS on Edgecore…

Prosoluce, a French ISP and managed services provider, upgraded its core to a 100G EVPN-VXLAN backbone running IP Infusion OcNOS…
Related from the blog
The AI Network Decision Framework: GPU Fabric for Speed, ROI, Strategic Freedom
Buyer-side framework for GPU fabric decisions with RoCEv2 and open hardware
Read the post →OcNOS 7.0 for Data Centers: AI Fabric, 800G Platforms, EVPN-VXLAN at Scale
800G AI fabric features and Tomahawk 5 platforms shipping in OcNOS 7.0
Read the post →OcNOS 7.0: What’s New for AI, Transport, and Cloud
Highlights RoCEv2 lossless transport with PFC for GPU clusters
Read the post →OcNOS 6.6 for Data Centers: AI Fabric PFC/ETS, EVPN Policy, 400G
PFC/ETS lossless DCB features that underpin OcNOS AI fabric deployments
Read the post →