OcNOS 7.0 for Data Centers is now generally available. This release is built around a single reality: AI infrastructure has entered a new era, and the network is the critical determinant of GPU cluster efficiency, job completion time, and infrastructure return on investment.
Latency, jitter, and packet loss inside an AI training fabric directly translate into lost GPU productivity at scale. OcNOS 7.0 addresses this with purpose-built AI fabric capabilities on open, ONIE-enabled Broadcom Tomahawk 5 platforms — delivering the performance of proprietary hyperscale solutions without vendor lock-in.
1. AI/ML Fabric: Lossless Transport on Broadcom Tomahawk 5
GPU-to-GPU communication in AI training clusters relies on RDMA over Converged Ethernet (RoCEv2). RoCEv2 is highly sensitive to packet loss — even brief congestion events trigger retransmission cycles that stall the entire training job. OcNOS 7.0 delivers a complete lossless fabric solution for RoCEv2 workloads.
Priority-Based Flow Control (PFC) and Enhanced Transmission Selection (ETS)
PFC implements per-priority pause frames, ensuring that a congested queue on a receiving interface signals the sender to stop transmitting for that priority class — preventing packet drops without affecting other traffic classes. ETS allocates bandwidth among traffic classes using weighted scheduling.
! OcNOS 7.0 -- AI fabric: PFC, ETS, and DCBX configuration
!
! Step 1: Define QoS map for RoCEv2 traffic (priority 3)
qos map dscp-cos ROCE-MAP
dscp 24 cos 3 ! RoCEv2 uses DSCP 24 (CS3)
!
! Step 2: Enable PFC for priority 3 (RoCEv2)
interface Ethernet1/1
qos map dscp-cos ROCE-MAP
priority-flow-control mode on
priority-flow-control priority 3 no-drop
!
! Step 3: ETS bandwidth allocation
qos scheduler-group FABRIC-ETS
strict-priority 7 ! Highest: control plane
wrr cos 3 weight 70 ! 70% bandwidth: RoCEv2
wrr cos 0 weight 30 ! 30% bandwidth: best-effort
!
! Step 4: DCBX for auto-negotiation with servers
interface Ethernet1/1
dcbx enable
dcbx version ieee
!
! Verification:
show priority-flow-control
show dcbx interface Ethernet1/1
show qos scheduler-group FABRIC-ETS
2. EVPN-VXLAN Multi-Site Overlay Extension
Modern data centers are no longer confined to a single location. AI clusters, hybrid cloud architectures, and distributed applications require seamless connectivity across multiple sites without sacrificing tenant isolation or operational control. OcNOS 7.0 delivers EVPN Layer 3 overlay extension with VXLAN stitching between sites, eliminating external gateway appliances and reducing infrastructure cost.
! OcNOS 7.0 -- EVPN-VXLAN multi-site L3 overlay extension
! Border Leaf: Site A side
!
vrf TENANT-A
vni 10001
!
router bgp 65001
!
address-family l2vpn evpn
neighbor 10.0.0.2 activate ! Site A spine
neighbor 10.200.0.1 activate ! Site B border leaf
neighbor 10.200.0.1 route-map EXPORT-SITE-B out
!
vrf TENANT-A
rd 65001:10001
route-target import 65002:10001 ! Import Site B prefixes
route-target export 65001:10001 ! Export Site A prefixes
redistribute connected
!
! Route-map: control which VRFs extend to Site B
route-map EXPORT-SITE-B permit 10
match evpn route-type 5 ! Only IP prefix routes (type 5)
set extcommunity rt 65001:10001 additive
3. Enhanced Visibility: Selective Mirroring and Route Target Filtering
At scale, operational visibility is not optional. OcNOS 7.0 introduces selective packet mirroring to CPU with filtering, enabling operators to capture and analyze specific traffic flows directly within the fabric — without deploying dedicated capture infrastructure or impacting forwarding performance.
! OcNOS 7.0 -- Selective mirroring for fabric troubleshooting
!
monitor session 1
source interface Ethernet1/1 rx
filter access-group MIRROR-FILTER
destination cpu
!
! Filter: capture only VXLAN-encapsulated traffic on VNI 10001
ip access-list MIRROR-FILTER
permit udp any any dst-port 4789 ! VXLAN UDP port
!
! Verification:
show monitor session 1
show capture buffer
4. New Platform: UfiSpace S9321-64EO (Broadcom Tomahawk 5)
OcNOS 7.0 introduces support for the UfiSpace S9321-64EO — a purpose-built AI/ML fabric switch delivering 51.2 Tbps of switching capacity with 64 high-density OSFP ports at 800G. This platform is engineered for next-generation GPU interconnects and large-scale AI training clusters requiring ultra-low latency and deterministic forwarding.
| Platform | Silicon | Switching Capacity | Ports | Use Case |
|---|---|---|---|---|
| UfiSpace S9321-64EO | Broadcom Tomahawk 5 | 51.2 Tbps | 64×800G OSFP | AI/ML spine, GPU fabric |
Key Benefits of OcNOS for Data Centers
- Lossless AI fabric — PFC, ETS, and DCBX deliver RoCEv2 lossless transport without proprietary NICs or switches
- Open hardware choice — Broadcom Tomahawk 5 silicon on multiple ODM platforms from UfiSpace, Edgecore, and others
- EVPN-VXLAN multi-tenancy — scalable overlay with Route Target filtering and multi-site L3 extension
- Real-time visibility — on-change gNMI telemetry and selective mirroring for operational insight at scale
- All-inclusive licensing — single SKU covers full OcNOS feature set; no per-feature upsell
- OcNOS 7.0 — Full Release Overview
- OcNOS 7.0 for Service Providers
- OcNOS Feature Matrix
- OcNOS-DC Product Page
- IP Maestro — Element Management for OcNOS
- Contact IP Infusion
Alan Huang is Senior Product Manager, Data Center at IP Infusion. Connect on LinkedIn.