BCM78900 · TSMC 5 nm · Shipping since March 2023

Broadcom Tomahawk 5 Tomahawk 5 Switches Four 800G open platforms, validated on OcNOS-DC.

Network engineers picking a Tomahawk 5 switch — start here. Edgecore AIS800-64D and AS9817-64D, UfiSpace S9321-64E and S9321-64EO. Same silicon, same OcNOS-DC image, four procurement paths. Specs, decision rules, and the OcNOS-DC feature surface — without the marketing fluff.

51.2Tbps
Switch Capacity
64×800G
Native Port Radix
4SKUs
OcNOS-Validated
2ODMs
Edgecore · UfiSpace
5nm
TSMC N5 Process
01
The Switches
Open hardware running Tomahawk 5

Four 800G platforms. Two ODMs. One OcNOS-DC image.

Two hardware designs, four SKUs. All four ship ONIE pre-loaded and run the same OcNOS-DC image — the differences are form factor (QSFP-DD vs OSFP), branding (AI-fabric SKU vs general-DC SKU), and which optics ecosystem the deployment is built around. Each card links to the full vendor datasheet (PDF, hosted locally).

Edgecore· DCS560 platform family
AI fabric spine

AIS800-64D

Validated on OcNOS-DC · ONIE pre-loaded
Ports
64 × QSFP-DD800Breakout: 2×400 / 4×200 / 8×100 (320 logical ports)
Form
2RU
Power
2× 3000 W AC/DC redundant30 W per QSFP-DD cage
CPU
Intel Xeon D1713NTE
▌ Pick this when

GPU-cluster AI fabric. AI-branded SKU of the DCS560 — same hardware as AS9817-64D under different framing.

Edgecore· DCS560 platform family
DC fabric · 800G aggregation

AS9817-64D

Validated on OcNOS-DC · ONIE pre-loaded
Ports
64 × QSFP-DD800Breakout: 2×400 / 4×200 / 8×100 (320 logical ports)
Form
2RU
Power
Hot-swap redundant AC/DC30 W per QSFP-DD cage
CPU
Intel Xeon D-class
▌ Pick this when

General data-center fabric or DCI duty. Same DCS560 chassis as AIS800-64D, branded for non-AI workloads.

UfiSpace· S9321 platform family
AI/ML fabric spine

S9321-64E

Validated on OcNOS-DC · ONIE pre-loaded
Ports
64 × QSFP-DD (200/400/800G)Breakout: 2×400 / 4×200 / 8×100
Form
2RU · 23.72 kg
Power
913 W typical (no transceivers)30 W per QSFP-DD cage
CPU
Intel Icelake-D 4-core · 32 GB DDR4
▌ Pick this when

Large, low-entropy AI/ML flows. UfiSpace markets the 64E for AllReduce-dominant traffic where TH5 adaptive routing is the design centre.

UfiSpace· S9321 platform family
800G DCI · coherent optics

S9321-64EO

Validated on OcNOS-DC · ONIE pre-loaded
Ports
64 × OSFP (200/400/800G)Breakout: 2×400 / 4×200 / 8×100
Form
2RU · 23.74 kg
Power
925 W typical · 200–240 V ACOSFP cages for higher-power optics
CPU
Intel Icelake-D · 32 GB DDR4
▌ Pick this when

800G ZR/ZR+ coherent or other higher-power module classes. OSFP form factor of the 64E — pick when the optics drive the cage choice.

· How to choose between the four

AIS800 vs AS9817Same Edgecore DCS560 hardware. AIS for AI-cluster framing; AS9817 for general DC fabric or DCI.
QSFP-DD vs OSFPQSFP-DD (S9321-64E + both Edgecore SKUs) for the high-volume optics ecosystem. OSFP (S9321-64EO) for higher-power module classes including 800G ZR/ZR+ coherent.
Edgecore vs UfiSpaceBoth are open-hardware ODMs with strong IP Infusion co-design. Pick by your ODM relationship, RMA logistics, or BoM economics.
Single-vendor riskTwo vendors with TH5 platforms means dual-source BoM is realistic — important for hyperscale and NeoCloud procurement.
02
Inside the Silicon
What 51.2 Tbps in one die buys you

Tomahawk 5 — Broadcom's flagship merchant switch ASIC.

The BCM78900 is a single 5 nm monolithic die delivering 51.2 Tbps of switching capacity — feeding 64 ports of 800GbE, 128 of 400G, or 256 of 200G natively. It was Broadcom's first 5 nm merchant switch IC and the first product anywhere to support 800GbE at the cage. 512 SerDes lanes running 100G PAM4 — the same lane count as Tomahawk 4, twice the per-lane speed.

Beyond raw capacity, three architectural choices made TH5 the silicon under most production AI fabrics: a shared-buffer architecture that absorbs NCCL micro-bursts, hardware Cognitive Routing (DLB) that rebinds elephant flows in the ASIC, and 5 nm thermal headroom that lets 30 W QSFP-DD800 cages run without per-port active cooling.

Specs verifiable against Broadcom's public BCM78900 product page.

ProcessTSMC N5 SeriesStrataXGS BufferShared, RDMA-tuned RoutingCognitive · DLB ShippingSince Mar 2023

· What 64 × 800G looks like

BCM78900 die51.2 Tbps
512 lanes × 100G PAM4 = 51.2 Tbps. Eight lanes per cage → 800G. The arithmetic is the architecture.
Four design choices that matter

Why TH5 ended up in almost every open AI fabric built since 2024.

The headline number gets the press. These four engineering choices are what AI fabric architects actually care about.

PRINCIPLE 01

Same lane count, twice the speed.

TH5 carries the same 512 SerDes lanes as TH4 — running them at 100G PAM4 instead of 50G. The throughput double came from speeding up existing infrastructure, not adding to it.

100G PAM4 · 106 Gbps
PRINCIPLE 02

Shared-buffer, not partitioned.

Packet memory pools across all 64 ports — not split per-port. NCCL AllReduce micro-bursts on one port absorb into the fabric-wide pool instead of triggering tail-drop. The single-line reason TH5 wins on RoCEv2.

Shared-buffer · RDMA-tuned
PRINCIPLE 03

Hardware adaptive routing.

Broadcom Cognitive Routing detects congested paths and rebinds elephant flows in the ASIC — no controller round-trip, no ECMP rehashing. OcNOS-DC turns it on as DLB Reactive-Path Rebalance.

DLB · 64 µs flowlet
PRINCIPLE 04

5 nm thermal headroom.

The first 5 nm merchant switch IC. The process shrink is what made 30 W per QSFP-DD800 cage feasible without active per-port cooling — including high-power 800G optics and 8×100G breakout.

TSMC N5 · 30 W/port
03
Generation Jump
Tomahawk 4 → Tomahawk 5

Every dimension doubled. Same rack footprint.

Honest framing: TH4 (25.6 Tbps · 32×400G · 7 nm) is still excellent for clusters built around 400G NICs. TH5 earns its rack space when 800G radix and AI-fabric primitives both matter.

Switching capacity
25.6 Tbps 51.2 Tbps

Doubled at the same rack footprint. Same 2RU, same power envelope class.

Native port radix
32 × 400G 64 × 800G

Three-tier Clos at 16k GPU instead of four — radix collapses a layer.

Process node
7 nm 5 nm

First 5 nm merchant switch IC. Thermal headroom for 30 W/port without active cooling.

SerDes per lane
50G PAM4 100G PAM4

Same 512 lanes, twice the speed. The throughput double came from existing infrastructure.

Brownfield refresh stays clean. The same OcNOS-DC image runs on TH3, TH4, and TH5 platforms — configurations, automation, and gNMI pipelines carry over. Pick TH5 for the next cluster; keep TH4 where it already works.
04
What OcNOS-DC Ships
OcNOS-DC on this silicon

Carrier-grade NOS. AI-tuned defaults.

Tomahawk 5 has the hardware. The job of the NOS is to expose it — to operators, to telemetry pipelines, to the cluster scheduler — without forcing them to write CLI gymnastics around it. OcNOS-DC ships these primitives as first-class configurable objects with YANG-modelled state.

Lossless RoCEv2

Shared-buffer architecture, zero-drop east-west.

OcNOS-DC ships PFC + ETS + Dynamic ECN pre-tuned to NCCL collective patterns. Tail latency stays bounded under AllReduce micro-bursts that take community NOS fabrics down. The TH5 shared-buffer pool absorbs synchronised many-to-one traffic that would tail-drop on partitioned-buffer chips.

Adaptive Routing

DLB rebinds flowlets in 64 µs.

ECMP hash-collision under elephant flows is the AI fabric killer. OcNOS-DC turns on TH5 Cognitive Routing's flowlet rebinding so AllReduce traffic spreads across every spine path automatically.

PFC Deadlock Watchdog

Per-port, per-priority. Auto-drain.

Detects paused-queue cycles before they hang training jobs. Auto-recovers without operator intervention.

Streaming Telemetry

gNMI on-change, OpenConfig YANG.

Buffer depth, ECN marks, PFC pause counts — every threshold a knob, every counter a sensor path. Plugs into Prometheus, Grafana, OTel.

Real Network

BGP · OSPF · IS-IS · EVPN-VXLAN.

The TH5 spine is also a real router. Full carrier-grade Layer 3 stack on the same silicon — operate the AI fabric like the rest of your network, not like a black box.

Validated feature surface

215 features across 8 categories — pulled from the live OcNOS Feature Matrix.

Layer 3 routing · L1/L2 · AI/ML fabric primitives · Multicast · QoS · Security · Hardware · Management. Every entry verifiable per-platform on the public matrix.

RoCEv2 / PFC DCQCN DLB EVPN-VXLAN SR-MPLS BGP / OSPF / IS-IS gNMI / NETCONF ZTP UEC 1.0 ready
Day-0 to Day-2

ZTP. gNMI on-change. NETCONF + YANG. DCBX.

Bring up a TH5 spine in the rack with zero-touch provisioning. Stream every counter to your observability stack. Tune every threshold via YANG-modelled config. No glue scripts.

ZTP IPv4/IPv6 gNMI NETCONF OpenConfig YANG DCBX LLDP Ansible Terraform provider
Who builds this stack

Three operator profiles. One silicon + NOS combo.

Same TH5 die, same OcNOS-DC image, three different framings of the same architectural question: how do you scale lossless east-west without locking the whole stack to one vendor?

AI Cluster Operator

1k–16k GPU training fabric on open silicon.

"We need 800G to the leaf, lossless RoCEv2, and tail latency that doesn't blow up under AllReduce. Single-vendor lock-in is not on the table."

TH5 64×800G spines, RoCEv2 with NCCL-tuned DCQCN, sub-millisecond DLB rebinding, PFC deadlock watchdog. Three-tier Clos at 16k GPU instead of four — the radix collapses a layer.

DC · AI Fabric SKU
NeoCloud · GPU-as-a-Service

Multi-tenant fabric, BoM under control.

"Our customers pick the GPU. We can't tie our fabric BoM to their NIC choice. We need a switch we can buy from two vendors at minimum."

Four OcNOS-validated TH5 SKUs across two vendors (Edgecore, UfiSpace). VRF-Lite tenant isolation, gNMI per-tenant telemetry, EVPN-VXLAN segmentation. One NOS image, multi-vendor hardware.

DC · Multi-Tenant
Hyperscaler · Brownfield Refresh

TH3/TH4 fabric refresh without forklift.

"We have a TH4 fabric in production. The next training cluster needs 800G NICs. We don't want to redesign the whole NOS layer to upgrade the silicon."

Same OcNOS-DC image runs on TH3, TH4, and TH5 platforms. Brownfield refresh keeps configs, automation, and gNMI pipelines intact. UEC 1.0 fabric profile already aligned for the next NIC generation.

DC · UEC-Ready
Frequently Asked

The questions architects actually ask.

Four open-hardware platforms across two ODMs: Edgecore AIS800-64D and AS9817-64D (sibling SKUs on the DCS560 chassis), and UfiSpace S9321-64E (QSFP-DD) and S9321-64EO (OSFP). All four ship ONIE pre-loaded and run the same OcNOS-DC image — same configuration, same feature surface, same automation hooks. Two vendors means dual-source BoM is realistic for hyperscale and NeoCloud procurement.
Same Edgecore DCS560 hardware, different SKU framing. AIS800-64D is the AI-fabric branding (sold for GPU clusters); AS9817-64D is the general data-center branding (sold for DC fabric or DCI). Mechanically and electrically identical — the choice is procurement framing, not engineering. Pick AIS if AI-cluster framing matters to your deployment; pick AS9817 for general DC fabric or DCI.
QSFP-DD (S9321-64E and both Edgecore SKUs) is the high-volume optics ecosystem — the right default for short-reach 800G inside the data center. OSFP (S9321-64EO) provides higher-power cages for module classes QSFP-DD cannot host: 800G ZR/ZR+ coherent for DCI, longer-reach DR4/DR8, and pluggable amplifiers. Pick OSFP when the optics drive the cage choice; otherwise QSFP-DD wins on cost and ecosystem breadth.
TH4 is 25.6 Tbps · 32×400G · 7 nm · 50G PAM4. TH5 doubles every dimension at the same rack footprint. Decision rule: if the cluster needs 800G ports natively, or GPU count puts pressure on tier count (TH5 collapses one Clos tier at 16k GPU), pick TH5. If the design is built around 400G NICs and the 256–1k GPU envelope, TH4 is still excellent and cheaper per port. OcNOS-DC supports both with the same feature set — brownfield refresh stays clean.
TH5 has the hardware mechanisms UEC 1.0 fabric profiles need — per-packet ECMP, packet-spray-friendly forwarding, shared-buffer scheduling that tolerates out-of-order delivery. UEC itself lives mostly in the NIC; TH5 fabrics running OcNOS-DC will carry UEC traffic correctly when UEC NICs ship in volume. RoCEv2 and UEC coexist on the same switch — migrate clusters NIC-by-NIC, no fabric replacement.
On TH5, OcNOS-DC ships pre-tuned for AI fabrics: PFC over L3, ETS, Dynamic ECN, DLB Reactive-Path Rebalance, DLB Random-Flow, PFC Deadlock Detection & Recovery, NCCL-aligned buffer profiles, DCBX LLDP. On the same silicon it also runs a full carrier-grade Layer 3 stack — BGP, OSPF, IS-IS, SR-MPLS, EVPN-VXLAN — that AI-only stacks typically don't cover. 215 features validated across 8 categories, every entry verifiable on the public OcNOS Feature Matrix.
SP edge, cell-site gateway, sub-1 Tbps aggregation. The 64×800G radix doesn't earn its rack space in those roles. For SP routing OcNOS validates Broadcom Qumran (Q2C, Q2C+) and Jericho (J2C+); for 100G/400G DC leaf where the cluster is below 1k GPUs, Trident (TD3-X7, TD4) is the better economics. Honest framing: TH5 wins when 800G radix and AI-fabric primitives both matter — not when only one does.

Designing a Tomahawk 5 fabric? Let's size it together.

30-minute architecture session with an OcNOS network architect. Bring your GPU count, NIC speed, and tier preference — leave with a sized BoM across all four TH5 SKUs.