適用於 AI Fabric 的 InfiniBand 與 Ethernet 對比

對於任何不簡單的 GPU 叢集，「直接買 InfiniBand」曾是穩妥之選，而如今這一答案正在改變。現代乙太網（具備 PFC 的 RoCEv2、ECN/DCQCN、DLB，以及即將到來的 GLB 和 UEC）彌合了大部分性能差距，同時打開了超大規模雲廠商正在邁入的多廠商、開放硬體之門。

InfiniBand or Ethernet for an AI cluster? InfiniBand still leads on raw latency for tightly coupled HPC, but modern Ethernet with RoCEv2 (PFC, ECN, DLB) now runs production AI fabrics at 400G and 800G, keeps your NIC, GPU, and switch choices open, and shares one operating model with the rest of the data center. OcNOS-DC delivers that Ethernet fabric today, and IP Infusion is a contributing member of the Ultra Ethernet Consortium shaping where the standard goes next.

兩張網路，兩種營運模式

左側：單一廠商的 InfiniBand fabric，只有一家 IB 晶片廠商、一套交換機、一個 NIC 生態。右側：多廠商的開放乙太網 fabric，可採用任意廠商的 RoCEv2 / UEC NIC，交換晶片來自 Broadcom，以 OcNOS-DC 作為 NOS，並沿用與資料中心其餘部分相同的協議。

兩者比較

InfiniBand 是為低延遲、無損 RDMA 而專門打造的。二十年來，這使它在緊耦合 HPC 工作負載上擁有實實在在的性能優勢。基於 DCB 協議棧、RoCEv2，以及日益成熟的 DLB 和 UEC 構建的現代乙太網，在過去幾年裡一直在縮小這一差距。剩餘的差距對某些工作負載至關重要，對另一些則無關緊要。正確答案取決於具體工作負載，而非教條之爭。

Axis	InfiniBand	乙太網（RoCEv2 / UEC）
時延下限	端到端 NIC 到 NIC 時延極低；交換機單跳通常為數百納秒。	時延下限比 IB 高出數百納秒，但仍遠低於會影響大規模分布式訓練集合通信的閾值。
丟包容忍度	架構層面無損（基於信用的流量控制）。	通過 PFC + ECN + DCQCN 實現無損。如今已達生產級；UEC 進一步降低對 PFC 暫停的依賴。
多路徑 / 負載均衡	自適應路由已內置於規範中。	靜態 ECMP，外加用於自適應單跳的 DLB、用於端到端的 GLB（OcNOS 7.1），以及適用於下一代的 UEC 報文噴灑。
廠商生態系統	在 NIC 和交換機晶片方面實際上均為單一供應商。	每一層均支持多廠商：ASIC、交換機、NIC、NOS、光模組。UEC 明確為廠商中立的互操作而設計。
營運模式	子網管理器（UFM 級別）。與 DC 其餘部分不同。需要單獨的技能與單獨的工具鏈。	沿用您已在運行的 BGP、EVPN、gNMI。與資料中心其餘部分採用相同的自動化工具（Ansible、NETCONF、OpenConfig）。
Multi-tenancy	有限；存在分區機制，但並非一流概念。	通過 EVPN-VXLAN 原生支持。GPU-as-a-Service、多團隊叢集、共享基礎設施皆水到渠成。
長距離 DCI	並非為此設計；需要 IB-over-WAN 網關。	通過 400G ZR/ZR+ 相干可插拔模組和 EVPN 跨資料中心原生實現。
儲存融合	儲存與計算並行運行；需要 IB 連接的儲存。	NVMe-oF、NFS、S3 全部運行在同一套乙太網網路上。
每埠成本（典型 400G+）	高端；單一廠商定價。	開放硬體 spine ＋ OcNOS-DC NOS，相比廠商鎖定方案具有實質性成本優勢。
路線圖推進速度	受制於單一廠商的版本發布節奏。	UEC 聯盟（AMD、Arista、Broadcom、Cisco、HPE、Intel、Meta、Microsoft、Oracle 等）推動著公開發布的規範演進。

各自的制勝場景

何時選擇 InfiniBand

時延下限受合同約束

在 HPC 仿真負載中，每一次集合通信都至關重要，絕對時延下限比總擁有成本更為關鍵。適用於緊湊、專屬的單租戶叢集，在這類場景中廠商鎖定是可以接受的。

在以下情況選擇乙太網

運維模式至關重要

多租戶 GPU 即服務。與資料中心其餘部分共享基礎設施的 AI 叢集。凡是團隊希望採用統一運維模型、統一工具棧和多廠商供應鏈的場景，均適用。

在以下情況選擇乙太網

每 GPU 浮點運算成本是關鍵門檻

開放硬體 spine + OcNOS-DC 消除了專有網路稅。在數千 GPU 規模的叢集上，所節省的 CapEx 往往足以購買額外的 GPU 算力。

在以下情況選擇乙太網

該 fabric 可跨多個 DC 延伸

如果某次訓練任務有朝一日需要跨越兩個機房或兩個地域，乙太網將成為預設優選：相干 DCI、EVPN 資料中心間互聯以及標準的多廠商光模組，使這成為一天即可完成的工作，而非耗時一個季度的線路系統工程。

現代乙太網已彌合差距的領域

無損行為。 採用 PFC、DCQCN 的 RoCEv2，以及 OcNOS-DC PFC 死鎖看門狗如今已達生產級水平。一旦正確配置好這些機制，「乙太網會丟包」這一質疑便不再成立。

自適應路由。 AI 工作負載上的靜態 ECMP 衝突確實存在，但 DLB 在亞毫秒級時間窗口內根據本地擁塞重新分配 flowlet，並 GLB 在 OcNOS 7.1 中將其擴展為完整的端到端路徑評分。

適配噴灑式傳輸。 Ultra Ethernet (UEC) 為標準乙太網帶來包噴灑（packet spray）、多路徑 RDMA、亂序交付以及選擇性重傳。曾定義 InfiniBand 的架構優勢，正落地於多廠商的開放協議棧之上。

TCO 探討

For most production AI fabric decisions in 2026, the network is 5-8% of cluster TCO over five years. The InfiniBand premium typically lands in the +30% to +60% range over open-hardware Ethernet for equivalent capacity. On a $100M cluster, that's a meaningful number, but the more important number is what you can do with the saved capex (more GPUs, larger storage tier, second site for HA). And for clusters where the network is multi-tenant or shared with the rest of DC, the operational simplification of one network model is worth more than its line-item cost difference.

IP Infusion 的觀點

兩者各有用武之地。 我們不會假裝乙太網能在每種工作負載中都勝出。對延遲有絕對下限要求的緊耦合 HPC 叢集，在一段時間內仍會繼續選購 InfiniBand。
大多數 AI 網路都應構建在乙太網之上。 超大規模的生產級 AI 訓練與推理正在轉向乙太網，因為一旦技術差距縮小，其運維和經濟上的優勢便壓倒性地明顯，而這一差距正在快速縮小。
OcNOS-DC 是開放之路。 當前支持 RoCEv2，當前支持 DLB，下一步支持 GLB，並將隨網卡上市支持 UEC。一套 NOS、一份特性路線圖，運行於來自 Edgecore、UfiSpace、Wedge 等廠商的經過驗證的開放硬體之上。
架構評審免費提供。 如果您正在進行 fabric 容量規劃，並希望獲得針對具體工作負載的分析，而非廠商推銷，我們的網路架構師將與您一同完成計算；或者，您也可以從以下工具的 leaf-spine 初步布局著手 AI Fabric 規模估算工具.

為下一個叢集選擇 IB 還是乙太網？讓我們做一次針對具體工作負載的測算。

預約架構評審 →

常見問題

與 InfiniBand 相比，Ethernet 對於 AI 訓練是否足夠快？

對於大規模的多數分散式訓練集合通訊而言，是的。Ethernet的延遲下限比InfiniBand高出數百奈秒，但只要正確設定了具備PFC、DCQCN與DLB的RoCEv2，此一差異便低於影響多數工作負載的門檻。

在哪些情況下仍應選擇 InfiniBand？

對於絕對延遲下限比total cost of ownership更重要的緊耦合HPC模擬，以及可接受single-vendor lock-in的封閉式單租戶叢集，請選擇InfiniBand。

在 AI fabric 中，Ethernet 何時更具優勢？

在多租戶 GPU-as-a-Service、與 data center 其餘部分共用同一維運模型和工具的叢集、對成本敏感的數千 GPU 規模建置，以及透過 coherent DCI 和 EVPN 跨 data center 延伸的 fabric 等場景中，Ethernet 更勝一籌。

OcNOS 如何縮小與 InfiniBand 的差距？

OcNOS-DC 目前提供 RoCEv2 lossless 傳輸、DLB 自適應路由以及 PFC 死結看門狗，於 7.1 中提供用於端到端路徑評分的 GLB，並隨 NIC 上市提供 UEC 支援，全部運行於 Edgecore、UfiSpace 等廠商的已驗證 open hardware 之上。

Are hyperscalers replacing InfiniBand with Ethernet for AI?

Yes. Ethernet passed InfiniBand in AI back-end networking during 2025, and industry reporting in early 2026 put roughly 70 percent of new AI fabric deployments on Ethernet. Meta has described training its largest models over a RoCE Ethernet fabric on a 24,000-GPU cluster. The Ultra Ethernet Consortium, whose founding members include AMD, Arista, Broadcom, Cisco, Eviden, HPE, Intel, Meta and Microsoft, was formed to standardize this direction. The reasons are operational and economic: one network model shared with the rest of the data center, a multi-vendor supply chain, and lower cost per port at 400G and 800G.

What is the best alternative to InfiniBand for AI cluster networking?

Ethernet with RoCEv2 is the mainstream alternative. It carries RDMA losslessly using PFC, ECN and DCQCN, adds adaptive routing through DLB, and gains packet spray and multi-path RDMA through Ultra Ethernet as UEC NICs ship. On open hardware running OcNOS-DC, it delivers this on a multi-vendor stack rather than a single-vendor fabric.

What is the difference between Ultra Ethernet and InfiniBand?

Ultra Ethernet (UEC) brings the transport techniques that defined InfiniBand to standard Ethernet: packet spray across all paths, multi-path RDMA, out-of-order delivery, and selective retransmission. The UEC 1.0 specification, published in 2025, defines this transport. The difference is the ecosystem: InfiniBand is effectively single-vendor for NIC and switch silicon, while Ultra Ethernet is an open, multi-vendor specification any vendor can build to.

How much does an Ethernet AI fabric cost compared to InfiniBand?

The network is roughly 5 to 8 percent of AI cluster total cost of ownership over five years, so it is rarely the largest line item. For equivalent capacity, single-vendor InfiniBand generally carries a price premium over open-hardware Ethernet running OcNOS-DC. The more important number is what the saved capital funds, whether that is more GPUs, a larger storage tier, or a second site for resilience.

規格書與解決方案簡介

深入了解，隨身帶走。

產品規格書，以及內容比本頁更為深入的簡明技術下載資料。

規格書

OcNOS-DC 規格書

完整的 OcNOS-DC 規格：EVPN-VXLAN 與 Ethernet for AI 功能集、軟體 SKU、支援的硬體平台，以及解決方案訂購指南。

取得規格書

解決方案簡報

OcNOS 800G 無損 AI Fabric

基於 Broadcom Tomahawk 4/5 spine 的無阻塞 RoCEv2 fabric：SKU 級別、經驗證的平台以及部署架構。

取得簡報

解決方案簡報

EVPN-VXLAN 資料中心網路

carrier-grade 的 leaf-spine data center fabric：對稱 IRB、Type-2/Type-5 路由，以及分散式 anycast 閘道。

取得簡報

AI 網路

Design the whole AI fabric with OcNOS

From the business case to the port-count maths, pick up wherever you are in the build.

解決方案 Open AI Fabric The complete 800G AI fabric: open switches, OcNOS-DC, and support under one contract. Reference designs AI Fabric 拓撲 Rail-optimized, scheduled 3-stage Clos, and coherent DCI, sized in real port counts. Size & build AI Fabric Design Suite Size a GPU fabric: leaf, spine, and super-spine counts with a component and power summary.

初次接觸AI組網？從這裡開始什麼是AI組網？什麼是GPU組網？什麼是無損乙太網？什麼是RDMA？

The technology inside RoCEv2 lossless Rail-optimized network DLB adaptive routing GLB (7.1) Ultra Ethernet DCQCN PFC 死鎖 InfiniBand 與乙太網對比 RoCE 與 InfiniBand 比較 AI fabric architecture Coherent DCI

解決方案

產品

合作夥伴

資源

公司

服務提供商網路

5G 行動傳輸

寬帶匯聚

都會乙太網路與匯聚

IP 核心與對等互連

IP over DWDM（路由光學）

AI 網路

neocloud 網路

Multi-Tenant Fabric

資料中心網路

資料中心互連

DDoS 防護

自動化與 API

適用於 AI Fabric 的 InfiniBand 與 Ethernet 對比

兩張網路，兩種營運模式

兩者比較

各自的制勝場景

時延下限受合同約束

運維模式至關重要

每 GPU 浮點運算成本是關鍵門檻

該 fabric 可跨多個 DC 延伸

現代乙太網已彌合差距的領域

TCO 探討

IP Infusion 的觀點

為下一個叢集選擇 IB 還是乙太網？讓我們做一次針對具體工作負載的測算。

常見問題

深入了解，隨身帶走。

OcNOS-DC 規格書

OcNOS 800G 無損 AI Fabric

EVPN-VXLAN 資料中心網路

Design the whole AI fabric with OcNOS