What is Scaleup Networking?
Scaleup networking provides direct GPU-to-GPU connections for cache coherency. It allows multiple GPUs to act as one giant GPU and is the highest bandwidth, lowest latency network for AI. There are several different scaleup networks, including NVidia using NVLink, AMD using UALink, and many hyperscalers exploring PCIe-based solutions. PCIe has become an essential technology for the industry as it allows customers to be GPU agnostic.
Why is Scaleup Networking Important Now?
Up until NVidia’s Blackwell GPUs, scaleup networks were limited to a single enclosure. Blackwell enabled the industry’s first pervasive scaleup network at the rack level with NVL72. Multiple offerings and multi-rack solutions need to support other XPUs to serve the Agentic workload. 2025 is the year that we see PCIe enter the mainstream as one of those alternatives to NVLink. PCIe switching enables PCIe to expand its domain within the server and rack enclosure. Retimers allow copper links to extend further, and CXL allows additional value to unlock within the server, such as memory pooling. Within the next five years, scaleup networking will be a $10B+ market.
PCIe Switching
PCIe switching allows for PCIe to leave the confines of the server motherboard, allowing for the connection of more devices to other server enclosures. For larger GPU servers, PCIe can enable connectivity inside the server enclosure. As we look towards multi-server GPU deployments, PCIe switching can sit in the rack, similar to an Ethernet/InfiniBand switch, and connect the elements to different servers.
Retimers
PCIe retimers take in a signal, can extract the clock and the signal itself, and send a fresh and clean signal to extend the reach of the link. Depending on the type of copper used, retimers can take signals that only go a few inches and extend them to distances within a full rack. With time, PCIe retimers and switching used in combination can even deploy across multiple racks. Advanced retimers can also help provide telemetry data to help provide better insight to the customer.
Compute Expresss Link (CXL)
CXL is an exciting enhancement to PCIe that enables memory sharing and cache coherency. It enhances the server by allowing multiple elements to act like one sizeable virtual entity and allowing numerous hosts to manage the same device. In early deployments, this will eliminate wasted memory in servers and bring coherent memory to the system to enable larger AI models and better performance.
Credo’s Role in PCIe and AI Scaleup Networks
Credo invented the category of AECs (Active Electrical Cables) several years ago to address the growing need of extending the range and reducing the volume of passive copper cables in high-speed data center deployments. Widely deployed, Credo’s ethernet AECs have evolved and now enable AI scale-out architectures. While ethernet works well for scale-out, high relative latency makes it unsuitable for scale-up especially for LLM training. Enabled by their PCIe retimers, Credo’s PCIe AECs offer a low latency, high-throughput open solution well-suited for scale-up networks. Phil Kumin, AVP of PCIe products says, “We have already seen hyperscalers adopt this open architecture for Gen5 scale-up and we think momentum will accelerate with Gen6 and Gen7 deployment.”
Key Takeaways
PCIe will play an increasingly important role in AI scaleup networking in 2025 as customers look towards the technology for GPUs and XPUs. Many larger hyperscalers are looking at the technology as a way to be GPU/XPU agnostic and to help drive the need for larger logical system sizes as the industry moves from foundational training to agentic AI. The 650 Group predicts that agentic AI wave of DC infrastructure spending should exceed $1T over the next 4-5 years.