Traditional enterprise data centers have an Achilles heel – the single TOR per rack architecture. If that TOR fails, it takes the entire rack offline, and as Microsoft recently published, this happens at least 2% of the time in the first 40 days of operation.
In the enterprise, this has traditionally been solved with dual-TOR / dual-port NIC MLAG architecture enabling each server to connect to two TORs and the complexity of this connection to be managed by a combination of TOR MLAG software and the hypervisor running on the server. The problem with this approach is speed. NIC speeds are growing from 10G to 100G and beyond. As speed increases, more x86 cores are dedicated to enabling the hypervisor’s management of this configuration. Today, this is about one core for every 2.5Gbps. However, this is not a sustainable path from a power or cost standpoint, especially in hyperscale infrastructure.
Hyperscalers have countered this problem by deploying redundancy at the application layer and running redundant server racks to support a TOR failure. Redundancy adds cost and complexity to the Hyperscaler’s operations by taking expensive servers off the revenue path and having them idle, waiting for a TOR failure.
Enter the Active Electrical Cable (AEC). In their latest configuration, AECs, which have already enabled the implementation of Distributed Disaggregated Chassis (DDC) deployments, are now ready to enable a new deployment of NIC to ToR implementations.
Credo’s HiWire SWITCH AEC eliminates the server management of a dual-TOR configuration and instead presents the server with only a single port. The Network Operating System (NOS), such as SONiC, can manage a MUX located inside the SWITCH AEC. This MUX can switch traffic from one TOR to another in an Active/Standby mode in less than a millisecond. The result is dual TOR reliability with dramatically improved failover performance as compared to MLAG implementations, but without requiring the server to do anything – or frankly even be aware of the dual-TOR architecture.
Before 2021, essentially 100% of NIC-TOR applications used passive DAC cables. We expect the TOR to server connection to evolve to address redundancy and other needs in the changing demands placed on the network. The market is upgrading to higher speeds and server applications are growing in complexity and scale, as shown below.
This topic will be discussed in a pair of livestreaming events hosted by 650 Group with Credo and Microsoft.
Learn more about the events and register for the English or Mandarin session at:
Wednesday, July 21, 2021 @ 10:00AM PT
REGISTER FOR ENGLISH BROADCAST
Thursday, July 22, 2021 @ 9:00AM Shanghai
REGISTER FOR MANDARIN BROADCAST