Solution for Supercomputing Center: InfiniBand

As is known to us, computing cluster tech platform plays a key part in supercomputing center due to its excellent productivity and flexibility. In this platform, high-speed interconnection techs as solutions for HPC(High Performance Computing) simulation are usually adopted, such as InfiniBand, high-performance Ethernet (simplified frame structure), BlueGene, Gray and other interconnection tech. Although Ethernet or Cray interconnection tech is used by many systems in Top500 of supercomputing centers(shown as the pic), and InfiniBand is the most prevailing among them by virtue of its unique merits, in which it’s reported that there are 41.2% of supercomputing centers using it. For it, Gigalight will have a detailed introduction about InfiniBand.

Gigalight will get about to introducing InfiniBand from the perspectives of its definition, its development reasons, its prospect one by one.

What Is Infiniband?

InfiniBand (abbreviated IB) is a powerful new architecture designed to support I/O connectivity for the Internet infrastructure. Simultaneously, it is a computer-networking communications standard used in high-performance computing that features very high throughput and very low latency. With input-output (I/O) broadband structure, InfiniBand can improve the communication speed among various devices of server and network subsystems to provide the broadband services with higher performance and infinite expansion for future computer systems. InfiniBand tech is not used for general network connection, but for connection problems of server side. Therefore, InfiniBand will be used for communication between server and server (such as copying, distributed work, etc.), server and storage device (such as SAN and direct storage attachment), and server and network (such as LAN, WANs and the Internet).

Why Choosing InfiniBand?

The reason why InfiniBand is able to stand out from these solutions for HPC is that it’s with the unique advantages:

  • With High Bandwidth: InfiniBand defines three link speeds at the physical layer, 1X(2.5 Gb/s, 4 wire), 4X(10 Gb/s, 16 wire), 12X(30 Gb/s, 48 wire). Each individual link is a four wire serial differential connection (two wires in each direction) that provides a full duplex connection at 2.5 Gb/s. Besides, the higher bandwidth connections (4X, 12X) defined by InfiniBand provide backbone capabilities for IPC clusters without the need of a secondary I/O interconnect.

  • Lower latency: RDMA zero-copy networking reduces OS overhead. Vis RDMA transfers from system memory to system memory, InfiniBand implements a reliable in-order transport connection in hardware and thus data is delivered extremely efficiently, with low latencies and without host CPU assistance. InfiniBand's ultra-low latencies, with measured delays of 1µs end to end, greatly accelerating many data center and high performance computing (HPC) applications.

  • Enhanced scalability: InfiniBand can accommodate theoretically unlimited-sized flat networks based on the same switch components simply by adding additional switches. The scalability is critical as applications require more processor bandwidth. Meanwhile, Scalability is supported with fully hot swappable connections managed by a single unit (Subnet Manager). With multicast support, single transactions can be made to multiple destinations.

  • Cost Effectiveness: InfiniBand Host Channel Adapters (HCAs) and switches are very competitively priced and create a compelling price/performance advantage over alternative technologies. In addition, the much lower PHY( Physical Layer Devices ) power results in both integration and RAS cost advantages for InfiniBand. With InfiniBand's reduced PHY power requirements, these higher port count devices are entirely within reach. Reducing a multi-chip system to a single chip solution provides substantial cost as well as area savings. Whether or not the PHYs are integrated, InfiniBand's reduced power consumption results in cost savings for highly available applications.

  • With Low Power Requirements: An InfiniBand copper PHY requires only about 0.25 watts per port. In contrast, a Gigabit Ethernet PHY requires roughly two watts per port. The order of magnitude difference is explained by realizing that Gigabit Ethernet PHY's are designed to support local area networks (LAN's) requiring connections spanning at least 100 meters. InfiniBand addresses only server and storage connections within the Internet Data Center. Thus it does not need to span such great lengths, and can operate at a reduced power level.

  • With Switched Fabric Architecture: A switched fabric is a point-to-point switch-based interconnection designed for fault tolerance and scalability. A point-to-point switch fabric means that every link has one device connected at each end of the link. Thus the loading and termination characteristics are well controlled (unlike the bus architecture), and I/O performance can be much greater with a fabric. The switched fabric architecture provides scalability which can be accomplished by adding switches to the fabric and connecting more end nodes through the switches. Unlike a shared bus architecture, the aggregate bandwidth of a system increases as additional switches are added to the network. Multiple paths between devices keep the aggregate bandwidth high and provide failsafe, redundant connections.

Besides, InfiniBand tech usually applies non-blocking Fat-Tree architecture for interconnection changing from the traditional 3-tier data center architecture to the 2-tier architecture with Top of Rack (TOR) and aggregation layers (convergence layer).

History of Infiniband

InfiniBand originated in 1999 from the merger of two competing designs: Future I/O and Next Generation I/O. This led to the formation of the InfiniBand Trade Association (IBTA), which included Compaq, Dell, Hewlett-Packard, IBM, Intel, Microsoft, and Sun. At the time it was considered some of the more powerful computers were approaching the interconnect bottleneck of the PCI bus, in spite of upgrades like PCI-X.Version 1.0 of the InfiniBand Architecture Specification was released in 2000. Initially the IBTA vision for IB was simultaneously a replacement for PCI in I/O, Ethernet in the machine room, cluster interconnect and Fibre Channel. IBTA also envisaged decomposing server hardware on an IB fabric. Following the burst of the dot-com bubble there was hesitation in the industry to invest in such a far-reaching technology jump.

Of course, when it comes to InfiniBand’s history, except for the above mentioned, the following one is worth mentioning. on February 23, 2016, the Flemish Supercomputer Center (VSC) in Belgium adopted Mellanox end-to-end 100Gb/s EDR interconnection solution and also integrated it with new LX supercomputers of NEC Company. The system would be the fastest supercomputer’s(Tier-1) system and the first complete end-to-end EDR 100Gb/s InfiniBand system in Belgium. Meanwhile, it is also another example that the amount of EDR InfiniBand tech deployment in globe is growing.

The Prospect of Infiniband

With the advent of dual-core processors, the growth of the PCI-Express bus and the growing size of supercomputers, the requirements for high bandwidth and low latency have become more demanding; Moreover, plus the development of simulation tech in database clusters, manufacturing, petroleum, meteorology, biology and so on, high-performance and low-cost network interconnection solutions gradually take a place. It seems that everything is promoting InfiniBand to be the mainstream in the fast-moving markets, such as scientific computing, high-speed storage and embedded applications, in which there are many famous suppliers of InfiniBand deveices worldwide, such as IBM, HP, Intel and so on(shown as the pic). Therefore, the prospect of InfiniBand is worth expecting.

As the professional suppliers specializing in optical interconnection for over ten years, Gigalight can also provide a complete product line(shown as the table) and an operating rate covering QDR, FDR and EDR according to actual application scenes, as well as provide different plan combinations (high-performance cables, Active Optical Cables and optical modules are available), which can provide great convenience to the solution selection of clients. Furthermore, Gigalight products can be compatible with Mellanox, Intel, IBM and other InfiniBand exchange equipment of main equipment suppliers.

Gigalight product model Operating rate Encapsulation mode Type Working distance
GQS-PC400-0xC 40GE, QDR QSFP+ DAC Copper Cable 5m
GQS-AC400-0xC 40GE, QDR QSFP+ Active Copper Cable Copper Cable 10m
GQS-MDO560-XXXC 56GE, FDR QSFP+ AOC Optical fiber 850nm 150m(OM4)
GQS-MDO400-XXXC 40GE, QDR QSFP+ AOC Optical fiber 850nm 400m(OM4)
GM-SDO400-XXXC 40GE, QDR QSFP+ AOC   Optical fiber 1310nm 2km
GQS-MPO400-SR4C 40GE, QDR QSFP + Optical module Multiple-mode 850nm 400m(OM4) 8-core
GQS-SPO400-LR4C 40GE, QDR QSFP + Optical Module Single-mode LR4 10km 2-core
GQM-SPO400-IR4C 40GE, QDR QSFP + Optical Module PSM Single-mode 1310nm 2km 8-core
GQM-SPO400-LR4C 40GE, QDR QSFP+ TransceiverQSFP + Optical Module PSM Single-mode 1310nm 10km 8-core
GQS-PC101-0XXC 100GE, EDR QSFP28 DAC Copper Cable 3m
GQS-4P28+PC-XXC 100GE, EDR QSFP28 to 4x SFP28 DAC Copper Cable 3m
GQS-MDO101-XXXC 100GE, EDR QSFP28 AOC Multiple-mode 850nm 100m(OM4)
GQP-MDO101-XXXC 100GE, EDR QSFP28 to 4x SFP28 AOC Multiple-mode 850nm 100m(OM4)
GQS-MPO101-SR4C 100GE, EDR QSFP28 Transceiver QSFP28 Optical Module Multiple-mode 850nm 100m(OM4) 8-core
GQM-SPO101-IR4C 100GE, EDR QSFP28 Transceiver QSFP28 Optical Module Single-mode PSM 1310nm 2km 8-core
GQS-MPO101-IR4C 100GE, EDR QSFP28 Transceiver QSFP28 Optical Module Single-mode CLR4 2km 2-core
GQS-MPO101-LR4C 100GE, EDR QSFP28 TransceiverQSFP28 Optical Module Single-mode LR4 10km 2-core