A superfast network is going online in New Orleans, the birthplace of Jazz. With the capacity to transfer over 260 gigabits per second, this network could, in theory, let you download the entire content of the Library of Congress in under 30 seconds. Called SCinet, the superhighway is a temporary setup, put together to celebrate the techophiles’ Mardi Gras, otherwise known as Supercomputing 2010 (SC10).
SCinet will be available for seven days only, just enough to get you drunk on the horsepower boost. But similar networks may pop up sooner than you think. The show’s leading sponsors are technology powerhouses with a vested interest in the proliferation of supercomputers, hosted online or installed on premise.
Clouds Bursting at the Seams
Just as SC10 is underway, Platform Computing, which specializes in cluster, grid, and cloud management, issued a white paper titled “High Performance Computing in the Cloud” (November 2010). Fusing premonition and admonition, the paper’s tag line warns we must “[Prepare] for the inevitable.”
Cloud-hosted HPC is inevitable for one simple fact: cost effectiveness. “When IT departments buy, build, and maintain clusters to handle peak loads it can be expensive, time consuming, and wasteful. Compute environments designed for peak loads often see utilization rates drop with idle compute resources when the project that created the spike is complete,” the paper points out.
“High-Performance Computing (HPC) has a long tradition of using dedicated, homogeneous, and fast resources connected via an extremely high speed network. Therefore, many HPC users don’t believe that cloud computing can be used as an HPC resource,” the paper observes. “Why? Because implementations of cloud computing are generally a loose set of commodity servers in an infrastructure that is not designed for speed … A typical cloud infrastructure is designed for transactional processing, and does not provide the high bandwidth, low latency connections or extremely low hop counts needed for high performance computing.”
The authors go on to explain that “the low application performance in virtualized environments created a huge barrier for cloud adoption in HPC.” But the barriers are about to come down. “Virtualization technology has advanced in recent years and performance is becoming less of an issue. Processor support for virtualization as well as para-virtualized operating system device drivers have improved …”
The Battleground at a Glance
Big names generally associated with desktop workstations and home computing are carving their own corners in the cloud. Microsoft told SC10 attendees that Windows HPC Server users will soon be able to run HPC workloads on Windows Azure. HP alerted HPC seekers that it was responsible for the TSUBAME 2.0, the first peta-scale system designed to support applications in climate and weather forecasting, tsunami simulations, and computational fluid dynamics. It comprises 1,357 HP ProLiant SL390s G7 servers, each with three NVIDIA Tesla M2050 GPUs, touting a sustained performance of 1.192 petaflops. The computer maker also supplied two other systems inducted into this year’s TOP500 (that’s the HPC-equivalent of Fortune 500): one at Georgia Tech, another at MD Anderson Cancer Center.
HP’s workstation rival Dell declared its HPC program for CERN’s ATLAS experiment is expanding to all Large Hadron Collider research experiments, powered by Dell PowerEdge HPC technologies. GPU maker NVIDIA continues to encroach on HPC and technical computing market, evident in the recent Amazon announcement that its Amazon Web Services division now offers Amazon Cluster GPU Instances, designed to deliver GPU-driven processing in the cloud.
Broader Bandwidth and Interconnects Critical to Cloud-Hosted HPC
Whatever the turbocharged horsepower hosted in the cloud may be, you’re still limited by the bite-per-second data transfer rate with which you’re communicating with the remote cluster. Platform Computing’s paper writes, “How the application interfaces with the storage resource, whether streaming or high IOPS, can affect the performance of the system. The processes that handle data can slow down a system and cause an application to take more time to finish.”
SC10’s seven-day HPC bonanza, SCinet, is delivered through an InfiniBand network, consisting of Quad Data Rate (QDR) 40-, 80-, and 120-gigabit per second (Gbps) circuits. That’s far superior to the kind of wired or wireless connection you might get in an office or at home.
Whereas Fat Tuesday marks the peak of Mardi Gras festivities, fat data pipelines may determine the future of on-demand HPC via cloud.