Home / Engineering Computing / Supercomputing and the Cloud, 2 in a Series

Supercomputing and the Cloud, 2 in a Series

By Peter Varhol

« Page 1 | 2

HPC Feature Supercomputing and the Cloud
NVIDIA’s new Tesla Personal Supercomputer is a 960-core GPU-based system rated at 4 TeraFLO PS.

Hardware Paces the Move
High-performance hardware has a completely different meaning today than it did just a couple of years ago. There was a significant performance gap between clustered industry-standard hardware and the top-end limited edition and often proprietary systems. But multicore chips have changed that equation, and those chips, and the systems built around them, are poised to make a difference in the cloud.

But first up is a dramatic multicore configuration that today is only on the fringes of what might be considered industry standard. At some point in the recent past, software engineers determined that for compute-intensive applications, the high-end graphics processing units (GPUs) normally used for video games had become far more powerful than Intel-architecture chips. The thinking went that there must be a use for this type of processor. The question was how to leverage it in a standard computer architecture.

NVIDIA recently delivered a 960-core GPU system using its high-end computing processors based on the NVIDIA CUDA parallel computing architecture. This system, called the Tesla Personal Supercomputer, is priced at just under $10,000 and is rated at 4 TeraFLOPS, making it theoretically possible to solve all but the most computationally intensive problems.

Granted, standard applications can’t run on systems like this — most commercial and custom applications are typically compiled to run on industry-standard Intel processors. But NVIDIA makes compilers available, so custom code can be compiled to run on this platform. In doing so it takes advantage of the parallelism offered by the multiple cores. And the ability to have this much computing power available on a cloud can lead application developers to find ways to take advantage of it.

Of course, Intel processors are also multicore and can be clustered for greater computing power. At Supercomputing 08, Microsoft announced that its Windows Server System had powered one of the top ten-rated supercomputers as measured by industry-standard floating-point benchmarks. This type of performance makes even Windows-based computers available for use in the cloud for high-performance systems.

The implications for enterprise computing in the cloud are significant. HPC on Windows using standard hardware brings to the engineering community the ability to obtain performance on demand for its applications. Applications that see a wide range of capacity needs, yet must execute fast under all types of load, will benefit from this trend.

Managing this level of computing power has always been one of the barriers to its widespread use. But management tools are also emerging that provide the ability to effectively partition this power among multiple applications and jobs.

Special-purpose middleware is also making it possible to run applications that have been tuned to work with that software. Middleware from companies such as Acceleware and ScaleMP sit between the operating system and the application, and use both general-purpose and industry-specific algorithms to take parts of the application execution and break them into parallel components. Those components can then execute on the underlying Windows operating system in threads that run in parallel on separate cores.

Don’t Forget About Programming Tools
In HPC, programming, or at least building an application, is just as important as running it. That’s because the application is often heavily compute-intensive, and uses special-purpose tools in execution.

Fortunately, development tools are emerging that can help application developers build software that takes advantage of these systems, both locally and in the cloud. These tools aren’t your standard development products, but they can be used by enterprise developers under certain circumstances. They include Mathematica, the math programming environment from Wolfram Research, and MATLAB, the engineering modeling and simulation environment from The MathWorks. Both assist in developing for and deploying to the cloud. For example, the MATLAB language provides a keyword that enables parallel execution of defined parts of the code. It works similarly to the Unix fork and join instructions, except it specifically tells the platform to run on multiple cores or systems if available.

Second, MATLAB enables the programmer to define the execution environment, even defining a specific cloud such as the Amazon EC2. By configuring that location for execution, the programmer can ensure that the application is optimized for that particular environment.

Overall, the cloud is fast becoming a legitimate platform for both development and execution of HPC needs. While a lot of development can still occur on the desktop, there is the need to both take advantage of a higher level of parallelism afforded by cloud systems, as well as capacity to quickly scale in response to business needs.

One Last Warning
Perhaps the most difficult aspect of developing applications for the cloud is being cognizant of the latencies and timeout errors involved in calling cloud services. As reliable as the Internet is in general, the Internet protocol is inherently an unreliable medium. There may be the need to build in fail-safe features that account for the lack of real-time responses in simulations, for example.

Cloud computing is less like the open source software revolution of the past several years in that there is a clear need to charge for cloud services early on. This means that there will be more planning and approvals before clouds become a common part of the enterprise-computing environment.

But cloud computing will find many adherents who like the model of paying for subscription services, especially if those services are of high value. The job of engineers is to identify those high-value services, and to ensure that the project understands and approves of that value, prior to engaging in significant cloud-based engineering projects.

More Info:
Acceleware Corp.
Munich, Germany

Seattle, WA

Microsoft Corp.
Redmond, WA

Santa Clara, CA

Cupertino, CA

The MathWorks
Natick, MA

Palo Alto, CA

Wolfram Research
Champaign, IL

Peter Varhol has been involved with software development and systems management for many years. Send comments about this column to DE-Editors@deskeng.com.

« Page 1 | 2

About Peter Varhol

Contributing Editor Peter Varhol covers the HPC and IT beat for Digital Engineering. His expertise is software development, math systems, and systems management. You can reach him at DE-Editors@digitaleng.news.