By Peter Varhol
As GPUs with multiple processing cores increasingly find their way into standalone workstations, engineers are looking for ways to take advantage of the computational power provided by these special-purpose processors. Today, few people doubt the value of GPU computing, especially for computationally intensive uses such as engineering analysis and simulation.
But GPU computing presents a difficult dilemma for engineers, hardware vendors and engineering software providers alike. The problem is that most code, whether commercial or in-house, has been written to industry-standard CPUs. If you own the code, you have to determine whether it’s worth the time and effort to convert to a GPU architecture. While a lot of in-house analysis code should, in theory, be portable, it can be a very large undertaking–requiring specialized skills and months of effort–for a return that may not be worth the cost.
The Microway BioStack combines Intel Xeon CPUs and Tesla Fermi GPUs
totaling 84 CPU Cores and 6272 GPU cores.
A commercial analysis and simulation software vendor has to determine whether the potential revenue from a GPU implementation justifies the effort to create a new version of a product. Several companies, including ANSYS, have already made that leap, but it’s a tougher decision for some of the smaller vendors.
Hardware vendors have to look at GPU architectures and decide whether to offer them in system configurations. While such systems are becoming more popular, it results in a more complex integration effort for the system vendors, making such systems more expensive to design and manufacture.
The situation gets still more complex. There are two primary GPU architectures, one from NVIDIA and one from Advanced Micro Devices (AMD). An individual engineering group with its own code can make a decision on which architecture to use, and port its code accordingly. But commercial vendors are almost obligated to choose one or the other, because they usually don’t have the engineering or financial resources to do both.
Today, NVIDIA has a clear lead over AMD, thanks to a concerted effort to build and promote GPU computational architectures, but it’s still early in the game. Some engineering software vendors have made the decision to port software to both platforms.
“We support both the NVIDIA and AMD platforms,” confirms Acceleware Ltd. Chief Technlogy Officer Ryan Schneider.
But few others can afford to make that call, so you may be limited in your choice of engineering applications if you want GPU performance–and even more limited in your GPU selection.
What About Performance?
From a performance standpoint, the GPU is a clear winner for many engineering applications, especially those that involve floating point applications. Depending on the GPU, the type of computation being performed and specific code details, computations can run as much as 10 times faster on the GPU than a corresponding CPU. Of course, those results are on benchmarks, and real-life application performance gains tend to be lower. But there is usually enough of a performance advantage for engineers to look carefully at the GPU hardware and software.
Industry-standard CPUs haven’t ceded a major role in engineering application performance, despite an architecture geared more toward general-purpose computing. But CPU leader Intel hasn’t given up on GPU computing, despite the failure to release a GPU multicore processor, code-named Larrabee, two years ago. The company is likely to build on the Larrabee technology, preferring to incorporate its GPU features into its popular CPU architecture.
Still, Intel doesn’t see a role for GPU-only processors, at least for high-performance parallel computing: “We’re satisfied with our Xeon performance and tools for engineering applications,” says James Reinders, director of marketing and business for the Intel Software Development Products group.
Despite its computational performance advantage, you’re not going to see a mainstream GPU-only machine anytime soon. While Linux is likely to be ported to one or both GPU architectures as soon as the languages and development tools mature, it is highly unlikely that the GPU will support mainstream applications on its own operating system, for example.
So, the likely configuration for the foreseeable future is a system with a Windows operating system running on one or more CPUs, combined with either an expansion card or a chassis with multiple GPU processors and up to 10,000 cores. These systems will support engineering software that displays on the CPU and renders on a graphics chip, but does computations on a GPU card or array.
Best of Both Worlds?
NVIDIA just might have an answer to the dilemma of when to use CPUs and GPUs. In conjunction with OEM hardware partners such as Dell, HP, Lenovo and Fujitsu, NVIDIA has recently announced a class of workstations, code-named Maximus. Maximus is powered by a CPU, a Quadro GPU and a Tesla GPU. These systems will use the CPU to run the operating system, the Quadro GPU to handle graphics processing, and the Tesla for parallel GPU computations. According to Jeff Brown, general manager of the Professional Solutions Group at the company, NVIDIA refers to the vision as “unifying graphics and parallel computing.”
Traditionally, simulation and analysis jobs requiring parallel computing are outsourced to server clusters–a workflow that hampers productivity among engineers, designers and digital content creators. With a Maximus-class workstation, an engineer may do CAD work, render graphics, and run simulation–all at the same time–without seeing a slowdown in system performance on his or her workstation.
The Maximus platform looks a lot like what Intel is doing with some of its hardware and software partners, in promoting virtual machine computing using its VT-d direct I/O technology. But there are important differences. Intel and its partners are focusing on lower to mid-range cluster computing, utilizing unused memory and processor cores. It’s a smart strategy, but it doesn’t make complex computations complete more quickly. That’s where GPUs can fill the gap.
NVIDIA promises dynamic resource allocation with its systems. In other words, the engineer doesn’t need to know which part of his or her job is best suited for the CPU, Quadro GPU or Tesla GPU. Maximus-certified machines will have the ability to balance load among its CPU, Quadro GPU and Tesla GPU on its own.
The Bottom Line
While there are a lot of question marks surrounding GPU computing today, it’s clear that it is fast becoming a significant force in engineering work. Engineers seeking a higher level of performance on individual workstations can adopt GPU computing via a plug-in card. Dedicated multi-processor and multi-core GPU systems are also available, with an industry-standard CPU running the operating system and many applications.
But software remains the key ingredient, and good engineering applications will continue to be slow in coming. Until GPU computing becomes ubiquitous, engineers may have to search for the right application mix. The alternative is to pass on building a GPU computing environment, and stay with slower, but more universal CPUs.
As software catches up, more engineers are likely to adopt GPU solutions for computational purposes, while keeping CPUs to run the operating system and general business applications.
Mobility and Mixed Processor Architectures
Most of us don’t give a lot of thought to the processors in our smartphones or other handheld devices. Most are ARM processor designs, which offer good performance with low power consumption. However, phone interfaces and applications are becoming increasingly graphical, and are starting to necessitate performance well beyond what is available in a mainstream ARM processor. High-performance portable devices are needed by engineers as extensions of their workstations–and even their clusters–as their work continues to become more mobile.
Sumit Gupta, director of the Tesla GPU Computing Business Unit at NVIDIA, notes that some design engineers are already working in this manner.
“I have a relative who’s an architect, and he uses his iPad to change designs right up until the time he arrives in the client’s office,” Gupta says.
It’s clearly within the reach of design engineers to work using portable and handheld devices. For example, NVIDIA offers a system-on-a-chip configuration that includes both an ARM CPU and a Quadro GPU, specifically for smartphones, tablets and other handheld devices. This chip, the Tegra 2, consists of a Dual-core ARM Cortex-A9 CPU, coupled with an ultra-low power NVIDIA GeForce GPU. This chip is intended to power smartphones that are just as much computer as phone.
The Tegra 2 provides a strong combination of phone performance, including web browsing and app execution, as well as graphics display for the UI, graphical applications and video streaming. It is possible to get PC-like response from video or from detailed graphics, albeit on a much smaller display.
Today, Tegra 2-based phones are available from the likes of Motorola Mobility, Samsung and LG. Phone design is often a careful balance between performance and power consumption, but performance is climbing fast. The Tegra 2 provides a platform that looks like a phone, but performs more like a computer.
You may not be using your smartphone as a design tool today, but the day where it will seem indispensable isn’t that far away. Whether you modify design components, run analyses using different parameters on your cluster or in the cloud, or use it as a demonstration platform, your smartphone is fast becoming a way to connect more closely with your designs.
Contributing Editor Peter Varhol covers the HPC and IT beat for DE. His expertise is in software development, math systems, and systems management. You can reach him at firstname.lastname@example.org.