Scaling New Heights in a Perfect HPC Storm, 2 in a Series

High-performance computing, CAE technology, and data management are converging to a point where scaling is poised to make all the difference.

Engineering Computing News

Engineering Computing Resources

Latest News

Next-Gen: Growing Green Space Access via Mobility

Securing Design IP in Distributed Manufacturing

Digitally Designing the Factory of the Future

Vectary API Brings Product Digital Twins to Life

3D Printing at the Crossroads

Altair Acquires Cambridge Semantics

All posts

By DE Editors

October 31, 2008

By Tom Kevan

« Pages 1 | 2

Figure 4: Engineers in the oil and gas industry perform transient simulations on tricone drill bits to determine wear and tear. Water is pumped through the center to flush away debris and keep the cutting edge clean. (Image courtesy of Blue Ridge Numerics Inc.)

The Importance of Communications
Of all the factors contributing to software’s ability or inability to scale — that is, take advantage of parallel processing in a cluster-based HPC system — the most important is communications. A software program’s ability to scale is determined by the level of interprocessor communications required, and that is dictated by how independently each processor can solve its piece of the puzzle.

“In the basic paradigm of parallel processing, you take a big problem and chop it into a bunch of small pieces,” says ANSYS’ Hutchings. “You have the individual processors work on the small pieces. For that to be an effective and efficient process, the small pieces have to be solvable as individual pieces. If they have an extreme dependency on the neighboring pieces of the puzzle, then solving them independently doesn’t get much speedup in terms of the problem turnaround. So it really comes down to the ability to work independently on the small pieces of the problem without too much dependence on the neighboring processors or neighboring parts of the problem.”

This rule of thumb means different things for different CAE disciplines (e.g., CFD and FEA). “As you change the physics and the way you are solving the math, sometimes the independent pieces need to talk to each other more,” says Ray Browell, senior product manager for ANSYS.

Consider FEA software. By its very nature, the finite element method requires more frequent communications. To function, the software solving an application component on an individual element must access data on neighboring finite elements. Some of those may be on different cores in the cluster, which translates into more interprocessor communication. So adding more processors begins to slow the overall calculation because there is more and more interprocessor communication.

“Finite element software solves problems by comparing information and passing information among cores that are a little bit farther away,” says Dennis Nagy, vice president of business development for CD-adapco. “So more often, it has to communicate with different cores or nodes in the cluster, and the percentage of time spent communicating goes up the more cores you have. It begins to erode the otherwise ideal effectiveness.”

CFD software is different. This type of application requires communications between cores, but it’s very limited. Imagine you have a fluid flow, and you divide it with an object, such as a valve. The effect is very isolated, or local. Compared with FEA, the physics of CFD are often simpler.

“With CFD, communications between the processors is so little compared with the number crunching going on at each node or at each core that we almost get linear scaling, way up to a hundred or more cores,” says Nagy.

In addition, there is a link between the effect of communications and the mathematical formulations used in CAE software. “The issue that I referred to about how dependent the calculation happening on one processor might be on a neighboring processor is a fundamental aspect of the mathematics,” says Hutchings.

In the case of FEA software, the end result is a diminished capacity to scale. “FEA software, which primarily relies on the implicit approach, takes advantage of parallel processing as well as it can, given the underlying numerical methods and equations that it is trying to solve,” says Nagy. “Those equations require more interprocessor communications than finite volume analysis that CFD is based on. FEA software vendors’ hands are somewhat tied by the methods they are using.”

The minimal communications required by CFD software works well with the mathematical approach it’s based on. “In an explicit solution scheme, you have small-element, localized matrix problems that you have to solve as you go along,” says Siemens’ Komzsik. And that is or may be confined to one particular processor. Therein lies the big difference — significantly less interprocessor communications.

In the end, the biggest challenge to scaling is communications — moving data among the cores. “Regarding multicore machines, the biggest challenge with utilizing the processing power is the ability to feed data into and out of the processors or cores fast enough to utilize the processing power of the cores,” says Jim Spann, vice president of marketing for Blue Ridge Numerics.

Future Improvements
Today’s CAE software scaling capabilities are not carved in stone. With the pressure to redeem the promise of parallel processing via continuing improvements in scaling, software vendors will wring out every improvement possible.

Up until now, the core analysis functions have been the focus of most efforts to improve scaling. Vendors are now going to spread their nets wider by looking to optimize other parts of the simulation process. It’s here that they hope to make improvements. “More development work will be concentrated on improving the parallel performance of all the phases on the analysis, which will increase the overall scalability as core counts increase,” says Dennis Sieminski, director of marketing and international sales for NEi Software.

For example, the meshing and surface wrapping processes are prime candidates for scaling improvements. “You can always get a robust mesh that’s pretty good quality, or you can wrap the surface automatically,” says CD-adapco’s Nagy. “Everybody says that is a great improvement, but those approaches are still basically happening on one processor. We can parallelize these processes. There’s still a lot of work to be done there that will speed up the compute steps.”

More than one provider is taking this approach. “For ALGOR, a new approach will be to expand the tasks that are run on high-performance computing systems,” says Bob Williams, product manager for ALGOR. “Currently, we support the calculation of the analysis solution. Future development will add preprocessing tasks, including meshing and postprocessing calculations, such as fatigue studies. Preprocessing and postprocessing tasks will also benefit from increased computing power in terms of faster, more efficient performance.”

Most application vendors feel there is room for improvement in the current generation of software, and this is where they are applying their energy and resources. Others believe a more radical solution will ultimately be called for. “A revision and rewriting of commercial software is needed to really take advantage of the multicore and many-core systems that are becoming available,” says Siemens’ Komzsik. Only time will tell.

More Info:
ALGOR, Inc.
Pittsburgh, PA

ANSYS
Canonsburg, PA

Blue Ridge Numerics
Charlottesville, VA

CD-adapco
Melville, NY

COMSOL
Burlington, MA

MSC.Software
Santa Ana, CA

NEi Software
Westminster, CA

Siemens PLM Software
Plano, TX

Tom Kevan is a New Hampshire-based freelance writer specializing in technology. Send your comments about this article to [email protected].

« Pages 1 | 2