By Peter Varhol
I cut my teeth as a software developer, and I know many engineers who still use their own code for specialized analysis of unique data, or to do simulations that require custom processing. Much of this code has been around for a long time, and while it may be highly efficient, it was never built to run on multiple processors or multiple processor cores.
Yet if engineers want that code to continue to be efficient, it must take advantage of the growing explosion of CPU cores on processors. As the number of cores expand to four, eight, and beyond, even on desktop systems, engineering applications need to take advantage of this power.
But it’s a difficult task, even for custom analysis code. Many matrix operations can be done in parallel, but identifying where to begin and end parallel execution can be difficult, and putting the safeguards into place to ensure that the individual operations are protected is a highly detailed activity.
The potential errors are both significant and difficult to find and analyze. One of the most insidious errors is the race condition. In a race condition, multiple parallel streams execute more-or-less simultaneously; however, because they share data during their execution, the end result can be incorrect, depending on which thread finishes first.
It is a difficult error to catch, in large part because it only occurs some of the time, and often seemingly randomly. Engineers can become frustrated by seeing the code execute successfully all of the time, but getting an incorrect or inconsistent result once in a while.
Other errors are also possible, but most result in a system crash as multiple parts of code attempt to access or change the same data at the same time. Whatever the type of error, engineers who know something about software development aren’t well trained to identify and fix them.
Intel, the processor vendor that brought multiple cores to our industry-standard hardware, has a big responsibility in alleviating the problem of writing code for those processors. And there is a lot of self-interest involved for Intel, too. If software can’t take advantage of the multicore processors Intel is manufacturing, then there is little reason for those requiring high-performance systems to continue buying Intel-based systems.
Intel has known this for a long time, and in response it is currently offering Intel Parallel Studio for beta testing at http://software.intel.com/en-us/intel-parallel-studio-home/. Intel Parallel Studio consists of three components—Intel Parallel Composer, Intel Parallel Inspector, and Intel Parallel Amplifier. Together, these tools enable developers building multithreaded applications for parallel execution to quickly construct and debug applications that are able to make more efficient use of today’s processors.
These tools plug into Microsoft Visual Studio, where you can write C or C++ code (or port existing code) and use the libraries, debuggers, and tuners to help you make that code parallel. Intel Parallel Composer consists of a parallel debugger plug-in simplifies debugging of parallel code and helps to ensure thread accuracy. Parallel Inspector detects threading and memory errors and provides guidance to help ensure application reliability. Parallel Amplifier is a performance profiler that makes it straightforward to quickly find multi-core performance bottlenecks without needing to know the processor architecture or assembly code.
However we might write parallel code in the future, Intel is now providing essential tools for anyone looking to write code for today’s multicore processors. It is the only way that we can start using the computing power the last few years have afforded us.
Peter Varhol has been involved with software development and systems management for many years. Send comments about this column to DE-Editors@deskeng.com.