Skip to main content

Tune up your career in high-performance supercomputing

Supercomputing is an appealing career path for anyone in IT who wants to work on exciting, demanding projects that make a difference in the world. Here are the skills you need to find a job in this growing field.

IT professionals who are interested in working on the leading edge of computing should learn about supercomputers and high-performance computing (HPC). This field of activity encompasses a variety of job roles that range from the people who specify and build the systems, to the admins and engineers who put them together and run them, to the programmers who create the tools and applications that put the monsters to work.

One thing they all have in common, though: HPC work is demanding, exciting, and often has an immediate effect on the world. Predicting the weather? Sequencing genomes to help researchers solve Alzheimer’s? Designing industry-leading special effects for a movie? Few of these uses for supercomputers are boring.

Here's a short technical overview of HPC, followed by career options in this super-fast segment of the computer market.

The HPC vocabulary

Everybody knows that supercomputers are the giant behemoths of the computing world. These installations invariably invite superlatives and usually involve thousands upon thousands of processors, plus ungodly amounts of RAM and storage. We’re talking teraFLOPS to petaFLOPS of computing speed, terabytes of RAM, and petabytes or more of disk storage. One also finds lots of high-speed interconnects to tie together the computing clusters from which such systems are made. That permits all those computing resources to work in parallel, crunching away at a variety of interesting problems and specialized applications.

HPC installations may be built using special-purpose, high-speed components, but they may also be put together using industry-standard CPUs and GPUs from well-known vendors, including Intel, AMD, Qualcomm, and Nvidia. Blade racks are most often cooled using water or forced air, because keeping systems cool not only saves on power costs but also permits them to run faster.

The power of parallelism

The secret to realizing a return on the kinds of massive investment that a supercomputer requires is to put its many pieces to work simultaneously, each one beavering away at a portion of the workload that the system handles at any given moment. The benefits of performance optimization scale along with the scope and scale of a supercomputer, so that modest gains at the unit level can translate into enormous boosts in throughput or, in more typical supercomputer terms, shorter completion times for scheduled jobs.

Most modern supercomputers run on highly customized versions of Linux, tweaked to squeeze as much performance out of the systems as high-speed technology, human ingenuity, and computerized automation will allow. Supercomputers are such expensive and valuable resources that they’re usually scheduled from months to years in advance, with access limited to specific time windows during which individual jobs may run. This makes working with such systems something like air traffic control at a busy airport: A lot is going on, with many people involved in orchestrating activity and a lot of stress and pressure to keep things moving, safe, and secure.

What's the future of HPC? What are the challenges on the path to exascale?

This also explains why automation is a key ingredient in supercomputing. In an environment where tens to hundreds of thousands of processors must be kept as fully occupied as possible, the value of powerful and reliable configuration, orchestration, management, and monitoring tools is hard to overstate. It’s in users’ best economic interests to make setup and configuration as quick as possible, and to capture state and data so that jobs may be interrupted at any time, only to resume as soon as computing resources once again become available.

In career terms, you should be thinking, “Which of those roles am I most interested in pursuing?”

What kinds of problems does HPC tackle best?

HPC systems usually play host to specialized applications and services. Historically, HPC has been used for weather forecasting, aerodynamics (terrestrial and aerospace), and even nuclear weapons simulations (throughout the Cold War, peaking in the 1980s). These days, HPC is used for all kinds of compute- and data-intensive applications, which include:

  • Complex financial models: This includes hedge funds or currency and commodities trading, as well as large-scale economic models.
  • Molecular, genetic, and chemical models: Think protein folding, molecular dynamics, genomic analysis, and complex chemical reactions of all kinds. Many of the world’s largest HPC installations work for big pharma.
  • High-energy physics and cosmology: Examples include analyzing data from particle accelerators, gravity wave interferometry, astronomy, and modeling the Big Bang and its aftermath.
  • Complex engineering: This applies to large-scale simulations of all kinds, including finite element modeling, fluid dynamics, multi-physics, and complex systems such as large buildings, bridges, and other structures. Generative design, which mixes parametric modeling with iteration and evolution, is a particularly rich area for engineering HPC.
  • Machine learning and big data/data science: Where there’s lots of data, there’s also lots of opportunity to perform rich and complex analysis. This is an area where HPC is showing great growth and equally great gains.

The essence of what makes an application suitable for HPC is that it requires many, many compute operations that can be at least somewhat isolated from each other, and that such operations be fairly consistent in character and scope. For these workloads, HPC provides the ultimate form of “divide and conquer,” where work gets divided up into an innumerable list of individual assignments that are parceled out to HPC nodes for processing and then aggregated when the processing is complete, producing outputs in the form of models, analyses, forecasts, and so forth.

What should you learn to work in HPC?

Want to shift your career into this area? There is a lot of opportunity for IT professionals to work in and around HPC.

As mentioned, there’s work for people who build and design the components from which such systems are assembled. Let’s call these folks the HPC designers and engineers. Then there’s work for people who put together and configure those components, and then take responsibility for their management, monitoring, and maintenance. Let’s call them the HPC administrators. And finally, there’s work for people who develop or optimize software to run on HPC systems, ranging from tools and utilities to optimize or parallelize HPC code, to building applications for such systems, to building or working with orchestration and runtime tools to set up and launch HPC jobs or tasks and then shut them down and capture their state and results.

For all such IT professionals, a strong background in computer science or engineering is essential. To excel in this area, IT pros who work in HPC must understand operating systems, automation tools, data storage and retrieval, and networking and communications, both broadly and deeply. In fact, these are the kinds of specializations where a master’s degree actually means something, and where a PhD holds both prestige and value. Furthermore, focused training on HPC topics and technologies pays off handsomely, as does research and hands-on experience in one’s chosen areas of HPC expertise.

Most real and useful HPC learning takes place in the academy or on the job. As Greg Lindahl, founder of PathScale, a developer of a highly optimizing compiler for HPC code compilation, put it baldly, “The HPC community has never been big into certifications.” And indeed, five years after Lindahl’s comment first appeared, I can’t find any HPC certifications available.

If you’re past college age, however, not all is lost. The search engine for Massively Open Online Courses (MOOCSE.com), offers a variety of HPC-relevant courses including coverage of linear models and matrix algebra, statistical inference and modeling, annotation and analysis of genomes, and high-performance finite element modeling. You can never have too strong a math background if you want to work in HPC, especially in areas such as applied mathematics, linear algebra, probability and statistics, multivariate calculus, algorithms and complexity, and set theory.

According to their individual areas of specialty (build/design engineers, admins, and programmers), aspiring HPC professionals should seek out information and opportunities to permit them to scratch their particular itch. That emphatically includes the industry in which a would-be HPC pro wants to work. If you want to be a programmer working on HPC applications for DNA analysis, for instance, you need to learn something about DNA, amino acid structures, and genomics.

Would-be HPC admins will benefit from learning one or more of the relevant toolkits for their target platforms and runtime environments. The coverage at HPCToolkit.org is a great place to start, where aspiring and practicing HPC professionals will find a plethora of options. Then, too, vendors offer a variety of frameworks that target their specific offerings, both for admins and programmers. In addition, aspiring HPC admins would do well to learn one or more scripting languages to help with task automation and orchestration, including Bash, Tcl, Python, and so forth.

Would-be HPC programmers likewise have lots of options from which to choose. They’ll want to acquire an understanding of algorithms and architectures for parallel processing, to understand the foundations upon which code optimization and parallelization rest. According to Quora, Bash scripting, along with Perl and Python, are meat and potatoes for HPC programmers. In addition, such languages or environments as R, Java, Scala, MatLab, and C/C++ also come up frequently in the toolboxes of already-practicing HPC professionals.

That said, you shouldn’t have to look too far or too hard to find HPC work, because this area is growing rapidly. Hyperion reports annual growth of 4.4 percent for 2016 for system sales, with growth over 30 percent for the final quarter of that year. Opportunities for all HPC job roles are excellent and getting better as the economy continues to improve. An IDC study calculated that each dollar invested in HPC returns an average $514.70 in revenue. With those kinds of returns, companies, organizations, and even governments cannot be averse to investing in HPC, which also means hiring people to put the systems to work.

The vendor connection

According to Hyperion, the leading HPC vendors in today’s market include Hewlett Packard Enterprise (34.6 percent market share), Dell (17.9 percent), and Lenovo (8.1 percent), with just under 40 percent spread across a large number of second-tier companies. Interested IT pros could do a lot worse than investigating what these leaders have to offer, by way of platforms and technologies and also training and information, to help them investigate and understand their employment options in this field.

Opportunities should be sweet for admins and programmers who learn how to work with one or more of the leading offerings, and the tools and APIs that support them. Don’t forget, in this age of automation and cloud computing, admins need to understand scripting and basic programming, too, to make the most of their capabilities. Programming isn’t just for programmers anymore. These days, it’s something that everybody needs to know and do, if only modestly.

Your HPC career: Lessons for leaders

  • The more computer science training you have, the better. This is one area in which a PhD is meaningful—and certifications are not.
  • Don't overlook the value of domain expertise beyond computer science. If you want to work on applying HPC to weather predictions, you should know something about weather.
  • HPC is more than a programming endeavor. Opportunities abound in system administration and hardware engineering.

Related link:

Overcoming the Barriers to Supercomputing

This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.