Working in high performance computing? These are the stats you should know
Most business needs are adequately served by ordinary computer systems, whether a solo user’s desktop system, a departmental server, or an enterprise data center. However, some computing use cases demand more than “ordinary” or “typical.”
At an individual level, these push-the-envelope applications traditionally are creative or multimedia tasks or gaming. However, as an enterprise category, high-performance computing (HPC) refers to the practice of aggregating computing power to solve large problems in science, engineering, and business.
What those large problems are—and which deserve HPC attention (and the cost) to solve them—is always debatable because of the steady increase of computing power. Specialized HPC hardware always serves a computing task that requires the most horsepower, the top few percent of what current (at the moment) systems are capable of delivering. In car terms, we’d label these “the fastest on the road,” whatever the current state of auto technology. “Going like 60” once represented the notion of traveling at almost-unimaginable speed, while now, it’s no big deal.
So if you want to come up to speed on the current state of HPC, here are the essentials.
HPC can be compartmentalized by the hardware that defined each era
There are all sorts of HPC dividing lines, but start with this history:
- 1940s–1960s: the first supercomputers
- 1975–1990: the Cray era
- 1990–2010: the cluster era
- 2000–present: the GPU and hybrid era
Alternatively, you can categorize the epochs by "a fundamental shift in design, new technologies, and the economics of the day,” per an InsideHPC summary.
- Before 1980: Supercomputers were rated by how many floating-point operations per second (FLOPS) they could deliver.
- Until 2005 or thereabouts: The multi- and many-core explosion happened in this period: “By November 2005, the TOP500 computer list was dominated (70 percent) by clusters of x86 servers.”
- Today: We're entering the exascale epoch. Various predictions expect exascale computing to arrive in the early 2020s (unless quantum computing gets there first).
How fast is fast? It depends how you measure
It’s really tempting to compare the processing power of an iPhone with the computer that took us to the moon, but ultimately, those are false statistics. Sure, you can find numbers: “The Apollo guidance computer? It operated at just over 1 MHz, which means each of the two processing cores of the  iPhone runs 1,270 times faster than the guidance computer’s single processor.”
But historical “fastest computing” comparisons are misleading because HPC is about system performance, not CPUs. HPC isn’t measured by FLOPS anymore or by the number of registers a programmer can access using Assembly language. CPU speed and efficiency is only one element in system speed.
Storage is a factor in HPC systems, in part because computing power depends on how much data can be ingested, the number of transactions that can be analyzed, and the plausible iterations required to accurately model a scenario. A superfast computational engine with a teeny data store cannot be effective at managing powerful scientific sensor networks, identifying fraudulent credit card activity, or giving patients their lab results before they leave the doctor’s office.
That storage is measured in distance to the processor, not just in the size of the bucket that can hold data. “The distance, measured in processor clocks, to storage devices increases as their capacity increases,” says Kevin R. Wadleigh, author of "Software Optimization for High Performance Computing: Creating Faster Applications." The processor’s set of registers are closest to the CPU, followed by caches, the main memory system, and data storage. “As storage devices become larger, they typically are farther away from the processor, and the path to them becomes narrower and sometimes more complicated,” he says.
For the fastest of the fast, look at the TOP500 list
The primary performance reference is the TOP500 list, founded upon the Linpack Benchmark report, which measures commercial systems using Linpack Fortran, Linpack n, and Linpack Highly Parallel Computing benchmarks. They’re all supremely fast; all 500 systems deliver a petaflop or more on the High Performance Linpack benchmark. As of June 2019, the fastest computer is the Summit IBM Power System AC922 (IBM POWER9 22C 3.07 GHz, NVIDIA Volta GV100, dual-rail Mellanox EDR InfiniBand) with 2,414,592 cores and a teraflop Rpeak of 200,794.9.
Ten years ago, in June 2009, the fastest was IBM’s Roadrunner (BladeCenter QS22/LS21 cluster, PowerXCell 8i 3.2 GHz, Opteron DC 1.8 GHz, Voltaire InfiniBand) with 129,600 cores and 1,456.7 teraflop Rpeaks. And 20 years ago, back in June 1999, the fastest commercially available system was ASCI Red from Intel (used at Sandia National Laboratories), with a then-astonishing 9,472 cores and a 3,154.0 teraflop Rpeak rating.
That, at least, makes a pretty graph to show off how much faster supercomputers have become. That is, theoretically, the fastest computers are 63 times faster than the top HPC system 20 years ago.
Note, however, that those are all the fastest computers available, which does not represent what you’d find in a typical data center, even one devoted to general HPC needs. It’s no more relevant than the relationship of today’s fastest race car to the 0-to-60 mph acceleration rating for an average family car.
For what’s coming up, it may be useful to compare "12 Early-Stage Quantum Computing Startups to Watch."
Top HPC uses today: Data modeling, AI, and analytics
Realistically, supercomputers are exceptionally expensive, at least in comparison to other business systems. So the HPC systems are used primarily for tasks that require massively efficient and powerful computing power, such as AI and heavy-duty data analytics, and where faster and better results have a clear return on investment.
The use case is a moving target depending on what “hard problems” are at the time. In 2006, for instance, the technical and enterprise customers adopting HPC primarily were scientists and engineers who needed to do number crunching. “Examples include climate prediction, protein folding simulations, oil and gas discovery, defense and aerospace work, automotive design, financial forecasting, etc. The latter category encompasses the corporate data center that stores customer records, inventory management, and employee details.”
Number and data crunching are no longer a niche requirement. There’s plenty of data to analyze. “Ninety percent of the data in the world was generated in the past two years,” says Becky Trevino, senior director of product marketing at Rackspace.
And that’s without taking into account the huge amount of IoT data that’s being collected and needs to be analyzed.
As a result, HPC use is on the upswing. In 2017, a Hyperion Research forecast suggested that the worldwide HPC server-based AI market would expand at a 29.5 percent compound annual growth rate to reach more than $1.26 billion in 2021, up more than threefold from $346 million in 2016. “We define the HPC AI market as a subset of the high-performance data analysis market that includes machine learning, deep learning, and other AI workloads running on HPC servers," the researchers explained. By 2019, Hyperion Research had increased its five-year HPC market forecast to $44 billion in 2023.
Per TOP500, in 1993, HPC was used primarily for research, aerospace, and geophysics, representing about a third of application demand. By 2018, top use cases for HPC were research, aerospace, and energy—though, alas, the data classifications were far less granular.
It’s more useful to look at the Hyperion Research market results from 2016, in which the top uses (or at least customer categories) were in the business areas of government ($2,059 million), university/academia ($1,934 million), computer-aided engineering ($1,251 million), and defense ($1,125 million).
It’s used primarily where ROI can easily be measured
Anyone who has worked to optimize software knows how much effort it takes to speed up an application. In most business environments, it’s “nice” to speed up an analysis that takes 24 hours to yield results in a single hour, but if you can afford to wait a day, there’s no particular need to bother with HPC—particularly since making effective use of HPC takes quite a bit of effort, both in terms of learning how to use it and developing software. That all costs money. Lots of money.
HPC is worth pursuing when the technology enables an organization to tackle bigger problems in the same amount of time or solve the same-sized problems in less time.
However, as IDC reported in 2015, organizations and governments investing in HPC have seen high ROI returns. Among them:
- From the latest data, $551 on average in revenue per dollar of HPC invested
- $52 on average of profits (or cost savings) per dollar of HPC invested
Looking for a remunerative career? Look at HPC
All those corporate HPC investments mean that experts are required to manage the hardware, write the data and analytics software, and otherwise know what they are doing in regard to superfast applications. Machine learning and data science engineers are a small specialty—and, according to LinkedIn’s data, among the most in demand.
Some HPC-related positions have an algorithmic focus, such as AI and IoT, or serve a vertical market. Sometimes that’s defined by “who has the money to spend.” For example, per an IDC 2017 presentation at the HPC User Forum, governments are looking to address critical issues with HPC, including global warming, alternative energy, financial disaster modeling, healthcare, and homeland security.
Ultimately, HPC advancements in any era are the forerunners to tomorrow’s ordinary technology. The HPC of today will be on your desktop—or in your pocket—in the next 10 to 15 years.
Read more about HPC:
This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.