HPE Cray Supercomputers QuickSpecs

Shape the Future of QuickSpecs - Your Input Matters

HPE Rear Door Heat Exchanger


The HPE Rear Door Heat Exchanger (RDHX) is a liquid-cooling accessory that removes heat generated by servers and IT equipment mounted in an HPE G2 Enterprise Series rack. The RDHX is mounted at the rear of the 42U or 48U rack. With a RDHX, heat load is not added to the datacenter because the heat exhausted from servers and IT is transferred to liquid cycling inside the RDHX.


A variety of RDHX is available to fit different heat load and different rack sizes, from 42U to 48U, 600mm and 800mm width, and a max cooling capacity of 60kW is available.

  • Overview

    One standard configuration rack supports the following:

    Compute

    • – One HPE G2 Advanced Series 42U 600mmx1200mm rack
    • – 16 Apollo n2600 Gen10 Plus 2U chassis
    • – 64 HPE ProLiant XL225n Gen10 Plus servers that support for 2nd generation AMD EPYC 7002 and 3rd generation AMD EPYC 7003 Series processors
    • – 4 64-port Slingshot TOR switches
    • – 4 HPE Aruba Networking Ethernet switches
    • – 2 vertical PDUs in the rear (1 per side) - 2 3-phase whips

    GPU Compute

    • – One HPE G2 Advanced Series 42U 600mmx1200mm rack
    • – 6 Apollo 6500 6U chassis
    • – 12 HPE ProLiant XL645d Gen10 Plus servers
      • – Support for 2nd generation AMD EPYC 7002 and 3rd generation AMD EPYC 7003 Series processors
      • – Support for 4 x NVIDIA A100 GPUs per node
    • – 1 64-port Slingshot TOR switches
    • – 1 HPE Aruba Networking Ethernet switches
    • – 2 vertical PDUs in the rear (1 per side) - 2 3-phase whips

    Other cabinet options

    • – 48U 600mmx1200mm rack, for additional space and increased system density
    • – 800mmx1200mm wider racks in both 42U and 48U. The wide racks provide additional room for cabling, and support a maximum of 4 vertical PDUs, as well as the redundant power rail configuration
    • – Rear door heat exchanger (RDHX) is an option to provide liquid cooling to the rack, that transfer heat into facility water directly, without the need for hot aisle cooling

    HPE Cray Supercomputer Compute Node Summary

    The compute platform in the HPE Cray Supercomputer is the HPE Apollo 2000 Gen10 Plus and HPE Apollo 6500 Gen10 Plus, powered by AMD 2nd generation EPYC 7002 and 3rd generation EPYC 7003 series processors. With dual sockets and eight DDR4 memory channels at 3200 MT/s, this compute platform offers a density optimized architecture to support a variety of HPC/ AI workloads. Each compute node is configured with Slingshot. A variety of processor and memory configurations are supported – consult the Apollo 2000 and Apollo 6500 Quickspecs for details.


    Features of the compute node are as follows:

    • – 4 dual-socket nodes per 2U chassis
    • – 8 DIMMs per socket
    • – 1 or 2 Slingshot injection ports per node. 200 Gb/s via dual-injection
    • – Diskless operation is standard. Contact your local sales representative for additional options

    Features of the GPU compute node are as follows:

    • – 2 dual-socket nodes per 6U Apollo 6500 chassis
    • – 8 DIMMs per socket
    • – 2 or 4 Slingshot injection ports per node. 200 Gb/s via
    • – 4 x NVIDIA A100 per node
    • – Diskless operation is standard. Contact your local sales representative for additional options
  • Standard Features

    Additional Node Options

    HPE Cray Supercomputer is designed to be flexible. Additional server options are supported. HPE DL325 (1U with 1-socket AMD 2nd generation EPYC 7002 and 3rd generation EPYC 7003 series processor) and HPE DL385 (2U with dual socket AMD 2nd generation EPYC 7002 and 3rd generation EPYC 7003 series processors platform) are supported as non-compute roles, such as management, login and gateway nodes. Consult the DL325 and DL385 Quickspecs and your local sales representative for additional details and configuration options.


    Non-compute platforms

    • – HPE DL325
      • – Management servers hosting the HPE Cray software stack
      • – Various user access nodes and admin nodes
    • – HPE DL385
      • – User access and admin nodes
      • – Interactive visualization nodes with GPU support

    Cabinet PDU and Cooling Options

    In addition to the standard 42U cabinet based on HPE G2 Advanced Series racks, wider 800mm versions of racks are also supported. 48U cabinets are also supported in both 600mm and 800mm widths. When configured as support cabinets for the Cray EX Supercomputer, optional trellis kit are available to connect all cabinets together to form a single system.


    PDU options are all 3-phase and include 208V 60 amps, 415V 60 amps, and 480V 60 amps, configured in two vertically mounted PDUs per cabinet. Only switched PDUs are supported.


    Liquid cooling option is available with the use of Rear Door Heat Exchanger and Cooling Distribution Unit (CDU), an option for the HPE Cray Supercomputer. This option enables water cooling to each cabinet; heat from the compute servers is captured to water through the door, achieving room neutral design point in the datacenter. Consult your local sales representative for configuration details.


    HPE Slingshot

    The HPE Slingshot Top-of-rack (TOR) Switch is a 64 port, 200 Gb/s switch provides the high-speed network interface for the compute nodes. Key Slingshot features include the dragonfly topology for all-to-all connectivity with 3 hops maximum, advanced congestion management and traffic class separation, all implemented on Ethernet. Dragonfly provides a lower cost and highly scalable alternative to traditional Fat Tree topologies. In a dragonfly topology, every switch is connected to every other switch Dual-rail support provides node connectivity at 200 Gb/s.


    Each TOR switch is supported by dual redundant power supplies and redundant fans for cooling. Depending on the network class, there is 2 or 4 TOR switches per cabinet. Cables within the cabinet and to the adjacent cabinets are copper, and optical cables are used for global connections between switches. The dragonfly topology lowers cost by leveraging copper cables and reduces use of more expensive optical connections by up to 50%.


    The network interface card (NIC) is Mellanox ConnectX-5 PCIe card, single-port, which offers 100Gb/s connectivity into the Slingshot network. For the 200 Gb/s dual-injection, two NICs are installed into each node.


    Management Network

    HPE Cray Supercomputer uses Ethernet based management network for telemetry monitoring, logging, and out-of-band control. Each compute node connects to the management network via 1Gbps Ethernet to the HPE Aruba Networking CX 6300M TOR switch, and these 6300M TOR switches are aggregated together by HPE Aruba Networking CX 8325 and 6410 in very large configurations, or HPE Aruba Networking CX 8325 switches for smaller configurations. Consult your local sales representative for details in management network topologies.

Software Stack

HPE Cray supercomputers are complete solutions with software and hardware that are tightly integrated and performance-tuned to offer the best system performance while bringing new standard in flexibility, manageability, and resiliency to supercomputing.


The Cray supercomputer software stack addresses the needs of both system administrators, developers, and end-users.


Administrative Software

HPE Cray System Management a built-for-scale system management solution offering administrators all functionalities they need to keep the system healthy, utilized to the maximum and accommodating wide range of workload requirements via –aaS experience. The software is built to manage systems which can scale to Exascale deployments featuring:

  • Comprehensive monitoring and management of all aspects of the system: CPU/GPU, network (integrated Cray Slingshot Fabric Manager), storage as well as power management and monitoring combined with provisioning for operational efficiency.
  • Multi-tenancy and partitioning, batch or container orchestration enable customers to run a variety of HPC/AI/HPDA workloads the way that makes the best use of their system without logistical constraints.
  • REST APIs & standard protocols enable full interoperability with existing monitoring, management, and automation toolsets.

Developer Software

HPE Cray Programming Environment is a fully integrated software development suite offering programmers comprehensive set of tools for developing, porting, debugging, and tuning of their applications so they can shorten application development time and accelerate their performance.


The programming environment is designed to make porting of existing applications easier with minimal recording and changes to the existing programming models to simplify transition to the new hardware architectures and configurations.


End User Software

HPE Cray OS is based on SUSE Linux Enterprise Server (SLES) with enhancements. The enhancements provide customers with capabilities specific to supercomputing and high-performance computing fully supported by HPE Services. These modifications don't alter the ability to run standard Linux applications, but rather enhance it for performance, scale, and reliability. We integrate and test these materials together and package releases.


While HPE Cray System Management and HPE Cray Operating System are designed to support HPE Cray Supercomputer systems with HPE Slingshot, HPE Cray Programming Environment is a self-standing product which also supports HPE Apollo servers, and the HPE Cray EX systems.


The whole software stack is supported by HPE Services.

Features

HPE Cray Supercomputer

Operating system

  • HPE Cray Operating System
  • RHEL (with HPCM)

System Management and

Fabric software

  • HPE Cray System Management
  • HPE Cray Slingshot Network Manager
  • HPCM

Workload Management and

Orchestration

  • Altair® PBS Professional
  • Slurm Workload Manager
  • Containers: Singularity & Docker

Software and Application

Development Tools:

HPE Cray Programming Environment

  • Development
  • – Communication Libraries: HPE Cray MPI, SHMEM
  • – Scientific Libraries: LAPACK, ScaLAPACK, BLAS, libsci, IRT, FFTW 3.0
  • – I/O Libraries: NETCDF, HDF5
    • o Compiling environment
    • o 3rd party programming environments:
      • • AMD Compilers
      • • PGI Compilers
      • • GNU Compiler Collection

  • Performance analysis tools
  • – Tools for performance analysis and optimization – versions for both experienced and novice users
  • – Code parallelization assistant for application optimization via code restructuring
  • – Visualization tool for quick assessment of severity of issues
  • – Debuggers: GDB for HPC, Valgrind for HPC, tools for stack trace analysis & abnormal termination processing
  • – 3rd party debugger support: Arm® Forge, TotalView™ by Perforce

DL/AI Tools:

  • Deep learning plugin

HPE Server Hardware Installation

Pre-installation Activities and Solution Implementation

Hewlett Packard Enterprise and the Customer determine all installation activities that must be completed prior to System installation. The Customer agrees to complete all of the pre-installation activities required. This includes HPE Site Engineering work onsite as may be described in the HPE Cray Site Preparation Guide.


Solution Implementation: upon completion of the pre-installation activities, Hewlett Packard Enterprise will provide the software components as set forth in the applicable system purchase agreement or bill of materials. Any additional software installation or configuration will need to be documented separately and will incur additional charges. This configuration service does not include any customer specific configuration, customization or testing unless otherwise specified.


System Testing and Performance Validation

The Hewlett Packard Enterprise installation personnel will conduct tests to verify the health and performance of the System. The tests are not intended to demonstrate application performance; the tests verify that the system infrastructure is working properly and delivering the intended performance level. Hewlett Packard Enterprise manufacturing tests and diagnostics will be used by installation personnel while onsite to validate that all hardware is functional, meeting the same performance and functional specifications as tested at the factory. Any additional testing that is required should be specified in a separate mutually agreed writing.

Hardware Maintenance Service Features

The HPE Cray EX system benefits of the Hewlett Packard Enterprise highest level of support for high-performance compute ‘HPE Cray Advanced Support’ that may include HPE presence onsite.

  • – This service level offers customers access to the HPE Cray customer portal. Case logging is available 24x7 by telephone or via this customer portal.
  • There is a choice of two maintenance coverage windows: 9x5 or 24x7. Onsite response time options are NBD, 4 hours, 2 hours, or 1 hour. When an issue is reported, a Hewlett Packard Enterprise technical representative will arrive onsite within the response time window to identify and begin resolving the issue.
  • – Hewlett Packard Enterprise provides critical spare parts to reduce any downtime associated with failures or maintenance. Critical spare parts may be located either onsite or at professionally managed regional spare part depots that provide rapid transportation of spare parts to customer sites. Customers may elect to supplement the HPE-owned spare parts inventory by purchasing additional spare parts.
  • – HPE Cray EX customers have access to the support snapshot analyzer that collects, analyzes, and reports support information for HPE air cooled and liquid cooled HPC, and Cray ClusterStor systems.
  • – HPE Remote Support via MyRoom, providing remote access and support to customers while still allowing customers full session control. Capabilities range from a customer having full control of a remote screen-sharing session to the HPE Services team having the ability to log in securely as needed to resolve issues and perform administrative functions.

HPE Cray EX customers are assigned a customer care manager who is familiar with their environment and able to assist with issues and escalations. This person provides quarterly reports and reviews of open issues and upcoming activities, such as larger maintenance upgrades, updates, system expansions, etc.

Software Support Features

Support for HPE developed software includes the following features:

  • – Access to self-help resources on HPE CrayPort
    • Ability to open and submit a support case
    • Access to HPE knowledge articles
    • Ability to download:
    • o Software releases and updates, including BIOS and FW
    • o Software Patches
  • – Notification of key operational items through the field notice (FN) process
  • – Assistance from HPE Services to resolve issues within the service level coverage window for the hardware contract; assistance includes:
    • Triage to investigate/analyze issues
    • Confirmation whether the issue is hardware or software

Confirmation if the issue is related to an HPE-supported product or a third-party-supported product. If the issue is with a HPE-supported product, HPE Services may provide configuration recommendations, possible work arounds, and directions to install a later version or patch, and/or submit a bug to get the issue fixed. For HPE products, Hewlett Packard Enterprise reserves the right to determine whether and how an issue will be resolved


Customized Software

Support is provided for products sold by Hewlett Packard Enterprise with a valid HPE Services support agreement. Support for third-party products without a related HPE Services support agreement requires the user to contact the third-party vendor for assistance. If customers modify HPE-delivered software without authorization from Hewlett Packard Enterprise, any issues resulting from the unapproved modifications fall outside of the standard support service agreement and Hewlett Packard Enterpriseis not responsible for any resulting defects, damage, failure, performance degradation, or issues of any kind, or correction or remedy of same. HPE may require the user to remove custom modifications to confirm that a modification is not the source of the issue. Customers may request that HPE Services assist in making modifications to a product. HPE Services will do its best to implement the request via a billable statement of work (SOW).

API and CLI Support

Support is available for HPE published APIs. Unpublished APIs are not eligible for support. Documentation outlining published API best practices and limitations is available at pubs.cray.com, accessible either directly or through the HPE CrayPort portal. Hewlett Packard Enterprisewill assist in determining if the API is working correctly, if the documentation is incorrect, or if the issue is an enhancement request.


HPE Application Programming Interface (API) and Command Line Interface (CLI) features allow the flexibility to configure and customize your system to optimize operations in your environment. These tools have the ability to significantly alter your system operations. If not properly tested and implemented in a controlled manner, they can introduce significant problems in your environment. When using these features or otherwise modifying or altering APIs, customers take on the responsibility to resolve or mitigate any issues they have introduced into the system.


HPE Services is not available to provide support to resolve issues that arise from the use of CLIs or APIs in a form not identical to those published by Hewlett Packard Enterprise.


Customer Training

Training courses are taught by Hewlett Packard Enterprise system experts and combine lectures with hands-on labs to enhance understanding and retention. The courses cover all aspects of using and maintaining an HPE system, from system administration to application development, porting, and optimization. A full listing of the standard HPE Cray training courses, along with their descriptions, can be found at https://education.hpe.com/ww/en/training/portfolio/hpc.html.


Subject to separate ordering arrangement, classes are scheduled on regular cycles at the Hewlett Packard Enterprise training facilities and can be scheduled for onsite delivery. Hewlett Packard Enterprise also offers customized training courses and can provide quotes for these courses based on the customer’s needs.


HPE Motivair Liquid Cooled Doors

HPE Motivair Liquid Cooled Doors can meet critical cooling requirements of HPE Cray Supercomputers servers in HPE racks in the modern datacenter.


HPE Motivair Liquid Cooled Doors is a liquid to air heat exchanger cooling system that is mounted directly to the rear panel of HPE racks.


For additional information, please visit HPE Motivair Liquid Cooled Doors QuickSpecs here

  • Service and Support

    HPE Services - Service and Support

    Get the most from your HPE Products. Get the expertise you need at every step of your IT journey with HPE Services. We help you lower your risks and overall costs using automation and methodologies that have been tested and refined by HPE experts through thousands of deployments globally. HPE Services Advisory Services, focus on your business outcomes and goals, partnering with you to design your transformation and build a roadmap tuned to your unique challenges. Our Professional and Operational Services can be leveraged to speed up time-to-production, boost performance and accelerate your business. HPE Services specializes in flawless and on-time implementation, on-budget execution, and creative configurations that get the most out of software and hardware alike.


    Consume IT on your terms

    HPE GreenLake brings the cloud experience directly to your apps and data wherever they are—the edge, colocations, or your data center. It delivers cloud services for on-premises IT infrastructure specifically tailored to your most demanding workloads. With a pay-per-use, scalable, point-and-click self-service experience that is managed for you, HPE GreenLake accelerates digital transformation in a distributed, edge-to-cloud world.

    • – Get faster time to market
    • – Save on TCO, align costs to business
    • – Scale quickly, meet unpredictable demand
    • – Simplify IT operations across your data centers and clouds

    Managed services to run your IT operations

    HPE GreenLake Management Services provides services that monitor, operate, and optimize your infrastructure and applications, delivered consistently and globally to give you unified control and let you focus on innovation.


    Recommended Services

    HPE Tech Care Service

    HPE Tech Care Service is the new operational service experience for HPE products. Tech Care goes beyond traditional support by providing access to product specific experts, an AI driven digital experience, and general technical guidance to not only reduce risk but constantly search for ways to do things better. HPE Tech Care Service has been reimagined from the ground up to support a customer-centric, AI driven, and digitally enabled customer experience to move your business forward. HPE Tech Care Service is available in three response levels. Basic, which provides 9x5 business hour availability and a 2 hour response time. Essential which provides a 15-minute response time 24x7 for most enterprise level customers, and Critical which includes a 6 hour repair commitment where available and outage management response for severity 1 incidents.

    https://www.hpe.com/services/techcare


    HPE Complete Care Service

    HPE Complete Care Service is a modular, edge-to-cloud IT environment service that provides a holistic approach to optimizing your entire IT environment and achieving agreed upon IT outcomes and business goals through a personalized and customer-centric experience. All delivered by an assigned team of HPE Services experts. HPE Complete Care Service provides:

    • – A complete coverage approach -- edge to cloud
    • – An assigned HPE team
    • – Modular and fully personalized engagement
    • – Enhanced Incident Management experience with priority access
    • – Digitally enabled and AI driven customer experience

    https://www.hpe.com/services/completecare

  • Technical Specifications

HPE Cray Supercomputer (cabinet)

Dimensions

78.99 x 23.50 x 50.65 in

2006.39 x 597.00 x 1286.43 mm

Weight (per cabinet)

1,770 lb ( 810 kg)

1 HPE G2 Advanced Series 42U Rack

16 Apollo 2000 chassis each with 4 nodes, 64 nodes total

4 Cray Slingshot 64-port TOR switches

4 HPE Aruba Networking CX 6300M Ethernet switches

Compute nodes

HPE ProLiant XL225n Gen10 Plus nodes with HPE Apollo n2600 Gen10 Plus Chassis

Cooling

Front-to-back air cooling at up to 35°C data center inlet temperature

Liquid cooling via Rear Door Heat Exchanger optional

Compute Node (HPE ProLiant XL225n Gen10 Plus)

Form factor

Half-width 1U blade. 4 XL225n for each Apollo n2600 Gen10 Plus chassis

Processors

AMD 2nd Gen AMD EPYC™ 7002 and 3rd Gen EPYC™ 7003 series

Compute blade

Four 2-socket CPU blades per n2600 chassis

Memory/blade

Up to 1024 GB per node, 16 DIMM slots per node

Memory technology

16, 32, 64, and 64 GB DDR4 3200 MT/s ECC Registered DIMMs

Local storage

none

Fabric options

Slingshot (single or dual-injection)

GPU Compute Node (HPE ProLiant XL645d Gen10 Plus)

Form factor

Half-width blade. 2 XL645d for each Apollo 6500 Gen10 Plus chassis

Processors

AMD 2nd Gen AMD EPYC™ 7002 and 3rd Gen EPYC™ 7003 series

GPU

4 x NVIDIA A100 GPU per node

Compute blade

two 2-socket CPU blades per Apollo 6500 chassis

Memory/blade

Up to 1024 GB per node, 16 DIMM slots per node

Memory technology

16, 32, 64, and 64 GB DDR4 3200 MT/s ECC Registered DIMMs

Local storage

(See Apollo 6500 quickspec)

Fabric options

Slingshot (dual or quad-injection)

System Software

Operating systems

HPE Cray Linux Environment

RHEL (with HPCM)

Fabric software

HPE Cray Slingshot Fabric Manager

HPE management

HPE Cray System Management

HPCM

Workload Management and Orchestration

Altair® PBS Professional

Slurm Workload Manager

Containers: Singularity & Docker

Kubernetes

Software Development Tools

(Programming languages, debuggers,

libraries)

HPE Cray Programming environment

Intel® Parallel Studio XE (w/Intel® MPI)

PGI® Professional Edition

GNU Compiler Collection

Rogue Wave Software® TotalView®

AMD AOCC

Vampir

  • Summary of Changes

Date

Version History

Action

Description of Change

03-Nov-2025

Changed

Rebranding applied to document

19-Feb-2024

Changed

Networking product names were updated.

15-Nov-2021

Changed

Service and Support section was updated.

02-Aug-2021

Changed

Overview section was updated

07-Jun-2021

Changed

Overview section was updated

04-May-2021

Changed

Overview, Standard Features and Technical Specifications sections were updated

03-Aug-2020

New

New QuickSpecs

Recommended for you