New infrastructure platform

NeSI is commissioning a new HPC system with the main computation and storage infrastructure at NIWA’s High Performance Computing Facility at Greta Point in Wellington and with a secondary copy of all critical data held at the University of Auckland’s Tamaki Data Centre. Stay in touch with our progress.

NeSI's Fabrice Cantos and Greg Hall at the Cray factory checking in on our XC50
Attribution: 
NeSI's Fabrice Cantos and Greg Hall at the Cray factory checking in on our XC50

 

The new systems, provide a step change in power to NeSI’s existing services, including a Cray XC50 Supercomputer and a Cray CS400 cluster High Performance Computer, both sharing the same high performance and offline storage systems .

Beyond these core components, the new systems will deliver new capabilities in data analytics and artificial intelligence (AI), virtual laboratory services – to provide interactive access to data on the HPC filesystems, remote visualisation, and support for high performance end-to-end data transfers between institutions and NeSI. These features will be rolled out over time, alongside training to ensure ease of use.

We are excited about providing New Zealand researchers with access to our new start-of-the-art infrastructure platform, supported by our team of experts.

The features we’ll launch include:

  • Single point of access to both the XC50 supercomputer and CS400 cluster HPC.
  • Faster processors so current work is done faster.
  • More processors so that more work gets done.
  • GPGPU nodes to support science codes and visualisation
  • A huge memory node to support memory hungry applications
  • Interconnect performance that on the XC50 will allow jobs to scale to 1000s of processors, and on the CS400 to run very large numbers of small jobs
  • A user environment that will make it easier to manage work, develop and run research workloads/jobs, and apply data analytics tools.
  • Increased storage capacity and hierarchical storage management to minimise the need to move data between the HPC storage and a user’s home institution, and underpin the new interactive data analysis services.
  • Vastly increased file system performance reducing the time spent reading and writing data to the filesystems.

Questions on these features and technical specifications? Please contact us at support@nesi.org.nz.

Future features and service development

Our focus is on putting the new infrastructure in place and building confidence it is performing and meeting researcher needs. Once we know this important work is done, we’ll be looking to the future.

We’re excited about the additional features our new platform will be able to offer including:

  • Interactive analyses and exploratory visualisation supported by high performance data increase clarity and insight.
  • Pre- and post-processing using specialised large memory nodes, GPUs, and a rich catalogue of software enables more efficient workflows.
  • Storing and archiving big data offline supports research teams working together across projects and enables the most data intensive research workflows.
  • Performing advanced data analytics and opening up the world of artificial intelligence to discover new insights and resolve complex problems.
  • End-to-end integration supporting high performance data transfers across institutional boundaries allows you to quickly and efficiently transfer big data to and from NeSI.
  • Research communities working together within virtual laboratories, which are a customised, integrated and easy to use one stop shop of domain specific tools and data.

To develop our services to offer these features we intend to work closely with research groups. If your group has important research projects and priority programmes where working with us can help, please contact us via support@nesi.org.nz.

Features and technical specifications of NeSI’s new infrastructure platform

The technical details are outlined below:

Capability Supercomputer

Feature

Hardware

Operating Environment

Capability Supercomputer

Cray XC50 Massively Parallel Capability Computer

464 nodes (of which NeSI has access to 265)* Intel Xeon “Skylake” 6148 processors, 2.4 GHz, 40 cores/node (18,560 cores total)

Aries Dragonfly interconnect

Memory: 50% nodes with 96GB/node, 50% nodes with 192GB/node (66.8TB total)

SUSE Linux Enterprise Server

Spectrum Scale filesystem

Slurm scheduler

Cray Programming Environment

Cray Compilers and tools

Intel compilers and tools

Allinea Forge  (DDT & MAP)  software development and debugging tools

Data Analytics including:

  • Spark Analytics
  • DASK for big data analysis
  • Artificial Intelligence, Machine Learning, Deep Learning
  • Cray Graph Engine

Pre and Post Processing and Virtual Laboratories

 

28 nodes (of which NeSI has access to 16)*: Intel Xeon “Skylake” 6148 processors, 2.4 GHz, 40 cores/node (1,200 cores total)

Memory: 768GB/node (23TB total)

GPUs: 8 NVIDIA Pascal GPGPUs

CentOS 7

Spectrum Scale filesystem

Intel Parallel Studio Cluster

Nice DCV Visualisation

 

*This HPC system has been procured in collaboration with NIWA. The XC50 Supercomputer is shared between the two organisations.

 

Capacity High Performance Computer

Feature

Hardware

Operating Environment

Capacity HPC

Cray CS400 Capacity High Performance Computing cluster

234 nodes: Intel Xeon® E5-2695v4  Broadwell processors, 2.1 GHz, 36 cores/node (8,424 cores total)

Memory: 128 GB/node (30TB total)

Interconnect: FDR Infiniband (from the node) to EDR Infiniband (100 GB/s) backbone network

 CentOS 7

Spectrum Scale filesystem

Slurm scheduler

Cray Programming Environment

Cray Compilers and tools

Intel Parallel Studio Cluster

Allinea Forge  (DDT & MAP)  software development and debugging tools

Pre and Post Processing and Virtual Laboratories

16 Large Memory and Virtual Laboratory nodes:  Intel Xeon® E5-2695v4  Broadwell processors, 36 cores/node (576 cores total)

Memory: 512 GB/node (8.2TB total)

GPUs: 8 Nvidia Pascal GPGPUs

1 Huge Memory node, 64 cores, 4 TB memory

CentOS 7

Spectrum Scale filesystem

Nice DCV Visualisation

 

High Performance and Offline Storage

Feature

Hardware

Operating Environment

High Performance Storage

IBM Elastic Storage Server GS4s and GL6s:  6.9 PB (shared across Capability Supercomputer and Capacity HPC)

Mellanox EDR 100 GB/s Infiniband network

Total bandwidth ~130GB/s

Spectrum Scale (previously called GPFS)

Offline Storage

IBM TS3500 Library, 12 ´ LTO7 drives, 5.8PB (uncompressed) (expandable to 30PB (uncompressed)) replicated across two sites