Skip to content

Cluster description

Hardware

  • 16 compute nodes (128 cores, 2 TB RAM)
    HPE Apollo 2000 XL225n Gen10+, 2x AMD EPYC 7662 (3.1GHz, 64 cores), 3 TB RAM
  • 68 compute nodes (28 cores, 256 GB RAM)
    DELL C6320, 2x Intel Xeon E5-2695v3 (2.3GHz, 14 cores), 256GB RAM
  • 1 "fat memory" compute node (64 cores, 3 TB RAM)
    DELL R930, 4x Intel Xeon E7-8860v3 (2.2GHz, 16 cores), 3 To RAM
  • 3 gpu compute nodes (32 cores, 512 GB RAM, 1 NVMe scratch volume each)
    DELL R7525, 2x AMD EPYC 7343 AMD-EPYC-7343 (3,2GHz, 16 cores), 2x graphic processors NVIDIA Ampere A100 40GB [NVIDIA A100] (https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf), 512 GB RAM
  • 2 login and 1 admin nodes
    DELL R630, 2x Intel Xeon E5-2620v4 (2.1GHz, 8 cores), 128 Go RAM

Total of 103 nodes - 8434 cores, 56TB of RAM

  • Storage based on DDN Lustre EXAScaler (2 PB)
    SFA400NVXE, ES400NVX and 2xSS9012 expansion enclosures, 11 x 3.84TB SSD, 180 x 16TB HDD
  • Local network Ethernet 10Gbits/s
    DELL S6000 switch 10/40Gb Ethernet

  • Internet access 1Gbits/s

Rack Admin node Storage (R730xd Storage (R730xd) Compute nodes (cpu-node-01-48) Compute nodes (cpu-node-49-68) Compute nodes (fat node, cpu-node-69) Switch Ethernet 10/40Gbits/s Switch Ethernet 10/40Gbits/s

Software layer

The cluster is managed by Slurm (version 20.11.8).

Scientific software and tools are available through Environment Modules and are mainly based on Conda packages or Singularity images.

Operating System: CentOS (cluster) and sometimes Ubuntu

Around the cluster management: Nagios Core, Netbox, Proxmox VE, VMware ESX.

Deployment and configuration are powered by Ansible and GitLab. Schema orchestration