Skip to content

Cluster description

Hardware#

  • 16 compute nodes (128 cores, 2 TB RAM) HPE Apollo 2000 XL225n Gen10+, 2x AMD EPYC 7662 (3.1GHz, 64 cores), 3 TB RAM
  • 68 compute nodes (28 cores, 256 GB RAM) DELL C6320, 2x Intel Xeon E5-2695v3 (2.3GHz, 14 cores), 256GB RAM
  • 1 "fat memory" compute node (64 cores, 3 TB RAM) DELL R930, 4x Intel Xeon E7-8860v3 (2.2GHz, 16 cores), 3 To RAM
  • 3 gpu compute nodes (32 cores, 512 GB RAM, 1 NVMe scratch volume each) DELL R7525, 2x AMD EPYC 7343 AMD-EPYC-7343 (3,2GHz, 16 cores), 2x graphic processors NVIDIA Ampere A100 40GB [NVIDIA A100] (https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf), 512 GB RAM
  • 2 login and 1 admin nodes
    DELL R630, 2x Intel Xeon E5-2620v4 (2.1GHz, 8 cores), 128 Go RAM

Total of 103 nodes - 8434 cores, 56TB of RAM

  • Storage based on DDN Lustre EXAScaler (2 PB)
    SFA400NVXE, ES400NVX and 2xSS9012 expansion enclosures, 11 x 3.84TB SSD, 180 x 16TB HDD
  • Local network Ethernet 10Gbits/s DELL S6000 switch 10/40Gb Ethernet

  • Internet access 1Gbits/s

Rack Admin node Storage (R730xd Storage (R730xd) Compute nodes (cpu-node-01-48) Compute nodes (cpu-node-49-68) Compute nodes (fat node, cpu-node-69) Switch Ethernet 10/40Gbits/s Switch Ethernet 10/40Gbits/s

Software layer#

The cluster is managed by Slurm (version 20.11.8).

Scientific software and tools are available through Environment Modules and are mainly based on Conda packages or Singularity images.

Operating System: CentOS (cluster) and sometimes Ubuntu

Around the cluster management: Nagios Core, Netbox, Proxmox VE, VMware ESX.

Deployment and configuration are powered by Ansible and GitLab. Schema orchestration