Cluster description
Hardware
- 16 compute nodes (128 cores, 2 TB RAM)
HPE Apollo 2000 XL225n Gen10+, 2x AMD EPYC 7662 (3.1GHz, 64 cores), 3 TB RAM - 68 compute nodes (28 cores, 256 GB RAM)
DELL C6320, 2x Intel Xeon E5-2695v3 (2.3GHz, 14 cores), 256GB RAM - 1 "fat memory" compute node (64 cores, 3 TB RAM)
DELL R930, 4x Intel Xeon E7-8860v3 (2.2GHz, 16 cores), 3 To RAM - 3 gpu compute nodes (32 cores, 512 GB RAM, 1 NVMe scratch volume each)
DELL R7525, 2x AMD EPYC 7343 AMD-EPYC-7343 (3,2GHz, 16 cores), 2x graphic processors NVIDIA Ampere A100 40GB [NVIDIA A100] (https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf), 512 GB RAM - 2 login and 1 admin nodes
DELL R630, 2x Intel Xeon E5-2620v4 (2.1GHz, 8 cores), 128 Go RAM
Total of 103 nodes - 8434 cores, 56TB of RAM
- Storage based on DDN Lustre EXAScaler (2 PB)
SFA400NVXE, ES400NVX and 2xSS9012 expansion enclosures, 11 x 3.84TB SSD, 180 x 16TB HDD -
Local network Ethernet 10Gbits/s
DELL S6000 switch 10/40Gb Ethernet -
Internet access 1Gbits/s
Software layer
The cluster is managed by Slurm (version 20.11.8).
Scientific software and tools are available through Environment Modules and are mainly based on Conda packages or Singularity images.
Operating System: CentOS (cluster) and sometimes Ubuntu
Around the cluster management: Nagios Core, Netbox, Proxmox VE, VMware ESX.
Deployment and configuration are powered by Ansible and GitLab.