Cluster description
Hardware
68 compute nodes (28 cores, 256 GB RAM)
DELL C6320 on 17 nodes, 2x Intel Xeon E5-2695v3 (2.3GHz, 14 cores), 256GB RAM
1 "fat memory" compute node (64 cores, 3 TB RAM)
DELL R930, 4x Intel Xeon E7-8860v3 (2.2GHz, 16 cores), 3 To RAM
3 gpu compute nodes (32 cores, 512 GB RAM, 1 NVMe scratch volume each)
DELL R7525, 2x AMD EPYC 7343 AMD-EPYC-7343 (3,2GHz, 16 cores), 2x graphic processors NVIDIA Ampere A100 40GB [NVIDIA A100] (https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf), 512 GB RAM
1 login and 1 admin node
DELL R630, 2x Intel Xeon E5-2620v4 (2.1GHz, 8 cores), 128 Go RAM
Storage based on DDN Lustre EXAScaler (2 PB)
SFA400NVXE, ES400NVX and 2xSS9012 expansion enclosures, 11 x 3.84TB SSD, 180 x 16TB HDD
Local network Ethernet 10Gbits/s
DELL S6000 switch 10/40Gb Ethernet
Internet access 1Gbits/s
Total of 1968 cores, 20TB of RAM
Software layer
The cluster is managed by Slurm (version 20.11.8).
Scientific software and tools are available through Environment Modules and are mainly based on Conda packages or Singularity images.
Operating System: CentOS (cluster) and sometimes Ubuntu
Around the cluster management: Nagios Core, Netbox, Proxmox VE, VMware ESX.
Deployment and configuration are powered by Ansible and GitLab.