Cluster description
Hardware#
- 16 compute nodes (128 cores, 2 TB RAM) HPE Apollo 2000 XL225n Gen10+, 2x AMD EPYC 7662 (3.1GHz, 64 cores), 3 TB RAM
- 68 compute nodes (28 cores, 256 GB RAM) DELL C6320, 2x Intel Xeon E5-2695v3 (2.3GHz, 14 cores), 256GB RAM
- 1 "fat memory" compute node (64 cores, 3 TB RAM) DELL R930, 4x Intel Xeon E7-8860v3 (2.2GHz, 16 cores), 3 To RAM
- 3 gpu compute nodes (32 cores, 512 GB RAM, 1 NVMe scratch volume each) DELL R7525, 2x AMD EPYC 7343 AMD-EPYC-7343 (3,2GHz, 16 cores), 2x graphic processors NVIDIA Ampere A100 40GB [NVIDIA A100] (https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-us-nvidia-1758950-r4-web.pdf), 512 GB RAM
- 2 login and 1 admin nodes
DELL R630, 2x Intel Xeon E5-2620v4 (2.1GHz, 8 cores), 128 Go RAM
Total of 103 nodes - 8434 cores, 56TB of RAM
- Storage based on DDN Lustre EXAScaler (2 PB)
SFA400NVXE, ES400NVX and 2xSS9012 expansion enclosures, 11 x 3.84TB SSD, 180 x 16TB HDD -
Local network Ethernet 10Gbits/s DELL S6000 switch 10/40Gb Ethernet
-
Internet access 1Gbits/s
Software layer#
The cluster is managed by Slurm (version 20.11.8).
Scientific software and tools are available through Environment Modules and are mainly based on Conda packages or Singularity images.
Operating System: CentOS (cluster) and sometimes Ubuntu
Around the cluster management: Nagios Core, Netbox, Proxmox VE, VMware ESX.
Deployment and configuration are powered by Ansible and GitLab.