NVIDIA DGX H100
Guarantor:
Prof. Pavel Václavek, Ph.D.
Technology / Methodology:
AI
Instrument status:
Operational
Equipment placement:
Serverovna CEITEC
Research group:
RICAIP Testbed Brno
Description:
´One of the compute nodes in our HPC cluster is an NVIDIA DGX H100. It can be used either on its own or in combination with a DGX H100 node. Thanks to its ample internal memory and high-performance computer location in our premises, the security of your data is enhanced.
High computational performance across various levels of computational precision enables the acceleration of a wide range of applications. For AI, lower FP16 precision with high-density tensor applications (TensorCores) is suitable, while for certain simulations, FP64 precision (CUDA cores) is preferred.
NVIDIA offers a comprehensive software environment that is intuitive and seamlessly integrated for maximum user convenience and applicability to a wide variety of tasks. It also helps maximize system performance. For accelerated applications (frameworks, libraries), there is a wide selection of Docker container images available on the NVIDIA GPU Cloud (NGC). Examples of such containers include TensorFlow, PyTorch, and JAX for neural networks, or NVIDIA RAPIDS for data analytics. Specific tasks can be addressed through custom development at lower software levels using specialized compilers within the NVIDIA HPC SDK.´
Specification:
|
Manufacturer |
NVIDIA |
|
Type |
DGX H100 |
|
Memory |
2 TB (CPU), 640 GB (GPU) |
|
GPU |
8x H100 80 GB, 4224 Tensor cores, 135 168 CUDA cores |
|
CPU |
2x Intel Xeon Platinum 8480 CPU, 112 cores, 2 GHz / 3.8 GHz |
|
Performance |
16 petaFLOPS (FP16 Tensor), 32 petaOPS (INT8 Tensor), 250 teraFLOPS (FP64 CUDA) |
|
Storage |
OS: 2x 1.92 TB NVMe; data: 30 TB (8x 3.84 TB) NVMe |
|
Software |
DGX OS (Ubuntu Linux); připravené aplikace Enterprise AI v rámci NVIDIA NGC (kontejnery Enroot, Apptainer) |
|
Connected to DGX H100 through Infiniband 4x 200 Gb/s. |
|