Leverage the most powerful supercomputing platform on CoreWeave Cloud
CoreWeave's infrastructure is purpose-built for large-scale, GPU-accelerated workloads. We specialize in serving the most demanding AI and machine learning applications. To this end, CoreWeave is proud to be one of the only Cloud platforms in the world offering NVIDIA's most powerful end-to-end AI supercomputing platform.
Image of an HGX H100
The HGX H100 is NVIDIA's most powerful supercomputing platform
The NVIDIA HGX H100 enables up to seven times more efficient high-performance computing (HPC) applications, up to nine times faster AI training on large models, and up to thirty times faster AI inference than the NVIDIA HGX A100.


Hyperfast compute plus the lowest available network latency for extremely fast training times

The intense speeds of the HGX H100, combined with the lowest NVIDIA GPUDirect network latency on the market - the NVIDIA Quantum-2 InfiniBand platform - reduces the training time of AI models to "days or hours, instead of months."
With AI permeating nearly every industry today, this speed and efficiency has never been more vital for HPC applications.

FP8 support with Transformer Engine for quicker onboarding to H100

The open source Python library Transformer Engine by NVIDIA enables the use of the FP8 (8-bit floating point) format on Hopper GPUs, the card architecture utilized by HGX H100s by providing .
Although all major deep learning frameworks support the FP16 format, FP8 support is not natively available in many frameworks today, a problem that Transformer Engine addresses:
[...] With Hopper GPU architecture, FP8 precision was introduced, which offers improved performance over FP16 with no degradation in accuracy. Although all major deep learning frameworks support FP16, FP8 support is not available natively in frameworks today.
TE addresses the problem of FP8 support by providing APIs that integrate with popular Large Language Model (LLM) libraries. It provides a Python API consisting of modules to easily build a Transformer layer as well as a framework agnostic library in C++ including structs and kernels needed for FP8 support. Modules provided by TE internally maintain scaling factors and other values needed for FP8 training, greatly simplifying mixed precision training for users.
Tutorials on using Faster Transformer on CoreWeave are forthcoming.
Additional Resources
Learn more from NVIDIA about FP8 and why it matters in ML and AI applications.
Read more on MosaicML about how HGX H100s on CoreWeave accelerate training operations while preserving model quality.

Make a reservation for HGX H100

Due to high demand, A100 NVLINK (HGX) and H100 NVLINK (HGX) nodes are currently fully committed on client contracts, and are therefore not available for on-demand use. We recommend a conversation with the CoreWeave team to build a strategic plan catered to your needs to make use of available infrastructure and to plan for your future capacity requirements. Contact CoreWeave Sales to get started.
If your needs demand the highest performance in supercomputing coupled with the lowest-latency networking available, make a reservation for HGX H100 compute on our website.