September 2022
๐ New on CoreWeave Cloud this month:
Self-serve signup for CoreWeave Cloud โโ
Signing up for an account on CoreWeave Cloud is now easier than ever! With self-serve signup, you can create your own account without additional approval.
Some features are only available through an upgrade request. To increase your quota, or access Kubernetes, log in to your CoreWeave account and navigate to Upgrade Quotas.
NVIDIA A100 80GB NVLINK with InfiniBand and SHARP โกโ
A100 80GB NVLINK SXM4 GPUs are now available in the LAS1 region. These GPUs are provisioned in large clusters, intended for distributed training and inference of LLMs such as BLOOM 176B.
Connectivity between compute hardware, as well as storage, plays a major role in overall system performance for applications of Neural Net Training, Rendering, and Simulation. Certain workloads, such as those used for training massive language models of over 100 billion parameters over hundreds or thousands of GPUs, require the fastest, lowest-latency interconnect.
CoreWeave provides highly optimized IP-over-Ethernet connectivity across all GPUs, and an industry-leading, non-blocking InfiniBand fabric for our top-of-the-line A100 NVLINK GPU fleet. CoreWeave has partnered with NVIDIA in its design of interconnect for A100 HGX training clusters. All CoreWeave A100 NVLINK GPUs offer GPUDirect RDMA over InfiniBand, in addition to standard IP/Ethernet networking.
CoreWeave's InfiniBand topology is fully SHARP compliant, and all components to leverage SHARP are implemented in the network control-plane, such as Adaptive Routing and Aggregation Managers, effectively doubling the performance of a compliant InfiniBand network as compared to a network with similar specifications without in-network computing such as RDMA over Converged Ethernet (RoCE).
A100 NVLINK 80GB GPUs with InfiniBand are now available in the LAS1 (Las Vegas) data center region. A100 NVLINK 40GB GPUs with InfiniBand are available in the ORD1 (Chicago) data center region!
Read more about HPC Interconnect and SHARP on CoreWeave Cloud!
CoreWeave's Private Docker Registry ๐ฆโ
Customers can now deploy their own private Docker registry from the application Catalog!
Images being hosted inside CoreWeave means no requirement for any subscriptions to external services such as Docker Hub, GitHub or GitLab. Additionally, credentials to pull images are automatically provisioned to a customer's namespace, alleviating the need to fiddle with "image pull secrets" that trip up many first-timers.
As usual with CoreWeave services, there is no charge except for the storage used for images and the minimal compute resources needed to run the registry server.
Head over to the Cloud applications Catalog to deploy a private Docker registry to your namespace!
Rocky Linux is now supported on CoreWeave Cloud โฐโ
Rocky Linux is a premiere, open-source enterprise Operating System, designed to be completely compatible with Red Hat Enterprise Linuxยฎ. Tipped to replace CentOS 7 as the leading VFX workstation of choice by the Visual Effects Society survey, Rocky Linux provides a stable platform with a 10-year upstream support lifecycle.
Determined AI is now available in the Applications Catalog ๐ง โ
Determined AI is an open-source deep learning training platform that makes building models fast and easy. Determined AI can now be deployed directly onto CoreWeave Cloud by deploying the application from the application Catalog. With Determined AI, you can launch Jupyter notebooks, interactive shells with VSCode support, and distributed training experiments right from the Web UI and CLI tools. Deploying Determined AI from the CoreWeave applications Catalog makes spinning up an instance fast and easy, and when running, the platform consumes minimal resources and incurs minimal cost.
Find Determined AI in the apps Catalog to learn more about it or deploy an instance to your namespace!
vCluster is now available in the Applications Catalog โตโ
For those of you who require or desire more custom control over your Kubernetes Control Plane, the vCluster application is a great solution. With vCluster, you can install your own custom cluster-wide controllers, manage your own custom resource definitions, all without sacrificing the benefits of running on CoreWeave Cloud's bare metal environment.
Find vCluster in the apps Catalog to learn more about it or deploy an instance to your namespace!
New machine learning walkthroughs on CoreWeave Cloud ๐งชโ
It's never been easier to deploy, train, and fine-tune machine learning models on the Cloud for some incredible results, and with our new walkthroughs and examples demonstrating just some of the ways CoreWeave's state-of-the-art compute power can be leveraged for model training, you can start today!:
- PyTorch Hugging Face Diffusers - Stable Diffusion Text to Image: Generating high-quality images with photorealistic qualities from nothing but a text prompt used to be the stuff of science fiction. But now, using the open source model built by our friends at Stability.AI, you can leverage CoreWeave Cloud's compute power to do precisely that with just a few clicks and commands in our latest walkthrough of this cutting-edge AI technology.
- PyTorch Hugging Face Transformers BigScience BLOOM: In the PyTorch Hugging Face Transformers BigScience BLOOM walkthrough, you'll learn how to use the autoregressive Large Language Model (LLM) trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. BLOOM is able to output coherent text in 46 languages - and 13 programming languages - whose structure is hardly distinguishable from text written by humans. BLOOM can even be instructed to perform text tasks it hasn't been explicitly trained for by casting them as text generation tasks.
- Triton Inference Server for GPT-J with FasterTransformer: GPT-J is one of the most popular Open Source NLP model. It's size and performance makes it a perfect fit for cost sensitive NLP use cases. In our Triton Inference Server for GPT-J FasterTransformer walkthrough, you'll learn how to leverage FasterTransformer for up to 40% faster GPT-J inference over a vanilla Hugging Face Transformers based implementation.
- Triton Inference Server for GPT-NeoX 20B with FasterTransformer: Together with EleutherAI, CoreWeave trained and released the Open Source GPT-NeoX 20B model in January. We are now taking self-hosted inference of this Large Language Model to the next level by offering a NVIDIA FasterTransformer-based inference option. In our Triton Inference Server for GPT-NeoX 20B walkthrough, you'll learn how to leverage FasterTransformer for up to 40% faster GPT-NeoX inference over a vanilla Hugging Face Transformers based implementation.
- GPT-NeoX fine-tuning: In our new GPT-NeoX fine-tuning walkthrough, using the Determined AI MLOps platform to run distributed fine-tuning jobs, you'll learn how to fine-tune a 20B parameter autoregressive model trained on the Pile dataset to generate text based on context or unconditionally for use cases such as story generation, chat bots, summarization, and more.
Introducing Layer 2 VPC โ๏ธโ
CoreWeave Cloud Networking (CCNN) is built to handle workloads requiring up to 100Gbps of network connectivity at scale, and it also handles firewalls and Load Balancing via Network Policies. Certain use cases, however, require a deeper level of network control than what is offered by a traditional Cloud network stack. For these users, we are now introducing the CoreWeave Cloud Layer 2 VPC (L2VPC).
L2VPC provides fine-grained customization by relinquishing all control over DHCP servers, and VPN gateways to the user. Virtual Firewalls are also supported and configured by the user - most KVM-compatible firewall images are compatible, allowing you to install your own firewall from the ground up. Installation guides for some of the most popular third-party choices, such as Fortinet's FortiGate, are also provided.
L2VPC is built on top of SR-IOV hardware virtualization technology, retaining the high performance and low latency customers have come to expect from CoreWeave Cloud.
CoreWeave Object Storage is now in beta โจโ
Object Storage is coming to CoreWeave! CoreWeave's S3-compatible Object Storage allows for an easy place to store and reference things like Docker images, machine learning models, and any other kinds of objects right within CoreWeave Cloud, streamlining your project workflows! Object storage is priced at only $0.03/GB/mo with no access and egress fees!
Accelerated object storage provides local caching for frequently accessed objects across all CoreWeave datacenters. Accelerated object storage is especially useful for large scale multi region rendering or inference auto-scaling where the same data needs to be loaded by hundreds or thousands of compute-nodes.
This feature is currently in beta, but you can learn more now, and contact your CoreWeave Support Specialist to try it out!
Introducing The Workload Activity Tracker dashboard ๐โ
It's an all too common experience to let idle research shells or experiments idle in your namespace after you're done working with them, only to later come back and realize you've been eating resources unnecessarily. Now, with the Workload Activity Tracker dashboard for Grafana, answering "is everything deployed in my namespace doing something?" is never a question you have to worry about.
The Workload Activity Tracker displays which of your Workloads have had activity in the past 24 hours, which are inactive, how many resources they are consuming, and how much cost they're incurring, all in a convenient and concise overview format.
Check out the Workload Activity Tracker dashboard now!