Feature Updates and Release Notes for CoreWeave Cloud
New on CoreWeave Cloud:
NVIDIA Mellanox Quantum leaf switches in the CoreWeave LAS1 datacenter
Connectivity between compute hardware, as well as storage, plays a major role in overall system performance for applications of Neural Net Training, Rendering, and Simulation. Certain workloads, such as those used for training massive language models of over 100 billion parameters over hundreds or thousands of GPUs, require the fastest, lowest-latency interconnect.
CoreWeave provides highly optimized IP-over-Ethernet connectivity across all GPUs, and an industry-leading, non-blocking InfiniBand fabric for our top-of-the-line A100 NVLINK GPU fleet. CoreWeave has partnered with NVIDIA in its design of interconnect for A100 HGX training clusters. All CoreWeave A100 NVLINK GPUs offer GPUDirect RDMA over InfiniBand, in addition to standard IP/Ethernet networking.
CoreWeave's InfiniBand topology is fully SHARP compliant, and all components to leverage SHARP are implemented in the network control-plane, such as Adaptive Routing and Aggregation Managers, effectively doubling the performance of a compliant InfiniBand network as compared to a network with similar specifications without in-network computing such as RDMA over Converged Ethernet (RoCE).
Images being hosted inside CoreWeave means no requirement for any subscriptions to external services such as Docker Hub, GitHub or GitLab. Additionally, credentials to pull images are automatically provisioned to a customer's namespace, alleviating the need to fiddle with “image pull secrets” that trip up many first-timers.
As usual with CoreWeave services, there is no charge except for the storage used for images and the minimal compute resources needed to run the registry server.
****Rocky Linux is a premiere, open-source enterprise Operating System, designed to be completely compatible with Red Hat Enterprise Linux®. Tipped to replace CentOS 7 as the leading VFX workstation of choice by the Visual Effects Society survey, Rocky Linux provides a stable platform with a 10-year upstream support lifecycle.
Determined AI is an open-source deep learning training platform that makes building models fast and easy. Determined AI can now be deployed directly onto CoreWeave Cloud by deploying the application from the application Catalog. With Determined AI, you can launch Jupyter notebooks, interactive shells with VSCode support, and distributed training experiments right from the Web UI and CLI tools. Deploying Determined AI from the CoreWeave applications Catalog makes spinning up an instance fast and easy, and when running, the platform consumes minimal resources and incurs minimal cost.
For those of you who require or desire more custom control over your Kubernetes Control Plane, the vCluster application is a great solution. With vCluster, you can install your own custom cluster-wide controllers, manage your own custom resource definitions, all without sacrificing the benefits of running on CoreWeave Cloud's bare metal environment.
It's never been easier to deploy, train, and finetune machine learning models on the Cloud for some incredible results, and with our new walkthroughs and examples demonstrating just some of the ways CoreWeave's state-of-the-art compute power can be leveraged for model training, you can start today!:
- ****PyTorch Hugging Face Diffusers - Stable Diffusion Text to Image: Generating high-quality images with photorealistic qualities from nothing but a text prompt used to be the stuff of science fiction. But now, using the open source model built by our friends at Stability.AI, you can leverage CoreWeave Cloud's compute power to do precisely that with just a few clicks and commands in our latest walkthrough of this cutting-edge AI technology.\
- ****PyTorch Hugging Face Transformers BigScience BLOOM: In the PyTorch Hugging Face Transformers BigScience BLOOM walkthrough, you'll learn how to use the autoregressive Large Language Model (LLM) trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. BLOOM is able to output coherent text in 46 languages - and 13 programming languages - whose structure is hardly distinguishable from text written by humans. BLOOM can even be instructed to perform text tasks it hasn't been explicitly trained for by casting them as text generation tasks.\
- ****Triton Inference Server for GPT-J with FasterTransformer: GPT-J is one of the most popular Open Source NLP model. It's size and performance makes it a perfect fit for cost sensitive NLP use cases. In our Triton Inference Server for GPT-J FasterTransformer walkthrough, you'll learn how to leverage FasterTransformer for up to 40% faster GPT-J inference over a vanilla Hugging Face Transformers based implementation.\
- ****Triton Inference Server for GPT-NeoX 20B with FasterTransformer: Together with EleutherAI, CoreWeave trained and released the Open Source GPT-NeoX 20B model in January. We are now taking self-hosted inference of this Large Language Model to the next level by offering a NVIDIA FasterTransformer-based inference option. In our Triton Inference Server for GPT-NeoX 20B walkthrough, you'll learn how to leverage FasterTransformer for up to 40% faster GPT-NeoX inference over a vanilla Hugging Face Transformers based implementation.\
- ****GPT-NeoX finetuning: In our new GPT-NeoX finetuning walkthrough, using the DeterminedAI MLOps platform to run distributed finetuning jobs, you'll learn how to finetune a 20B parameter autoregressive model trained on the Pile dataset to generate text based on context or unconditionally for use cases such as story generation, chat bots, summarization, and more.\
CoreWeave Cloud Networking (CCNN) is built to handle workloads requiring up to 100Gbps of network connectivity at scale, and it also handles firewalls and Load Balancing via Network Policies. Certain use cases, however, require a deeper level of network control than what is offered by a traditional Cloud network stack. For these users, we are now introducing the CoreWeave Cloud Layer 2 VPC (L2VPC).
L2VPC provides fine-grained customization by relinquishing all control over DHCP servers, and VPN gateways to the user. Virtual Firewalls are also supported and configured by the user - most KVM-compatible firewall images are compatible, allowing you to install your own firewall from the ground up. Installation guides for some of the most popular third-party choices, such as Fortinet's FortiGate, are also provided.
L2VPC is built on top of SR-IOV hardware virtualization technology, retaining the high performance and low latency customers have come to expect from CoreWeave Cloud.
Object Storage is coming to CoreWeave! CoreWeave's S3-compatible Object Storage allows for an easy place to store and reference things like Docker images, machine learning models, and any other kinds of objects right within CoreWeave Cloud, streamlining your project workflows! Object storage is priced at only $0.03/GB/mo with no access and egress fees!
Accelerated object storage provides local caching for frequently accessed objects across all CoreWeave datacenters. Accelerated object storage is especially useful for large scale multi region rendering or inference auto-scaling where the same data needs to be loaded by hundreds or thousands of compute-nodes.
The Workload Activity Tracker in action
It's an all too common experience to let idle research shells or experiments idle in your namespace after you're done working with them, only to later come back and realize you've been eating resources unnecessarily. Now, with the Workload Activity Tracker dashboard for Grafana, answering "is everything deployed in my namespace doing something?" is never a question you have to worry about.
The Workload Activity Tracker displays which of your Workloads have had activity in the past 24 hours, which are inactive, how many resources they are consuming, and how much cost they're incurring, all in a convenient and concise overview format.
The Release Notes for May 2022 are inclusive of many new features launched since January 2022.
We are pleased to announce the general availability of the CoreWeave LGA1 data center, providing extremely low latency, high performance cloud compute resources to the broader New York City market. Richly connected into the global Tier 1 internet backbone, LGA1 is built for low latency compute intensive use cases that require ultimate reliability and security.
Like all CoreWeave data centers, LGA1 is packed with a broad range of state of the art NVIDIA GPU accelerated cloud compute instances, including the Quadro RTX series, the newest RTX Ampere workstation and A40 data center GPUs. In addition to GPU compute, LGA1 is packed with CPU only instances, and high performance Block and Shared File System storage.
LGA1 is housed in an ISO 27001 certified, SSAE 18 SOC 2 compliant, Energy Star Certified campus, providing the utmost in security and efficiency for your critical workloads.
CoreWeave now offers the NVIDIA A100 80GB PCIe, which delivers unprecedented acceleration to power the world’s highest-performing AI, data analytics, and HPC applications. The NVIDIA A100 80GB PCIe accelerator is now available for Kubernetes deployments in ORD1 using the
By popular demand, we’ve added support for multiple users per organization and an Organization Management UI to invite and manage these users. Keep an eye on this page - we are regularly updating it with additional improvements and functionality.
Since the start of the year, we've added:
👫 Multi-User Support: Invite and manage users to your Organization.
🔢 Resource Quotas: See how many pods, the number of GPUs, and storage capacity allocated at any time.
Features coming soon:
RBAC: Permissions and granular control over user access
Multiple Namespaces: Provision multiple namespaces per Organization
🕹️ Scalable Pixel Streaming: Stream your Unreal Engine projects to the masses quickly and easily.
🌐 Traefik: Custom ingresses, for use with your own domains.
🚚 ArgoCD: Access to a declarative, GitOps continuous delivery tool for Kubernetes.
🔥 Backblaze: Automate your volume backups to safeguard your data.
Looking to finetune your own ML model on CoreWeave? Check out our new reference tools and examples for models such as GPT-Neo, GPT-J-6B, and Fairseq. Learn how to collect your dataset, which will then be tokenized and finetuned on with the parameters you give it, and even set up an endpoint to test your work with.
Need to organize your access tokens by user or track what they are being used for? You can now label them at creation from the CoreWeave Cloud UI.
With CloudInit, you can choose your preferred settings in advance and they'll be set up during your instance launch. Plus, we now offer Static MAC Addresses and Serial Number support.
We’ve invested heavily in networking to start 2022, with upgrades to 200Gbps+ Tier 1 transit in each region.
Direct connects up to 100Gbps are now available at all of our data centers, and we’ve installed a CoreWeave Cloud On Ramp in downtown Los Angeles at CoreSite LA2 to accept cross connects back to LAS1.
We’ve also joined the Megaport network at LAS1 and LGA1 for direct, quick software defined connectivity to CoreWeave Cloud.