October 2024
Important updates for October
Prometheus upgrade
We have upgraded our Prometheus service to significantly improve reliability and stability. You should now expect to see faster, more reliable results when querying for metrics over larger spans of time. This will benefit customers querying the Prometheus service directly via its API, or indirectly via Managed Grafana or Self-Hosted Grafana.
To achieve this, we have deployed a new backend for the Prometheus service. The new backend includes complete metrics data from October 21, 2024 onward, and may additionally contain sparse data collected after July 8, 2024. Customers may still access historical metrics data from the Legacy Prometheus endpoint, although the associated endpoint will stop collecting new data at a future date. For your convenience, we will ensure a 4 week overlap period for metrics data in both the new and legacy backend.
In detail:
- As of October 31, the Prometheus data source in Managed Grafana is backed by the new backend and only reliably contains metrics data from October 21, 2024 and onwards.
- As of October 31, metrics queries served from prometheus.ord1.coreweave.com will only reliably contain metrics data from October 21, 2024 and onwards.
- As of October 31, Managed Grafana includes a Legacy Prometheus data source, and dashboards include a dropdown picker so you can toggle between the Prometheus and Legacy Prometheus data sources; the Legacy Prometheus data source contains historical metrics data from the old backend.
- As of October 31, you are able to query metrics from prometheus-legacy.ord1.coreweave.com, and add a data source pointed at this endpoint in your Self-Hosted Grafana instances; this endpoint will serve historical metrics data.
- CoreWeave will ensure that at least one of the data sources can be used for any metrics query spanning 4 weeks of data.
An important change with the new backend is an improvement to certain PromQL functions, namely rate
, increase
, delta
, idelta
, and changes
. While these improvements should make working with counter metrics more intuitive around counter resets, we understand some customers may be relying on the current behavior of these functions. To that end, customers may update their queries, alerts, and dashboards with the corresponding functions rate_prometheus
, increase_prometheus
, delta_prometheus
, idelta_prometheus
, and changes_prometheus
if retaining the legacy behavior is desirable.
These changes will have no impact on the CoreWeave Cloud Loki service, which serves log data. All log data (historical and future) will continue to be available via the Loki data source, and served for querying at prometheus.ord1.coreweave.com. For convenience, it will also be served at prometheus-legacy.ord1.coreweave.com.
Additional documentation is available as follows:
- Directly querying the prometheus.ord1.coreweave.com and prometheus-legacy.ord1.coreweave.com endpoints
- Toggling between the Legacy Prometheus and Prometheus data sources in Managed Grafana
- Configuring your own Legacy Prometheus data source in Self-Hosted Grafana