Apply Node Pool updates
Apply Node Pool updates by queuing a reconfigure reboot
Some Node Pool modifications require you to both reconfigure and reboot the Nodes to apply updates successfully, for example, a new system OS image or GPU driver update. To apply these updates, you can queue a reconfigure reboot for the Node Pool. If you need to reboot Nodes manually to apply a system update or other change, without reconfiguring the Node, see Reboot Nodes.
Prerequisites
Before you begin, ensure you have:
- An active CoreWeave account
- An API Access Token
- The CoreWeave Intelligent CLI installed locally
Queue a reconfigure reboot
To queue a reconfigure reboot for a Node Pool, run the following command, replacing [list-of-space-separated-nodes] with the list of Nodes you want to reboot:
$cwic node reboot --reconfigure [list-of-space-separated-nodes]
When you submit the command, it queues a reconfigure reboot to begin as soon as the Nodes are idle. Meanwhile, Nodes will be cordoned to prevent scheduling new workloads.
The reconfiguration and reboot process can take up to an hour:
- During the reconfiguration and reboot, the Node
PhaseStatecondition moves toproduction-reconfigure-powercycle-test, and the Node becomes unavailable. - When the reconfiguration is complete, the Node
PhaseStatecondition returns toproductionand the Node is uncordoned.
When rebooting Nodes, limit the number of Nodes that are rebooted at one time to avoid service interruptions. If you need to reboot 50 or more Nodes at a time, please contact support for assistance.
Optional flags
There are two optional cwic flags you can use to expedite the total reboot time: --no-test to skip post-reboot validation, and --force to queue the reboot to begin immediately.
Skip post-reboot validation
CoreWeave performs a post-reboot validation of a node after any reboot to ensure it is operating within acceptable tolerances. This test usually consumes about 30 minutes of the overall reboot time. To expedite reboots, you can skip the validation by using the --no-test flag. This sets the node PhaseState to
production-reconfigure-powerreset.
Queue reboot to start immediately
You can queue a reconfigure reboot to begin immediately by adding the --force flag, but this makes no
consideration of the active state of the Nodes. Before using this flag, please ensure the Nodes are idle or otherwise able to interrupt running workflows.