Design Nomad Jobs for Resiliency

Learn about how Nomad attempts to keep jobs running in the event of the unexpected through local retries and rescheduling to other nodes.

4 tutorials

1min
Failure recovery strategies
Customize Nomad's application failure handling strategies,local restarts, check restarts, and rescheduling failed workloads, for your jobs.
- Nomad
3min
Define restart behavior in your jobs
Configure Nomad's application local restart behaviors in your Nomad job specification.
- Nomad
2min
Restart a workload based on health checks
Configure your job for Nomad to restart tasks with a failing health check using the "check_restart" stanza.
- Nomad
2min
Define reschedule behaviors for a job
Learn how to control Nomad's rescheduling behaviors so that jobs that fail to start and exhaust their local restarts can be scheduled on another node.
- Nomad