»Autopilot Operator HTTP API

The /operator/autopilot endpoints allow for automatic operator-friendly management of Nomad servers including cleanup of dead servers, monitoring the state of the Raft cluster, and stable server introduction.

»Read Autopilot Configuration

This endpoint retrieves its latest Autopilot configuration.

MethodPathProduces
GET/v1/operator/autopilot/configurationapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:read

»Sample Request

$ curl \
    https://localhost:4646/v1/operator/autopilot/configuration
$ curl \    https://localhost:4646/v1/operator/autopilot/configuration

»Sample Response

{
  "CleanupDeadServers": true,
  "LastContactThreshold": "200ms",
  "MaxTrailingLogs": 250,
  "ServerStabilizationTime": "10s",
  "EnableRedundancyZones": false,
  "DisableUpgradeMigration": false,
  "EnableCustomUpgrades": false,
  "CreateIndex": 4,
  "ModifyIndex": 4
}
{  "CleanupDeadServers": true,  "LastContactThreshold": "200ms",  "MaxTrailingLogs": 250,  "ServerStabilizationTime": "10s",  "EnableRedundancyZones": false,  "DisableUpgradeMigration": false,  "EnableCustomUpgrades": false,  "CreateIndex": 4,  "ModifyIndex": 4}

For more information about the Autopilot configuration options, see the agent configuration section.

»Update Autopilot Configuration

This endpoint updates the Autopilot configuration of the cluster.

MethodPathProduces
PUT/v1/operator/autopilot/configurationapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:write

»Parameters

  • cas (int: 0) - Specifies to use a Check-And-Set operation. The update will only happen if the given index matches the ModifyIndex of the configuration at the time of writing.

»Sample Payload

{
  "CleanupDeadServers": true,
  "LastContactThreshold": "200ms",
  "MaxTrailingLogs": 250,
  "ServerStabilizationTime": "10s",
  "EnableRedundancyZones": false,
  "DisableUpgradeMigration": false,
  "EnableCustomUpgrades": false,
  "CreateIndex": 4,
  "ModifyIndex": 4
}
{  "CleanupDeadServers": true,  "LastContactThreshold": "200ms",  "MaxTrailingLogs": 250,  "ServerStabilizationTime": "10s",  "EnableRedundancyZones": false,  "DisableUpgradeMigration": false,  "EnableCustomUpgrades": false,  "CreateIndex": 4,  "ModifyIndex": 4}
  • CleanupDeadServers (bool: true) - Specifies automatic removal of dead server nodes periodically and whenever a new server is added to the cluster.

  • LastContactThreshold (string: "200ms") - Specifies the maximum amount of time a server can go without contact from the leader before being considered unhealthy. Must be a duration value such as 10s.

  • MaxTrailingLogs (int: 250) specifies the maximum number of log entries that a server can trail the leader by before being considered unhealthy.

  • ServerStabilizationTime (string: "10s") - Specifies the minimum amount of time a server must be stable in the 'healthy' state before being added to the cluster. Only takes effect if all servers are running Raft protocol version 3 or higher. Must be a duration value such as 30s.

  • EnableRedundancyZones (bool: false) - (Enterprise-only) Specifies whether to enable redundancy zones.

  • DisableUpgradeMigration (bool: false) - (Enterprise-only) Disables Autopilot's upgrade migration strategy in Nomad Enterprise of waiting until enough newer-versioned servers have been added to the cluster before promoting any of them to voters.

  • EnableCustomUpgrades (bool: false) - (Enterprise-only) Specifies whether to enable using custom upgrade versions when performing migrations.

»Read Health

This endpoint queries the health of the autopilot status.

MethodPathProduces
GET/v1/operator/autopilot/healthapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:read

»Sample Request

$ curl \
    https://localhost:4646/v1/operator/autopilot/health
$ curl \    https://localhost:4646/v1/operator/autopilot/health

»Sample response

{
  "Healthy": true,
  "FailureTolerance": 0,
  "Servers": [
    {
      "ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e",
      "Name": "node1",
      "Address": "127.0.0.1:8300",
      "SerfStatus": "alive",
      "Version": "0.8.0",
      "Leader": true,
      "LastContact": "0s",
      "LastTerm": 2,
      "LastIndex": 46,
      "Healthy": true,
      "Voter": true,
      "StableSince": "2017-03-06T22:07:51Z"
    },
    {
      "ID": "e36ee410-cc3c-0a0c-c724-63817ab30303",
      "Name": "node2",
      "Address": "127.0.0.1:8205",
      "SerfStatus": "alive",
      "Version": "0.8.0",
      "Leader": false,
      "LastContact": "27.291304ms",
      "LastTerm": 2,
      "LastIndex": 46,
      "Healthy": true,
      "Voter": false,
      "StableSince": "2017-03-06T22:18:26Z"
    }
  ]
}
{  "Healthy": true,  "FailureTolerance": 0,  "Servers": [    {      "ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e",      "Name": "node1",      "Address": "127.0.0.1:8300",      "SerfStatus": "alive",      "Version": "0.8.0",      "Leader": true,      "LastContact": "0s",      "LastTerm": 2,      "LastIndex": 46,      "Healthy": true,      "Voter": true,      "StableSince": "2017-03-06T22:07:51Z"    },    {      "ID": "e36ee410-cc3c-0a0c-c724-63817ab30303",      "Name": "node2",      "Address": "127.0.0.1:8205",      "SerfStatus": "alive",      "Version": "0.8.0",      "Leader": false,      "LastContact": "27.291304ms",      "LastTerm": 2,      "LastIndex": 46,      "Healthy": true,      "Voter": false,      "StableSince": "2017-03-06T22:18:26Z"    }  ]}
  • Healthy is whether all the servers are currently healthy.

  • FailureTolerance is the number of redundant healthy servers that could be fail without causing an outage (this would be 2 in a healthy cluster of 5 servers).

  • Servers holds detailed health information on each server:

    • ID is the Raft ID of the server.

    • Name is the node name of the server.

    • Address is the address of the server.

    • SerfStatus is the SerfHealth check status for the server.

    • Version is the Nomad version of the server.

    • Leader is whether this server is currently the leader.

    • LastContact is the time elapsed since this server's last contact with the leader.

    • LastTerm is the server's last known Raft leader term.

    • LastIndex is the index of the server's last committed Raft log entry.

    • Healthy is whether the server is healthy according to the current Autopilot configuration.

    • Voter is whether the server is a voting member of the Raft cluster.

    • StableSince is the time this server has been in its current Healthy state.

    The HTTP status code will indicate the health of the cluster. If Healthy is true, then a status of 200 will be returned. If Healthy is false, then a status of 429 will be returned.