»/v1/operator

The /operator endpoint provides cluster-level tools for Nomad operators, such as interacting with the Raft subsystem.

See the Outage Recovery guide for some examples of how these capabilities are used. For a CLI to perform these operations manually, please see the documentation for the nomad operator command.

»Read Raft Configuration

This endpoint queries the status of a client node registered with Nomad.

MethodPathProduces
GET/v1/operator/raft/configurationapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOmanagement

»Parameters

  • stale - Specifies if the cluster should respond without an active leader. This is specified as a query string parameter.

»Sample Request

$ curl \
    https://localhost:4646/v1/operator/raft/configuration

»Sample Response

{
  "Index": 1,
  "Servers": [
    {
      "Address": "127.0.0.1:4647",
      "ID": "127.0.0.1:4647",
      "Leader": true,
      "Node": "bacon-mac.global",
      "RaftProtocol": 2,
      "Voter": true
    }
  ]
}

»Field Reference

  • Index (int) - The Index value is the Raft corresponding to this configuration. The latest configuration may not yet be committed if changes are in flight.

  • Servers (array: Server) - The returned Servers array has information about the servers in the Raft peer configuration.

    • ID (string) - The ID of the server. This is the same as the Address but may be upgraded to a GUID in a future version of Nomad.

    • Node (string) - The node name of the server, as known to Nomad, or "(unknown)" if the node is stale and not known.

    • Address (string) - The ip:port for the server.

    • Leader (bool) - is either "true" or "false" depending on the server's role in the Raft configuration.

    • Voter (bool) - is "true" or "false", indicating if the server has a vote in the Raft configuration. Future versions of Nomad may add support for non-voting servers.

»Remove Raft Peer

This endpoint removes a Nomad server with given address from the Raft configuration. The return code signifies success or failure.

MethodPathProduces
DELETE/v1/operator/raft/peerapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOmanagement

»Parameters

  • address (string: <optional>) - Specifies the server to remove as ip:port. This cannot be provided along with the id parameter.

  • id (string: <optional>) - Specifies the server to remove as id. This cannot be provided along with the address parameter.

»Sample Request

$ curl \
    --request DELETE \
    https://localhost:4646/v1/operator/raft/peer?address=1.2.3.4

»Read Autopilot Configuration

This endpoint retrieves its latest Autopilot configuration.

MethodPathProduces
GET/v1/operator/autopilot/configurationapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:read

»Sample Request

$ curl \
    https://localhost:4646/v1/operator/autopilot/configuration

»Sample Response

{
  "CleanupDeadServers": true,
  "LastContactThreshold": "200ms",
  "MaxTrailingLogs": 250,
  "ServerStabilizationTime": "10s",
  "EnableRedundancyZones": false,
  "DisableUpgradeMigration": false,
  "EnableCustomUpgrades": false,
  "CreateIndex": 4,
  "ModifyIndex": 4
}

For more information about the Autopilot configuration options, see the agent configuration section.

»Update Autopilot Configuration

This endpoint updates the Autopilot configuration of the cluster.

MethodPathProduces
PUT/v1/operator/autopilot/configurationapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:write

»Parameters

  • cas (int: 0) - Specifies to use a Check-And-Set operation. The update will only happen if the given index matches the ModifyIndex of the configuration at the time of writing.

»Sample Payload

{
  "CleanupDeadServers": true,
  "LastContactThreshold": "200ms",
  "MaxTrailingLogs": 250,
  "ServerStabilizationTime": "10s",
  "EnableRedundancyZones": false,
  "DisableUpgradeMigration": false,
  "EnableCustomUpgrades": false,
  "CreateIndex": 4,
  "ModifyIndex": 4
}
  • CleanupDeadServers (bool: true) - Specifies automatic removal of dead server nodes periodically and whenever a new server is added to the cluster.

  • LastContactThreshold (string: "200ms") - Specifies the maximum amount of time a server can go without contact from the leader before being considered unhealthy. Must be a duration value such as 10s.

  • MaxTrailingLogs (int: 250) specifies the maximum number of log entries that a server can trail the leader by before being considered unhealthy.

  • ServerStabilizationTime (string: "10s") - Specifies the minimum amount of time a server must be stable in the 'healthy' state before being added to the cluster. Only takes effect if all servers are running Raft protocol version 3 or higher. Must be a duration value such as 30s.

  • EnableRedundancyZones (bool: false) - (Enterprise-only) Specifies whether to enable redundancy zones.

  • DisableUpgradeMigration (bool: false) - (Enterprise-only) Disables Autopilot's upgrade migration strategy in Nomad Enterprise of waiting until enough newer-versioned servers have been added to the cluster before promoting any of them to voters.

  • EnableCustomUpgrades (bool: false) - (Enterprise-only) Specifies whether to enable using custom upgrade versions when performing migrations.

»Read Health

This endpoint queries the health of the autopilot status.

MethodPathProduces
GET/v1/operator/autopilot/healthapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:read

»Sample Request

$ curl \
    https://localhost:4646/v1/operator/autopilot/health

»Sample response

{
  "Healthy": true,
  "FailureTolerance": 0,
  "Servers": [
    {
      "ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e",
      "Name": "node1",
      "Address": "127.0.0.1:8300",
      "SerfStatus": "alive",
      "Version": "0.8.0",
      "Leader": true,
      "LastContact": "0s",
      "LastTerm": 2,
      "LastIndex": 46,
      "Healthy": true,
      "Voter": true,
      "StableSince": "2017-03-06T22:07:51Z"
    },
    {
      "ID": "e36ee410-cc3c-0a0c-c724-63817ab30303",
      "Name": "node2",
      "Address": "127.0.0.1:8205",
      "SerfStatus": "alive",
      "Version": "0.8.0",
      "Leader": false,
      "LastContact": "27.291304ms",
      "LastTerm": 2,
      "LastIndex": 46,
      "Healthy": true,
      "Voter": false,
      "StableSince": "2017-03-06T22:18:26Z"
    }
  ]
}
  • Healthy is whether all the servers are currently healthy.

  • FailureTolerance is the number of redundant healthy servers that could be fail without causing an outage (this would be 2 in a healthy cluster of 5 servers).

  • Servers holds detailed health information on each server:

    • ID is the Raft ID of the server.

    • Name is the node name of the server.

    • Address is the address of the server.

    • SerfStatus is the SerfHealth check status for the server.

    • Version is the Nomad version of the server.

    • Leader is whether this server is currently the leader.

    • LastContact is the time elapsed since this server's last contact with the leader.

    • LastTerm is the server's last known Raft leader term.

    • LastIndex is the index of the server's last committed Raft log entry.

    • Healthy is whether the server is healthy according to the current Autopilot configuration.

    • Voter is whether the server is a voting member of the Raft cluster.

    • StableSince is the time this server has been in its current Healthy state.

    The HTTP status code will indicate the health of the cluster. If Healthy is true, then a status of 200 will be returned. If Healthy is false, then a status of 429 will be returned.

»Read Scheduler Configuration

This endpoint retrieves the latest Scheduler configuration. This API was introduced in Nomad 0.9 and currently supports enabling/disabling preemption. More options may be added in the future.

MethodPathProduces
GET/v1/operator/scheduler/configurationapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:read

»Sample Request

$ curl \
    https://localhost:4646/v1/operator/scheduler/configuration

»Sample Response

{
  "Index": 5,
  "KnownLeader": true,
  "LastContact": 0,
  "SchedulerConfig": {
    "CreateIndex": 5,
    "ModifyIndex": 5,
    "SchedulerAlgorithm": "spread",
    "PreemptionConfig": {
      "SystemSchedulerEnabled": true,
      "BatchSchedulerEnabled": false,
      "ServiceSchedulerEnabled": false
    }
  }
}

»Field Reference

  • Index (int) - The Index value is the Raft commit index corresponding to this configuration.

  • SchedulerConfig (SchedulerConfig) - The returned SchedulerConfig object has configuration settings mentioned below.

    • SchedulerAlgorithm (string: "binpack") - Specifies whether scheduler binpacks or spreads allocations on available nodes.

    • PreemptionConfig (PreemptionConfig) - Options to enable preemption for various schedulers.

      • SystemSchedulerEnabled (bool: true) - Specifies whether preemption for system jobs is enabled. Note that this defaults to true.

      • BatchSchedulerEnabled (bool: false) - Specifies whether preemption for batch jobs is enabled. Note that this defaults to false and must be explicitly enabled.

      • ServiceSchedulerEnabled (bool: false) - Specifies whether preemption for service jobs is enabled. Note that this defaults to false and must be explicitly enabled.

    • CreateIndex - The Raft index at which the config was created.

    • ModifyIndex - The Raft index at which the config was modified.

»Update Scheduler Configuration

This endpoint updates the scheduler configuration of the cluster.

MethodPathProduces
PUT, POST/v1/operator/scheduler/configurationapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:write

»Bootstrap Configuration Element

The default_scheduler_config attribute of the server stanza will provide a starting value for this configuration. Once bootstrapped, the value in the server state is authoritative.

»Parameters

  • cas (int: 0) - Specifies to use a Check-And-Set operation. The update will only happen if the given index matches the ModifyIndex of the configuration at the time of writing.

»Sample Payload

{
  "SchedulerAlgorithm": "spread",
  "PreemptionConfig": {
    "SystemSchedulerEnabled": true,
    "BatchSchedulerEnabled": false,
    "ServiceSchedulerEnabled": true
  }
}
  • SchedulerAlgorithm (string: "binpack") - Specifies whether scheduler binpacks or spreads allocations on available nodes. Possible values are "binpack" and "spread"

  • PreemptionConfig (PreemptionConfig) - Options to enable preemption for various schedulers.

    • SystemSchedulerEnabled (bool: true) - Specifies whether preemption for system jobs is enabled. Note that if this is set to true, then system jobs can preempt any other jobs.

    • BatchSchedulerEnabled (bool: false) - Specifies whether preemption for batch jobs is enabled. Note that if this is set to true, then batch jobs can preempt any other jobs.

    • ServiceSchedulerEnabled (bool: false) - Specifies whether preemption for service jobs is enabled. Note that if this is set to true, then service jobs can preempt any other jobs.

»Sample Response

{
  "Updated": false,
  "Index": 15
}
  • Updated - Indicates that the configuration was updated when a cas value is provided. For non-CAS requests, this field will be false even though the update is applied.

  • Index - Current Raft index when the request was received.

»Get Nomad Enterprise License Info

This endpoint gets information about the current license.

MethodPathProduces
GET/v1/operator/licenseapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:read

»Sample Request

$ curl \
    https://localhost:4646/v1/operator/license

»Sample Response

{
  "KnownLeader": false,
  "LastContact": 0,
  "LastIndex": 0,
  "License": {
    "CustomerID": "temporary license customer",
    "ExpirationTime": "2020-06-01T14:50:16.581304556-04:00",
    "Features": [
      "Automated Upgrades",
      "Enhanced Read Scalability",
      "Redundancy Zones",
      "Namespaces",
      "Resource Quotas",
      "Preemption",
      "Audit Logging",
      "Setinel Policies"
    ],
    "Flags": {
      "modules": [
        "governance-policy"
      ]
    },
    "InstallationID": "*",
    "IssueTime": "2020-06-01T08:50:16.581304556-04:00",
    "LicenseID": "temporary-license",
    "Modules": [
      "governance-policy"
    ],
    "Product": "nomad",
    "StartTime": "2020-06-01T08:50:16.581304556-04:00",
    "TerminationTime": "2020-06-01T14:50:16.581304556-04:00"
  },
  "RequestTime": 0
}

»Updating the Nomad Enterprise License

This endpoint updates the Nomad license.

MethodPathProduces
PUT/v1/operator/licenseapplication/json

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOoperator:write

»Sample Payload

The payload is the raw license blob.

»Sample Request

$ curl \
    --request PUT \
    --data @nomad.license \
    https://localhost:4646/v1/operator/license

»Sample Response

{
  "Index": 15
}

»Generate Snapshot

This endpoint generates and returns an atomic, point-in-time snapshot of the Nomad server state for disaster recovery. Snapshots include all state managed by Nomad's Raft consensus protocol.

Snapshots are exposed as gzipped tar archives which internally contain the Raft metadata required to restore, as well as a binary serialized version of the Nomad server state. The contents are covered internally by SHA-256 hashes. These hashes are verified during snapshot restore operations. The structure of the archive is internal to Nomad and not intended to be used other than for restore operations. The archives are not designed to be modified before a restore.

MethodPathProduces
GET/v1/operator/snapshot200 application/x-gzip

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOmanagement

»Parameters

  • stale - Specifies if the cluster should respond without an active leader. This is specified as a query string parameter.

»Sample Request

$ curl \
    -o snapshot.tgz \
    http://127.0.0.1:4646/v1/operator/snapshot

The above example results in a tarball named snapshot.tgz in the current working directory.

»Restore Snapshot

This endpoint restores a point-in-time snapshot of the Nomad server state.

Restores involve a potentially dangerous low-level Raft operation that is not designed to handle server failures during a restore. This operation is primarily intended to be used when recovering from a disaster, restoring into a fresh cluster of Nomad servers.

The body of the request should be a snapshot archive returned from a previous call to the GET method.

MethodPathProduces
PUT/v1/operator/snapshot200 text/plain (empty body)

The table below shows this endpoint's support for blocking queries and required ACLs.

Blocking QueriesACL Required
NOmanagement

»Sample Request

$ curl \
    --request PUT \
    --data-binary @snapshot.tgz \
    http://127.0.0.1:4646/v1/operator/snapshot