Skip to content

Playbooks

Automate multi-step operations with YAML playbooks for repeatable infrastructure management.

Playbooks let you define multi-step operations as YAML files and execute them across your fleet. Instead of running individual commands one at a time, a playbook chains together command execution, file writes, and file reads into a single, repeatable workflow.

Why Playbooks?

  • Repeatable — Define once, run many times. Eliminate manual copy-paste of command sequences.
  • Version-controlled — Check playbooks into Git alongside your application code.
  • Safe — Preview with --dry-run, validate syntax before execution, and use conditional steps.
  • Multi-host — Run the same playbook across your entire fleet with concurrency control.

Basic Structure

A playbook is a YAML file with a name, optional variables, and a list of steps:

yaml
name: health-check
description: "Verify service health on target hosts"
vars:
  service_name: nginx
 
steps:
  - name: check-service
    exec: "systemctl is-active {{.Vars.service_name}}"
 
  - name: check-disk
    exec: "df -h / --output=pcent | tail -1"

Step Types

The step type is determined by which action field is present: exec, fs_write, or fs_read.

Runs a shell command on the remote host. The most common step type.

yaml
- name: restart-app
  exec: "sudo systemctl restart myapp"
  timeout_ms: 30000

Step Fields

Every step supports the following optional fields:

FieldTypeDescription
namestringUnique identifier for the step (required)
exec / fs_write / fs_readvariesThe action to perform. Exactly one must be present.
osstringRun only on hosts matching this OS: linux, macos, or windows
whenstringConditional expression; step is skipped if it evaluates to false
registerstringStore the step output under this name for use in later steps
continue_on_errorbooleanIf true, playbook continues even if this step fails (default: false)
timeout_msintegerPer-step timeout in milliseconds (overrides the default)
retryobjectRetry configuration with max_attempts and interval_ms

OS Filtering

Run steps only on hosts matching a specific operating system:

yaml
steps:
  - name: install-linux
    os: linux
    exec: "sudo apt install -y myapp"
 
  - name: install-macos
    os: macos
    exec: "brew install myapp"
 
  - name: install-windows
    os: windows
    exec: "choco install -y myapp"

Conditional Steps

Use when to execute a step only if a condition is met. Conditions can reference registered output from previous steps:

yaml
steps:
  - name: check-version
    exec: "myapp --version"
    register: version_check
 
  - name: upgrade
    when: "{{.Steps.version_check.Output}} != 'v2.0.0'"
    exec: "sudo apt install -y myapp=2.0.0"

Retry Configuration

Automatically retry failing steps with configurable backoff:

yaml
- name: wait-for-healthy
  exec: "curl -sf http://localhost:8080/health"
  retry:
    max_attempts: 5
    interval_ms: 3000

This retries up to 5 times with a 3-second interval between attempts.

Template Variables

Playbooks support Go template syntax for dynamic values. The following variables are available in command, content, and when fields:

VariableDescription
{{.Host.ID}}The target host name
{{.Host.Address}}The host address (VPN IP)
{{.Host.OS}}The operating system of the target host (macos/linux/windows)
{{.Host.Tags}}Host tag key-value pairs (map[string]string). Access individual tags with {{index .Host.Tags "key"}}.
{{.Vars.<name>}}A user-defined variable from the vars block or --var flag
{{.Steps.<name>.Output}}The stdout output of a previously registered step
{{.Steps.<name>.OK}}Whether a registered step succeeded (bool)
{{.Steps.<name>.ExitCode}}The exit code of a previously registered step
{{.Steps.<name>.Skipped}}Whether a registered step was skipped (bool)

CLI Commands

Run a Playbook

bash
nefia playbook run deploy.yaml --target group:production
nefia playbook run deploy.yaml --target group:production

Running playbook: deploy-app (3 steps) Targets: 4 hosts (group:production)

Step 1/3: stop-service [web-01] OK (0.8s) [web-02] OK (0.9s) [web-03] OK (0.7s) [web-04] OK (0.8s)

Step 2/3: deploy-binary [web-01] OK (3.2s) [web-02] OK (3.4s) [web-03] OK (3.1s) [web-04] OK (3.3s)

Step 3/3: start-service [web-01] OK (1.1s) [web-02] OK (1.0s) [web-03] OK (1.2s) [web-04] OK (1.1s)

Playbook completed: 4/4 hosts succeeded

Pass Variables

Override or supply variables at runtime with --var:

bash
nefia playbook run deploy.yaml --target group:production --var version=2.1.0 --var env=staging

Preview with Dry Run

See what a playbook would do without executing anything:

bash
nefia playbook run deploy.yaml --target group:production --dry-run
nefia playbook run deploy.yaml --target group:production --dry-run

=== DRY RUN: deploy-app === Targets: 4 host(s)

[web-01] (os=linux)

  1. stop-service → exec: systemctl stop myapp
  2. deploy-binary → fs_write: /opt/myapp/bin/myapp
  3. start-service → exec: systemctl start myapp

[web-02] (os=linux)

  1. stop-service → exec: systemctl stop myapp
  2. deploy-binary → fs_write: /opt/myapp/bin/myapp
  3. start-service → exec: systemctl start myapp

Step Timeout

Override the default step timeout for all steps in the playbook:

bash
nefia playbook run deploy.yaml --target group:production --step-timeout 5m

This sets a default timeout of 5 minutes for every step. Individual steps can still override this with their own timeout_ms field.

Validate Syntax

Check a playbook for syntax errors without running it:

bash
nefia playbook validate deploy.yaml
nefia playbook validate deploy.yaml

Playbook "deploy-app" is valid (3 steps)

List Playbooks

List available playbooks from the search paths (./playbooks/ and user config dir):

bash
nefia playbook list

Show Playbook Details

Display the parsed structure of a playbook:

bash
nefia playbook show deploy.yaml

Multi-Host Execution

Playbooks execute hosts in parallel (up to the concurrency limit), and within each host, steps run sequentially (step 1 completes before step 2 begins):

bash
nefia playbook run deploy.yaml --target all --concurrency 3

Complete Example

Here is a full playbook that deploys an application with a health check:

yaml
name: deploy-app
description: "Deploy application binary and restart the service"
vars:
  version: "2.0.0"
  health_url: "http://localhost:8080/health"
 
steps:
  - name: check-current-version
    exec: "myapp --version || echo 'not installed'"
    register: current_version
    continue_on_error: true
 
  - name: stop-service
    os: linux
    exec: "sudo systemctl stop myapp"
    continue_on_error: true
    timeout_ms: 15000
 
  - name: stop-service-windows
    os: windows
    exec: "Stop-Service -Name myapp -Force"
    continue_on_error: true
    timeout_ms: 15000
 
  - name: symlink-binary
    os: linux
    exec: "ln -sf /opt/myapp/bin/myapp-{{.Vars.version}} /opt/myapp/bin/myapp"
 
  - name: symlink-binary-macos
    os: macos
    exec: "ln -sf /opt/myapp/bin/myapp-{{.Vars.version}} /opt/myapp/bin/myapp"
 
  - name: copy-binary-windows
    os: windows
    exec: "Copy-Item -Path C:\\myapp\\bin\\myapp-{{.Vars.version}}.exe -Destination C:\\myapp\\bin\\myapp.exe -Force"
 
  - name: write-config
    fs_write:
      path: /etc/myapp/config.yaml
      content: |
        server:
          host: "{{.Host.ID}}"
          version: "{{.Vars.version}}"
 
  - name: start-service
    os: linux
    exec: "sudo systemctl start myapp"
    timeout_ms: 15000
 
  - name: start-service-windows
    os: windows
    exec: "Start-Service -Name myapp"
    timeout_ms: 15000
 
  - name: health-check
    exec: "curl -sf {{.Vars.health_url}}"
    retry:
      max_attempts: 10
      interval_ms: 2000

Run it:

bash
nefia playbook run deploy.yaml \
  --target group:production \
  --var version=2.1.0 \
  --concurrency 2
Remote Execution

Learn about target selectors, concurrency, and output formats used by playbooks.

File Operations

Detailed reference for file read, write, and sync operations.

CLI Reference

Complete reference for all playbook commands and flags.