Playbooks

Automate multi-step operations with YAML playbooks for repeatable infrastructure management.

Playbooks let you define multi-step operations as YAML files and execute them across your fleet. Instead of running individual commands one at a time, a playbook chains together command execution, file writes, and file reads into a single, repeatable workflow.

Why Playbooks?

Repeatable — Define once, run many times. Eliminate manual copy-paste of command sequences.
Version-controlled — Check playbooks into Git alongside your application code.
Safe — Preview with --dry-run, validate syntax before execution, and use conditional steps.
Multi-host — Run the same playbook across your entire fleet with concurrency control.

Basic Structure

A playbook is a YAML file with a name, optional variables, and a list of steps:

yaml

name: health-check
description: "Verify service health on target hosts"
vars:
  service_name: nginx
 
steps:
  - name: check-service
    exec: "systemctl is-active {{.Vars.service_name}}"
 
  - name: check-disk
    exec: "df -h / --output=pcent | tail -1"

Step Types

The step type is determined by which action field is present: exec, fs_write, or fs_read.

Runs a shell command on the remote host. The most common step type.

yaml

- name: restart-app
  exec: "sudo systemctl restart myapp"
  timeout_ms: 30000

Step Fields

Every step supports the following optional fields:

Field	Type	Description
`name`	string	Unique identifier for the step (required)
`exec` / `fs_write` / `fs_read`	varies	The action to perform. Exactly one must be present.
`os`	string	Run only on hosts matching this OS: `linux`, `macos`, or `windows`
`when`	string	Conditional expression; step is skipped if it evaluates to false
`register`	string	Store the step output under this name for use in later steps
`continue_on_error`	boolean	If true, playbook continues even if this step fails (default: false)
`timeout_ms`	integer	Per-step timeout in milliseconds (overrides the default)
`retry`	object	Retry configuration with `max_attempts` and `interval_ms`

OS Filtering

Run steps only on hosts matching a specific operating system:

yaml

steps:
  - name: install-linux
    os: linux
    exec: "sudo apt install -y myapp"
 
  - name: install-macos
    os: macos
    exec: "brew install myapp"
 
  - name: install-windows
    os: windows
    exec: "choco install -y myapp"

Conditional Steps

Use when to execute a step only if a condition is met. Conditions can reference registered output from previous steps:

yaml

steps:
  - name: check-version
    exec: "myapp --version"
    register: version_check
 
  - name: upgrade
    when: "{{.Steps.version_check.Output}} != 'v2.0.0'"
    exec: "sudo apt install -y myapp=2.0.0"

Retry Configuration

Automatically retry failing steps with configurable backoff:

yaml

- name: wait-for-healthy
  exec: "curl -sf http://localhost:8080/health"
  retry:
    max_attempts: 5
    interval_ms: 3000

This retries up to 5 times with a 3-second interval between attempts.

Template Variables

Playbooks support Go template syntax for dynamic values. The following variables are available in command, content, and when fields:

Variable	Description
`{{.Host.ID}}`	The target host name
`{{.Host.Address}}`	The host address (VPN IP)
`{{.Host.OS}}`	The operating system of the target host (macos/linux/windows)
`{{.Host.Tags}}`	Host tag key-value pairs (`map[string]string`). Access individual tags with `{{index .Host.Tags "key"}}`.
`{{.Vars.<name>}}`	A user-defined variable from the `vars` block or `--var` flag
`{{.Steps.<name>.Output}}`	The stdout output of a previously registered step
`{{.Steps.<name>.OK}}`	Whether a registered step succeeded (bool)
`{{.Steps.<name>.ExitCode}}`	The exit code of a previously registered step
`{{.Steps.<name>.Skipped}}`	Whether a registered step was skipped (bool)

CLI Commands

Run a Playbook

bash

nefia playbook run deploy.yaml --target group:production

nefia playbook run deploy.yaml --target group:production

Running playbook: deploy-app (3 steps) Targets: 4 hosts (group:production)

Step 1/3: stop-service [web-01] OK (0.8s) [web-02] OK (0.9s) [web-03] OK (0.7s) [web-04] OK (0.8s)

Step 2/3: deploy-binary [web-01] OK (3.2s) [web-02] OK (3.4s) [web-03] OK (3.1s) [web-04] OK (3.3s)

Step 3/3: start-service [web-01] OK (1.1s) [web-02] OK (1.0s) [web-03] OK (1.2s) [web-04] OK (1.1s)

Playbook completed: 4/4 hosts succeeded

Pass Variables

Override or supply variables at runtime with --var:

bash

nefia playbook run deploy.yaml --target group:production --var version=2.1.0 --var env=staging

Preview with Dry Run

See what a playbook would do without executing anything:

bash

nefia playbook run deploy.yaml --target group:production --dry-run

nefia playbook run deploy.yaml --target group:production --dry-run

=== DRY RUN: deploy-app === Targets: 4 host(s)

[web-01] (os=linux)

stop-service → exec: systemctl stop myapp
deploy-binary → fs_write: /opt/myapp/bin/myapp
start-service → exec: systemctl start myapp

[web-02] (os=linux)

stop-service → exec: systemctl stop myapp
deploy-binary → fs_write: /opt/myapp/bin/myapp
start-service → exec: systemctl start myapp

Step Timeout

Override the default step timeout for all steps in the playbook:

bash

nefia playbook run deploy.yaml --target group:production --step-timeout 5m

This sets a default timeout of 5 minutes for every step. Individual steps can still override this with their own timeout_ms field.

Validate Syntax

Check a playbook for syntax errors without running it:

bash

nefia playbook validate deploy.yaml

nefia playbook validate deploy.yaml

Playbook "deploy-app" is valid (3 steps)

List Playbooks

List available playbooks from the search paths (./playbooks/ and user config dir):

bash

nefia playbook list

Show Playbook Details

Display the parsed structure of a playbook:

bash

nefia playbook show deploy.yaml

Multi-Host Execution

Playbooks execute hosts in parallel (up to the concurrency limit), and within each host, steps run sequentially (step 1 completes before step 2 begins):

bash

nefia playbook run deploy.yaml --target all --concurrency 3

Complete Example

Here is a full playbook that deploys an application with a health check:

yaml

name: deploy-app
description: "Deploy application binary and restart the service"
vars:
  version: "2.0.0"
  health_url: "http://localhost:8080/health"
 
steps:
  - name: check-current-version
    exec: "myapp --version || echo 'not installed'"
    register: current_version
    continue_on_error: true
 
  - name: stop-service
    os: linux
    exec: "sudo systemctl stop myapp"
    continue_on_error: true
    timeout_ms: 15000
 
  - name: stop-service-windows
    os: windows
    exec: "Stop-Service -Name myapp -Force"
    continue_on_error: true
    timeout_ms: 15000
 
  - name: symlink-binary
    os: linux
    exec: "ln -sf /opt/myapp/bin/myapp-{{.Vars.version}} /opt/myapp/bin/myapp"
 
  - name: symlink-binary-macos
    os: macos
    exec: "ln -sf /opt/myapp/bin/myapp-{{.Vars.version}} /opt/myapp/bin/myapp"
 
  - name: copy-binary-windows
    os: windows
    exec: "Copy-Item -Path C:\\myapp\\bin\\myapp-{{.Vars.version}}.exe -Destination C:\\myapp\\bin\\myapp.exe -Force"
 
  - name: write-config
    fs_write:
      path: /etc/myapp/config.yaml
      content: |
        server:
          host: "{{.Host.ID}}"
          version: "{{.Vars.version}}"
 
  - name: start-service
    os: linux
    exec: "sudo systemctl start myapp"
    timeout_ms: 15000
 
  - name: start-service-windows
    os: windows
    exec: "Start-Service -Name myapp"
    timeout_ms: 15000
 
  - name: health-check
    exec: "curl -sf {{.Vars.health_url}}"
    retry:
      max_attempts: 10
      interval_ms: 2000

Run it:

bash

nefia playbook run deploy.yaml \
  --target group:production \
  --var version=2.1.0 \
  --concurrency 2

Remote Execution

Learn about target selectors, concurrency, and output formats used by playbooks.

File Operations

Detailed reference for file read, write, and sync operations.

CLI Reference

Complete reference for all playbook commands and flags.