Playbooks
Automate multi-step operations with YAML playbooks for repeatable infrastructure management.
Playbooks let you define multi-step operations as YAML files and execute them across your fleet. Instead of running individual commands one at a time, a playbook chains together command execution, file writes, and file reads into a single, repeatable workflow.
Why Playbooks?
- Repeatable — Define once, run many times. Eliminate manual copy-paste of command sequences.
- Version-controlled — Check playbooks into Git alongside your application code.
- Safe — Preview with
--dry-run, validate syntax before execution, and use conditional steps. - Multi-host — Run the same playbook across your entire fleet with concurrency control.
Basic Structure
A playbook is a YAML file with a name, optional variables, and a list of steps:
name: health-check
description: "Verify service health on target hosts"
vars:
service_name: nginx
steps:
- name: check-service
exec: "systemctl is-active {{.Vars.service_name}}"
- name: check-disk
exec: "df -h / --output=pcent | tail -1"Step Types
The step type is determined by which action field is present: exec, fs_write, or fs_read.
Runs a shell command on the remote host. The most common step type.
- name: restart-app
exec: "sudo systemctl restart myapp"
timeout_ms: 30000Step Fields
Every step supports the following optional fields:
| Field | Type | Description |
|---|---|---|
name | string | Unique identifier for the step (required) |
exec / fs_write / fs_read | varies | The action to perform. Exactly one must be present. |
os | string | Run only on hosts matching this OS: linux, macos, or windows |
when | string | Conditional expression; step is skipped if it evaluates to false |
register | string | Store the step output under this name for use in later steps |
continue_on_error | boolean | If true, playbook continues even if this step fails (default: false) |
timeout_ms | integer | Per-step timeout in milliseconds (overrides the default) |
retry | object | Retry configuration with max_attempts and interval_ms |
OS Filtering
Run steps only on hosts matching a specific operating system:
steps:
- name: install-linux
os: linux
exec: "sudo apt install -y myapp"
- name: install-macos
os: macos
exec: "brew install myapp"
- name: install-windows
os: windows
exec: "choco install -y myapp"Conditional Steps
Use when to execute a step only if a condition is met. Conditions can reference registered output from previous steps:
steps:
- name: check-version
exec: "myapp --version"
register: version_check
- name: upgrade
when: "{{.Steps.version_check.Output}} != 'v2.0.0'"
exec: "sudo apt install -y myapp=2.0.0"Retry Configuration
Automatically retry failing steps with configurable backoff:
- name: wait-for-healthy
exec: "curl -sf http://localhost:8080/health"
retry:
max_attempts: 5
interval_ms: 3000This retries up to 5 times with a 3-second interval between attempts.
Template Variables
Playbooks support Go template syntax for dynamic values. The following variables are available in command, content, and when fields:
| Variable | Description |
|---|---|
{{.Host.ID}} | The target host name |
{{.Host.Address}} | The host address (VPN IP) |
{{.Host.OS}} | The operating system of the target host (macos/linux/windows) |
{{.Host.Tags}} | Host tag key-value pairs (map[string]string). Access individual tags with {{index .Host.Tags "key"}}. |
{{.Vars.<name>}} | A user-defined variable from the vars block or --var flag |
{{.Steps.<name>.Output}} | The stdout output of a previously registered step |
{{.Steps.<name>.OK}} | Whether a registered step succeeded (bool) |
{{.Steps.<name>.ExitCode}} | The exit code of a previously registered step |
{{.Steps.<name>.Skipped}} | Whether a registered step was skipped (bool) |
CLI Commands
Run a Playbook
nefia playbook run deploy.yaml --target group:productionRunning playbook: deploy-app (3 steps) Targets: 4 hosts (group:production)
Step 1/3: stop-service [web-01] OK (0.8s) [web-02] OK (0.9s) [web-03] OK (0.7s) [web-04] OK (0.8s)
Step 2/3: deploy-binary [web-01] OK (3.2s) [web-02] OK (3.4s) [web-03] OK (3.1s) [web-04] OK (3.3s)
Step 3/3: start-service [web-01] OK (1.1s) [web-02] OK (1.0s) [web-03] OK (1.2s) [web-04] OK (1.1s)
Playbook completed: 4/4 hosts succeeded
Pass Variables
Override or supply variables at runtime with --var:
nefia playbook run deploy.yaml --target group:production --var version=2.1.0 --var env=stagingPreview with Dry Run
See what a playbook would do without executing anything:
nefia playbook run deploy.yaml --target group:production --dry-run=== DRY RUN: deploy-app === Targets: 4 host(s)
[web-01] (os=linux)
- stop-service → exec: systemctl stop myapp
- deploy-binary → fs_write: /opt/myapp/bin/myapp
- start-service → exec: systemctl start myapp
[web-02] (os=linux)
- stop-service → exec: systemctl stop myapp
- deploy-binary → fs_write: /opt/myapp/bin/myapp
- start-service → exec: systemctl start myapp
Step Timeout
Override the default step timeout for all steps in the playbook:
nefia playbook run deploy.yaml --target group:production --step-timeout 5mThis sets a default timeout of 5 minutes for every step. Individual steps can still override this with their own timeout_ms field.
Validate Syntax
Check a playbook for syntax errors without running it:
nefia playbook validate deploy.yamlPlaybook "deploy-app" is valid (3 steps)
List Playbooks
List available playbooks from the search paths (./playbooks/ and user config dir):
nefia playbook listShow Playbook Details
Display the parsed structure of a playbook:
nefia playbook show deploy.yamlMulti-Host Execution
Playbooks execute hosts in parallel (up to the concurrency limit), and within each host, steps run sequentially (step 1 completes before step 2 begins):
nefia playbook run deploy.yaml --target all --concurrency 3Complete Example
Here is a full playbook that deploys an application with a health check:
name: deploy-app
description: "Deploy application binary and restart the service"
vars:
version: "2.0.0"
health_url: "http://localhost:8080/health"
steps:
- name: check-current-version
exec: "myapp --version || echo 'not installed'"
register: current_version
continue_on_error: true
- name: stop-service
os: linux
exec: "sudo systemctl stop myapp"
continue_on_error: true
timeout_ms: 15000
- name: stop-service-windows
os: windows
exec: "Stop-Service -Name myapp -Force"
continue_on_error: true
timeout_ms: 15000
- name: symlink-binary
os: linux
exec: "ln -sf /opt/myapp/bin/myapp-{{.Vars.version}} /opt/myapp/bin/myapp"
- name: symlink-binary-macos
os: macos
exec: "ln -sf /opt/myapp/bin/myapp-{{.Vars.version}} /opt/myapp/bin/myapp"
- name: copy-binary-windows
os: windows
exec: "Copy-Item -Path C:\\myapp\\bin\\myapp-{{.Vars.version}}.exe -Destination C:\\myapp\\bin\\myapp.exe -Force"
- name: write-config
fs_write:
path: /etc/myapp/config.yaml
content: |
server:
host: "{{.Host.ID}}"
version: "{{.Vars.version}}"
- name: start-service
os: linux
exec: "sudo systemctl start myapp"
timeout_ms: 15000
- name: start-service-windows
os: windows
exec: "Start-Service -Name myapp"
timeout_ms: 15000
- name: health-check
exec: "curl -sf {{.Vars.health_url}}"
retry:
max_attempts: 10
interval_ms: 2000Run it:
nefia playbook run deploy.yaml \
--target group:production \
--var version=2.1.0 \
--concurrency 2Related
Learn about target selectors, concurrency, and output formats used by playbooks.
Detailed reference for file read, write, and sync operations.
Complete reference for all playbook commands and flags.