Daimons — the deep dive

The shape

A daimon is a markdown file. The YAML frontmatter tells the runtime how to drive the agent — provider, schedule, tools, collectors. The body is the system prompt the LLM sees on every run. That's the entire surface.

 # filename: edr.md
 ---
 name: edr
 version: "1"
 description: Linux EDR. Collects telemetry every 2 minutes…
 
 provider: claude
 model: claude-mythos-preview
 mode: daemon
 interval: 2m
 
 tools:

  - read_file

  - bash
 
 collectors:

  - name: processes

    command: "ps aux | head -60"
 ---
 

You are an EDR agent. Investigate the telemetry below for anomalies…

You don't write Go to add a daimon. You write a markdown file, drop it in the configured directory, and the binary picks it up. Same shape whether the daimon is doing endpoint detection, file integrity monitoring, certificate rotation, cost watching, or a domain-specific job nobody else needed before.

Note: The high-level introduction lives at /concepts/daimons. This page assumes you've read it and want depth. If a field below feels arbitrary, the concept page explains where it sits in the broader picture.

Identity

Three fields establish who the daimon is on the wire and in the dashboard.

 name: edr
 version: "1"
 description: >

  Linux EDR. Collects process / network / file telemetry every

  2 minutes and uses an LLM to triage anomalies into findings. 

name — globally unique identifier. Findings emitted by this daimon land in the CP tagged with this name; the dashboard groups by it.
version — bump when the prompt or contract changes. Lets multiple versions of the same daimon coexist during rollouts.
description — surfaced in the dashboard's daimon library and, importantly, in the orchestrator's prompt to chained agents (so step 2 of a playbook knows what step 1's daimon does).

Provider & model

Pick the LLM and how hard you want it to think.

 provider: claude # claude | codex | custom
 model: claude-mythos-preview
 effort: low # low | medium | high — reasoning depth
 maxTurns: 20 # tool-call rounds before the runtime aborts
 timeout: 5m # wall-clock cap on a single tick 

provider — which SDK to drive. claude uses the Anthropic API; codex uses OpenAI's; custom hooks an in-process implementation. Same daimon spec works against any.
model — provider-specific model identifier. EDR-style ticks (cheap, frequent) want claude-mythos-preview or similar; deep-investigation agents lean toward larger models.
effort — for reasoning models, how long the chain of thought runs. low for triage, medium or high for analysis playbooks.
maxTurns — defensive cap on the tool-use loop. A tick that takes more than ~8–10 turns is usually stuck; 20 leaves headroom but bounds runaway costs.

Schedule

Two execution shapes: scheduled (daemon mode) or on-demand (one-shot mode).

 mode: daemon # daemon | oneshot
 interval: 2m # or use cron: "*/5 * * * *"
 overlap: skip # skip | queue | parallel 

mode: daemon — the runtime ticks the agent on the configured cadence. Used for always-on monitors (EDR, FIM, SRE health).
mode: oneshot — the daimon runs once when invoked (via okesu auto, okesu claude, or an orchestration step that dispatches it). Used for triage / response agents that are reactive, not periodic.
interval — fixed cadence, e.g. 2m, 15m, 1h. Mutually exclusive with cron.
cron — for time-of-day or day-of-week patterns. "0 9 * * 1" = Mondays at 09:00.
overlap — what happens if the previous tick is still running when the next is due. skip drops the new tick (default for monitors), queue runs them serially, parallel runs concurrently (rare).

State & dedup

The runtime gives every daimon a writable directory and a finding-deduplication window.

 stateDir: /var/lib/okesu/edr
 dedupeTtl: 1h

stateDir — where the daimon keeps tick metadata. The runtime maintains a last_run.txt with the previous tick's timestamp so collectors can use {{.LastRunISO}} in their commands (see Collectors below).
dedupeTtl — when the daimon emits a finding with a dedup_key, identical findings inside this window are suppressed. Prevents the same noisy condition from re-emitting every tick.

Tools & RBAC

The tool list is the agent's surface — what it can call.

 tools:

  - read_file

  - write_file

  - list_files

  - search

  - bash
 
 actions:

  rbac:

    allow:

      - tool: read_file

      - tool: bash

        reason: "allowed but logged; targeted investigation only"

    deny: []

tools — the universe of tool calls the agent could make. Each name maps to a Go function the daimon runtime exposes.
actions.rbac — within that universe, which calls are allowed in this daimon's runtime. Defining the surface twice (once on tools, once on actions.rbac) is intentional: tools is the schema the LLM sees; actions.rbac is the host-side gate. An entry under deny blocks the call even if the LLM tries.

The default convention for monitors is observe-only: read_file, list_files, search, no bash. bash is reserved for daimons that have to inspect runtime state (process trees, kernel logs); the reason field documents why for audit logs.

Collectors

This is the part that surprises operators coming from prompt-only agent frameworks. A daimon's collectors array is a list of shell commands that run before the LLM is invoked. Their output is injected into the system prompt via Go template variables.

 collectors:

  - name: processes

    command: "ps aux --no-headers --sort=-%cpu | head -60"

    timeout: 5s
 
  - name: new_files

    command: "find /tmp /var/tmp -newer {{.LastRunFile}} -type f"

    timeout: 10s

    optional: true
 
  - name: osquery_no_disk

    command: "osqueryi --json 'SELECT pid, name FROM processes WHERE on_disk = 0'"

    timeout: 10s

    optional: true

name — referenced in the prompt body via {{range .CollectorsList}}.
command — bash command. Templated with Go template syntax — common variables include {{.LastRunFile}} (path to a file last-touched by the runtime, useful for find -newer), {{.LastRunISO}} (RFC3339 timestamp), {{.HostID}}.
timeout — per-collector cap. A slow collector doesn't block the whole tick.
optional — if true and the command fails or returns non-zero, the tick continues; that collector is marked skipped in the prompt. Useful for tools that may not be installed everywhere (osquery, jq, etc).

Why pre-collect? Because LLM tool-calling is expensive and slow compared to a shell. Running ps, ss, journalctl deterministically up front, then asking the model "is anything in here suspicious?" is dramatically cheaper than letting the model issue tool calls one at a time. Collectors give you cheap recall. Tools give you precise inspection.

The prompt body

Everything after the closing --- of the frontmatter is the system prompt. It's a Go template with the collector outputs interpolated.

 ---
 # (frontmatter above)
 ---
 
You are an EDR agent running on host **{{.HostID}}** in region **{{.CloudRegion}}**.
 
Agent: {{.AgentName}} | Tick: {{.Tick}} | Time: {{.TickTime}}
 
## Telemetry collected this tick
 
{{range .CollectorsList -}}

### {{.Name}} (skipped if optional and failed)

```

{{.Output}}

```

{{end}}
 
## Your task

Analyze the telemetry above. […domain instructions, decision logic…]

The prompt is what makes the daimon a daimon — the structural shape (frontmatter + tools + collectors) is uniform, but the prompt is the agent's mission. Two daimons with identical frontmatter but different prompts are different agents.

Outputs & sinks

Where the agent's emitted JSONL events go.

 outputs:

  # Always-on: stream to stdout (picked up by journald)

  - type: stdout
 
  # Local audit log — keeps every event including tick lifecycle

  - type: file

    path: /var/log/okesu/edr.jsonl

    maxBytes: 104857600 # rotate at 100 MB
 
  # CP webhook — high-signal events only

  - type: webhook

    url: "${OKESU_WEBHOOK_URL}"

    secret: "${OKESU_WEBHOOK_SECRET}"

    events: [finding, action_taken, action_denied, error]

    retries: 3

    bufferCap: 512

stdout — the everything-stream. Useful in dev; in prod systemd captures it for journald.
file — local rotating audit log. Survives loss of network. Keep it on hosts that produce signal but talk to the CP intermittently.
webhook — push to the CP's ingest endpoint. The events: filter is intentional — only the events you care about, no tick lifecycle noise. The buffer (bufferCap) and retry policy (retries) handle CP unreachability gracefully; once it's full, oldest events drop.

Multiple sinks are typical. stdout for local debugging, file for audit trail, webhook for the dashboard. None of them are mandatory — a daimon with no sinks is silent (sometimes you want that).

Management plane

The optional management section wires the daimon into the CP's mTLS control channel.

 management:

  url: "${OKESU_MGMT_URL}"

  certDir: /etc/okesu # client.crt, client.key, ca.crt

  heartbeatSec: 60

  pollSec: 300

Without a management section, the daimon runs autonomously and only pushes findings via the webhook sink. With it, the CP can dispatch ad-hoc orchestration steps to it, push hot config (new prompt, new tool list) without restart, and detect when the daimon goes stale (heartbeat timeout). It's optional precisely because some operators run daimons in environments where they cannot accept inbound dispatches — pure outbound is the lowest-trust shape.

What you can build

The same eight or so frontmatter sections — identity, provider, schedule, state, tools, collectors, outputs, management — express a remarkable range of agents. Some shipped with the platform:

edr

Linux EDR. 2-minute ticks, 8 collectors covering processes / network / files / kernel / auth, observe-only tools, webhook + file sink.

instance-integrity

File integrity monitoring. Hashes critical paths, compares to baseline, emits findings on drift. Cron-scheduled (hourly).

instance-threat

Active-exploitation detector. IMDS abuse, container escapes, cryptomining. 5-minute ticks.

sre-health

Health-check side. TLS expiry, deployment frequency, recent incidents. Reads HTTP endpoints + parses certs. 10-minute ticks.

cost-watcher

FinOps. Hourly tick that pulls cloud-bill deltas, flags anomalies, identifies idle resources. Runs on the CP itself, not on hosts.

data-quality

Pipeline auditor. Row counts, null rates, schema drift, freshness against SLAs. Runs against your data warehouse, emits findings.

oci-posture

Cloud posture. Audits IAM, compartments, NSGs, object-storage visibility. Driven by the OCI CLI as a tool.

compliance-auditor

CIS / SOC2 / your-own checks. Daily cadence. Reads the same telemetry the EDR daimon collects, but applies a different framework.

Eight different missions, eight markdown files, one runtime. Three things change between them:

The collectors — the cost watcher pulls bill data; the EDR pulls ps aux; data-quality runs SQL queries against the warehouse.
The prompt — what to investigate, what counts as a finding, the decision logic.
The cadence — EDR ticks every 2 minutes; the auditor ticks daily.

Tools and outputs are usually shared across multiple daimons. Identity, provider/model, RBAC are usually identical for any operator using the same LLM provider. The variance lives in three places, exactly the three places where the agent's mission changes.

Runtime model

Behind the spec, the daimon binary's runtime is a small loop:

Tick fires (interval / cron / one-shot dispatch).
Run all collectors in parallel, gather their outputs, mark optional ones as skipped on failure.
Render the prompt body with the Go template variables filled in (collector outputs, host id, tick time, last-run time).
Send the rendered prompt to the LLM provider with the declared tools available. Loop on tool calls until the model emits its final response or maxTurns trips.
Parse the model's emitted JSONL events. Run them through the dedup window. Push to all configured output sinks.
Update last_run.txt and the stateDir entries. Wait for the next tick.

That's it. There's no orchestrator inside the daimon — the daimon does one job per tick. Multi-step playbooks are an Okesu orchestration running at the CP level, dispatching to (possibly multiple) daimons.

Why this shape

Three design decisions are doing most of the work:

Markdown plus YAML frontmatter, not a programming language. Adding a new agent or modifying an existing one is editing a file an operator can read and review without compiling anything. The barrier to "let me try this idea" is small. The risk of "I shipped a typo" is also small, because there's a schema and a parser.

Pre-collectors, not pure tool-use. Letting the LLM drive every observation via tool calls is slow and expensive, and the latency tax compounds at the cadence a monitor needs. Pre-collecting deterministic shell output up front and asking the model to analyze rather than investigate from scratch is the difference between a 3-second tick and a 90-second one.

The same shape for monitors and one-shots. An on-demand triage agent and an always-running EDR look identical at the spec level — just mode: oneshot vs mode: daemon. That uniformity means an orchestration step can dispatch to either kind of agent without caring which it is.

The point isn't that any single piece is novel. It's that the pieces fit. A markdown file with eight section headers can express a Linux EDR and a cost watcher and an FinOps auditor and a custom thing you needed yesterday — and the same runtime drives all of them.

Where to next

Agents — agents are the on-demand cousin of daimons.
Orchestrations — chain daimons + agents into playbooks.
Recipes — six worked YAML orchestrations to copy-paste.
Install — drop a daimon onto your first host.

The Daimon