Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 42 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,28 @@
# Data Machine Code

**Data Machine Code gives a WordPress site the ability to write and ship code** — clone repos, edit files in worktrees, commit, push, open issues and PRs, comment on reviews. Built on [Data Machine](https://github.com/Extra-Chill/data-machine), powered by the Abilities API.
**Data Machine Code gives a WordPress site code-adjacent GitHub and workspace abilities** — API-first GitHub operations, managed-host-safe GitSync for site-owned files, and shell-backed workspaces for installs that can run a local coding runtime. Built on [Data Machine](https://github.com/Extra-Chill/data-machine), powered by the Abilities API.

Who pulls the trigger is up to you. Same capability surface, three driver modes:
Who pulls the trigger is up to you. DMC has three driver modes, and each mode uses only the capabilities the host actually supports:

- **An external coding-agent runtime** on your machine pointed at the site (Claude Code, OpenCode, kimaki, Studio Code) — interactive, human-in-the-loop.
- **The site itself, via a Data Machine flow** — scheduled or webhook-triggered, no human and no external runtime. The site codes on its own behalf.
- **An external coding-agent runtime** on your machine pointed at the site (Claude Code, OpenCode, kimaki, Studio Code) — interactive, human-in-the-loop; requires shell/git/workspace access.
- **The site itself, via a Data Machine flow** — scheduled or webhook-triggered; can use API-first GitHub/GitSync abilities without a shell, or workspace abilities when the host supports them.
- **An ephemeral CI job** that boots a WordPress instance with DMC loaded for a single check (Playground + GitHub Actions). The site exists for one job, codes for one PR, dies.

On managed hosts (WordPress.com, VIP, sandboxed environments) DMC is **never installed by design**. Those sites cannot write code, and that's the point — DMC's presence is the switch between "this WordPress site can ship code" and "this one can't."
Managed-host support depends on the subsystem. GitHub API abilities and GitSync use `wp_remote_request()` plus WordPress-owned storage and are designed to work on managed hosts. Workspace, worktree, shell-git, AGENTS.md projection, and co-located runtime features require a host where PHP can see and mutate the configured workspace and, for git operations, execute shell commands.

## What It Is

DMC's activation is the declarative answer to **"can this WordPress site code?"** When DMC is loaded, the install is no longer just a site running an AI — it has a worktree-native workspace, GitHub/git/workspace abilities, an `AGENTS.md` at the WP root for any runtime that asks, and a capability surface other plugins can read to gate disk-side behavior.
DMC's activation is the declarative answer to **"does this WordPress site have code-adjacent capabilities?"** When DMC is loaded, the install gains GitHub API abilities, optional GitSync bindings for site-owned files, and shell-backed workspace/git abilities when the environment supports them. The `\DataMachineCode\Environment` surface lets other plugins gate disk-side behavior without guessing which host they are running on.

That framing reshapes what every other plugin can assume:

- **AGENTS.md is composed and written to the WP root** for external runtimes that discover it on session start. DMC owns the file, contributes core sections, and lets other plugins register more via Data Machine's `SectionRegistry`. In-process and CI drivers don't use AGENTS.md — they get context through Data Machine's normal channels (system prompts, memory files, flow user messages). AGENTS.md is a feature for one of the three drivers, not the universal interface.
- **The site gains GitHub, workspace, and git abilities** through the Abilities API — every ability is automatically callable from chat, MCP, REST, WP-CLI, *and* directly from inside Data Machine flows running on the site.
- **A worktree-native workspace area** at `~/.datamachine/workspace/` lets the site clone repos, edit files in isolated branches, and push changes — all gated by per-repo policies. Same workspace whether the editor is an external runtime, an in-process AI step, or a CI job.
- **`\DataMachineCode\Environment` is a capability surface** plugins can read to ask "can this site code?", "can we shell out?", "is the filesystem writable outside `/uploads`?". Other DM plugins use it to gate disk-side hooks (e.g. Intelligence's `SKILL.md` sync).
- **AGENTS.md is composed and written to the WP root** for external runtimes that discover it on session start, when the filesystem permits it. DMC owns the file, contributes core sections, and lets other plugins register more via Data Machine's `SectionRegistry`. In-process and CI drivers don't use AGENTS.md — they get context through Data Machine's normal channels (system prompts, memory files, flow user messages). AGENTS.md is a co-located-runtime feature, not the universal interface.
- **The site gains GitHub, GitSync, workspace, and git abilities** through the Abilities API. API-first abilities are host-portable; workspace/git abilities are still gated by filesystem and shell capability.
- **GitSync binds site-owned directories under `ABSPATH` to GitHub repos** using the GitHub Contents + Git Data APIs. It needs no local git binary, no `.git` directory, and no shell, so consumers can use it on managed hosts that allow the target site-owned path to be written.
- **A worktree-native workspace area** at `~/.datamachine/workspace/` lets shell-capable installs clone repos, edit files in isolated branches, and push changes — all gated by per-repo policies. Same workspace whether the editor is an external runtime, an in-process AI step, or a CI job.
- **`\DataMachineCode\Environment` is a capability surface** plugins can read to ask "is DMC active?", "can we shell out?", "is the filesystem writable outside `/uploads`?". Other DM plugins use it to gate disk-side hooks (e.g. Intelligence's `SKILL.md` sync).

The asymmetry between self-hosted and managed is the whole product statement: self-hosted sites can install DMC and ship code; managed sites cannot. DMC is the seam.
The important seam is not self-hosted versus managed; it is API-first versus shell-backed. Managed sites can use the API-first pieces when installed and configured. Shell-backed coding-runtime features remain intentionally limited to hosts that expose shell/git/workspace access.

## Driver Modes

Expand All @@ -31,7 +32,7 @@ The asymmetry between self-hosted and managed is the whole product statement: se
| **In-process flow** | DM AI step inside a flow | Long-lived install | An Intelligence wiki maintenance flow calling workspace abilities; a webhook-triggered PR review flow |
| **Ephemeral CI** | DM flow inside Playground / GitHub Actions | One job | [`wc-site-generator`](https://github.com/chubes4/wc-site-generator) static-site validation; the Stage 5 Playground proof |

The capability surface is identical in all three. The differences are who pulls the trigger and how long the site lives.
The registered surface is broad, but callers should select abilities that match the host. For example, a managed-host flow can use GitHub API and GitSync abilities, while a co-located runtime can additionally use workspace and shell-git abilities.

## How It Differs From Other Data Machine Extensions

Expand All @@ -47,16 +48,25 @@ Sibling extensions like `data-machine-socials` and `data-machine-business` are *
- GitHub pull request webhook validation mode for review-flow triggers
- PR review flow scaffold for webhook-triggered review automation with bounded context gathering and managed comment upsert

### GitSync
- Bind site-owned directories under `ABSPATH` to GitHub repositories
- Pull files from GitHub via the Contents and Git Data APIs
- Submit local file changes through a sticky proposal branch and PR
- Push directly to the pinned branch only when both write policy keys are enabled
- Works without local git, `.git` directories, shell execution, or workspace checkouts

### Workspace Management
- Clone and manage git repositories in a secure workspace directory
- Read, write, and edit files within workspace repos
- List directory contents with file metadata
- Bundled Data Machine pipeline templates for workspace inventory, metadata repair, artifact cleanup, retention cleanup, and emergency cleanup
Requires a host where PHP can read and write the configured workspace path.

### Git Operations
- Status, log, diff (read-only)
- Pull, add, commit, push (policy-controlled)
- Per-repo write/push policies with path allowlists and branch restrictions
Requires shell execution and a local git binary.

### AI Agent Tools
- GitHub and workspace chat tools for read-only context gathering plus managed PR review comments
Expand All @@ -76,6 +86,14 @@ wp datamachine-code github repos owner
wp datamachine-code github review-flow create --repo=owner/repo --agent=code-reviewer
wp datamachine-code github status

# GitSync — API-first, no local git required
wp datamachine-code gitsync bind docs \
--local=/wp-content/uploads/intelligence-docs/ \
--remote=https://github.com/owner/repo
wp datamachine-code gitsync pull docs
wp datamachine-code gitsync status docs --format=json
wp datamachine-code gitsync submit docs --message="Update docs"

# Workspace
wp datamachine-code workspace path
wp datamachine-code workspace list
Expand Down Expand Up @@ -119,6 +137,8 @@ DMC discovers primary checkouts and worktrees by scanning the configured workspa
root. Worktree lifecycle metadata supports cleanup and reconciliation; if that
workspace path is not visible to PHP, DMC cannot see the checkouts.

Workspace and worktree commands are shell-backed. On managed hosts, prefer GitSync for site-owned content trees that need GitHub synchronization without a local checkout.

The primary checkout (bare `<repo>`) is **read-only by default** for mutating
operations — pass `--allow-primary-mutation` to override. The default-deny is
intentional: the primary tracks the deployed branch, and silent branch-switches
Expand All @@ -136,22 +156,24 @@ on it are how parallel agents corrupt each other's work.
- WordPress 6.9+
- PHP 8.2+
- [Data Machine](https://github.com/Extra-Chill/data-machine) plugin (core)
- A driver — at least one of:
- A driver for the abilities you plan to use — at least one of:
- An external coding-agent runtime on the same host (Claude Code, OpenCode, kimaki, Studio Code, etc.); see [`wp-coding-agents`](https://github.com/Extra-Chill/wp-coding-agents) for an opinionated setup.
- A Data Machine flow on the site that calls DMC's tools / abilities (in-process driver).
- A CI workflow that boots WordPress with DMC loaded and runs a DM flow against it; see [`wc-site-generator`](https://github.com/chubes4/wc-site-generator) for the canonical Playground-based example.
- Shell-backed workspace/git features require `exec()`, a local `git` binary, and a visible writable workspace path.
- GitSync requires GitHub credentials plus a writable site-owned path under `ABSPATH`; it does not require shell or local git.

DMC's abilities still register without any of these — but nothing exercises them, and an idle workspace is just an empty directory.
DMC's abilities still register without a co-located runtime. API-first flows can exercise GitHub/GitSync directly; an idle workspace is only relevant when using workspace/git abilities.

## Installation

1. Install and activate Data Machine core
2. Clone this repo to `wp-content/plugins/data-machine-code`
3. Run `composer install`
4. Activate the plugin
5. Wire up at least one driver:
5. Wire up at least one driver or API-first flow:
- **Co-located runtime:** point a coding-agent runtime at the install. See [`wp-coding-agents`](https://github.com/Extra-Chill/wp-coding-agents).
- **In-process flow:** create a DM flow whose AI step calls workspace / GitHub abilities directly. PR review flows (`wp datamachine-code github review-flow create`) are the bundled example.
- **In-process flow:** create a DM flow whose AI step calls GitHub, GitSync, workspace, or git abilities directly. PR review flows (`wp datamachine-code github review-flow create`) are the bundled example.
- **Ephemeral CI:** check DMC out into a CI job alongside Data Machine and run a flow inside Playground or a fresh WP install. See `wc-site-generator`'s `.github/workflows/playground-stage-5.yml` and `static-site-validation.yml` for working references.

## Configuration
Expand Down Expand Up @@ -276,19 +298,19 @@ Memory and guideline disk projection follows the same boundary: Data Machine own

### Capability detection

Other plugins gate disk-side or shell-using behavior on DMC's presence rather than on platform sniffing:
Other plugins should gate disk-side or shell-using behavior on explicit DMC capability checks rather than on platform sniffing:

```php
if ( class_exists( '\DataMachineCode\Environment' ) ) {
// This site can write code — register disk hooks,
// sync SKILL.md to .opencode/skills/, write MEMORY.md, etc.
// DMC is active. Register API-first integrations, then check
// narrower filesystem/shell capabilities before disk runtime hooks.
}

\DataMachineCode\Environment::has_shell(); // can we shell_exec?
\DataMachineCode\Environment::has_writable_fs(); // can we write outside /uploads?
```

This is intentionally simpler than detecting WP.com vs VIP vs self-hosted vs CI vs Studio — those distinctions don't matter. What matters is "can this WordPress site write code?" and that question has exactly one answer: "is DMC active?"
This is intentionally simpler than detecting WP.com vs VIP vs self-hosted vs CI vs Studio. What matters is which DMC subsystem the caller needs: API-first GitHub/GitSync, writable filesystem projection, or shell-backed workspace/git.

## Roadmap

Expand Down
33 changes: 19 additions & 14 deletions inc/Environment.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,27 @@
/**
* Data Machine Code Environment
*
* Public signal that a co-located coding agent runtime exists on this host.
* Public signal that Data Machine Code is active on this host.
*
* Data Machine Code is the bridge between WordPress and an external
* coding-agent runtime. Its mere activation is the declarative answer to
* "is there a coding agent here?" — there is no separate marker file or
* constant to declare.
* Data Machine Code exposes code-adjacent GitHub, GitSync, workspace, and
* runtime capabilities. Its activation means those DMC surfaces are available
* to inspect and call, but callers must still check narrower capabilities
* before assuming shell execution or broad filesystem writes.
*
* Other plugins that ship disk-side artifacts for a coding agent (e.g.
* Intelligence's SKILL.md sync, MEMORY.md disk writes, MCP bridges) should
* gate their disk hooks on the presence of this class:
* gate their DMC integrations on the presence of this class, then gate
* shell/filesystem work on the explicit helpers below:
*
* if ( class_exists( '\DataMachineCode\Environment' ) ) {
* // co-located coding agent runtime exists — register disk hooks
* // DMC is active; now check has_shell() or has_writable_fs()
* // before registering shell-backed or disk-projection hooks.
* }
*
* On managed hosts (WordPress.com, VIP, sandboxed environments) Data Machine
* Code is never installed by design. The class does not exist and disk
* artifacts are correctly skipped without any platform sniffing.
* Managed hosts may support API-first DMC subsystems such as GitHub abilities
* and GitSync while still denying shell execution or writes outside approved
* site-owned paths. Capability checks are intentionally narrower than host
* detection.
*
* @package DataMachineCode
* @since 0.6.0
Expand All @@ -41,12 +44,14 @@ class Environment {
private static $shell_diagnostic_cache = null;

/**
* Is the coding agent runtime bridge available on this install?
* Is Data Machine Code available on this install?
*
* Returns true whenever this class is loaded, which is equivalent to
* "Data Machine Code is active." Provided as an explicit method so
* callers can write capability-style code rather than relying on the
* `class_exists()` idiom.
* "Data Machine Code is active." This does not imply shell execution,
* local git, or writable plugin/theme filesystems; callers that need those
* must also check has_shell() and has_writable_fs(). Provided as an explicit
* method so callers can write capability-style code rather than relying on
* the `class_exists()` idiom.
*
* @since 0.6.0
*
Expand Down
Loading