diff --git a/README.md b/README.md index 9de620d..1f796bf 100644 --- a/README.md +++ b/README.md @@ -1,27 +1,28 @@ # Data Machine Code -**Data Machine Code gives a WordPress site the ability to write and ship code** — clone repos, edit files in worktrees, commit, push, open issues and PRs, comment on reviews. Built on [Data Machine](https://github.com/Extra-Chill/data-machine), powered by the Abilities API. +**Data Machine Code gives a WordPress site code-adjacent GitHub and workspace abilities** — API-first GitHub operations, managed-host-safe GitSync for site-owned files, and shell-backed workspaces for installs that can run a local coding runtime. Built on [Data Machine](https://github.com/Extra-Chill/data-machine), powered by the Abilities API. -Who pulls the trigger is up to you. Same capability surface, three driver modes: +Who pulls the trigger is up to you. DMC has three driver modes, and each mode uses only the capabilities the host actually supports: -- **An external coding-agent runtime** on your machine pointed at the site (Claude Code, OpenCode, kimaki, Studio Code) — interactive, human-in-the-loop. -- **The site itself, via a Data Machine flow** — scheduled or webhook-triggered, no human and no external runtime. The site codes on its own behalf. +- **An external coding-agent runtime** on your machine pointed at the site (Claude Code, OpenCode, kimaki, Studio Code) — interactive, human-in-the-loop; requires shell/git/workspace access. +- **The site itself, via a Data Machine flow** — scheduled or webhook-triggered; can use API-first GitHub/GitSync abilities without a shell, or workspace abilities when the host supports them. - **An ephemeral CI job** that boots a WordPress instance with DMC loaded for a single check (Playground + GitHub Actions). The site exists for one job, codes for one PR, dies. -On managed hosts (WordPress.com, VIP, sandboxed environments) DMC is **never installed by design**. Those sites cannot write code, and that's the point — DMC's presence is the switch between "this WordPress site can ship code" and "this one can't." +Managed-host support depends on the subsystem. GitHub API abilities and GitSync use `wp_remote_request()` plus WordPress-owned storage and are designed to work on managed hosts. Workspace, worktree, shell-git, AGENTS.md projection, and co-located runtime features require a host where PHP can see and mutate the configured workspace and, for git operations, execute shell commands. ## What It Is -DMC's activation is the declarative answer to **"can this WordPress site code?"** When DMC is loaded, the install is no longer just a site running an AI — it has a worktree-native workspace, GitHub/git/workspace abilities, an `AGENTS.md` at the WP root for any runtime that asks, and a capability surface other plugins can read to gate disk-side behavior. +DMC's activation is the declarative answer to **"does this WordPress site have code-adjacent capabilities?"** When DMC is loaded, the install gains GitHub API abilities, optional GitSync bindings for site-owned files, and shell-backed workspace/git abilities when the environment supports them. The `\DataMachineCode\Environment` surface lets other plugins gate disk-side behavior without guessing which host they are running on. That framing reshapes what every other plugin can assume: -- **AGENTS.md is composed and written to the WP root** for external runtimes that discover it on session start. DMC owns the file, contributes core sections, and lets other plugins register more via Data Machine's `SectionRegistry`. In-process and CI drivers don't use AGENTS.md — they get context through Data Machine's normal channels (system prompts, memory files, flow user messages). AGENTS.md is a feature for one of the three drivers, not the universal interface. -- **The site gains GitHub, workspace, and git abilities** through the Abilities API — every ability is automatically callable from chat, MCP, REST, WP-CLI, *and* directly from inside Data Machine flows running on the site. -- **A worktree-native workspace area** at `~/.datamachine/workspace/` lets the site clone repos, edit files in isolated branches, and push changes — all gated by per-repo policies. Same workspace whether the editor is an external runtime, an in-process AI step, or a CI job. -- **`\DataMachineCode\Environment` is a capability surface** plugins can read to ask "can this site code?", "can we shell out?", "is the filesystem writable outside `/uploads`?". Other DM plugins use it to gate disk-side hooks (e.g. Intelligence's `SKILL.md` sync). +- **AGENTS.md is composed and written to the WP root** for external runtimes that discover it on session start, when the filesystem permits it. DMC owns the file, contributes core sections, and lets other plugins register more via Data Machine's `SectionRegistry`. In-process and CI drivers don't use AGENTS.md — they get context through Data Machine's normal channels (system prompts, memory files, flow user messages). AGENTS.md is a co-located-runtime feature, not the universal interface. +- **The site gains GitHub, GitSync, workspace, and git abilities** through the Abilities API. API-first abilities are host-portable; workspace/git abilities are still gated by filesystem and shell capability. +- **GitSync binds site-owned directories under `ABSPATH` to GitHub repos** using the GitHub Contents + Git Data APIs. It needs no local git binary, no `.git` directory, and no shell, so consumers can use it on managed hosts that allow the target site-owned path to be written. +- **A worktree-native workspace area** at `~/.datamachine/workspace/` lets shell-capable installs clone repos, edit files in isolated branches, and push changes — all gated by per-repo policies. Same workspace whether the editor is an external runtime, an in-process AI step, or a CI job. +- **`\DataMachineCode\Environment` is a capability surface** plugins can read to ask "is DMC active?", "can we shell out?", "is the filesystem writable outside `/uploads`?". Other DM plugins use it to gate disk-side hooks (e.g. Intelligence's `SKILL.md` sync). -The asymmetry between self-hosted and managed is the whole product statement: self-hosted sites can install DMC and ship code; managed sites cannot. DMC is the seam. +The important seam is not self-hosted versus managed; it is API-first versus shell-backed. Managed sites can use the API-first pieces when installed and configured. Shell-backed coding-runtime features remain intentionally limited to hosts that expose shell/git/workspace access. ## Driver Modes @@ -31,7 +32,7 @@ The asymmetry between self-hosted and managed is the whole product statement: se | **In-process flow** | DM AI step inside a flow | Long-lived install | An Intelligence wiki maintenance flow calling workspace abilities; a webhook-triggered PR review flow | | **Ephemeral CI** | DM flow inside Playground / GitHub Actions | One job | [`wc-site-generator`](https://github.com/chubes4/wc-site-generator) static-site validation; the Stage 5 Playground proof | -The capability surface is identical in all three. The differences are who pulls the trigger and how long the site lives. +The registered surface is broad, but callers should select abilities that match the host. For example, a managed-host flow can use GitHub API and GitSync abilities, while a co-located runtime can additionally use workspace and shell-git abilities. ## How It Differs From Other Data Machine Extensions @@ -47,16 +48,25 @@ Sibling extensions like `data-machine-socials` and `data-machine-business` are * - GitHub pull request webhook validation mode for review-flow triggers - PR review flow scaffold for webhook-triggered review automation with bounded context gathering and managed comment upsert +### GitSync +- Bind site-owned directories under `ABSPATH` to GitHub repositories +- Pull files from GitHub via the Contents and Git Data APIs +- Submit local file changes through a sticky proposal branch and PR +- Push directly to the pinned branch only when both write policy keys are enabled +- Works without local git, `.git` directories, shell execution, or workspace checkouts + ### Workspace Management - Clone and manage git repositories in a secure workspace directory - Read, write, and edit files within workspace repos - List directory contents with file metadata - Bundled Data Machine pipeline templates for workspace inventory, metadata repair, artifact cleanup, retention cleanup, and emergency cleanup +Requires a host where PHP can read and write the configured workspace path. ### Git Operations - Status, log, diff (read-only) - Pull, add, commit, push (policy-controlled) - Per-repo write/push policies with path allowlists and branch restrictions +Requires shell execution and a local git binary. ### AI Agent Tools - GitHub and workspace chat tools for read-only context gathering plus managed PR review comments @@ -76,6 +86,14 @@ wp datamachine-code github repos owner wp datamachine-code github review-flow create --repo=owner/repo --agent=code-reviewer wp datamachine-code github status +# GitSync — API-first, no local git required +wp datamachine-code gitsync bind docs \ + --local=/wp-content/uploads/intelligence-docs/ \ + --remote=https://github.com/owner/repo +wp datamachine-code gitsync pull docs +wp datamachine-code gitsync status docs --format=json +wp datamachine-code gitsync submit docs --message="Update docs" + # Workspace wp datamachine-code workspace path wp datamachine-code workspace list @@ -119,6 +137,8 @@ DMC discovers primary checkouts and worktrees by scanning the configured workspa root. Worktree lifecycle metadata supports cleanup and reconciliation; if that workspace path is not visible to PHP, DMC cannot see the checkouts. +Workspace and worktree commands are shell-backed. On managed hosts, prefer GitSync for site-owned content trees that need GitHub synchronization without a local checkout. + The primary checkout (bare ``) is **read-only by default** for mutating operations — pass `--allow-primary-mutation` to override. The default-deny is intentional: the primary tracks the deployed branch, and silent branch-switches @@ -136,12 +156,14 @@ on it are how parallel agents corrupt each other's work. - WordPress 6.9+ - PHP 8.2+ - [Data Machine](https://github.com/Extra-Chill/data-machine) plugin (core) -- A driver — at least one of: +- A driver for the abilities you plan to use — at least one of: - An external coding-agent runtime on the same host (Claude Code, OpenCode, kimaki, Studio Code, etc.); see [`wp-coding-agents`](https://github.com/Extra-Chill/wp-coding-agents) for an opinionated setup. - A Data Machine flow on the site that calls DMC's tools / abilities (in-process driver). - A CI workflow that boots WordPress with DMC loaded and runs a DM flow against it; see [`wc-site-generator`](https://github.com/chubes4/wc-site-generator) for the canonical Playground-based example. +- Shell-backed workspace/git features require `exec()`, a local `git` binary, and a visible writable workspace path. +- GitSync requires GitHub credentials plus a writable site-owned path under `ABSPATH`; it does not require shell or local git. -DMC's abilities still register without any of these — but nothing exercises them, and an idle workspace is just an empty directory. +DMC's abilities still register without a co-located runtime. API-first flows can exercise GitHub/GitSync directly; an idle workspace is only relevant when using workspace/git abilities. ## Installation @@ -149,9 +171,9 @@ DMC's abilities still register without any of these — but nothing exercises th 2. Clone this repo to `wp-content/plugins/data-machine-code` 3. Run `composer install` 4. Activate the plugin -5. Wire up at least one driver: +5. Wire up at least one driver or API-first flow: - **Co-located runtime:** point a coding-agent runtime at the install. See [`wp-coding-agents`](https://github.com/Extra-Chill/wp-coding-agents). - - **In-process flow:** create a DM flow whose AI step calls workspace / GitHub abilities directly. PR review flows (`wp datamachine-code github review-flow create`) are the bundled example. + - **In-process flow:** create a DM flow whose AI step calls GitHub, GitSync, workspace, or git abilities directly. PR review flows (`wp datamachine-code github review-flow create`) are the bundled example. - **Ephemeral CI:** check DMC out into a CI job alongside Data Machine and run a flow inside Playground or a fresh WP install. See `wc-site-generator`'s `.github/workflows/playground-stage-5.yml` and `static-site-validation.yml` for working references. ## Configuration @@ -276,19 +298,19 @@ Memory and guideline disk projection follows the same boundary: Data Machine own ### Capability detection -Other plugins gate disk-side or shell-using behavior on DMC's presence rather than on platform sniffing: +Other plugins should gate disk-side or shell-using behavior on explicit DMC capability checks rather than on platform sniffing: ```php if ( class_exists( '\DataMachineCode\Environment' ) ) { - // This site can write code — register disk hooks, - // sync SKILL.md to .opencode/skills/, write MEMORY.md, etc. + // DMC is active. Register API-first integrations, then check + // narrower filesystem/shell capabilities before disk runtime hooks. } \DataMachineCode\Environment::has_shell(); // can we shell_exec? \DataMachineCode\Environment::has_writable_fs(); // can we write outside /uploads? ``` -This is intentionally simpler than detecting WP.com vs VIP vs self-hosted vs CI vs Studio — those distinctions don't matter. What matters is "can this WordPress site write code?" and that question has exactly one answer: "is DMC active?" +This is intentionally simpler than detecting WP.com vs VIP vs self-hosted vs CI vs Studio. What matters is which DMC subsystem the caller needs: API-first GitHub/GitSync, writable filesystem projection, or shell-backed workspace/git. ## Roadmap diff --git a/inc/Environment.php b/inc/Environment.php index 016754b..0936f18 100644 --- a/inc/Environment.php +++ b/inc/Environment.php @@ -2,24 +2,27 @@ /** * Data Machine Code Environment * - * Public signal that a co-located coding agent runtime exists on this host. + * Public signal that Data Machine Code is active on this host. * - * Data Machine Code is the bridge between WordPress and an external - * coding-agent runtime. Its mere activation is the declarative answer to - * "is there a coding agent here?" — there is no separate marker file or - * constant to declare. + * Data Machine Code exposes code-adjacent GitHub, GitSync, workspace, and + * runtime capabilities. Its activation means those DMC surfaces are available + * to inspect and call, but callers must still check narrower capabilities + * before assuming shell execution or broad filesystem writes. * * Other plugins that ship disk-side artifacts for a coding agent (e.g. * Intelligence's SKILL.md sync, MEMORY.md disk writes, MCP bridges) should - * gate their disk hooks on the presence of this class: + * gate their DMC integrations on the presence of this class, then gate + * shell/filesystem work on the explicit helpers below: * * if ( class_exists( '\DataMachineCode\Environment' ) ) { - * // co-located coding agent runtime exists — register disk hooks + * // DMC is active; now check has_shell() or has_writable_fs() + * // before registering shell-backed or disk-projection hooks. * } * - * On managed hosts (WordPress.com, VIP, sandboxed environments) Data Machine - * Code is never installed by design. The class does not exist and disk - * artifacts are correctly skipped without any platform sniffing. + * Managed hosts may support API-first DMC subsystems such as GitHub abilities + * and GitSync while still denying shell execution or writes outside approved + * site-owned paths. Capability checks are intentionally narrower than host + * detection. * * @package DataMachineCode * @since 0.6.0 @@ -41,12 +44,14 @@ class Environment { private static $shell_diagnostic_cache = null; /** - * Is the coding agent runtime bridge available on this install? + * Is Data Machine Code available on this install? * * Returns true whenever this class is loaded, which is equivalent to - * "Data Machine Code is active." Provided as an explicit method so - * callers can write capability-style code rather than relying on the - * `class_exists()` idiom. + * "Data Machine Code is active." This does not imply shell execution, + * local git, or writable plugin/theme filesystems; callers that need those + * must also check has_shell() and has_writable_fs(). Provided as an explicit + * method so callers can write capability-style code rather than relying on + * the `class_exists()` idiom. * * @since 0.6.0 *