Skip to content

Add SSH daemon management API#369

Merged
jamiepine merged 3 commits intomainfrom
feat/ssh-daemon
Mar 9, 2026
Merged

Add SSH daemon management API#369
jamiepine merged 3 commits intomainfrom
feat/ssh-daemon

Conversation

@jamiepine
Copy link
Copy Markdown
Member

@jamiepine jamiepine commented Mar 8, 2026

Summary

  • Adds PUT /api/ssh/authorized-key endpoint for pushing public keys to the instance
  • Adds GET /api/ssh/status endpoint to check sshd state
  • Adds ssh_enabled field to PUT /api/settings to start/stop sshd
  • sshd runs on port 2222, host keys and authorized_keys persist in $SPACEBOT_DIR/ssh/

The platform already calls these endpoints when users enable SSH — this wires up the instance side.

Note

Implementation Details:

New src/api/ssh.rs module provides full SSH daemon lifecycle management. Handles host key generation (RSA/ECDSA/Ed25519), file permission setup (0600 for private keys, 0644 for public), and sshd process management on port 2222. SSH settings toggle integrated into the settings update flow with proper error handling and logging. All SSH data persists to $SPACEBOT_DIR/ssh/ for container restarts. Security enforced via PermitRootLogin=prohibit-password and PasswordAuthentication=no.

Written by Tembo for commit 024d00a. This will update automatically on new commits.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 8, 2026

Walkthrough

Introduces an internal SSH subsystem: a new ssh API module with HTTP endpoints to set an authorized key and report SSH status, lifecycle logic to enable/disable an sshd per instance, config wiring to persist ssh.enabled, and a mutex in ApiState to serialize SSH transitions.

Changes

Cohort / File(s) Summary
SSH module
src/api/ssh.rs
New module implementing sshd lifecycle (enable/disable), host key generation/installation, authorized_keys management, PID/process checks, per-instance SSH data directory helpers, permission handling, and HTTP handlers set_authorized_key and ssh_status. Uses async fs/process operations and system commands.
API surface & routing
src/api.rs, src/api/server.rs
Added pub(crate) mod ssh declaration and registered two new HTTP routes: PUT /ssh/authorized-key and GET /ssh/status.
Settings persistence & API
src/api/settings.rs
Added ssh_enabled: bool to GlobalSettingsResponse and ssh_enabled: Option<bool> to GlobalSettingsUpdate; read ssh.enabled from config in get, persist ssh.enabled on update, and toggle SSH accordingly (with explicit error responses on persistence failures).
API state
src/api/state.rs
Added ssh_mutex: tokio::sync::Mutex<()> to ApiState and initialized it to serialize enable/disable operations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change—adding SSH daemon management API—which is the primary focus of the changeset.
Description check ✅ Passed The description is directly related to the changeset, detailing the new SSH endpoints, settings integration, and implementation specifics.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/ssh-daemon

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jamiepine jamiepine marked this pull request as ready for review March 8, 2026 17:20
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/api/ssh.rs (1)

116-116: Consider using full variable name.

ak_path abbreviates "authorized_keys" to "ak". As per coding guidelines: "Don't abbreviate variable names."

♻️ Suggested rename
-    let ak_path = ssh_dir.join("authorized_keys");
-    if ak_path.exists() {
-        install_authorized_keys(&ak_path)
+    let authorized_keys_path = ssh_dir.join("authorized_keys");
+    if authorized_keys_path.exists() {
+        install_authorized_keys(&authorized_keys_path)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/ssh.rs` at line 116, Rename the abbreviated variable ak_path to a
clear, non-abbreviated name (e.g., authorized_keys_path or authorized_keys_file)
wherever it’s defined and used; update the binding created from
ssh_dir.join("authorized_keys") and all references to ak_path in functions or
methods in this module (e.g., the variable declared at let ak_path =
ssh_dir.join("authorized_keys")) so the code follows the "Don't abbreviate
variable names" guideline.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/api/ssh.rs`:
- Around line 193-220: The generate_host_keys function currently logs ssh-keygen
failures and still returns Ok, causing callers like enable() and
install_host_keys to proceed with missing keys; modify generate_host_keys to
propagate errors: when Command::new("ssh-keygen") returns a non-success status,
return an Err(std::io::Error) (or a custom error) instead of only logging, or
alternatively track success state across HOST_KEY_TYPES and if none succeeded
return an Err; ensure callers (e.g., enable(), install_host_keys) handle the
Result and abort/startup appropriately when key generation fails.

---

Nitpick comments:
In `@src/api/ssh.rs`:
- Line 116: Rename the abbreviated variable ak_path to a clear, non-abbreviated
name (e.g., authorized_keys_path or authorized_keys_file) wherever it’s defined
and used; update the binding created from ssh_dir.join("authorized_keys") and
all references to ak_path in functions or methods in this module (e.g., the
variable declared at let ak_path = ssh_dir.join("authorized_keys")) so the code
follows the "Don't abbreviate variable names" guideline.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 677e7833-a59a-4401-bb0c-46f7fbcc5375

📥 Commits

Reviewing files that changed from the base of the PR and between 203c9d9 and 024d00a.

📒 Files selected for processing (4)
  • src/api.rs
  • src/api/server.rs
  • src/api/settings.rs
  • src/api/ssh.rs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/api/settings.rs`:
- Around line 343-349: The new SSH flag is persisted to TOML but not modeled in
the typed config, so runtime reloads via Config::load_from_path() will ignore
it; add a typed representation for the ssh table (e.g., a struct or field named
ssh with an enabled: bool) to the config model used by
get_global_settings()/Config, and include "ssh" in the schema-backed top-level
key list inside the load path logic that validates/round-trips keys (the code
around Config::load_from_path() and the load.rs schema list). Ensure the loader
maps toml["ssh"]["enabled"] into the new Config::ssh.enabled field so persisted
changes are visible to runtime reloads and boot-time config handling.
- Around line 391-410: The persisted ssh.enabled flag is written before calling
the lifecycle operations, so if super::ssh::enable or super::ssh::disable fails
the on-disk value is inconsistent with the running daemon; fix by either
deferring the disk write until after super::ssh::enable/disable succeeds or, if
you must write first, capture the previous value (e.g.,
current_settings.ssh.enabled) and on any error from super::ssh::enable/disable
restore that previous value to disk and return the failure response. Update the
code paths around request.ssh_enabled, super::ssh::enable, super::ssh::disable
and the settings persistence call so the TOML commit is only final on success or
is rolled back on error while returning the same GlobalSettingsUpdateResponse on
failure.

In `@src/api/ssh.rs`:
- Around line 173-221: The pidfile handling allows killing unrelated processes
and silently drops cleanup errors; update is_sshd_running() to read
/proc/<pid>/comm (or /proc/<pid>/cmdline) and confirm the process identity is
"sshd" before treating the pid as ours, and make disable() perform the same
validation before issuing kill so you never send signals to non-sshd PIDs; also
change remove_file(PID_FILE).await from being ignored to handling the Result
(log an error via tracing::error or return Err with context) and treat only "no
such process" (ESRCH) from kill/state checks as harmless while surfacing other
failures. Use the existing symbols PID_FILE, is_sshd_running, and disable to
locate where to add the /proc/<pid> validation and adjust error handling for
remove_file.
- Around line 109-197: enable() and disable() can race because they perform
multi-step FS/process work without synchronization; add a per-instance
tokio::sync::Mutex to ApiState (e.g., pub ssh_lifecycle: Mutex<()>) and acquire
it at the start of both enable() and disable() (let _guard =
state.ssh_lifecycle.lock().await) so the entire sequence in enable (host key
generation, install_authorized_keys, create /run/sshd, is_sshd_running check,
starting sshd) and the entire sequence in disable (reading PID_FILE, kill,
remove_file) are serialized on the same mutex instance, preventing concurrent
transitions and races on the pidfile/host-key/install/start/stop steps.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a36bb5f8-eef8-46ba-99fd-6ba0695b7af9

📥 Commits

Reviewing files that changed from the base of the PR and between 024d00a and bf6d0ef.

⛔ Files ignored due to path filters (1)
  • .github/workflows/release.yml is excluded by !**/*.yml
📒 Files selected for processing (2)
  • src/api/settings.rs
  • src/api/ssh.rs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/api/settings.rs (1)

10-18: ⚠️ Potential issue | 🟠 Major

Update the frontend settings contract with ssh_enabled.

GlobalSettingsResponse now serializes ssh_enabled, but interface/src/api/client.ts:1300-1308 still models the old payload. That leaves the UI typed against stale data and prevents any typed SSH settings flow from consuming this field.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/settings.rs` around lines 10 - 18, The frontend API contract must be
updated to match the backend: add the new ssh_enabled boolean field to the
TypeScript type that models GlobalSettingsResponse in the generated API client
(the type currently modeling the Rust GlobalSettingsResponse), and update any
code that consumes global settings to read this new property; specifically,
modify the generated client type for GlobalSettingsResponse to include
ssh_enabled: boolean and regenerate or adjust usages (e.g., getGlobalSettings
callers / UI settings flow) so they access the new field.
♻️ Duplicate comments (2)
src/api/ssh.rs (2)

249-268: ⚠️ Potential issue | 🟠 Major

is_pid_sshd() still fails open on /proc read errors.

Returning true at Line 267 means a raced exit or any /proc read failure is treated as “our sshd”. That can make ssh_status() report SSH enabled when it is not, and disable() can still signal the wrong PID. Please fail closed here and validate /proc/<pid>/cmdline against this daemon's pidfile/port instead of only checking comm.

Proposed direction
-    match tokio::fs::read_to_string(format!("/proc/{pid}/comm")).await {
-        Ok(comm) => comm.trim() == "sshd",
-        // If /proc isn't available (e.g. macOS dev), fall back to trusting the PID file.
-        Err(_) => true,
-    }
+    let cmdline = match tokio::fs::read(format!("/proc/{pid}/cmdline")).await {
+        Ok(bytes) => bytes,
+        Err(_) => return false,
+    };
+    let cmdline = String::from_utf8_lossy(&cmdline);
+    cmdline.contains("/usr/sbin/sshd")
+        && cmdline.contains(&format!("PidFile={PID_FILE}"))
+        && cmdline.contains(&SSHD_PORT.to_string())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/ssh.rs` around lines 249 - 268, The is_pid_sshd function currently
treats any error reading /proc/<pid>/comm as a positive match (returns true),
which lets failures open; change the error branch to return false so read errors
fail closed, and enhance the verification by reading /proc/<pid>/cmdline (or
both comm and cmdline) and comparing the cmdline contents against our sshd
pidfile/expected port or binary name to ensure the PID really belongs to our
daemon; update the match in is_pid_sshd to return false on Err and add logic to
read and validate /proc/{pid}/cmdline against the pidfile/port/binary name to
reduce races and false positives.

150-151: ⚠️ Potential issue | 🟠 Major

Handle stale PID-file cleanup errors before starting sshd.

If removing PID_FILE fails for anything other than NotFound, the next sshd start can fail against the stale file and the real cause is lost. Please handle that case explicitly instead of let _ = .... As per coding guidelines, "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them."

Proposed fix
-    let _ = tokio::fs::remove_file(PID_FILE).await;
+    match tokio::fs::remove_file(PID_FILE).await {
+        Ok(()) => {}
+        Err(error) if error.kind() == std::io::ErrorKind::NotFound => {}
+        Err(error) => {
+            return Err(format!("failed to remove stale PID file: {error}"));
+        }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/ssh.rs` around lines 150 - 151, The removal of PID_FILE currently
swallows all errors via `let _ = tokio::fs::remove_file(PID_FILE).await;`;
instead await the Result and handle it: if the error is `NotFound` ignore it,
otherwise log the failure (including the error) or propagate it so the
subsequent sshd start doesn't fail silently; use
`tokio::fs::remove_file(PID_FILE).await` and inspect the `std::io::Error` (e.g.,
check `error.kind() == ErrorKind::NotFound`) and then call your logger or return
the error from the surrounding function to surface the real cause before
starting sshd.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/api/settings.rs`:
- Around line 343-349: Before mutating doc when handling request.ssh_enabled,
capture the existing value of doc["ssh"]["enabled"] (or None if missing) into a
variable, then set doc["ssh"]["enabled"] = toml_edit::value(enabled); if the
subsequent lifecycle step fails, restore that captured original value exactly
back into doc (or remove the ssh.enabled key if original was None) and propagate
or log any error returned by the rollback write instead of using let _ =;
replace the current logic that assumes rollback should write !enabled and drops
errors with this capture/restore pattern so the config is returned to its true
prior state on failure.

---

Outside diff comments:
In `@src/api/settings.rs`:
- Around line 10-18: The frontend API contract must be updated to match the
backend: add the new ssh_enabled boolean field to the TypeScript type that
models GlobalSettingsResponse in the generated API client (the type currently
modeling the Rust GlobalSettingsResponse), and update any code that consumes
global settings to read this new property; specifically, modify the generated
client type for GlobalSettingsResponse to include ssh_enabled: boolean and
regenerate or adjust usages (e.g., getGlobalSettings callers / UI settings flow)
so they access the new field.

---

Duplicate comments:
In `@src/api/ssh.rs`:
- Around line 249-268: The is_pid_sshd function currently treats any error
reading /proc/<pid>/comm as a positive match (returns true), which lets failures
open; change the error branch to return false so read errors fail closed, and
enhance the verification by reading /proc/<pid>/cmdline (or both comm and
cmdline) and comparing the cmdline contents against our sshd pidfile/expected
port or binary name to ensure the PID really belongs to our daemon; update the
match in is_pid_sshd to return false on Err and add logic to read and validate
/proc/{pid}/cmdline against the pidfile/port/binary name to reduce races and
false positives.
- Around line 150-151: The removal of PID_FILE currently swallows all errors via
`let _ = tokio::fs::remove_file(PID_FILE).await;`; instead await the Result and
handle it: if the error is `NotFound` ignore it, otherwise log the failure
(including the error) or propagate it so the subsequent sshd start doesn't fail
silently; use `tokio::fs::remove_file(PID_FILE).await` and inspect the
`std::io::Error` (e.g., check `error.kind() == ErrorKind::NotFound`) and then
call your logger or return the error from the surrounding function to surface
the real cause before starting sshd.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2e12d5d2-5803-45f1-9e05-b11cefbfe82d

📥 Commits

Reviewing files that changed from the base of the PR and between bf6d0ef and 5e1e815.

📒 Files selected for processing (3)
  • src/api/settings.rs
  • src/api/ssh.rs
  • src/api/state.rs

Comment on lines +343 to +349
// Persist ssh_enabled into config.toml so sshd auto-starts on boot.
if let Some(enabled) = request.ssh_enabled {
if doc.get("ssh").is_none() {
doc["ssh"] = toml_edit::Item::Table(toml_edit::Table::new());
}
doc["ssh"]["enabled"] = toml_edit::value(enabled);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Rollback needs the previous persisted value, not !enabled.

Line 406 only works if every request flips the flag. An idempotent request like ssh_enabled=true when the config already has true will roll the file back to false on failure, which is the opposite of the original state. Line 407 also drops a failed rollback write, so the config can stay wrong silently. Capture the pre-update ssh.enabled value before mutating doc, and restore that exact value if the lifecycle step fails. As per coding guidelines, "Don't silently discard errors. No let _ = on Results. Handle them, log them, or propagate them."

Proposed fix
+    let previous_ssh_enabled = doc
+        .get("ssh")
+        .and_then(|ssh| ssh.get("enabled"))
+        .and_then(|value| value.as_bool())
+        .unwrap_or(false);
+
     // Persist ssh_enabled into config.toml so sshd auto-starts on boot.
     if let Some(enabled) = request.ssh_enabled {
         if doc.get("ssh").is_none() {
             doc["ssh"] = toml_edit::Item::Table(toml_edit::Table::new());
         }
@@
         if let Err(error) = ssh_result {
             tracing::warn!(%error, "failed to toggle SSH, rolling back config");
             // Roll back: re-read, restore previous value, write.
             if let Ok(content) = tokio::fs::read_to_string(&config_path).await
                 && let Ok(mut rollback_doc) = content.parse::<toml_edit::DocumentMut>()
             {
-                rollback_doc["ssh"]["enabled"] = toml_edit::value(!enabled);
-                let _ = tokio::fs::write(&config_path, rollback_doc.to_string()).await;
+                rollback_doc["ssh"]["enabled"] = toml_edit::value(previous_ssh_enabled);
+                if let Err(rollback_error) =
+                    tokio::fs::write(&config_path, rollback_doc.to_string()).await
+                {
+                    tracing::error!(%rollback_error, "failed to roll back SSH config");
+                }
             }
             return Ok(Json(GlobalSettingsUpdateResponse {

Also applies to: 391-418

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/settings.rs` around lines 343 - 349, Before mutating doc when
handling request.ssh_enabled, capture the existing value of
doc["ssh"]["enabled"] (or None if missing) into a variable, then set
doc["ssh"]["enabled"] = toml_edit::value(enabled); if the subsequent lifecycle
step fails, restore that captured original value exactly back into doc (or
remove the ssh.enabled key if original was None) and propagate or log any error
returned by the rollback write instead of using let _ =; replace the current
logic that assumes rollback should write !enabled and drops errors with this
capture/restore pattern so the config is returned to its true prior state on
failure.

@jamiepine jamiepine merged commit a5eba12 into main Mar 9, 2026
4 checks passed
rktmeister pushed a commit to rktmeister/spacebot that referenced this pull request Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant