Three behavior changes to the hermes -w worktree lifecycle:
1. Git-native locks. _setup_worktree now locks its worktree
(git worktree lock --reason "hermes session pid=<pid>"), and
_prune_stale_worktrees skips locked worktrees at ANY age — a lock
from a live or crashed session means "do not touch". New helpers
_lock_worktree / _unlock_worktree / _worktree_is_locked (fail-safe:
any error reads as locked) / _worktree_is_dirty (fail-safe: any
error reads as dirty).
2. Dirty trees are preserved. _cleanup_worktree previously destroyed
worktrees with uncommitted changes if there were no unpushed
commits; it now keeps the worktree, branch, and lock when the tree
is dirty OR has unpushed commits, and prints manual cleanup hints
(git worktree unlock + remove --force). The >72h "force remove
regardless" prune tier is removed: pruning may only ever delete
clean, unlocked, fully-pushed worktrees.
3. Branch deletion is gated on removal success. Both cleanup and
prune previously deleted the branch without checking the
git worktree remove returncode, dropping easy reachability of the
commits even when removal failed; the branch is now only deleted
after a successful remove.
Remove unused imports (F401) and duplicate/shadowed import
redefinitions (F811) across the codebase using ruff's safe
autofixes. No behavioral changes -- imports only.
- ~1400 safe autofixes applied across 644 files (net -1072 lines)
- __init__.py re-exports preserved (excluded from F401 removal so
public re-export surfaces stay intact)
- Re-exports that are imported or monkeypatched by tests but look
unused in their defining module are kept with explicit # noqa:
F401 (gateway/run.py load_dotenv; run_agent re-exports from
agent.message_sanitization, agent.context_compressor,
agent.retry_utils, agent.prompt_builder, agent.process_bootstrap,
agent.codex_responses_adapter)
- Unsafe F841 (unused-variable) fixes deliberately skipped -- those
can change behavior when the RHS has side effects
- ruff lints remain disabled in pyproject.toml (only PLW1514 is
selected); this is a one-time cleanup, not a config change
Verification:
- python -m compileall: clean
- pytest --collect-only: all 27161 tests collect (zero import errors)
- core entry points import clean (run_agent, model_tools, cli,
toolsets, hermes_state, batch_runner, gateway)
- static scan: every name any test imports directly from an edited
module still resolves
Problem: hermes -w sessions accumulated 37+ worktrees and 1200+ orphaned
branches because:
- _cleanup_worktree bailed on any dirty working tree, but agent sessions
almost always leave untracked files/artifacts behind
- _prune_stale_worktrees had the same dirty-check, so stale worktrees
survived indefinitely
- pr-* and hermes/* branches from PR review had zero cleanup mechanism
Changes:
- _cleanup_worktree: check for unpushed commits instead of dirty state.
Agent work lives in pushed commits/PRs — dirty working tree without
unpushed commits is just artifacts, safe to remove.
- _prune_stale_worktrees: three-tier age system:
- Under 24h: skip (session may be active)
- 24h-72h: remove if no unpushed commits
- Over 72h: force remove regardless
- New _prune_orphaned_branches: on each -w startup, deletes local
hermes/hermes-* and pr-* branches with no corresponding worktree.
Protects main, checked-out branch, and active worktree branches.
Tests: 42 pass (6 new covering unpushed-commit logic, force-prune
tier, and orphaned branch cleanup).
* refactor: re-architect tests to mirror the codebase
* Update tests.yml
* fix: add missing tool_error imports after registry refactor
* fix(tests): replace patch.dict with monkeypatch to prevent env var leaks under xdist
patch.dict(os.environ) can leak TERMINAL_ENV across xdist workers,
causing test_code_execution tests to hit the Modal remote path.
* fix(tests): fix update_check and telegram xdist failures
- test_update_check: replace patch("hermes_cli.banner.os.getenv") with
monkeypatch.setenv("HERMES_HOME") — banner.py no longer imports os
directly, it uses get_hermes_home() from hermes_constants.
- test_telegram_conflict/approval_buttons: provide real exception classes
for telegram.error mock (NetworkError, TimedOut, BadRequest) so the
except clause in connect() doesn't fail with "catching classes that do
not inherit from BaseException" when xdist pollutes sys.modules.
* fix(tests): accept unavailable_models kwarg in _prompt_model_selection mock