maestro/docs/oss-docs-cleanup-plan.md
oss-sync d061ad08d8
Some checks failed
CI / build-and-test (push) Has been cancelled
sync: update from private repo (e62f5c7)
2026-06-11 01:52:48 +00:00

20 KiB
Raw Blame History

OSS Documentation Cleanup Plan (MAESTRO)

Status: PLAN (audit + execution checklist). Branch: docs/oss-cleanup. Author: docs audit pass, 2026-06-10.

This plan audits MAESTRO's documentation against GitHub/OSS publishing conventions and the repo owner's HTML-staged-docs preference, and lays out a concrete, ordered execution path. It does not change any shipping docs; the only new files on this branch are this plan and two clearly-labeled DRAFT stubs (docs/README.en.draft.md, docs/SECURITY.draft.md).

How OSS publishing actually works (verified)

scripts/oss-sync.sh builds the public tree from git archive HEAD, then:

  1. removes every path in oss/exclude.txt (CLAUDE.md, docs/superpowers, docs/plans, dated internal docs, oss/ itself, sync tooling, editor configs);
  2. scrubs internal hostnames/IPs via oss/scrub.sed;
  3. strips dead links to excluded docs via oss/scrub-deadlinks.py;
  4. overlays oss/overlay/. last (public-only files win);
  5. runs a release gate (oss/verify_release.py + oss/forbidden.txt).

So the public doc set = (tracked docs that survive exclude) + the overlay. The overlay is the source of truth for the public-facing files. The tracked root README.md is the private README and is replaced by oss/overlay/README.md (README is overlaid, not excluded). Anything authored for the public must land in oss/overlay/ or in a tracked docs/ path that is not excluded.

Implication for this cleanup: most "public-facing" doc work means editing files under oss/overlay/, not the tracked root files.


A. INVENTORY

Audience = who the doc is for. OSS-ships = overlay (public-only file), tracked (survives exclude and ships as-is), or excluded (removed by oss/exclude.txt). Language as of today.

Top-level files

Path Audience Lang OSS-ships? Verdict
README.md (tracked root) internal JA replaced by overlay README keep as private; not public
oss/overlay/README.md public JA overlay (becomes public README) translate→EN-first + add badges/screenshot
oss/overlay/AGENTS.md public (contributors) EN overlay keep
oss/overlay/CONTRIBUTING.md public EN overlay keep (add CoC + DCO/CLA note)
oss/overlay/SECURITY.md public EN overlay keep (good)
oss/overlay/CHANGELOG.md public EN overlay keep (date stale: dated 2026-06-02)
oss/overlay/LICENSE public EN overlay keep (Apache-2.0)
oss/overlay/NOTICE public EN overlay keep
AGENTS.md (tracked root) internal EN tracked (not excluded) — collides w/ overlay see note 1
GEMINI.md (tracked root) internal tracked (not excluded!) add to exclude.txt (internal editor config)
CLAUDE.md internal JA excluded keep excluded

Note 1 — AGENTS.md exists both tracked-at-root and in the overlay. The overlay copy wins (overlaid last), so the public ships the overlay version. Confirm the tracked root AGENTS.md is acceptable to ship if the overlay ever stops shipping it; today it is harmless but a maintenance trap (two AGENTS.md to keep in sync). Recommend: keep only the overlay AGENTS.md public; the tracked root one is internal — fine, but document the duplication.

docs/ overlay (public)

Path Audience Lang OSS-ships? Verdict
oss/overlay/docs/getting-started.md public JA overlay translate→EN
oss/overlay/docs/configuration.md public JA overlay translate→EN
oss/overlay/docs/architecture.md public JA overlay translate→EN

docs/ tracked (ship unless excluded)

Path Audience Lang OSS-ships? Verdict
docs/architecture.md mixed JA tracked reconcile vs overlay arch (see note 2)
docs/tools/*.md (34) public JA tracked keep; translate top-N later
docs/operations/bash-sandbox-provisioning.md public JA tracked keep
docs/operations/index.html + initial-setup.html + guide.css public JA tracked (HTML) see RECONCILIATION
docs/design/** (ui_kits_reference jsx/html/css) internal mixed tracked exclude (dev design refs, not user docs)
docs/aao-gateway-overview.md public JA tracked keep
docs/mcp.md, docs/skills.md, docs/ssh.md, docs/bench.md public JA tracked keep
docs/context-flow.md, docs/user-folder-layout.md public JA tracked keep
docs/maintenance-checklist.md internal (contributors) JA tracked keep (referenced by CONTRIBUTING)

Note 2 — duplicate architecture docs: docs/architecture.md (tracked) AND oss/overlay/docs/architecture.md (overlay). The overlay wins for the public. Decide which is canonical and have the other redirect/stub, or exclude the tracked one to avoid drift.

docs-wip branch (in-progress reorg — NOT yet on main)

docs-wip does two things:

  1. Adds a consolidated, curated doc set: docs/reference/*.md (18 files, Japanese — feature reference: scheduler, config, mcp, ssh, memory, gateway, pieces, skills, notifications, media-tools, etc.), plus an HTML build system: docs/build_html.py (Markdown→staged HTML), generated docs/html/**, and manifest JSON (.investigate-status.json, .consolidate-manifest.json).

docs/build_html.py reads each doc's implementation status from .investigate-status.json and renders staged HTML (design=blue / implementation=amber / completed=green) with an index.html nav — exactly the owner's HTML-staged convention. The docs/reference/*.md consolidation is the single most valuable doc asset for OSS: a clean, deduplicated feature reference replacing scattered dated design specs.

Recommendation: land docs-wip's docs/reference/*.md consolidation (highest-value, low-risk) ahead of the OSS push, but treat the generated docs/html/** output and the *.json manifests as build artifacts (see RECONCILIATION — keep the generator, gitignore/exclude the output).


B. GAP ANALYSIS — GitHub OSS readiness

Verified-present: LICENSE (Apache-2.0), NOTICE, CONTRIBUTING.md, SECURITY.md, CHANGELOG.md, AGENTS.md, Dockerfile, docker-compose.yml, .env.example, .dockerignore.

Gaps found:

Gap Severity Notes
No CODE_OF_CONDUCT.md high Standard for OSS; GitHub surfaces it in the community profile. Add Contributor Covenant 2.1 to oss/overlay/.
No .github/ community files high Missing issue templates (bug/feature), PULL_REQUEST_TEMPLATE.md, FUNDING.yml (optional). Note: repo is published to Gitea (swallow/maestro), not GitHub — Gitea reads .gitea/ISSUE_TEMPLATE/ (and also .github/). Add templates under a host-appropriate dir in the overlay.
README is JA-only high First-time visitor on an English-default host can't read it. EN-first is the single biggest readiness fix.
README has 1 badge only medium Only a static license badge. Add: build/CI status, release/version, Node version, "PRs welcome". Avoid badges that need a live service.
No screenshots/GIF in README medium An agent UI sells itself visually. Add 12 screenshots (task detail / settings) under oss/overlay/docs/assets/ and embed in README.
No architecture diagram image low The execution-flow ASCII block is fine; a simple diagram would help. Optional.
CHANGELOG date stale low v0.1.0 — 2026-06-02 predates current HEAD; refresh on release.
No top-level "Documentation" landing for the curated set medium If docs/reference/* lands, README/getting-started should link an index.
License headers in source low Apache-2.0 doesn't require per-file headers, but NOTICE + a short header policy in CONTRIBUTING avoids questions. Optional.
GEMINI.md would leak medium Tracked at root, NOT in oss/exclude.txt → ships publicly. It's an internal editor/assistant config like CLAUDE.md. Add to exclude.
docs/design/ui_kits_reference/** would ship low-medium Internal design references (JSX prototypes, legacy admin kit). Not user docs. Add docs/design to exclude.

README quality for a first-time visitor (overlay README): structure is good (features → quickstart → requirements → docs → security → license). What's missing for a strong first impression: English, a screenshot, a one-line "what/why" hook in English at the very top, and CI/release badges.


C. i18n PLAN

Current state: all public docs (overlay README + getting-started + configuration

  • architecture) and tracked docs/** are Japanese. GitHub/Gitea OSS audiences default to English. The product also targets 多言語対応 as a goal.

Convention (recommended): English-first with a .ja sibling. This is the lowest-friction, most-recognized GitHub pattern.

  • README.md → English (the public README, i.e. oss/overlay/README.md).
  • README.ja.md → Japanese (current content moved here).
  • Cross-link at the top of each: [English](README.md) | [日本語](README.ja.md).

For docs/, use a suffix convention to avoid a parallel directory tree:

oss/overlay/docs/
  getting-started.md        # EN (canonical)
  getting-started.ja.md     # JA
  configuration.md          # EN
  configuration.ja.md       # JA
  architecture.md           # EN
  architecture.ja.md        # JA

Rationale: a docs/en/ + docs/ja/ split doubles directory depth and breaks relative links on every move. The .ja.md suffix keeps EN as the default a visitor hits and keeps JA one click away. (If a richer i18n site is built later — e.g. Docusaurus/MkDocs — migrate then; don't over-engineer now.)

Translate-first order (highest visitor impact first):

  1. README.md (overlay) — the storefront. Do first.
  2. docs/getting-started.md — clone→running path.
  3. docs/configuration.mdconfig.yaml reference.
  4. docs/architecture.md — for evaluators/contributors.
  5. CHANGELOG.md (already EN), SECURITY.md (already EN), CONTRIBUTING.md (already EN), AGENTS.md (already EN) — no action.
  6. Later/optional: top 58 docs/tools/*.md by usage (bash, websearch, browseweb, spawnsubtask, office) and the docs/reference/* set if landed.

Tooling/convention:

  • Keep EN canonical, JA as translation. Do not auto-generate JA at build time; hand-maintain the high-value pages, accept staleness on the long tail.
  • Add a short "Translations" note in CONTRIBUTING describing the .ja.md convention and that EN is canonical (PRs that change EN should flag the JA sibling as needing update — don't block on it).
  • The existing JA overlay docs are already written and accurate — they become the .ja.md siblings essentially for free. The work is the EN translation, not throwing away the JA.

D. DOCKER DOCS PLAN

From git clone to running via Docker, a new user needs (and currently has):

Step Covered today? Where
cp .env.example .env yes README quickstart + getting-started §5
Set LLM endpoint yes .env.example comments (OLLAMA_BASE_URL, OLLAMA_MODEL)
docker compose up -d yes README + getting-started
Where the UI is yes http://localhost:9876
Data persistence yes named volumes maestro-data / maestro-workspaces
Security default (localhost-only) yes README + getting-started + SECURITY

What's missing / unclear for a Docker-first OSS user:

  1. host.docker.internal on Linux. .env.example defaults to http://host.docker.internal:11434/v1. On Linux Docker this name is not resolvable by default (works on Docker Desktop mac/win). New Linux users will hit "connection refused" with no hint. Add a note: on Linux use the host gateway IP or --add-host=host.docker.internal:host-gateway (compose: extra_hosts), or point at the LAN IP of the Ollama host.
  2. No "verify it's running" step. Add a healthcheck/docker compose logs -f
    • "open the UI, create a task" smoke check to getting-started §5.
  3. Mounting config.yaml into the container is referenced ("see the comments in docker-compose.yml") but not shown inline. New users benefit from one explicit example of mounting a host config.yaml and where setup runs in the container (does the image run npm run setup, or only env-var config?).
  4. Bash sandbox in Docker. bwrap needs unprivileged user namespaces; in a container that may need extra flags or --privileged-adjacent settings, or the hardened fallback applies. Getting-started §8 covers host provisioning but not the containerized story. Add one paragraph: what bash_sandbox mode the Docker image ships with and any caveats.
  5. Image build vs prebuilt. Is there a published image, or is docker compose up building locally from the Dockerfile every time? State it. If build-from-source, note the first-run build time.
  6. GPU / external Ollama. Make explicit that MAESTRO's container does not run the LLM; users point it at an existing OpenAI-compatible endpoint. The README says it but the Docker section could restate it (common confusion).

Recommend a dedicated oss/overlay/docs/docker.md (EN) consolidating the above, linked from README + getting-started §5, rather than growing getting-started.


E. RECONCILIATION — HTML-staged docs vs GitHub Markdown

The tension: the owner's convention wants docs/ as staged HTML (design/implementation/completed, color badges, index.html nav). GitHub/OSS convention wants Markdown README + docs/ that render on the repo host.

These are not actually in conflict if we treat HTML as a generated artifact, not the source:

  1. Markdown is the single source of truth. Author everything as .md (README.md, docs/**/*.md, the docs/reference/*.md set from docs-wip). Markdown renders natively on Gitea/GitHub and is what OSS contributors expect.

  2. docs/build_html.py (from docs-wip) generates the staged HTML view from that Markdown for the owner's internal browsing/handoff workflow. The generator stays; its output (docs/html/**) is a build artifact — do not hand-edit it, and keep it out of OSS:

    • add docs/html/ to .gitignore (don't commit generated output), and
    • add docs/html + the *.json manifests to oss/exclude.txt as belt-and- suspenders so even a stray commit never ships generated HTML publicly. This gives the owner the staged-HTML experience locally (python3 docs/build_html.py) while the OSS repo ships clean Markdown — no duplication of content, only a generation step.
  3. The few hand-written HTML docs that exist today (docs/operations/index.html, initial-setup.html, guide.css) are the exception: they're authored HTML, not generated. Two options — (a) convert them to Markdown so they fit the generated-from-MD model (preferred for OSS consistency), or (b) keep them as authored HTML but exclude them from OSS and let the public read the Markdown equivalents (docs/operations/bash-sandbox-provisioning.md already exists). Recommend (a) long-term, (b) as the immediate no-risk choice.

  4. Stage metadata (design/impl/completed) lives in the Markdown frontmatter or the manifest JSON, consumed only by build_html.py. OSS readers never see stages; they see finished Markdown. Internal readers get the staged HTML view.

Net: keep docs-wip's generator + the docs/reference/*.md consolidation; gate the generated HTML behind .gitignore + oss/exclude.txt. One source (MD), two renderings (host-native MD for OSS, staged HTML for internal).


F. PRIORITIZED EXECUTION CHECKLIST

Effort: S ≤30min, M ≤2h, L ≤half-day. Tags: [public-facing] ships to OSS; [internal] private-repo / tooling only.

P0 — leak/correctness fixes (do before any OSS push)

  1. [internal] Add GEMINI.md to oss/exclude.txt (internal assistant config, currently ships). — S
  2. [internal] Add docs/design to oss/exclude.txt (JSX/HTML UI prototypes, not user docs). — S
  3. [internal] Resolve duplicate architecture doc: pick canonical (oss/overlay/docs/architecture.md), exclude or stub the tracked docs/architecture.md. — S
  4. [internal] Decide AGENTS.md duplication (overlay wins; document that the tracked root copy is internal). — S
  5. [internal] Run scripts/oss-sync.sh --dry-run --local-only and read the release-gate output + diff stat to confirm nothing internal leaks. — S

P1 — English README + storefront (highest visitor impact)

  1. [public-facing] Translate oss/overlay/README.md to English; move JA to oss/overlay/README.ja.md; add the EN | 日本語 switcher line. Use docs/README.en.draft.md (this branch) as the starting skeleton. — M
  2. [public-facing] Add badges to README: CI/build, release/version, Node 22+, license (keep), PRs-welcome. Only badges that don't require a live host. — S
  3. [public-facing] Add 12 screenshots (oss/overlay/docs/assets/) and embed under a "Screenshots" section. — SM (needs capturing UI)

P2 — community health files

  1. [public-facing] Add CODE_OF_CONDUCT.md (Contributor Covenant 2.1) to oss/overlay/. — S
  2. [public-facing] Add issue + PR templates under the host-appropriate dir in the overlay (.gitea/ISSUE_TEMPLATE/{bug,feature}.md + a PULL_REQUEST_TEMPLATE.md; also .github/ if mirroring to GitHub). — M
  3. [public-facing] Promote SECURITY.draft.md (this branch) — confirm the existing oss/overlay/SECURITY.md is sufficient (it is); the draft is a redundant stub, delete it once confirmed. — S

P3 — i18n of core docs

  1. [public-facing] Translate docs/getting-started.md → EN, move JA to getting-started.ja.md. — M
  2. [public-facing] Translate docs/configuration.md → EN (+ .ja.md). — ML
  3. [public-facing] Translate docs/architecture.md → EN (+ .ja.md). — M
  4. [public-facing] Add a "Translations" note to CONTRIBUTING describing the .ja.md convention (EN canonical). — S

P4 — Docker docs

  1. [public-facing] Add oss/overlay/docs/docker.md (EN) covering the 6 gaps in section D (esp. Linux host.docker.internal, verify-running, config mount, sandbox-in-Docker, build-vs-prebuilt, external LLM). Link from README + getting-started §5. — M

P5 — docs reorg reconciliation (coordinate with docs-wip)

  1. [internal] Land docs-wip's docs/reference/*.md consolidation onto main (the 18 curated feature docs; drop the generated docs/html/** from the merge). — L (review of 18 docs)
  2. [internal] Add docs/html/ to .gitignore and docs/html + docs/.investigate-status.json + docs/.*-manifest.json to oss/exclude.txt (generated artifacts; section E). — S
  3. [public-facing] Add a docs/README.md (or section in main README) indexing the docs/reference/* set so the curated docs are discoverable. — S
  4. [public-facing] Convert docs/operations/*.html to Markdown (or exclude from OSS); section E option (a)/(b). — M

P6 — polish (optional, post-launch)

  1. [public-facing] Translate top 58 docs/tools/*.md. — L
  2. [public-facing] Refresh CHANGELOG.md date/contents at actual release. — S
  3. [public-facing] Add an architecture diagram image to README. — M

Drafts created on this branch (NOT replacing anything)

  • docs/README.en.draft.md — English README skeleton (starting point for item 6).
  • docs/SECURITY.draft.md — redundant stub pointing at the existing policy; exists only to confirm coverage (item 11) and should be deleted once the existing oss/overlay/SECURITY.md is accepted.

Both are clearly labeled DRAFT and live under docs/ so they do not collide with or overwrite the shipping overlay files.