# Python packages pre-baked into the runtime's system python (read-only # bind-mounted into every Bash sandbox). The agent CANNOT `pip install` at # runtime — it is rejected in all modes — so anything it might `import` must # be listed here and re-provisioned. See docs/operations/bash-sandbox-provisioning.md. # # NOTE: the Bash sandbox runs with --unshare-net (no network), so network # libraries (requests/httpx/aiohttp/yt-dlp/scrapy ...) are intentionally NOT # listed — outbound HTTP must go through the WebFetch/DownloadFile/MCP tools. # ── Documents / file-format readers ────────────────────────── pypdf pymupdf>=1.24 # fitz; PDF render/extract (musllinux wheels >=1.24) pdfplumber # PDF text + table extraction (pdfminer.six based) python-docx # .docx read/write python-pptx # .pptx read/write openpyxl # .xlsx read/write xlsxwriter # .xlsx write with formatting/charts xlrd # legacy .xls (pre-2007 Excel) read odfpy # OpenDocument .odt / .ods striprtf # .rtf -> plain text beautifulsoup4 # HTML/XML parsing lxml # fast XML/HTML backend markdownify # HTML -> Markdown markdown # Markdown -> HTML # ── Data / analysis ────────────────────────────────────────── numpy pandas tabulate # render tables as markdown/plain text python-dateutil # flexible date parsing matplotlib # offline charts -> image files in output/ (HEAVY: on # alpine/musl needs freetype/libpng build deps; fine on host) # ── Images / text ──────────────────────────────────────────── Pillow # image processing charset-normalizer # robust encoding detection for messy text PyYAML # YAML read/write (stdlib json covers JSON)