context-docs was previously spawned per call as a fresh stdio container,
which meant every MCP request paid full cold-start cost (embedding model
load + Chroma open) and concurrent clients raced for the same Chroma
writer. The 50+ orphan container build-up I saw during the publish audit
was the visible symptom.
This refactor runs docs-mcp as one long-lived service:
- compose: docs-mcp leaves the 'mcp' profile, gets container_name,
restart: unless-stopped, healthcheck, and a host port (127.0.0.1:8776
by default). Runs as the host UID/GID so bind mounts don't end up
root-owned.
- docker image: adds mcp-proxy (0.12.0) and an entrypoint that fronts
llms-txt-mcp's stdio as Streamable HTTP. Reads sources from a flat
file mounted at /etc/context-kit/docs-sources.txt. Disables eager
preindex by default; callers refresh on demand via the docs_refresh
tool. Set CONTEXT_KIT_DOCS_PREINDEX=1 to restore eager behavior.
- bin/context-kit: 'start' brings up the docs service alongside SearXNG,
generates the sources file from CONTEXT_KIT_DOCS_SOURCES, and waits
for the HTTP endpoint to become ready (up to 180s for first-run model
download). 'docs' still works for stdio-only clients but is now a
thin mcp-proxy bridge onto the shared HTTP service. 'doctor' and
'status' both surface the new endpoint.
- install snippets: context-docs is now 'type: remote'/'type: http'
pointing at ${CONTEXT_KIT_DOCS_HTTP_URL}. HTTP-capable MCP clients
bypass the bridge entirely. snippets/*.json and the install command
output stay byte-identical.
- docs and .env.example updated for new vars (CONTEXT_KIT_DOCS_PORT,
CONTEXT_KIT_DOCS_HTTP_URL, CONTEXT_KIT_DOCS_PREINDEX) and the new
24h TTL default (down from 7d; the long-lived service makes shorter
defaults cheap).
Verified end-to-end:
- compose config -q, bash -n, sh -n all clean
- HTTP /status returns 200
- stdio bridge returns initialize + tools/list with the same 3 tools
(docs_sources, docs_refresh, docs_query)
- doctor passes all 10 checks including the new HTTP probe
- web-search and repomix MCP handshakes still work
- redaction-check clean
- install JSON valid for both targets + --absolute
55 lines
2.1 KiB
Markdown
55 lines
2.1 KiB
Markdown
# Configuration
|
|
|
|
Configuration is via environment variables or a `.env` file in the repository
|
|
root. Start from `.env.example`.
|
|
|
|
Explicit environment variables win over `.env` values. The `.env` parser accepts
|
|
simple `KEY=VALUE` lines for `CONTEXT_KIT_*` variables only; it does not execute
|
|
shell code.
|
|
|
|
## Core Variables
|
|
|
|
| Variable | Default | Purpose |
|
|
|---|---|---|
|
|
| `CONTEXT_KIT_DATA_DIR` | `$HOME/.local/share/context-kit` | Persistent docs indexes and model cache |
|
|
| `CONTEXT_KIT_COMPOSE_PROJECT` | `context-kit` | Docker Compose project and network prefix |
|
|
| `CONTEXT_KIT_SEARXNG_PORT` | `8099` | Localhost SearXNG port |
|
|
| `CONTEXT_KIT_DOCS_PORT` | `8776` | Localhost port for the long-lived docs-mcp HTTP service |
|
|
| `CONTEXT_KIT_DOCS_HTTP_URL` | `http://127.0.0.1:${CONTEXT_KIT_DOCS_PORT}/mcp` | URL emitted into install snippets and used by the stdio bridge |
|
|
| `CONTEXT_KIT_DOCS_TTL` | `24h` | Docs re-fetch cadence |
|
|
| `CONTEXT_KIT_DOCS_SOURCES` | `config/sources.default.txt` | Space-separated source profile files |
|
|
| `CONTEXT_KIT_DOCS_MAX_GET_BYTES` | `75000` | Max bytes returned by docs retrieval |
|
|
| `CONTEXT_KIT_DOCS_EMBED_MODEL` | `BAAI/bge-small-en-v1.5` | SentenceTransformers embedding model |
|
|
| `CONTEXT_KIT_DOCS_PREINDEX` | `0` | Set to `1` to re-embed every source on container start |
|
|
|
|
## TTL Guidance
|
|
|
|
`24h` is the default. Most reference docs do not need re-embedding more often,
|
|
and the shared service does not re-fetch sources until the TTL elapses.
|
|
|
|
Use shorter TTLs for fast-moving APIs:
|
|
|
|
```sh
|
|
CONTEXT_KIT_DOCS_TTL=6h bin/context-kit restart
|
|
```
|
|
|
|
Use longer TTLs for stable specs:
|
|
|
|
```sh
|
|
CONTEXT_KIT_DOCS_TTL=30d bin/context-kit restart
|
|
```
|
|
|
|
The docs-mcp container reads `CONTEXT_KIT_DOCS_TTL` at startup, so changes
|
|
require `bin/context-kit restart`. When freshness matters for one task, prefer
|
|
calling the `docs_refresh` MCP tool instead of lowering the global TTL.
|
|
|
|
## Source Profiles
|
|
|
|
The docs MCP accepts one or more source files:
|
|
|
|
```sh
|
|
CONTEXT_KIT_DOCS_SOURCES="config/sources.default.txt config/sources.js.txt"
|
|
```
|
|
|
|
Each source file is plain text. Blank lines and `#` comments are ignored.
|