context-docs was previously spawned per call as a fresh stdio container,
which meant every MCP request paid full cold-start cost (embedding model
load + Chroma open) and concurrent clients raced for the same Chroma
writer. The 50+ orphan container build-up I saw during the publish audit
was the visible symptom.
This refactor runs docs-mcp as one long-lived service:
- compose: docs-mcp leaves the 'mcp' profile, gets container_name,
restart: unless-stopped, healthcheck, and a host port (127.0.0.1:8776
by default). Runs as the host UID/GID so bind mounts don't end up
root-owned.
- docker image: adds mcp-proxy (0.12.0) and an entrypoint that fronts
llms-txt-mcp's stdio as Streamable HTTP. Reads sources from a flat
file mounted at /etc/context-kit/docs-sources.txt. Disables eager
preindex by default; callers refresh on demand via the docs_refresh
tool. Set CONTEXT_KIT_DOCS_PREINDEX=1 to restore eager behavior.
- bin/context-kit: 'start' brings up the docs service alongside SearXNG,
generates the sources file from CONTEXT_KIT_DOCS_SOURCES, and waits
for the HTTP endpoint to become ready (up to 180s for first-run model
download). 'docs' still works for stdio-only clients but is now a
thin mcp-proxy bridge onto the shared HTTP service. 'doctor' and
'status' both surface the new endpoint.
- install snippets: context-docs is now 'type: remote'/'type: http'
pointing at ${CONTEXT_KIT_DOCS_HTTP_URL}. HTTP-capable MCP clients
bypass the bridge entirely. snippets/*.json and the install command
output stay byte-identical.
- docs and .env.example updated for new vars (CONTEXT_KIT_DOCS_PORT,
CONTEXT_KIT_DOCS_HTTP_URL, CONTEXT_KIT_DOCS_PREINDEX) and the new
24h TTL default (down from 7d; the long-lived service makes shorter
defaults cheap).
Verified end-to-end:
- compose config -q, bash -n, sh -n all clean
- HTTP /status returns 200
- stdio bridge returns initialize + tools/list with the same 3 tools
(docs_sources, docs_refresh, docs_query)
- doctor passes all 10 checks including the new HTTP probe
- web-search and repomix MCP handshakes still work
- redaction-check clean
- install JSON valid for both targets + --absolute
40 lines
1.2 KiB
Docker
40 lines
1.2 KiB
Docker
FROM python:3.12-slim
|
|
|
|
ARG LLMS_TXT_MCP_VERSION=0.2.0
|
|
ARG MCP_PROXY_VERSION=0.12.0
|
|
|
|
RUN apt-get update \
|
|
&& apt-get install -y --no-install-recommends \
|
|
ca-certificates \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
# Install CPU-only torch first so llms-txt-mcp does not pull large CUDA wheels.
|
|
RUN pip install --no-cache-dir \
|
|
--index-url https://download.pytorch.org/whl/cpu \
|
|
torch
|
|
|
|
# llms-txt-mcp does the indexing/search; mcp-proxy fronts its stdio transport
|
|
# as Streamable HTTP so multiple MCP clients can share one long-lived process
|
|
# (and therefore one Chroma DB writer).
|
|
RUN if [ -n "${LLMS_TXT_MCP_VERSION}" ]; then \
|
|
pip install --no-cache-dir "llms-txt-mcp==${LLMS_TXT_MCP_VERSION}"; \
|
|
else \
|
|
pip install --no-cache-dir llms-txt-mcp; \
|
|
fi \
|
|
&& pip install --no-cache-dir "mcp-proxy==${MCP_PROXY_VERSION}"
|
|
|
|
COPY entrypoint.sh /usr/local/bin/docs-mcp-entrypoint
|
|
RUN chmod +x /usr/local/bin/docs-mcp-entrypoint
|
|
|
|
RUN mkdir -p /data /models /etc/context-kit
|
|
ENV HF_HOME=/models \
|
|
SENTENCE_TRANSFORMERS_HOME=/models \
|
|
DOCS_MCP_HTTP_HOST=0.0.0.0 \
|
|
DOCS_MCP_HTTP_PORT=8000 \
|
|
DOCS_MCP_SOURCES_FILE=/etc/context-kit/docs-sources.txt
|
|
|
|
VOLUME ["/data", "/models"]
|
|
EXPOSE 8000
|
|
|
|
ENTRYPOINT ["/usr/local/bin/docs-mcp-entrypoint"]
|