Talk to the user's locally-running LLM Wiki app over its built-in HTTP API. This is a **standard JSON API** — call it directly with whatever HTTP tool is already in your environment (`curl`, `fetch`, `requests`, `http` middleware, etc.). No client library to install, no SDK to learn.
Treat the wiki as a **private, structured knowledge base** the user has been curating: pages live as `wiki/**.md`, raw documents under `raw/sources/`, wikilinks form a graph.
## When to invoke
Invoke **only** when the user is clearly referring to **LLM Wiki** specifically — by app name, by `wiki` framing, or by `知识库` framing. Concretely:
- asks a question framed as "what does my **wiki** / my **knowledge base** / 我的**知识库** / **LLM Wiki** say about X"
- asks to "search **my wiki** / **LLM Wiki** project / 我的**知识库** for X"
- references a **wiki page** by stem / title and wants to read or cross-link
- asks for the **wiki graph / 知识图谱 / wiki overview / wiki structure**
- has just added or edited files under the LLM Wiki **source folder** and wants ingest re-run / **重新索引**
- says "use **my wiki** for context" / "ground your answer in **my wiki**" / "check **my LLM Wiki**"
- names a wiki project (by ID, by absolute path, or by `current`)
**Do NOT invoke when the user says:**
- "search **my notes**" without further qualification — likely Obsidian / Apple Notes / Notion / Logseq / Bear / etc.
- "check **my Obsidian / Notion / Roam / Logseq vault**" — explicitly a different tool
- "look up **my Anki / Readwise / Pocket**" — different tool
- "search **my files / my Documents folder**" — generic filesystem, not the wiki
- general world knowledge, current events, or anything the user clearly wants from the open web
When in doubt about which knowledge tool the user means, ask: *"Do you mean your LLM Wiki specifically, or another tool?"* — don't silently call the LLM Wiki API on what might be an Obsidian vault.
## 首次使用流程(默认行为)
当用户首次要求访问知识库时,按照以下流程执行:
1. 提示用户选择访问地址:
-`https://kb.wangjun.dev`
-`http://127.0.0.1`
2. 用户选择地址后,提示输入 API KEY。
3. 获取 API KEY 后,立即执行默认查询:
```text
列出所有kb(wiki)项目
```
4. 调用:
```http
GET /api/v1/projects
```
5. 将返回的项目列表展示给用户,并将用户选择的地址和 API KEY 用于后续所有请求。
如果用户明确指定了访问地址或 API KEY,则优先使用用户提供的值。
## Quick start
The whole API is plain HTTP + JSON. The fastest path:
2. The user's `apiConfig.token` saved via Settings → API Server
3.`allowUnauthenticated: true` mode (no token needed; rare, user opt-in only)
Always check `/api/v1/health` first — it returns `{ enabled, authConfigured, allowUnauthenticated, tokenSource }`. **If `authConfigured: false && allowUnauthenticated: false`, ask the user to open `Settings → API Server → Generate new token`**. Do not proceed without auth being set up.
Three equivalent ways to send the token:
```
Authorization: Bearer <token> # preferred
X-LLM-Wiki-Token: <token> # alternative header
?token=<urlencoded-token> # query param — last resort, leaks into logs
```
**Never log or echo the token. Never put it in any URL the user can see in your output** (Referer / shell history / logs all leak it).
## Standard workflow
When the user asks "look it up in my wiki":
1.**Resolve project** (see [Project resolution](#project-resolution) below).
2.**Search**: `POST /api/v1/projects/{id}/search` with `{ query, topK: 5..10 }` → ranked hits (`path`, `title`, `snippet`, `score`, `titleMatch`, optional `vectorScore`, `images`). Inspect `response.mode` to know whether hybrid retrieval kicked in.
3.**Read top hits**: for each promising hit, `GET /api/v1/projects/{id}/files/content?path=...` for the full markdown. Or pass `includeContent: true` to the search to avoid the round-trip.
4.**Cite + answer**: synthesize an answer grounded in the read pages. **Quote the `path` of each page you used** so the user can verify and jump in-app.
### Reading the score
The `score` field's scale depends on `mode`:
-**`mode: "keyword"`** — additive keyword score. Filename-exact hits are ~200; phrase-in-title ~50+; bag-of-tokens lands in single digits. Treat anything below ~5% of the top result as low-confidence.
-**`mode: "hybrid"` or `"vector"`** — RRF (Reciprocal Rank Fusion) score, typically in the **0.015–0.035** range. The absolute number is small; relative ordering is what matters. Use the per-result `vectorScore` (raw cosine 0–1) for "how strongly did the embedding match" if you need it.
Don't apply a fixed score threshold across modes. Sort by `score` descending and rely on relative gaps.
### Project resolution
`{id}` in every project-scoped endpoint accepts **four forms**:
| Form | When to use | Example |
|---|---|---|
| `current` (literal) | Default for "my wiki / 我的知识库 / this project / this wiki". The user is referring to whatever is open in the desktop UI. | `/api/v1/projects/current/search` |
| UUID | The user pasted a project ID, OR you previously resolved a name to an ID and want to re-use it. | `/api/v1/projects/a0e90b29-fcf3-4364-9502-8bd1272de820/files` |
| Absolute filesystem path (URL-encoded) | The user named the path (e.g. `~/notes/research`). Useful when the user has multiple projects with similar names. | `/api/v1/projects/%2FUsers%2Fme%2Fwiki%2Fresearch/files` |
| Project name | **Not supported directly.** You must `GET /api/v1/projects` first, find a match by `name`, then use that project's `id`. |
**Decision tree** for what the user said:
```
"my wiki" / "my 知识库" / "this wiki" / "this project" / unspecified
→ use `current`
"my Research project" / "in Reading"
→ GET /api/v1/projects
→ name-match (case-insensitive substring on `name`)
→ use the resulting `id`
→ if 0 matches: tell the user, list available names, fall back to `current` only if they confirm
→ if 2+ matches: ask the user to disambiguate, quoting both names + paths
"the project at /Users/me/foo"
→ URL-encode the path, use directly
→ if the API returns 404, the project isn't registered — list and let user pick
"project a0e90b29-…"
→ use the UUID literally
```
Cache the resolved `id` for the rest of the conversation — there's no need to re-`GET /projects` for every call. But if the user switches contexts mid-conversation ("now look in my Reading project"), re-resolve.
When the user is silent about which project, **default to `current`** and mention it once: *"Looking in your active project (Research Notes)…"*. This avoids cross-project surprises.
- Filter via `?q=term` (substring of id/label, case-insensitive) and `?nodeType=entity|concept|...`
For "I added new docs" requests:
-`POST /api/v1/projects/{id}/sources/rescan` → returns `{ queue: { tasks }, changedTasks: [...] }`. Tell the user how many files changed. Actual ingest runs asynchronously via the desktop queue.
## Endpoint contract (v1)
| Method | Path | Notes |
|---|---|---|
| GET | `/api/v1/health` | No auth. Returns `{ ok, status, version, enabled, authRequired, authConfigured, allowUnauthenticated, tokenSource }`. |
| GET | `/api/v1/projects` | List projects. Each: `{ id, name, path, current }`. |
| GET | `/api/v1/projects/{id}/files?root=wiki\|sources\|all&recursive=true&maxFiles=2000` | Tree of `{ name, path, isDir, size, children }`. Capped at 10000 nodes (413). |
| GET | `/api/v1/projects/{id}/files/content?path=wiki/foo.md` | Text files only (md/mdx/txt/json/yaml/yml/csv/html/htm/xml/rtf/log). 2 MB max. 415 on binary, 413 on oversize, 403 on out-of-scope path. |
| POST | `/api/v1/projects/{id}/search` | Body: `{ "query": "...", "topK": 10, "includeContent": false }`. **Hybrid (keyword + vector)** when the user has embeddings configured in Settings; falls back to keyword-only otherwise. Response carries `mode: "keyword" \| "vector" \| "hybrid"`, plus `tokenHits` / `vectorHits` and per-result `vectorScore`. Empty query → 400. |
| GET | `/api/v1/projects/{id}/graph?q=&nodeType=&limit=200` | Wikilinks graph from `wiki/*.md`. Limit clamped to 1000. |
| POST | `/api/v1/projects/{id}/sources/rescan` | Triggers a backend rescan using the user's Source Watch config. Returns post-rescan queue + actually-changed tasks. |
| POST | `/api/v1/projects/{id}/chat` | **501** — not implemented in v1. Don't call. |
`{id}` accepts a UUID, an absolute filesystem path (URL-encoded), or the literal string `current`.
## Error handling
Always treat the status code as the contract:
| Status | Meaning | What to do |
|---|---|---|
| 200 | OK | Use `body.ok === true` belt-and-suspenders; payload is in the same object. |
| 400 | Bad request | Show `body.error`. Typical: empty `query`, invalid `?root=`, oversized body. |
| 401 | Unauthorized | Token missing/wrong. Tell user to set/regenerate in Settings → API Server. |
| 403 | Forbidden | Path traversal or out-of-scope (e.g. `../app-state.json`). Don't retry the same path. |
| 404 | Not found | Unknown project id or unknown route. On unknown project, list projects first to recover. |
| 405 | Method not allowed | Wrong HTTP verb. |
| 413 | Payload too large | File > 2 MB, file tree > maxFiles, or request body > 1 MB. Suggest narrower scope. |
| 415 | Unsupported media | Binary or non-UTF-8 file content. API is text-only. |
| 429 | Too many requests | Rate limit (120 req/sec global). Back off ≥1 second. |
| 503 | Service unavailable | Two flavors: API toggled off (`error` contains "disabled"); in-flight cap (64) reached ("busy"). Back off ≥2s. |
If the HTTP call itself fails (connection refused / ENOTFOUND): the desktop app is **not running**. Tell the user: "Launch LLM Wiki, then re-try."
## Etiquette
-**Cite paths.** When you answer using wiki content, name the page: `(from wiki/concepts/rope.md)`. The user uses these to verify and to jump in-app.
-**Stay read-only by default.** Only `sources/rescan` mutates state; everything else is reads. Don't invent write endpoints — they don't exist in v1.
-**Don't dump full pages unless asked.** Snippet + path is usually enough. Pull full content only when reasoning genuinely needs it.
-**Respect the project boundary.** The current project is the user's active context. Do not silently switch projects.
-**Honor the rate limit.** 120 req/sec is plenty for sequential work, but parallel page reads can burst close to it. Batch where the API allows (`includeContent: true` on search avoids N+1 reads).
-**Never leak the token.** Headers are safe; query params and your own output text are not.
## See also
-`api-reference.md` — full endpoint shapes with request / response examples
-`examples.md` — common conversational patterns mapped to direct `curl` / `fetch` sequences
-`README.md` — human setup notes (token generation, port conflicts, troubleshooting)
-`authConfigured` — `true` if env `LLM_WIKI_API_TOKEN` or `apiConfig.token` is set
-`allowUnauthenticated` — anonymous local-process mode (rare; user opt-in only)
-`tokenSource` — `env` / `store` / `none`. If `env`, the desktop UI token field is **ignored**.
---
## GET /api/v1/projects
**Auth:** required.
```json
{
"ok":true,
"projects":[
{
"id":"a0e90b29-fcf3-4364-9502-8bd1272de820",
"name":"Research Notes",
"path":"/Users/me/wiki-projects/research",
"current":true
},
{
"id":"...",
"name":"Reading",
"path":"/Users/me/wiki-projects/reading",
"current":false
}
]
}
```
The `current` flag marks the project that is currently open in the desktop UI. Use it when the user doesn't name a project explicitly.
`{id}` placeholder in every project-scoped endpoint accepts:
- the project UUID (`p.id`)
- the project filesystem path (`p.path`, URL-encoded)
- the literal string `current`
### Resolving a user-spoken project name
There is **no `?name=` filter** on this endpoint and `{id}` does **not** accept a name directly — names are resolved entirely client-side after listing all projects.
Algorithm:
1.`GET /api/v1/projects` → array of `{id, name, path, current}`
**Windows path gotchas** — match the form the desktop app stored, otherwise you'll get 404:
- Use **forward slashes** (`C:/Users/me/wiki`), not backslashes. The desktop app normalizes paths to forward slashes before saving.
- Preserve the case the user actually has on disk (`C:/Users/Me/...` ≠ `c:/users/me/...` for the API's string compare, even though Windows itself is case-insensitive).
- The colon after the drive letter **must** be percent-encoded (`%3A`) — it's a reserved URI delimiter. `EscapeDataString` / `jq @uri` / `encodeURIComponent` all do this for you.
- If you get 404, fall back to `GET /api/v1/projects`, find the project there, and use its `id` (UUID) — UUIDs are platform-agnostic and never need encoding.
If the path isn't registered in the desktop app, you'll get **404** — fall back to listing and asking.
---
## GET /api/v1/projects/{id}/files
**Auth:** required.
Query params:
| Param | Default | Notes |
|---|---|---|
| `root` | `wiki` | One of `wiki` / `sources` (alias `raw`, `raw/sources`) / `all`. `all` lists every public sub-tree (`purpose.md`, `schema.md`, `wiki/`, `raw/sources/`). |
| `recursive` | `true` | `false` → only one level. |
| `includeContent` | `false` | When `true`, each hit carries `content` (full markdown). Skip the per-page content fetch round-trip. |
| `queryEmbedding` | `null` | Optional `number[]`. If you precomputed a query embedding (your own model, batched offline), pass it here and the server skips its own embed call. Must be a non-empty array of finite numbers, otherwise → 400. |
Response:
```json
{
"ok": true,
"projectId": "...",
"mode": "hybrid",
"tokenHits": 78,
"vectorHits": 14,
"note": "Search uses the shared backend retrieval service. When embeddingConfig is enabled, the API automatically includes LanceDB vector results; clients may also pass queryEmbedding explicitly.",
"results": [
{
"path": "wiki/concepts/rope.md",
"title": "Rotary Position Embedding",
"snippet": "...inject positional information by rotating Q and K...",
The server picks the mode automatically based on whether the active project has embeddings configured (Settings → Embeddings) **and** whether the vector index for the project has data:
| `mode` | Trigger | Score scale |
|---|---|---|
| `"keyword"` | No `embeddingConfig`, OR embedding fetch failed, OR vector index empty. | Additive keyword score: filename-exact ~200, phrase-in-title ~50+, token-bag scoring in single digits. |
| `"vector"` | Vector index returned hits but keyword scoring matched nothing. Rare in practice. | RRF rank score, typically `1 / (60 + rank)` ≈ `0.015–0.017`. |
| `"hybrid"` | Both keyword and vector pipelines produced hits — the common case when embeddings are enabled. | RRF combined: up to `1/61 + 1/61` ≈ `0.0328` for a top hit. |
`tokenHits` is the number of pages the keyword pass scored; `vectorHits` is the number of distinct pages LanceDB returned. Either can be 0.
### Per-result fields
| Field | Always present? | Notes |
|---|---|---|
| `path` | yes | Project-relative path to the markdown page. |
| `title` | yes | Front-matter `title:` if present; else first `# Heading`; else filename with dashes → spaces. |
| `snippet` | yes | ~160-char window. In keyword mode: centered on the query/anchor token in the page body. In vector-only matches: the actual matching chunk text, optionally prefixed with the chunk's heading path (e.g. `"Section > Detail: chunk text..."`). |
| `titleMatch` | yes | `true` when a token or phrase hit the title (boosts ranking). |
| `score` | yes | Final ranking score. See "Retrieval mode" for scale. |
| `vectorScore` | optional | Raw vector similarity (≈ cosine 0–1) when the page matched via the vector index. Useful for "how strong was the semantic match" decisions. Absent on pure keyword hits. |
| `images` | yes | Embedded `` references discovered in the markdown, deduped by URL. Useful for the agent to surface diagrams. |
| `content` | optional | Full markdown, only when `includeContent: true`. |
`results` is sorted descending by `score`. Don't compare scores **across** modes (keyword scores are 100×+ larger than RRF scores by construction). Rely on relative ordering within one response.
---
## GET /api/v1/projects/{id}/graph
**Auth:** required.
Query params:
| Param | Default | Notes |
|---|---|---|
| `q` | — | Substring filter on `id` or `label`, case-insensitive. |
Edges are derived from `[[wikilink]]` references inside `wiki/*.md`. Deduplicated by unordered pair `(source, target)` — `[[a]]` in b and `[[b]]` in a produce **one** edge. Self-edges are dropped. `weight` is `1.0` in v1.
`linkCount` is the node's degree in the deduped graph.
---
## POST /api/v1/projects/{id}/sources/rescan
**Auth:** required. **Mutates state.**
No body required.
```json
{
"ok": true,
"projectId": "...",
"result": {
"queue": {
"version": 1,
"tasks": []
},
"changedTasks": [
{
"id": "...",
"path": "raw/sources/new-paper.pdf",
"kind": "created"
}
]
}
}
```
`changedTasks` contains the files this rescan **actually detected as changed** (created / modified / deleted). The downstream ingest queue picks these up asynchronously — the API call returns as soon as the diff is queued, not when ingest finishes.
The user's `sourceWatchConfig` (file type filters, exclude dirs, max size) is honored. If the user disabled auto-ingest, files are still detected but not queued for the LLM pipeline.
---
## POST /api/v1/projects/{id}/chat
**Returns 501.** Chat / RAG pipeline lives in the WebView in v1. Don't invoke. Tell the user to use the desktop chat UI.
---
## Limits & defenses
| Limit | Value | Effect |
|---|---|---|
| Body size | 1 MiB | Exceed → 400. |
| File content read | 2 MiB | Exceed → 413. |
| File-tree node count | 10000 hard cap | Exceed → 413. |
| Search `topK` | 50 max | Silently clamped. |
| Graph `limit` | 1000 max | Silently clamped. |
| Rate limit | 120 req/sec (global) | 429 with `Retry-After` semantics implied (back off ≥1s). |
| In-flight requests | 64 concurrent | 503 "API server is busy" — back off ≥2s. |
CORS: `Access-Control-Allow-Origin: *`. Preflight cached 10 min via `Access-Control-Max-Age: 600`. Allowed headers include `Authorization`, `X-LLM-Wiki-Token`, `Content-Type`.
Treat these as **recipes**, not scripts. The agent decides when to combine them. Every example is plain HTTP — pick whichever client your environment already has (`curl`, `fetch`, `requests`, etc.).
---
## Cross-platform note (Windows / macOS / Linux)
The bash + `curl` + `jq` snippets below are written for **macOS / Linux**. On **Windows**, translate them to whichever shell the user is in:
| What the bash example does | PowerShell equivalent | cmd.exe equivalent |
|---|---|---|
| `$TOKEN` env var | `$env:LLM_WIKI_API_TOKEN` | `%LLM_WIKI_API_TOKEN%` |
| URL-encode a string<br>`printf %s "$x" \| jq -sRr @uri` | `[System.Uri]::EscapeDataString($x)` | use a helper / `curl --data-urlencode` |
| Backtick line-continuation `\` at end of line | backtick `` ``` at end of line | `^` at end of line |
**Paths on Windows**:
- Always pass **forward slashes** in API request paths (`wiki/concepts/foo.md`, never `wiki\concepts\foo.md`). The server stores and accepts the forward-slash form.
- When using a Windows filesystem path as `{id}` (e.g. `C:/Users/me/wiki`), percent-encode the **colon** (`C%3A/Users/me/wiki`). `EscapeDataString` / `encodeURIComponent` / `jq @uri` all do this correctly.
- If a path-as-id call returns 404 on Windows, fall back to `GET /api/v1/projects` and use the project's UUID — UUIDs are platform-agnostic and don't need encoding.
If you're calling from JavaScript / Python / Go / any other language with a real HTTP client (`fetch`, `requests`, `httpx`, `net/http`), platform doesn't matter — just `encodeURIComponent` or its equivalent and forget the shell quirks.
2. **Inspect `mode`** in the response to know how to read scores (see below).
3. If the top results are clearly above the rest (big gap in `score`), read those and synthesize. Otherwise read the top 3-5 and merge.
4. **Cite each `path` you used.** Quote snippets directly.
5. If nothing is found (empty `results` or a flat distribution with no clear winners), say so honestly. **Do not fabricate.**
```bash
curl -s -H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{"query":"rope rotary position embedding","topK":5,"includeContent":true}' \
$BASE/api/v1/projects/current/search
```
### How to read scores
The score's scale depends on `mode`:
| `mode` | Typical top `score` | What "good" looks like |
|---|---|---|
| `keyword` | 50–300+ (additive: filename-exact ≈ 200, phrase-in-title ≈ 50) | A clear gap (2×+) between top result and the rest. |
| `hybrid` / `vector` | 0.015–0.035 (RRF: `1/(60+rank)`-based) | Top RRF score near `0.032` ≈ matched in both keyword and vector top-1. |
**Don't apply a fixed threshold across modes.** Sort by `score` descending and rely on the relative gap. Use `vectorScore` (when present) for "how strong was the semantic match" — it's a raw similarity in `[0, 1]`, much easier to threshold than RRF.
Answer template:
> Per `wiki/concepts/rope.md` (matched via hybrid, vectorScore=0.94), rotary position embedding works by rotating Q and K vectors by an angle proportional to position. Your wiki specifically mentions …
---
## "Read me the page about X"
User wants the full text, not a synthesis.
1. If the user named a slug-like identifier (`rope`, `flash-attention`), search first with `topK: 1` to disambiguate.
The user's `purpose.md` describes intent; `index.md` enumerates pages; `overview.md` is the AI-generated topical summary. Quote the relevant chunks.
---
## "I added new docs to the source folder — re-index"
1. `POST /api/v1/projects/current/sources/rescan`
2. Read back `changedTasks`. Report:
- "Detected N new / M modified / K deleted files."
- List the first ~5 file paths so the user can verify.
3. Tell the user the **actual ingest** runs asynchronously via the desktop queue — encourage them to open the Activity panel if they want progress.
```bash
curl -s -X POST -H "Authorization: Bearer $TOKEN" \
"$BASE/api/v1/projects/current/sources/rescan"
```
If `changedTasks` is empty:
> No file changes detected. If you added files but they're not appearing, check `Settings → Source Watch` — your filters may be excluding them (e.g., `.json` is excluded by default).
---
## "Find every page that mentions Y" (broad sweep)
Search is ranked (hybrid when embeddings are configured, keyword otherwise) and capped at 50 hits per call. For **exhaustive** sweeps:
1. Run `POST .../search` with `topK: 50` and your term.
2. If the 50th result still has a non-trivial score (relative to the top), run again with a more specific query — the API will not return more than 50 in one call.
3. For **exact-string** sweeps where keyword tokenization mangles your phrase (e.g. CJK punctuation boundaries, code identifiers with underscores), walk every `wiki/*.md` via `files` + `files/content` and grep client-side. Slow but reliable.
4. Pure-semantic sweeps: set `topK: 50` and read `vectorScore` on each hit — pages without `vectorScore` matched only via keyword.
---
## "Search in my Reading project, not the current one"
When the user names a specific project rather than implying the active one.
const match = projects.filter(p => p.name.toLowerCase().includes("reading"))
```
3. Handle ambiguity:
- **0 matches** → tell the user, list available names, ask which one. Don't silently fall back to `current` — that would answer the wrong question.
- **1 match** → use its `id` in all subsequent calls for this conversation.
- **2+ matches** → ask the user to disambiguate, showing both `name` + `path`.
4. Use the resolved id directly:
```bash
PROJECT_ID="def-…" # from step 2
curl -s -H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{"query":"narrative voice","topK":5}' \
"$BASE/api/v1/projects/$PROJECT_ID/search"
```
5. Cache the `id` for the rest of the conversation. Don't re-list projects on every call. Only re-resolve if the user switches contexts ("now search my Research project instead").
You can also pass the project's filesystem path directly (URL-encoded) when the user references it that way:
3. Diff / contrast the result sets. Cite **both** project name and page path: *"In Research Notes (`wiki/concepts/narrative.md`)… vs. in Reading (`wiki/concepts/voice.md`)…"*.
`current` only refers to the active project; for multi-project queries always pass explicit IDs.
---
## "Switch to project X" (mid-conversation)
User has been asking about the active project, then says "now check my Reading project for the same thing."
1. Re-resolve via `GET /api/v1/projects` (or use cached list if recent).
2. Replace your cached project id for the rest of the conversation.
3. **Confirm the switch in your reply once**: *"Switching to your Reading project…"*. Don't silently apply.
4. Keep the user's query — apply it to the new project.
The desktop UI's active project does **not** change just because you used a different `{id}` — your API calls scoped to a non-current id are read-only and don't affect the user's UI state.
---
## Anti-patterns
- **Don't `POST /chat`** — it returns 501. Use the desktop chat UI, or call your own LLM with the wiki content as context.
- **Don't paginate the file tree with maxFiles=1, maxFiles=2, …** — there's no offset/cursor. Use a higher cap or filter by `root=`.
- **Don't bypass the API for writes.** The API is read-mostly. Writing to `wiki/**` or `raw/sources/**` via the filesystem directly bypasses the file-sync watcher. Use the desktop UI's Sources → Import flow or the ingest pipeline.
- **Don't include the token in URLs you echo / log / show.** Use headers in any output visible to the user.
- **Don't retry blindly on 429 / 503** — back off explicitly (1s / 2s respectively).
- **Don't shell out to a wrapper script.** The API is the contract. Direct `curl` / `fetch` is the interface.
---
## End-to-end transcript
User: *"What does my wiki say about flash attention? Show me the related concepts too."*
> Looking in your active project. Search ran in **hybrid mode** (tokenHits=12, vectorHits=8) — top hit `wiki/concepts/flash-attention.md` matched both keyword and vector signals (`vectorScore=0.91`).
>
> Your wiki explains FlashAttention as a memory-efficient exact attention implementation that tiles the QKV matrices to avoid materializing the full N×N attention map. Key claim: **2-4× speedup at long context on A100s** (per your wiki).
>
> Related concepts in your graph:
> - `wiki/concepts/attention.md` (parent)
> - `wiki/concepts/long-context.md` (sibling, both reachable from `flash-attention`)