# PRD_Addendum_LightRAG_LearningLLM_Enhanced_Metrics.md

_(three enhancements aligned with PLMS/PAS/HMI)_

Trent Carter

11/7/2025

A) LightRAG for Codebase (with Vector Manager agent)

Objective: maintain a continuously updated code+graph index for precise retrieval and dependency-aware planning. Scope (V1):

Ingest repo on vp new and on every commit (post-commit hook).

Build both semantic vectors and a call/import graph (tree-sitter/ctags).

Expose queries: where_defined(symbol), who_calls(fn), impact_set(file|symbol), nearest_neighbors(snippet).

Vector Manager agent owns refresh cadence, backfills, and integrity checks; surfaces drift warnings in HMI.

APIs:

rag.refresh(scope) → repo or subpath.

rag.query(kind, payload) → returns code locations + graph paths.

rag.snapshot() → writes rag_snapshot.json bound to git SHA for reproducibility.

KPIs / SLOs:

Index freshness ≤ 2 min from commit.

Query latency P95 ≤ 300 ms (local).

Coverage: ≥ 98% of files indexed; graph edges on par with static analyzer.

Risks:

Large monorepos → shard index per submodule; lazy load on demand.

B) Planner Learning LLM (project-experience model)

Objective: reduce cost/time by training a planner-facing LLM on what worked (per lane, per provider, per topology), both locally (per project) and globally (portfolio). Data to learn from:

Task tree + assignments, lane ids, provider matrix, rehearsal outcomes, KPI passes/fails, budget runways, violations, rework counts.

Pipeline:

After completion: PLMS emits a planner_training_pack.json (sanitized).

Trainer agent fine-tunes or LoRA-adapts the Planner model (or updates a retrieval memory) with dual partitions: LOCAL(project) and GLOBAL(portfolio).

A/B validation: Re-run the same project template with the updated Planner (no human), compare units (time, tokens by type, cost, energy). Target ≥15% median improvement after 10 projects.

Serving:

Planner uses GLOBAL first, overlays LOCAL deltas if the repo/team matches.

Cold-start: fallback to default priors + CI bands.

KPIs / SLOs:

Estimation MAE% drops over time (goal: ≤20% at 10 projects).

Rework rate ↓, KPI violations ↓, budget overruns ↓.

Risks:

Calibration poisoning → include only (baseline|hotfix) ∧ validation_pass ∧ !sandbox.

Privacy → anonymize task text; keep only structured features when required.

C) Multi-Metric Telemetry & Visualization (incl. Energy/Carbon)

Objective: track and visualize time, tokens (input/output/tool-use/think), cost, and energy (estimated) separately; allow custom roll-ups per stakeholder. Data model (already compatible):

Extend receipts to log token breakdown per step.

Add energy estimator: E ≈ (GPU_kW × active_time) + (CPU_kW × active_time) with model-specific coefficients; store in receipts.

HMI

Stacked bars per task and per lane (time / token types / cost / energy).

Budget runway (already added) + carbon overlay for “green” stakeholders.

Compare runs: show percent deltas to prior baselines for the same project template.

APIs

GET /metrics?with_ci=1&breakdown=all → returns mean + CI for each metric and token subtype.

GET /compare?runA=…&runB=… → structured diff with significance flags.

KPIs / SLOs

Visualization latency ≤ 1s for recent projects.

Metrics completeness ≥ 99% of steps report all four classes.

Risks

Energy estimates imperfect → clearly label as estimated and show coefficient sources; allow per-cluster overrides.

Cross-cutting: how these three fit PLMS/PAS/HMI

PLMS: planning uses LightRAG to localize work; estimates borrow historical priors; rehearsal uses RAG to sample representative strata.

PAS: Vector Manager triggers re-index after artifact writes; KPI receipts attach RAG lookups used.

HMI: adds RAG search widget, dependency graph view (2D/3D), learning gains panel (units saved vs last baseline), and multi-metric bars with carbon overlay.

Open questions (call these in stand-up)

Rename “VP of Engineering”? Proposal: “Project Executive (PEX) Agent” to avoid org-role confusion.

Local model pack defaults? Decide exact SKUs and VRAM targets for Mac/PC.

Energy coefficients: Which GPU/CPU profiles to ship by default?

RAG indexer: prefer tree-sitter or ctags+custom? (I recommend tree-sitter for rich edges; ctags as fallback.)

Planner training cadence: every run vs nightly batch (I recommend nightly).

# PRD_Addendum_LightRAG_LearningLLM_Enhanced_Metrics.md

# PRD_Addendum_LightRAG_LearningLLM_Enhanced_Metrics.md

Trent Carter

11/7/2025

A) LightRAG for Codebase (with Vector Manager agent)

B) Planner Learning LLM (project-experience model)

C) Multi-Metric Telemetry & Visualization (incl. Energy/Carbon)

Cross-cutting: how these three fit PLMS/PAS/HMI

Open questions (call these in stand-up)

Related Research

Product Requirements Document (PRD)

INVERSE_STELLA: Product Requirements Document

PRD_VP-Agent_Local_Code_Operator.md

Sentence → Vector (384D)