Product Requirements Document: The Cognitive Core
Document Version: 1.0
Status: Draft
Date: 2025-08-25
Related Document: PRD: The Cloud Lexicon Architecture v1.1
TMDMamba: (Your suggestion) - Direct, descriptive, and memorable. It clearly communicates the core components: Task, Modifier, Data + Mamba. It's the most practical and least ambiguous name.
Cognitive Fusion Architecture (CFA): This name is more abstract and professional. It emphasizes the "fusion" of instructions (the IFM) and the "cognitive" nature of the work it performs.
Latent-Space Reasoning Core (LSRC): This name focuses on the _function_ and _domain_ of the model. It highlights that it operates in latent space and its primary purpose is reasoning.
1. Executive Summary & Vision
1.1. The Vision: A True Reasoning Engine
The Cognitive Core is a new paradigm of artificial intelligence model designed to reason directly in a high-dimensional latent space. It functions as the central processing unit for the Cloud Lexicon ecosystem. Unlike traditional Large Language Models that manipulate tokens, the Cognitive Core ingests and generates pure concepts, represented as vectors.
Its purpose is to perform controllable, stateful, and complex reasoning tasks, moving beyond simple pattern association to achieve a form of abstract thought. This architecture will enable unprecedented efficiency, long-range coherence, and steerable, explicit control over the AI's cognitive processes.
1.2. Strategic Goals
Enable Latent Space Reasoning: Build a model that can perform abstract operations (e.g., analogy, summarization, logical deduction) directly on concept vectors.
Achieve Linear Scalability: Utilize a Mamba/Jamba-based architecture to process vast sequences of concepts without the quadratic overhead of traditional Transformers.
Provide Explicit Control: Allow users and systems to precisely control the model's reasoning task, style, and subject matter through a structured triplet input.
Seamless Lexicon Integration: Function as the native "thinker" for the Cloud Lexicon, using it for both input vocabulary and output expression.
2. System Architecture
The Cognitive Core is a sequence-to-sequence model that transforms an input stream of concept triplets into an output stream of reasoned concept triplets.
2.1. Core Technology
Foundation: A deep stack of Mamba or Jamba blocks. This choice is critical for achieving stateful processing and linear-time complexity with respect to context length.
Input Format: A sequence of V_Instruction vectors.
Output Format: A sequence of (V_Task_Response, V_Modifier_Response, V_Data_Response) triplets.
2.2. Key Architectural Modules
1. Instruction Fusion Module (IFM)
Function: To combine the (V_Task, V_Modifier, V_Data) input triplet into a single, fused V_Instructionvector that primes the model for a specific cognitive task.
Mechanism: A cross-attention network where V_Data acts as the query and V_Task and V_Modifier act as keys/values.
2. Mamba/Jamba Sequence Processor
Function: The primary reasoning engine. It processes the sequence of V_Instruction vectors, maintaining a compressed state of the entire conceptual context. Its stateful nature is key to performing coherent, multi-step reasoning.
3. Response Deconstruction Module (RDM)
Function: To take the final, holistic V_Thought vector from the Mamba stack's hidden state and project it back into the structured (V_Task, V_Modifier, V_Data) triplet format.
Mechanism: Three parallel, independent Multi-Layer Perceptron (MLP) heads, one for each component of the output triplet. This allows for specialized training and granular loss calculation.
3. Functional Requirements
3.1. Controllable Reasoning
The model MUST be steerable via the input triplet. It must learn to differentiate between tasks, apply modifiers, and operate on the provided data.
Example Task: (V_Task: "Compare", V_Modifier: "For a beginner", V_Data: "Stoicism vs. Epicureanism") should produce a simplified comparison.
Example Task: (V_Task: "Generate", V_Modifier: "In a poetic style", V_Data: "The concept of entropy") should produce a creative, metaphorical concept.
3.2. Stateful, Chained Reasoning (Chain of Thought)
The model's architecture MUST support recursive operations where the output of one reasoning step becomes the input for the next.
Process: The (V_Task_out, V_Mod_out, V_Data_out) from step N can be directly fed into the IFM as the input for step N+1. This enables complex problem-solving entirely within the latent space.
3.3. Abstract Operations
The model MUST be trained to perform abstract cognitive functions beyond simple next-concept prediction. This includes, but is not limited to:
Analogical Reasoning: (A is to B as C is to ?)
Summarization & Elaboration
Logical Deduction
Causal Inference
4. Success Metrics
Task Accuracy: Achieve >90% accuracy on a benchmark of defined reasoning tasks (e.g., correctly solving analogies).
Modifier Adherence: Human evaluations should confirm that the model's output style correctly reflects the V_Modifier input in >85% of cases.
Throughput: Concepts processed per second per GPU (Target: > 10,000 concepts/sec).
Context Window: Demonstrate coherent reasoning across sequences of at least 100,000 concepts.
4.3. Coherence
Semantic Stability: Generated concept sequences must maintain high semantic coherence over long chains of thought, as measured by automated metrics and human evaluation.
5. Phased Implementation Plan (Crawl, Walk, Run)
This section outlines a phased approach to de-risk the project, ensuring each component of the pipeline is functional and tested before scaling. The initial "Crawl" phase is designed to be fully executable on a high-end local machine (e.g., MacBook M4 Pro).
| Phase | Component | Goal | Scale / Tools | Success Criteria |
| CRAWL | Concept Data Selection | Create a minimal, perfect dataset to test the pipeline mechanics. | 10-20 core concepts from a single, simple domain (e.g., basic physics definitions). | Dataset is created and documented. |
| Concept Data Curation | Manually create the ground-truth input/output triplets. | A single CSV file. (Input_T, M, D) -> (Output_T, M, D). | All 10-20 concepts are curated into training pairs. |
| Concept Data Storage | Implement basic storage and retrieval. | SQLite database on local disk. | Can successfully read/write concept text. |
| Vector Generation | Prove vectors can be generated and stored. | GTR-T5 model running locally. Local FAISS index for the ~20-40 vectors. | All concepts are encoded and stored in FAISS; can perform similarity search. |
| Model Training | Prove the model can learn. The goal is NOT generalization, but to see the loss decrease. | A minimal Mamba model (e.g., 2 layers). Train for 100-200 epochs on the 10-20 concepts. Overfitting is expected and desired. | Training loss converges. The pipeline runs without crashing. |
| Model Testing | Prove the end-to-end pipeline works. | Run the 10 trained concepts + 5 unseen (but related) concepts through the full system. | The system produces a non-random, plausible output vector for both seen and unseen inputs. No crashes. |
| WALK | Concept Data Selection | Expand to a single, coherent domain. | 1,000-5,000 concepts from one domain (e.g., a full textbook chapter). | Dataset is ingested and processed. |
| Concept Data Curation | Automate the creation of training data. | Python scripts to generate triplets based on rules and heuristics. Manual review of a subset. | 90% of the dataset is curated automatically. |
| Concept Data Storage | Move to a more robust database. | Local PostgreSQL or a small cloud SQL instance. | All data is migrated and accessible. |
| Vector Generation | Scale up the encoding process. | Batch processing scripts. Still local FAISS or a small cloud equivalent. | All 1k-5k concepts are encoded efficiently. |
| Model Training | Achieve generalization within one domain. | A medium-sized Mamba model. Train on a local GPU or a single cloud GPU (e.g., A10G). | Model achieves low loss on a held-out validation set from the same domain. |
| Model Testing | Implement an automated evaluation suite. | Automated scripts to calculate the PRD's success metrics (Task Accuracy, etc.) on a test set. | The model meets initial accuracy targets for the single domain. |
| RUN | Concept Data Selection | Scale to a large, multi-domain dataset. | 1,000,000+ concepts from diverse sources (web crawl, papers, books). | Data ingestion pipeline is robust and scalable. |
| Concept Data Curation | Build a production-grade curation pipeline. | LLM-based data generation and filtering. Human-in-the-loop validation. | Curation pipeline runs continuously with high precision. |
| Concept Data Storage | Deploy production infrastructure. | Production-grade, replicated cloud SQL and Vector DB. | Database handles high throughput with low latency. |
| Vector Generation | Deploy a scalable encoding service. | Distributed, serverless functions for on-demand encoding. | New concepts are encoded and added to the Lexicon in near real-time. |
| Model Training | Train a world-class reasoning model. | Full-scale Mamba/Jamba model, potentially with MoE. Train on a multi-GPU cluster. | Model achieves state-of-the-art performance on reasoning benchmarks. |
| Model Testing | Implement a full CI/CD/CT pipeline. | Continuous Integration, Delivery, and Training. Automated red-teaming and safety checks. | New models are continuously evaluated and deployed safely. |
Based on the "Crawl" phase outlined in the PRD, the goal is to create a minimal, high-quality dataset from a simple, factual domain to prove the pipeline works end-to-end. The best concepts will be distinct but clearly related, allowing us to test a basic reasoning step.
Here are 3 examples of high-quality concepts from a "basic physics" domain that would be perfect for the initial 10-20 concept list.
Example 1: A Foundational Definition
This establishes a baseline concept. The model's first task is to learn a simple association between a concept and its definition.
Input Triplet:
- Task: Define
- Modifier: In simple terms
- Data: Force
Expected Output Triplet:
- Task: Definition provided
- Modifier: Factual
- Data: A force is a push or pull upon an object resulting from the object's interaction with another object.
Why it's a good concept: It's a simple, unambiguous, one-to-one mapping. This is the easiest task for the model to learn and is perfect for verifying that the basic training loop and data pipeline are functional.
Example 2: A Second Foundational Definition
This adds another core concept from the same domain, building the model's "vocabulary."
Input Triplet:
- Task: Define
- Modifier: In simple terms
- Data: Mass
Expected Output Triplet:
- Task: Definition provided
- Modifier: Factual
- Data: Mass is the measure of the amount of matter in an object and its inertia.
Why it's a good concept: It's consistent with the first example and provides another known entity. Having multiple, distinct definitions is necessary before we can test if the model can reason _between_ them.
Example 3: The First Reasoning Task
This is the most critical concept in the "Crawl" phase. It tests the model's ability to synthesize the first two concepts into a third, relational concept (Newton's Second Law).
Input Triplet:
- Task: Explain relationship
- Modifier: Using an equation
- Data: Force, Mass, and Acceleration
Expected Output Triplet:
- Task: Relationship explained
- Modifier: Mathematical
- Data: Newton's Second Law of Motion states that the force on an object is equal to its mass multiplied by its acceleration (F=ma).
Why it's a good concept: This tests the core hypothesis of the project on a micro scale. The model cannot answer this by simply retrieving a definition; it must learn to associate the three input concepts with a new, structured concept that defines their relationship. If the model can learn this one simple reasoning step, it proves the entire architectural approach is viable.