A self-learning conversational AI that adapts permanently to each user — powered by the COGNET architecture.
Status: Fully implemented. All 22 COGNET components exist in 28 source files inside
src/project_adam/, verified againstarchitecture.md(564 lines) by 57 architecture compliance tests. 374 tests total. No stubs, no planned sections — the code is the architecture.
Built on Qwen2.5 (0.5B/1.5B/3B) with LoRA fine-tuning, running on consumer GPUs with 4GB+ VRAM. Optionally uses a remote API endpoint for generation while continuing to train via online distillation.
| # | Component | File | Purpose |
|---|---|---|---|
| 1 | SensoryEncoders | encoder.py |
Stacked β-VAE (text/vision/audio), precision-weighted, hardware-tier-aware |
| 2 | WorkingMemory | memory/working.py |
3-buffer Baddeley model, PFC-gated entry, attention eviction, temporal decay, goal tracking |
| 3 | EpisodicMemory | memory/episodic.py |
SQLite-backed (s,a,r,c) tuples with SentenceTransformer indexing, temporal compression |
| 4 | SemanticMemory | memory/semantic.py |
Schema graph with slot values, Piaget assimilation/accommodation, prediction-error gating |
| 5 | ProceduralMemory | memory/procedural.py |
RL-learned skills via RPE, keyword-overlap matching, n-gram chunking (7±2 slots) |
| 6 | SpatialMemory | memory/spatial.py |
GridCells + PlaceCells, 17 relation types, conflict detection, graph traversal |
| 7 | DiffMemory | memory/diffmemory.py |
2-layer MLP (384→1536→384) with GELU + LayerNorm + residual; dual learning: gradient + Hebbian + STDP |
| 8 | RLCore | rl_core.py |
TD(λ) with Prospect-theoretic RPE (λ=2.25, α=0.88), ActorNetwork (8→64→5), eligibility traces |
| 9 | SFL | sfl.py |
7 social features, Rescorla-Wagner Q-learning, asymmetric Prospect theory loss weighting |
| 10 | BayesianWorldModel | world_model.py |
Causal graph with conjugate Gaussian priors, transition simulation, speaker model |
| 11 | WebSearch | search.py |
DuckDuckGo general + Wikipedia knowledge, independent caches |
| 12 | MetacognitiveController | metacog.py |
MLP policy (9→32→16→5), REINFORCE, 5 canonical actions (REPLAY, SEARCH, RANDOM, REFLECT, EXPLORE) |
| 13 | LanguageInterface | language.py |
Dual-backend (local/API), persona builder, behavioral rules injection, utterance-likeness scoring |
| 14 | ActionSelector | selector.py |
Dual-system: fast pattern-matching + slow world model deliberation, somatic bias modulation |
| 15 | OfflineConsolidator | consolidator.py |
Full 8-step cycle (replay → TD → abstract+noise+Hebbian → prune → reentry → world → procedural → re-encode) |
| 16 | SelfPlayLearner | self_play.py |
Daemon thread with 4 strategies (schema/world/procedural/creative), spaced repetition, metacog-gated |
| 17 | SomaticMarker | somatic.py |
3-layer MLP (16→8→1), Damasio somatic marker hypothesis, online post-RPE update, graded bias |
| 18 | PersonaManager | persona_manager.py |
List/load/switch/generate personas, multi-draft synthesis via teacher API |
| 19 | Persona | persona.py |
Markdown identity with heading parser, behavioral rules, 5-level size guard |
| 20 | MCP Server | mcp_server.py |
13 FastMCP tools over stdio or SSE, all write tools route through canonical agent methods |
| 21 | UserProfileManager | profiles.py |
Per-user state, somatic_state tracking, LoRA adapter management, reward history |
| 22 | AgentOrchestrator | agent.py |
CognitiveAgent class — instantiates all 22 components, orchestrates chat loop, self-play, consolidation |
- Per-user adaptation — User detection, profiles, per-user LoRA adapters at
agent_memory/adapters/{user}/ - Dual-backend generation — Local Qwen (auto-detected hardware tier) or remote API (OpenAI-compatible)
- Online distillation — Remote API responses train the local LoRA model, no separate pipeline needed
- Hardware auto-detection — Low tier (≤4GB Pascal) → API mode; Mid (8GB+ Volta) / High (24GB+ Ampere) → local mode
- Self-play learning — Autonomous background daemon thread generates (query, teacher_response) pairs from knowledge gaps
- Idle consolidation — Background thread runs 8-step cycle every 300s during conversation inactivity
- Persona generation — Create new personas via teacher API with multi-draft synthesis, auto-condensed to model tier
- 13 MCP tools — Query knowledge, teach, consolidate, switch personas, control self-play via stdio or SSE
- Persistent daemon mode —
--daemonruns API + background self-play/consolidation 24/7 - Somatic markers — Damasio-style gut-feel bias that modulates fast/slow decision paths based on learned emotional associations
- DiffMemory — Compresses episodic patterns into differentiable MLP weights during consolidation, with Hebbian plasticity + STDP
- Streaming output — Tokens appear one-by-one
- Web UI — Gradio interface at
localhost:7860(--web) - Voice mode — Speech-to-speech conversation (
--voice) - REST API — FastAPI on port 8765, includes OpenAI-compatible
/v1/chat/completions(streaming + non-streaming) - Web search — DuckDuckGo for general search, Wikipedia for knowledge (independent caches)
- Generation config — All 18
model.generate()parameters configurable via YAML, auto-detected per model size - Reward-driven learning — Prospect-theoretic RPE (λ=2.25) broadcast to TDCore, SFL, SensoryEncoders, ProceduralMemory, SomaticMarker
- Configurable via YAML —
config.yamloverrides device, model chain, quantization, generation params, backend mode
pip install -r requirements-dev.txt
# CLI chat
python3 -m project_adam
# Web UI
python3 -m project_adam --web
# Voice mode (needs mic + speakers)
python3 -m project_adam --voice
# Persistent daemon (API + background self-play + MCP SSE)
python3 -m project_adam --daemon
# REST API server (standalone)
./start_adam.sh
# or: PYTHONPATH=src uvicorn project_adam.api:app --host 0.0.0.0 --port 8765
# MCP server (stdio, for subprocess integration)
python3 -m project_adam --mcp
# Test OpenAI-compatible endpoint
curl http://localhost:8765/v1/models
curl -X POST http://localhost:8765/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"adam-cognet","messages":[{"role":"user","content":"Hello"}],"stream":false}'export API_ENDPOINT="http://localhost:8765/v1"Edit config.yaml:
backend:
mode: "auto" # auto-detects: low hardware → API, mid/high → local
api:
endpoint: "https://<remoteapibackend>/v1/chat/completions"
key: ""
model: "ai-model"See COGNET Architecture for the full 564-line specification (10 principles, 22 components).
AgentOrchestrator ─┬─ SensoryEncoders (stacked β-VAE: text/vision/audio, precision-weighted)
├─ WorkingMemory (3-buffer, PFC-gated, 7±2 chunk slots)
├─ EpisodicMemory (SQLite, (s,a,r,c) tuples, SentenceTransformer)
├─ SemanticMemory (schema graph, assimilation/accommodation)
├─ ProceduralMemory (RL via RPE, n-gram chunking, Q-values)
├─ SpatialMemory (GridCells + PlaceCells, 17 relations)
├─ DiffMemory (2-layer MLP, gradient + Hebbian + STDP)
├─ RLCore (TD(λ), Prospect-theoretic RPE λ=2.25)
├─ SFL (7 features, Rescorla-Wagner, asymmetric loss)
├─ BayesianWorldModel (conjugate priors, causal graph)
├─ SomaticMarker (3-layer MLP 16→8→1, online post-RPE)
├─ WebSearch (DDGS + Wikipedia, independent caches)
├─ MetacognitiveController (MLP 9→32→16→5, REINFORCE)
├─ LanguageInterface (dual-backend, persona builder)
├─ ActionSelector (fast/slow dual-system, somatic bias)
├─ OfflineConsolidator (8-step cycle: →TD→abstract+noise+Hebbian→prune→reentry→WM→procedural→re-encode)
├─ SelfPlayLearner (daemon, 4 strategies, spaced repetition)
├─ PersonaManager (list/load/switch/generate)
├─ Persona (Markdown identity, behavioral rules)
├─ MCP Server (13 tools, stdio + SSE)
└─ UserProfileManager (per-user state, LoRA adapters)
| Tier | Criteria | Behavior |
|---|---|---|
| Low | ≤4GB VRAM, Pascal | backend.mode=api, encoder loss task_weight=0.1, speaker model uses perplexity proxy, VisionEncoder/AudioEncoder disabled |
| Mid | 8GB+ VRAM, Volta+ | backend.mode=local, full encoder loss, normalized log-probability speaker model |
| High | 24GB+ VRAM, Ampere+ | Same as Mid, plus flan-t5-large loaded as encoder-decoder backend |
Qwen2.5-3B at 4-bit NF4 (~2.1GB VRAM). Falls back to 1.5B → 0.5B if unavailable. Tries each model in model_chain until one loads.
Remote API endpoint configured in config.yaml. Falls back to local model on failure. In auto mode, low-tier hardware automatically uses the API.
LoRA adapters (r=8, target_modules=["q_proj","v_proj"], α=16) trained every 10 interactions. Data sourced from both local and API responses (online distillation). Adapters saved per user at agent_memory/adapters/{user}/. RPE-scaled loss weights prioritize high-reward episodes.
config.yaml supports all runtime options:
device: cuda # "cuda" or "cpu"
base_model: Qwen/Qwen2.5-3B-Instruct
model_chain:
- Qwen/Qwen2.5-3B-Instruct
- Qwen/Qwen2.5-1.5B-Instruct
- Qwen/Qwen2.5-0.5B-Instruct
backend:
mode: "auto" # "auto", "local", or "api"
api:
endpoint: "https://<remoteapibackend>/v1/chat/completions"
key: "${API_KEY}"
model: "ai-model"
timeout: 60
generation:
max_new_tokens: 128
temperature: 0.7
top_p: 0.9Copy config.yaml.example to config.yaml and edit.
# Stdio mode (for subprocess integration — Claude Desktop, etc.)
python3 -m project_adam --mcp
# SSE mode (persistent — connect via HTTP while daemon runs)
# MCP client connects to: http://localhost:8765/mcp/sse
python3 -m project_adam --daemon| Tool | Description |
|---|---|
adam_query_knowledge |
Search all memory systems for a topic |
adam_explain_entity |
Get Bayesian posterior beliefs about an entity |
adam_get_status |
Stats across all memory systems and self-play |
adam_process_vision |
Submit visual features through VisionEncoder |
adam_process_audio |
Submit audio features through AudioEncoder |
adam_teach |
Submit (query, response) learning pair |
adam_observe_entity |
Submit entity observation → Bayesian update |
adam_teach_fact |
Submit fact → Piaget assimilation/accommodation |
adam_teach_skill |
Submit skill example → Q-learning + chunking |
adam_consolidate |
Run full 8-step consolidation cycle |
adam_self_play |
Start/stop/restart/status autonomous learning loop |
adam_list_personas |
List available personas |
adam_get_persona |
Inspect a persona's structure and rules |
adam_switch_persona |
Switch to a different persona mid-session |
adam_generate_persona |
Generate new persona via teacher API |
# All tests
PYTHONPATH=src python3 -m pytest tests/ -v
# Single file
PYTHONPATH=src python3 -m pytest tests/test_diffmemory.py -v
# Architecture compliance (57 tests)
PYTHONPATH=src python3 -m pytest tests/test_architecture.py -v374 tests across 32 test files covering all 22 components:
- Encoder (β-VAE forward, loss, sparsity, hardware tier, precision-weighting)
- Memory (working, episodic, semantic, procedural, spatial, diffmemory)
- RL core (TD update, actor policy, eligibility traces, Prospect theory)
- SFL (Q-learning, batch, negative reward, asymmetric loss)
- Metacognitive controller (MLP policy, REINFORCE, 5 actions)
- Language interface (generate, build_prompt, behavioral rules, utterance likeness)
- Action selection (Q-values, trajectory sim, fast/slow dual system, somatic bias)
- Integration (full chat flow, user detection, reward tracking, SFL updates)
- Web search (DDGS, Wikipedia, cache, independent methods)
- Profiles, persona, persona management, API endpoints
- Self-play (metacog gate, RPE path, episodic path, no training calls)
- MCP server (13 tools registered, canonical methods only)
- Somatic marker (online post-RPE update, prediction, profile state)
- World model (Bayesian conjugate, causal graph, transition simulation)
- Architecture compliance (57 tests) — verifies all 10 principles, 22 components against
architecture.md - Dataflow audit — validates RPE broadcast targets, PFC gate circular dependency break, consolidation order
| File | Contents |
|---|---|
architecture.md |
COGNET theoretical architecture (564 lines, 10 principles, 22 components) |
docs/setup.md |
Installation, requirements, CUDA setup, config, logging |
docs/usage.md |
CLI, Web UI, Voice mode walkthrough |
docs/api.md |
REST API endpoints + usage examples |
docs/training.md |
LoRA adaptation, RPE, online distillation |
docs/memory.md |
All 6 memory systems in detail |
docs/faq.md |
Troubleshooting and common questions |
docs/implementation-flowchart.md |
Mermaid flowchart of component implementation order |
docs/wiring-map.md |
Cross-component wiring and data flow |
docs/DATA_FLOW.md |
Detailed data flow through the architecture |
- Minimum: NVIDIA GPU with 4GB+ VRAM (Pascal or newer) — or use remote API backend
- Recommended: NVIDIA GPU with 8GB+ VRAM (Volta or newer) for local generation
- Python 3.12+
- 6GB disk for Qwen2.5-3B model (~2GB quantized)
- API key required for remote API endpoint (set
keyin config.yaml)
AGPL v3 with commercial option — see LICENSE.
Free to use, modify, and distribute under the AGPL v3 terms. If you use this software in a proprietary product without releasing your source code, you must purchase a commercial license.