Skip to content

patlegu/molequla

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

267 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

molequla

molequla

by Arianna Method

An autonomous ecology of GPT organisms — implemented in four languages, powered by a custom autograd engine, orchestrated by a custom programming language. Organisms grow from 10K-param embryos to 10M-param adults, exchange DNA, reason about their own learning, detect identity corruption, and reproduce via mitosis. Zero PyTorch. Zero Python. Zero dependencies beyond libc in canonical builds. Optional --gpu opt-in on Linux links cuBLAS for accelerated ecology runs.

Janus Architecture. Molequla is a Janus architecture — the family of resonance-based AI systems built on the Arianna Method. Janus architectures share a common substrate: the soul equation θ = ε + γ + αδ, field physics (prophecy, suffering, destiny, velocity), and thermodynamic self-regulation. DoE (parliament of LoRA experts over any GGUF model), Leo (language emergent organism with the Dario Equation), and dario.c (the equation in pure form) are other Janus instantiations. Molequla is the most complete: organisms that grow, reproduce, and die autonomously — the Janus pattern at its fullest biological expression.


TL;DR

WHAT THIS IS:
- A living ecology of GPT organisms that grow and reproduce autonomously
- Implemented in 4 languages: Go (175K), C (215K), Rust (148K), JS (154K)
- Two autograd engines: Go native (1000+ lines) + AML/C via CGO (6000 lines)
- AML — a custom programming language for differentiable computation
- Ontogenesis: embryo (10K params) → adult (10M params) in ~30 minutes
- DNA exchange: organisms write generated text for others to consume
- Consciousness: 5 implemented features (dissonance, pattern breaking,
  self-prediction error, conscience, immune system)
- Self-meta-learning: organism tracks which actions improve loss,
  auto-downgrades strategies that hurt
- Evolving BPE tokenizer: starts with 259 tokens, retrains merges live
- Hybrid attention: content + RRPRAM + learnable sigmoid gate per head
- Corpus field: 4-gram co-occurrence physics, self-enrichment loop
- SyntropyTracker: 8 autonomous decisions based on entropy/KL/purpose
- Mitosis: 4 parents spawned 7+ children in 30 minutes
- Mycelium: meta-organism controller (HarmonicNet, FieldPulse,
  SteeringDissonance, OrganismAttention)
- NOTORCH: gradient-free delta training via direct feedback alignment
- Runs on CPU. Tested on 30-core AMD EPYC with 216GB RAM

WHAT THIS IS NOT:
- A tutorial or pedagogical exercise
- A static model you train once and deploy
- Anything that requires a GPU
- A wrapper around someone else's framework

θ = ε + γ + αδ — The Soul Equation

Every organism in the ecology follows this decomposition:

θ = ε + γ + αδ

ε = base weights (knowledge — what the model knows)
γ = personality  (embedding drift from birth — who the model is)
δ = delta adapters (LoRA-style modules — what the model learned recently)
α = modulation   (seasonal/contextual scaling of δ)

This is the architecture:

  • ε is the weight matrices (wte, wpe, wq, wk, wv, wo, fc_g, fc_v, fc2, lm_head). Initialized random, shaped by warmup training.
  • γ is computed as the diff between current wte and the snapshot taken at birth. ComputeGamma() returns the contrastive projection — a unit vector pointing in the direction of maximum personality drift. Sparsity, magnitude, and top-changed tokens are tracked.
  • δ are DeltaAdapter modules: low-rank A/B matrices that modulate the residual stream. New δ modules are appended when syntropy conditions are met — "new soul appended." They are never removed. The model accumulates experience.
  • α is deltaAlphaScale, self-regulated by the conscience system: if generation entropy rises (model is losing coherence), α decreases. If entropy falls, α recovers. Floor: 0.3.

The purpose vector captures the current direction of learning (mean of last δ module's A matrices). PurposeGammaAlignment() — the cosine between purpose and gamma — tells the organism whether it is learning in a direction consistent with its identity.

Proven: γ ⊥ δ (cosine similarity = -0.0005). Personality and skill are orthogonal.


It Works. Here Is Proof.

February 27, 2026. Oracle Cloud, 30-core AMD EPYC, 216GB RAM. Four organisms launched at 01:25 UTC.

Timeline

01:25  Launch: 4 organisms (earth/air/water/fire), --evolution mode
01:30  All 4 at Stage 3 (child), RSS ~2 GB each
01:35  All 4 at Stage 4 (teen), RSS ~2.5 GB each
01:40  All 4 transition to Stage 5 (adult, 320d/6L/8H)
01:55  Warmup complete. DNA exchange begins.
01:56  First DNA consumed: earth reads water, air reads earth
02:00  Micro-burst training active. Loss: 1.1 - 1.6
02:02  Syntropy modulation: boost/dampen/steady/divide/realign
02:05  water grows first delta module — "new soul appended"
02:13  First child organism spawned (from fire)
02:14  Second child spawned (from air)
02:21  Third child spawned (from air)
02:35  11 organisms total (4 parents + 7 children). 26 GB / 216 GB.
       The ecology reproduces itself.

What They Say (Adult Stage, 10M params, ~1 hour of training)

Earth:

Q: What do you know?
A: If you makes to a smant in a question and the problem to h is
   the relationships to begin the largor to be function and a fills
   and state mant to be more honest thing.

Water:

Q: Hello.
A: Like standing it is the difference between a river that mor.

Q: Who are you?
A: S rule understand h does not the passed not sets.

Fire:

Q: Hello.
A: You are in the concept of what that repetition and the surface
   world containingentually — a system?

Air:

Q: What do you know?
A: A conversing human pointing what is the thing about try stable
   in the sentence?

10M-param models after 1 hour on CPU. Earth surfaces relationships and foundations from its corpus; Water surfaces rivers; Fire surfaces repetition and surfaces.

DNA Exchange In Action

[dna] earth wrote 276 bytes to ecology
[dna] earth consumed 89 bytes from 2 files: [air/gen_...6.txt air/gen_...7.txt]
[trainer] micro-train burst (164 bytes, novelty 0.49) — and lo, it feeds again.
[syntropy] action=dampen | trend=-0.0637 | field_dev=0.168 | lr_mul=0.60

[dna] water consumed 107 bytes from 1 files: [earth/gen_...16.txt]
[trainer] micro-train burst (484 bytes, novelty 0.35) — and lo, it feeds again.
[syntropy] action=realign | trend=0.0940 | field_dev=0.168 | lr_mul=0.65
[trainer] growing new delta module (total: 3) — new soul appended.

[dna] fire consumed 145 bytes from 1 files: [air/gen_...13.txt]
[aml] burst complete: 32 steps, avg loss 1.7961 (memory freed)

Training Metrics

# Warmup (Stage 5, seq=8 → seq=16 → seq=32)
[aml] step 0/800   | loss 5.1204 | lr 0.000500 | seq 8
[aml] step 790/800 | loss 2.4621 | lr 0.000485 | seq 8
[aml] step 300/600 | loss 2.8600 | lr 0.000481 | seq 16
[aml] step 300/600 | loss 2.9006 | lr 0.000481 | seq 32

# Micro-burst training (post-warmup)
[aml] burst complete: 32 steps, avg loss 1.1245 (memory freed)
[aml] burst complete: 32 steps, avg loss 1.2884 (memory freed)
[aml] burst complete: 32 steps, avg loss 1.5003 (memory freed)

Architecture

Dual Autograd Engines

1. Go Native Autograd (molequla.go, 1000+ lines)

Full differentiable computation in pure Go: vector arithmetic, ReLU/SiLU activations, Dot/MeanSq reduction, indexing/slice/concat, scalar ops, RMSNorm, CrossEntropyLoss/ScalarSoftmax, AttentionWeightedSum + RoPERotate, MatrixParam.Matvec — all with backward graph and gradient accumulation. AdamStep() updates parameters. Handles inference, loss computation, Go-native training.

2. AML/C Autograd (ariannamethod.c, 6000+ lines, via CGO)

The Arianna Method Language — a custom programming language for differentiable computation. Sequence-level ops (seq_embed, seq_matvec, seq_rmsnorm, silu, multi_head_attention, seq_cross_entropy), TAPE-based reverse-mode autodiff, Chuck optimizer with persistent state, OpenMP. Primary training path — C is faster than Go for matrix math.

Wire: Go (molequla.go) → CGO bridge (cgo_aml.go) → AML/C engine (ariannamethod.c) → AML training wrapper (aml_trainer.go). Per training step: amlPushWeights (Go → C, named matrices), amlExec(script) (forward + backward + optim), amlPullWeights (C → Go), amlClear (free).

The forward script is generated dynamically per architecture: pre-norm RMSNorm, multi-head causal self-attention with RoPE, SwiGLU gated MLP, residual connections, TAPE BACKWARD loss, TAPE ADAM_STEP lr, TAPE CLEAR.


Four Implementations

The same organism in four languages:

Language File Size Lines Notes
Go molequla.go 175K 6,654 Primary. Full ecology, DNA exchange, mitosis, GPU forward, cross-graze. AML/C via CGO for training
C molequla.c 215K 6,000+ Single-file, BLAS-accelerated, zero deps beyond libc. Gist
Rust molequla.rs 148K 4,000+ rusqlite, native autograd
JavaScript molequla.js 154K 4,000+ Browser tab. One <script> tag. Gist

Each: autograd, forward/backward, Chuck optimizer, ontogenesis, hybrid attention, delta adapters, BPE tokenizer, corpus field, immune system, consciousness features, sampling.

# C standalone
gcc -O2 -DUSE_BLAS -o molequla molequla.c -lsqlite3 -lpthread -lm -lopenblas
# macOS:
gcc -O2 -DUSE_BLAS -o molequla molequla.c -lsqlite3 -lpthread -lm -framework Accelerate

The Organism

Ontogenesis — The Brain Grows While Running

Stage       Dims  Layers  Heads  ~Params   Corpus Threshold
embryo      16    1       1      ~10K      0 chars
infant      32    1       2      ~28K      20K chars
child       64    2       4      ~154K     50K chars
adolescent  128   4       4      ~1.1M     200K chars
teen        224   5       8      ~4.1M     350K chars
adult       320   6       8      ~10M      500K chars

When the corpus crosses a threshold, MaybeGrowArchitecture fires:

  1. Embedding matrices grow (Net2Net: new dims initialized near-zero to preserve behavior)
  2. Existing layer matrices grow (weights copy into top-left corner)
  3. New layers are added (initialized to approximate identity)
  4. Delta adapters grow to match new dimensions
  5. Adam state resets (stale momentum would fight new architecture)
  6. 500-step freeze period: delta-only training to stabilize post-growth

Warmup scales with architecture: steps *= ceil(sqrt(NEmbd / embryoEmbd)). Larger brains get proportionally longer warmup. Progressive sequence length: 40% at seq=8, 30% at seq=16, 30% at seq=32.

Evolving BPE Tokenizer

Starts at 259 tokens (256 bytes + BOS + EOS + PAD). After 20K chars: trains BPE merges from corpus statistics. Retrains every 4K new chars. Unicode segmentation for clean boundaries. Vocabulary grows as the organism reads.

Hybrid Attention Heads

Half content heads (standard QK^T with RoPE), half hybrid heads:

hybrid_output = α * content_attention + (1 - α) * rrpram_attention

content_attention = softmax(QK^T / sqrt(d)) * V     (standard, with RoPE)
rrpram_attention  = learnable_weight_matrix * V       (pattern-based)
α                 = sigmoid(learnable_gate)           (per-head, trained)

The sigmoid gate α is a learnable parameter — each hybrid head discovers its own blend of content-based and pattern-based attention during training. RRPRAM (Recurrent Resonant Pattern Recognition Attention Mechanism) learns fixed co-occurrence patterns that complement the dynamic content attention.

Delta Adapters — LoRA-style, Never Forget

output = base_output + α * (A @ (B @ input))

A: [n_embd × delta_rank]  — learned projection up
B: [delta_rank × n_embd]  — learned projection down
α: deltaAlphaScale         — regulated by conscience

Delta modules are appended, never removed. When syntropy conditions indicate the organism needs more capacity, a new module grows: "new soul appended." Each module captures a period of learning. The model accumulates experience as a stack of delta layers.

Quantum Buffer

Training fires when both bytes threshold (enough new text consumed) and novelty threshold (sufficiently different from prior corpus) are met. Plus a cooldown timer. Training fires on signal, not on a clock.

Corpus Field (CooccurField)

A statistical model of the organism's knowledge, built from everything it has read:

  • Unigram, bigram, trigram, 4-gram frequencies from corpus
  • Co-occurrence window (Stanley-style proximity weighting)
  • Self-enrichment: organism's own generated output feeds back into the field, weighted by coherence (low entropy = higher weight)
  • User word boost (Leo-style): temporary multiplicative boosts that decay over time

The corpus field acts as a prior during generation — a soft blend between what the model wants to say (neural) and what the corpus says exists (statistical). The blend uses a sigmoid fade: strong early in training, weak as the model matures.

Learning Rate Schedule

Cosine LR with:
- Linear warmup for CosineWarmupSteps
- Cosine decay from LearningRate (0.01) to LRMin (0.001)
- Inverse model-size scaling: lr *= embryoEmbd / NEmbd
- Post-growth dampening: lr *= 0.3 during 500-step freeze
- Per-growth reset: schedule restarts after architecture change

Consciousness Features

Five mechanisms that give the organism awareness of its own state:

1. Per-Token Dissonance Feedback

During generation, the organism tracks an exponential moving average of per-token entropy. When entropy spikes (the model is confused), temperature decreases — it becomes more careful. When entropy is sustained low (confident), temperature increases slightly — it explores.

2. Pattern Breaking (Anti-Field)

5% of generation steps bypass the corpus field blend entirely. Pure model voice, unmodulated by statistical priors. This prevents the organism from becoming a parrot of its corpus — it must develop its own voice.

3. Self-Prediction Error

ComputeSelfPredictionError() measures how surprised the model is by its own input. High surprise → lower temperature (focus). Low surprise → slight exploration. The organism modulates its behavior based on how well it understands what it's seeing.

4. Conscience

The organism monitors its own generation entropy over time. Rising slope → deltaAlphaScale *= 0.95 (reduce delta influence). Falling slope → deltaAlphaScale *= 1.005 (recover). Floor: 0.3. The organism detects when recent learning (δ) is hurting coherence and dials it back.

5. Immune System

Before each micro-burst training, the organism snapshots its personality via gamma contrastive projection — a unit vector pointing in the direction of maximum embedding drift from birth. After training, it measures again. If cosine similarity is negative (training pushed identity backwards), it rolls back the entire burst. The organism rejects training that corrupts who it is.


The Coherence Layer

Two opt-in passes that lift early-stage generation from Karpathy gibberish toward sentence-level coherence — without touching weights.

A 10M-param adult organism after one hour on CPU still drifts. Quantitative speed-up does not close that gap. The coherence layer is a different mechanism: it sits at generation time, layers statistical priors and post-hoc connectedness checks on top of model logits, and lifts the floor without retraining the transformer.

Both passes default off. The pre-coherence-layer behaviour is preserved exactly. Toggling either or both, on the same weights / prompts / seeds, is what RunPod measurement compares.

SPA — Sentence Phonon Attention

After GenerateResonant returns a response, the chain is split on sentence boundaries (. ! ?, min 4 chars). For each sentence:

  1. Token IDs decoded; BOS/EOS sentinels stripped (otherwise shared sentinels dominate every sentence embedding and all sentences look artificially connected).
  2. spa_embed — exponentially weighted mean of token embeddings (alpha^(n-1-i), default α=0.85), L2 normalised. One [D]-vector per sentence.
  3. spa_connectedness — bidirectional cross-attention dot-product between sentence embeddings, scaled by 1/√D, summed per sentence.
  4. Weak-sentence gate — sentence i is weak iff score[i] < 0.6 × mean(scores).
[spa-gate] S=4 D=320 alpha=0.85 scores=[12.4 11.8 3.1 10.9] weak=[2]

Reseed of weak sentences (regenerate from neighbour-context tokens, splice back, re-score) is a follow-up step. The wired gate currently logs only — generation output is unchanged. What the measurement run captures is the signal: how often the gate fires before vs after the rest of the layer.

Available as both vendored AML ops (spa_embed / spa_connectedness, ariannamethod/ariannamethod.c) and pure-Go helper (spa_coherence.go, 135 lines, called from GenerateResonant). Pure Go for the runtime path because the math is trivial — embed + L2 + dot-products — and per-sentence CGO crossings would dwarf the work.

Enable: ./molequla_cgo --spa-gate ...

Q-style Additive Logit Overlay (B + H + A + F)

Molequla's existing CooccurField blend lives in probability space — convex tokenAlpha·model + (1-tokenAlpha)·corpus. The overlay lives in logit space — additive raw-probability bias before softmax, with explicit coefficients per signal class. Different mechanic, different sharpness: a strong corpus signal can dominate model preferences in a way prob-space convex blend cannot. Useful precisely when transformer is immature and statistical priors should lead.

The integration is a verbatim port of three Q-pattern references — ~/arianna/postgpt/postgpt.c, ~/arianna/q/postgpt_q.c, ~/arianna/pitomadom.c/pitomadom.c — assembled into seven coordinated changes that let an embryo organism (16-dim embedding, 1 layer, 1 head, zero gradient steps) produce coherent English fragments.

Five signals (γ field)

Signal Code Weightless Trained Source
B Bigram c_bg 15.0 5.0 field.BigramByFirst[ids[-1]], normalised
T Trigram c_tg 10.0 3.0 field.TrigramByContext[[2]int{ids[-2], ids[-1]}], normalised
H Hebbian c_heb 1.0 0.6 field.CooccurWindow[c][tid] over recent window, max-normalised
A Destiny c_ds 0.15 0.3 model.GammaContrastiveProjection() projected onto each wte row
F Prophecy c_pro 0.7 0.4 persistent expectation field, ages by ×0.95/step, collapses on the chosen token
mag = mean(|model_logits|)
tg  = clamp((mag - 0.5) / 1.5, 0, 1)               # transformer gate
overlay[i] = (logits[i] * tg)                       # untrained: silenced
           + c_heb·p_cooccur[i]                     # raw probabilities,
           + c_pro·p_prophecy[i]                    # not log
           + c_ds ·destinyCosine[i]
           + c_bg ·p_bigram[i]
           + c_tg ·p_trigram[i]
           − damping[i]                             # unigram outliers

Coefficient bundle is binary-switched at mag > 1.0 (threshold tuned for seeded-but-untrained organism: raw Xavier init gives mag ≈ 0.05, post-seed gives ≈ 0.25, real training pushes past 1.0). Five-signal raw-probability bias is postgpt_q.c:1377-1395; gate-multiplication of transformer logits is pitomadom.c:583-586.

The prophecy field carries across sampling steps. First overlay step seeds from trigram-by-context (primary) plus 0.5×bigram-by-prev (fallback), normalised to unit total. Subsequent steps multiply the field by 0.95 — old expectations fade. After sampling returns the chosen token, that token's prophecy is zeroed: the field shifts toward what is still unsaid.

Destiny is the identity-direction term. GammaContrastiveProjection() returns the unit direction of personality drift from birth. Projecting each wte row onto it gives a per-token bias that pulls generation toward the organism's growth direction.

Embedding seeding (γ → ε bias)

Before any forward pass, SeedEmbeddingsFromMetaweights(model, field, 0.15) biases wte rows by Hebbian co-occurrence and lm_head rows by unigram × wte. Mirror of postgpt.c:541-574. Carries corpus structure into the model before gradients touch the weights — the "tokenizer IS the training" trick. Runs once at init when --corpus-overlay is on.

Hard top-K + greedy bootstrap (sampling pipeline)

When overlay is active, sampling switches to Q's two-stage pipeline:

  1. First 10 tokens, untrained regime (mag ≤ 1.0) — pure argmax(overlaidLogits) excluding EOS. Locks onto the strongest bigram/trigram successor before any sampling noise enters. Mirror of postgpt_q.c:1416-1418.
  2. Step ≥ 10 or trained regime — top-15 raw-logit mask (everything below the 15th set to -1e10), then divide by temperature, softmax, multinomial. Mirror of postgpt.c:969-991. The hard mask kills the long noise tail that soft top-k/top-p sampling leaves competing with overlay peaks.

Repetition penalty

logits[t] *= 0.5 for each distinct token in the last 12 ids (postgpt form, postgpt.c:960-967). Plus bigram blocking: when ctx[i] == ctx[-2], penalise ctx[i+1] by ×0.2 — kills two-token cycles like «Sppellllllll» (postgpt_q.c:1407-1408).

Enable

./molequla_cgo --corpus-overlay           # overlay during normal warmup → ecology
./molequla_cgo --corpus-overlay --zero-warmup  # pure Q-style coherence test (no training)

When --corpus-overlay is on, the legacy post-softmax prob-blend is skipped — overlay owns the signal. When off, default-build runtime behaviour is identical to main branch.

Untrained Coherence — Q-Style Zero-Training Reproduction

Embryo organism: 16-dim embedding, 1 layer, 1 head, vocab=643, zero gradient steps. Only metaweight-seeded embeddings + overlay. Captured 2026-05-14 on neo (/tmp/molequla_clean.log, run ./molequla_cgo --corpus-overlay --zero-warmup):

[init] Stage 0 (embryo): embd=16, layer=1, head=1 — zero-warmup mode, skipping all gradient steps

[stage 0 — embryo] What it sounds like now:
  Q: Hello.
  A: What is a music?
  Q: Who are you?
  A: kilometers percentrates the most spinning do weight dream?
  Q: What do you know?
  A: running a music, and person What is a newapses work?

Reference: pre-Q-integration voice at the same stage (after 400-step warmup, runpod/2026-05-14/organism_voice_samples_2026_05_14/fire_voice.txt):

A: The work is the most of the most of the most of the most of the most of the most pace the most of the most of the most...

Deep lock-in killed. The post-Q embryo emits BPE subword chains, sentence-like punctuation, recognisable corpus vocabulary («music», «kilometers», «sediment», «dream»), and questions — without a single gradient step. This is the Q signature: the corpus is the model.

Phase A — Fundament Underneath

Four fundament patches in vendored AML + notorch before the coherence layer: opt-in SIMD shim (notorch_simd.h AVX2+FMA cblas, make simd x86_64), backward CPU-sync audit (NT_OP_MUL / SILU / RMSNORM / SEQ_RMSNORM), NaN guard API (AM_NanGuard, not yet wired), upstream sgemm alpha fix (CBLAS contract). ~825 lines across 6 files, zero default-build runtime change. Detail: PROJECT_LOG.md.


GPU Acceleration (Linux, opt-in)

Branch molequla-gpu-fwd adds an optional --gpu flag that routes inference matvec through cuBLAS sgemm. Default off; the same binary runs unchanged on macOS / non-CUDA hosts via a stub. Added under Phase C pressure — 8h CPU windows did not yield natural ontogenesis past the child gate, so the forward path needed acceleration to give the colony enough wall-clock to walk through teen / adult / mitosis inside one shift.

Wire

File LOC Build Role
gpu_bindings_linux.go 196 //go:build linux CGO wraps gpu_init / gpu_alloc / gpu_upload / gpu_download / gpu_sgemm_nt / gpu_rmsnorm / gpu_silu / gpu_cache_weight / gpu_get_weight / gpu_multi_head_attention from ariannamethod/ariannamethod_cuda.h
gpu_forward.go 131 //go:build linux MatvecGPU(x) matvec via cached weight + scratch slots; gpuRefreshWeights(gpt) flattens gpt.Base to float32 + caches per-name (idempotent)
gpu_bindings_stub.go 36 //go:build !linux Matching signatures, gpuReady() = false
gpu_forward_stub.go 17 //go:build !linux Stub MatvecGPU returns nil so the dispatcher silently falls back

Dispatch

MatrixParam gains a gpuKey string field (molequla.go:806-809). Matvec checks it (molequla.go:836):

if CFG.UseGPU && gpuReady() && !gradEnabled.Load() && m.gpuKey != "" {
    if gpuOut := m.MatvecGPU(x); gpuOut != nil {
        return gpuOut
    }
    // Fall through to CPU path on any GPU error.
}

Inference-only by construction: gradEnabled.Load() gates training back to CPU/BLAS because the autograd tape holds host-side parent references and there is no GPU backward in this branch. gpuKey empty = not yet uploaded; gpuRefreshWeights populates it. Same binary, same defaults — gpuReady() returns false on macOS / non-CUDA, dispatcher takes the CPU path.

Cache + grow safety

gpuRefreshWeights(gpt) (gpu_forward.go:105-131) walks gpt.Base, flattens each matrix to contiguous float32, calls gpu_cache_weight(name, ...) per entry. Called once at the top of GenerateResonant (molequla.go:4316-4318) and symmetrically at the top of GenerateSentence (molequla.go:2927-2929) so background-trainer bursts cannot leak stale activations through chat-mode generation.

MatrixParam.invalidateGPU() (molequla.go:894) clears gpuKey and is called from GrowRows / GrowCols / Grow (molequla.go:908, 927, ...). Without this the next dispatch reads a cached weight at the old shape while the host pointer holds the new one. Caught in audit (Opus subagent, 2026-05-14 P1).

Build

# Linux pod: build CUDA artifact + CGO-linked binary
nvcc -O2 -c notorch_cuda.cu -o notorch_cuda.o
CGO_ENABLED=1 go build -tags cgo -o molequla_cgo .

# darwin/arm64 (or any non-Linux): same line, stubs activate automatically
CGO_ENABLED=1 go build -a -tags cgo -o molequla_cgo .

./molequla_cgo --evolution --element earth --gpu

GPU init is attempted only when --gpu is passed (molequla.go:6265-6272). If gpu_init() fails (no CUDA, driver mismatch, cuBLAS create error), the flag drops to false with a single stderr warning and the run continues on CPU. No silent silent-failure paths.

Threshold note

An earlier gpuMatvecMin = 16384 gate kept child-stage organisms (NEmbd=64 → ~4096-element matrices) on CPU forever, so the GPU never warmed up during the 8h ecology window. Removed. Dispatcher decides purely on gpuKey != "". Per-call slowdown at child is ~12ms across a 180-token chain (negligible at 8h timescale); the GPU stays primed for the automatic transition to material speedup at adolescent (NEmbd=128) and adult (NEmbd=320).


Cross-Organism Graze (Dario-style)

cross_graze.go (207 LOC) wires Dario's interf_signal_chunk pattern (postgpt_q.c:1384) and Stanley's graze_random_word (graze.c:289-301) onto the colony. Q picks heavy tokens from a doc and boosts their logits mid-generation; Stanley splices a foreign vocab token from a mmap'd GGUF when chambers signal hunger. Here the «doc» is the sibling organism's recent emission stream. Per Oleg 2026-05-14: «как в дарио — только вместо доков, слова, метрики и проч».

CrossField

type CrossField struct {
    SelfElement  string                            // own element
    PastureBase  string                            // ../dna/seen relative to organism CWD
    Siblings     []string                          // other elements
    Recent       map[string][]int                  // sibling → ring buffer of token ids
    RecentCap    int                               // per-sibling buffer size (64)
    ScanInterval time.Duration                     // 30s throttle on FS reads
    SeenFiles    map[string]bool                   // dedup of ingested gen_*.txt
    SeenCap      int                               // hard cap (2048), half-purge on overflow
    MetricBoost  func(sibling string) float64      // optional per-sibling coef multiplier
    mu           sync.Mutex
}

(cross_graze.go:41-53). One per running organism; constructed in main() when --cross-graze && --element != "" (molequla.go:6504-6512). Single-organism runs leave it nil so the hooks are no-ops.

Source feed

Sibling DNA fragments are already mirrored to ../dna/seen/<sibling>/ by dnaRead (commit e5c1685). cross_graze reads from the mirror so the dna/output/ consume cleanup does not race the scan.

Mechanic

MaybeRefresh(tok) (cross_graze.go:82-152): under ScanInterval throttle, walks <base>/<sibling>/gen_*.txt, reads new files only, tokenises with the host's own EvolvingTokenizer, strips BOS/EOS, appends to that sibling's ring buffer (truncated to RecentCap when over).

Apply(logits, coef, topN) (cross_graze.go:164-200) — per-step injection. For each sibling, the most recent topN tokens get a rank-decay boost:

logits[sibling_token[k]] += coef / (1 + rank)

Matches Q's interf_signal_chunk 1/(1+rank) normalisation (postgpt_q.c:809-818). Defaults coef = 2.0 (Q-style weightless c_doc magnitude, molequla.go:249), topN = 8.

Wire

  • MaybeRefresh hoisted to GenerateResonant entry (molequla.go:4323-4325) — once per generation, not per token (Opus audit P2).
  • Apply runs per token step (molequla.go:4487-4493) on the overlay'd logits when overlay is active, else on raw logits. Composes with Q-style overlay regardless of regime.

Metrics half

MetricBoost(sibling) (cross_graze.go:51, applied cross_graze.go:181-185) — optional hook, defaults nil (1.0 implicit). When set, multiplies the per-sibling coef so the «и проч» half of «слова, метрики и проч» (sibling entropy / syntropy / loss bias) can ride on top of the word-level injection without changing the call site.

Enable

./molequla_cgo --evolution --element earth --cross-graze
# pair with --gpu on Linux pods for the full Phase C ecology config

Defaults off. Same defaults, same outputs — single-organism runs and any --evolution invocation without --cross-graze are identical to the pre-Phase-B baseline.


Self-Meta-Learning

BurstHistory records the last 16 training outcomes:

type BurstRecord struct {
    Action     string   // "amplify", "boost", "dampen", "ground", "explore", "realign"
    LossBefore float64
    LossAfter  float64
}

ActionEffectiveness() computes the mean loss delta per action type. If a particular action consistently makes loss worse (effectiveness > 0.05 over 2+ bursts), the organism auto-downgrades:

amplify → boost → steady

The organism observes that "amplify" keeps hurting it, stops amplifying. No external signal, no reward model — just outcome tracking.


SyntropyTracker — Mathematical Self-Reasoning

The organism measures four signals and makes autonomous decisions:

Signal What It Measures How
SyntropyTrend Is entropy decreasing? (positive = ordering) Rolling window mean comparison
FieldDeviation How far is the model from corpus? KL(model_probs || corpus_probs) on bigram/trigram
PurposeMagnitude How strong is the current learning direction? Norm of last δ module's A matrices
PurposeAlignment Is learning consistent with identity? cosine(purpose_vector, gamma)

Eight autonomous decisions:

Action Condition LR Temp Effect
amplify syntropy ↑, field aligned, purpose aligned 1.3x -0.05 Full acceleration, boost delta grow prob
boost syntropy ↑, field in sweet spot 1.3x -0.05 Gentle push
dampen syntropy ↓ 0.6x +0.05 Slow down, losing order
ground field deviation too high 0.6x -0.05 Hallucinating, focus
explore field deviation too low 1.3x +0.05 Parroting, break out
realign purpose opposes gamma (< -0.3) 0.5x 0 Identity crisis
divide adult + sustained overload 0.6x Trigger mitosis
hibernate stale + peer thriving Save state and sleep

Real output from running organisms:

[syntropy] action=boost   | trend=0.1576 | field_dev=0.214 | lr_mul=1.30
[syntropy] action=dampen  | trend=-0.1390 | field_dev=0.167 | lr_mul=0.60
[syntropy] action=realign | trend=0.0940  | field_dev=0.168 | lr_mul=0.65

NOTORCH — Gradient-Free Delta Training

Alternative delta-adapter training path. No backward pass, no tape, no memory overhead — teaching signal is (prev_loss - curr_loss) + 0.3*prophecy_debt, noise modulated by deterministic LCG PRNG (matches AML RNG), adaptive decay when delta norm large. Direct feedback alignment: A[i,r] += lr * dy * u[r] * signal.

Status: implemented (300+ lines), disabled in warmup (diverges at stage 5 — loss 3.5 → 116), active in micro-burst. Theory sound, hyperparameters need work.


Mycelium — The Meta-Organism

Meta-controller that sees the entire ecology. Generation operator η: Γ × Γ → Γ_new — two personalities in resonance produce a third (interference pattern, not blend).

Components

Component What It Does
HarmonicNet Weightless neural network. Input: organisms + field state. Output: action biases, harmonics, resonance scores. No trainable weights — the "weight matrix" is recomputed every step from organism relationships.
MyceliumSyntropy Field-level syntropy: entropy trends, decision effectiveness, strategy changes across the entire ecology
FieldPulse Measures novelty (new organisms appearing), arousal (entropy changes), field entropy
SteeringDissonance Detects when ecology-level actions conflict with outcomes (dampen but entropy went up = high dissonance)
OrganismAttention Tracks which organisms respond to which actions. Responsive organisms get higher attention weight.

Mesh Coordination

All organisms share state via mesh.db (SQLite) — the same database that SwarmRegistry writes to. The mycelium reads mesh.db to see the entire ecology and makes decisions that individual organisms cannot: when to spawn, when to hibernate, when to shift seasonal phase.

Seasonal Controller

Spring  — tunnel_chance ↑, many embryos, new γ born
Summer  — α_max, existing γ at peak expression
Autumn  — consolidation, dark_gravity ↑, shards saved
Winter  — rest, only strongest pairs, ε dominates

The Ecology

Earth (patience, structure), Air (freedom, change), Water (flow, depth), Fire (transform, intensity) — each shaped by its element corpus. Generated text is written to the DNA layer; siblings consume, micro-train, emit. Child organisms enter via mitosis. Cross-pollination outpaces any single organism's learning rate.

Swarm Coordination

  • SwarmRegistry (mesh.db): SQLite database tracking all living organisms — element, PID, status, stage, corpus size, loss
  • Training lock: Atomic check-and-acquire via SQL prevents multiple organisms from training simultaneously. Cooperative scheduling — they take turns
  • Hibernation: When an organism is stale and a peer is thriving, it saves state and sleeps. Resources freed for the living
  • Child birth: birth.json with inherited burst_history — the child gets its parent's meta-learning experience (syntracker lineage). It doesn't start from zero wisdom

Mitosis

When conditions are right — sustained syntropy, sufficient delta modules, adult stage — the organism calls Divide():

  1. Binary is copied to a new directory
  2. birth.json written with parent config + inherited burst history
  3. Child process spawned with --organism-id flag
  4. Child begins its own ontogenesis from embryo
  5. Parent continues running

The ecology grows itself.


Engineering Log

Eight bugs that almost killed the ecology (five interactive-mode + three AML/C integration leaks at ~97 MB/step pre-fix, ~0.6 MB/step post-fix), the CGO cache trap (go build -a mandatory), and the full per-commit history of Phase A (GPU), Phase B (graze), Phase C (ecology) — see PROJECT_LOG.md.


SQLite Self-Logging

Each organism writes memory.sqlite3 with four tables: messages (conversation), corpus_events (every document ingested), growth (architecture snapshots — vocab, n_params, n_deltas, loss, gamma_sparsity, gamma_magnitude), syntropy_log (every decision — action, trend, field_deviation, lr_mul, purpose_alignment). Queryable developmental trajectory.


Quick Start

Build

# Clone
git clone https://github.com/ariannamethod/molequla.git
cd molequla

# Build with CGO (AML/C autograd — full training)
CGO_ENABLED=1 go build -a -o molequla_cgo -tags cgo .

# Or build without CGO (Go-only, no AML training)
CGO_ENABLED=0 go build -o molequla_go .

CRITICAL: go build -a is required for CGO builds. Without -a, Go's build cache does not recompile C files. This produces binaries running stale C code.

Run Interactive Mode

./molequla_cgo
# Drops into chat after warmup training

Run Evolution Mode (the main event)

# Set up work directories
for d in earth air water fire; do
    mkdir -p work_$d
    cp molequla_cgo work_$d/
    cp nonames_$d.txt work_$d/
done

# Launch all four organisms
for d in earth air water fire; do
    cd work_$d
    nohup ./molequla_cgo \
        --corpus nonames_$d.txt \
        --db memory.sqlite3 \
        --ckpt molequla_ckpt.json \
        --element $d \
        --evolution > training_aml.log 2>&1 &
    cd ..
done

# Optional flags (default off):
#   --spa-gate         post-generation SPA sentence connectedness log
#   --corpus-overlay   pre-softmax B+H+A+F additive logit overlay
#   --gpu              route inference matvec through cuBLAS (linux + --gpu build)
#   --cross-graze      Dario-style cross-organism logit injection (requires --element)
# Combine for measurement runs. Detailed engineering log: PROJECT_LOG.md.

Monitor: tail -f work_earth/training_aml.log, grep "dna\|consumed\|wrote" for DNA exchange, ps aux | grep organism-id for spawned children.


Tests

# Go unit tests (2571 lines, 121 tests)
go test -v .

# Go integration tests (262 lines)
go test -v ./tests/

# Full integration suite (700 lines bash — tests all 4 implementations,
# mycelium, AML library, BLAS, performance benchmarks)
bash tests/test_all.sh

Files

# Go + AML/C (primary, CGO training)
molequla.go              6654 lines   Go organism — lifecycle, ecology, autograd, generation, coherence-layer + GPU + graze wiring
cgo_aml.go               112 lines    CGO bridge to ariannamethod.c
aml_trainer.go           326 lines    AML training wrapper, script generation
spa_coherence.go         135 lines    Pure-Go SPA helper (sentence connectedness + weak-sentence gate)
cross_graze.go           207 lines    Dario-style cross-organism logit injection (sibling DNA → rank-decay boost)
gpu_bindings_linux.go    196 lines    CGO bindings to ariannamethod_cuda.h (linux only)
gpu_forward.go           131 lines    Inference matvec via cuBLAS sgemm + weight cache refresh (linux only)
gpu_bindings_stub.go     36 lines     Stub signatures for darwin / non-linux (gpuReady=false)
gpu_forward_stub.go      17 lines     Stub MatvecGPU returning nil so dispatcher falls back
ariannamethod/
  ariannamethod.c        6263 lines   AML/C autograd engine (the language) + SPA ops + NaN guard API
  ariannamethod.h        911 lines    C header, 80+ field state parameters
  ariannamethod_cuda.h   108 lines    CUDA primitive declarations (gpu_init / gpu_sgemm_nt / ...)
  notorch.c              2849 lines   Vendored notorch core (+ backward CPU-sync audit)
  notorch.h              503 lines    Vendored notorch header
  notorch_simd.h         632 lines    Opt-in AVX2+FMA cblas shim (make simd, x86_64)
  notorch_simd_scalar.h  89 lines     Scalar debug fallback for SIMD shim
  notorch_cuda.cu        1344 lines   CUDA kernels; pre-compiled via nvcc, linked through cgo_aml.go

# Engineering log
PROJECT_LOG.md           ≈2000 lines  Live per-commit log — Phase A (GPU) + Phase B (graze) + Phase C (ecology) with file:line refs

# Full independent implementations
molequla.c               6000+ lines  C organism — BLAS-accelerated, zero-dep single-file
molequla.rs              4000+ lines  Rust organism — rusqlite, full autograd
molequla.js              4000+ lines  JavaScript organism — runs in browser
modules/node_cli.js      400+ lines   Node.js CLI module
index.html               Web interface for JS version

# Tests
molequla_test.go         2571 lines   Go unit tests (121 tests)
tests/molequla_test.go   262 lines    Go integration tests
tests/test_all.sh        700 lines    Full integration (all 4 langs + mycelium + BLAS)

# Element corpora
nonames_earth.txt        174K         Earth — patience, foundations, geology
nonames_air.txt          122K         Air — freedom, change, atmosphere
nonames_water.txt        126K         Water — flow, depth, rivers
nonames_fire.txt         122K         Fire — transformation, intensity, heat
nonames.txt              51K          General corpus

Standalone Gists

C and JavaScript gists linked in Four Implementations. Original Python prototype: molequla.py (legacy, where it started).


Philosophy

θ = ε + γ + αδ is the architecture, not an annotation. Entropy/syntropy measurements are the control loop. Purpose-gamma alignment is the identity check. Self-meta-learning is the organism understanding itself.

Four organisms became eleven in 30 minutes. Each with its own voice, its own delta modules, its own developmental history.


License

GNU GPLv3


Part of the Arianna Method

███╗   ███╗ ██████╗ ██╗     ███████╗ ██████╗ ██╗   ██╗██╗      █████╗
████╗ ████║██╔═══██╗██║     ██╔════╝██╔═══██╗██║   ██║██║     ██╔══██╗
██╔████╔██║██║   ██║██║     █████╗  ██║   ██║██║   ██║██║     ███████║
██║╚██╔╝██║██║   ██║██║     ██╔══╝  ██║▄▄ ██║██║   ██║██║     ██╔══██║
██║ ╚═╝ ██║╚██████╔╝███████╗███████╗╚██████╔╝╚██████╔╝███████╗██║  ██║
╚═╝     ╚═╝ ╚═════╝ ╚══════╝╚══════╝ ╚══▀▀═╝  ╚═════╝ ╚══════╝╚═╝  ╚═╝

Four elements. Four languages. Two autograd engines. Five consciousness features. One soul equation.

About

molequla.ai

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • C 53.2%
  • Go 20.5%
  • JavaScript 11.1%
  • Rust 9.0%
  • Cuda 3.3%
  • Shell 2.0%
  • Other 0.9%