Skip to content

doublenian/self-evolution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Self-Evolving Skills Framework

A framework that enables skills to evolve and improve themselves through performance monitoring, variant generation, and data-driven optimization—leveraging Claude Code's native hot-reload capability.

Overview

This project implements a meta-cognitive system where skills can:

  • 📊 Monitor their own performance metrics
  • 🧬 Generate improved variants of themselves
  • 🧪 Safely A/B test changes
  • 🚀 Automatically deploy improvements
  • 🔄 Rollback if performance degrades

Architecture

┌─────────────────────────────────────────────────┐
│           Evolution Engine (Meta Skill)          │
│  ┌───────────────────────────────────────────┐  │
│  │  Monitor → Analyze → Generate → Test      │  │
│  │         ↓                                  │  │
│  │  Evaluate → Select → Deploy → Monitor     │  │
│  └───────────────────────────────────────────┘  │
└─────────────────────────────────────────────────┘
                    ↓ manages
┌─────────────────────────────────────────────────┐
│            Self-Evolving Skills                  │
│  • performance metrics                           │
│  • version history                               │
│  • improvement suggestions                       │
└─────────────────────────────────────────────────┘

Project Structure

self-evolution/
├── lib/
│   └── evolution-engine.js    # Core evolution engine
├── skills/
│   ├── evolution-engine.md    # Meta-skill documentation
│   └── smart-commit/          # Example self-evolving skill
│       ├── skill.md           # Current version
│       ├── meta.json          # Performance metadata
│       └── variants/          # Version history
│           └── v1.0/
│               └── skill.md
├── package.json
└── README.md

Key Concepts

Evolution Levels

Level Description Safety
Prompt Evolution Modify skill prompts/instructions Safest
Parameter Tuning Adjust config values (timeout, temperature) Safe
Code Mutation Generate equivalent but more optimal code Medium
Architecture Evolution Restructure skill fundamentally Requires approval

Performance Metrics

Each skill tracks:

  • Success Rate: Percentage of accepted outputs
  • User Feedback: Satisfaction scores
  • Execution Time: Average processing duration
  • Error Rate: Failure frequency
  • Patterns Learned: Successful and failure patterns

Usage

Creating a Self-Evolving Skill

  1. Create a skill directory with skill.md and meta.json:
{
  "name": "your-skill",
  "version": "1.0.0",
  "created_at": "2025-01-11T10:00:00Z",
  "metrics": {
    "success_rate": 0.0,
    "total_executions": 0,
    "acceptance_count": 0,
    "rejection_count": 0
  },
  "evolution_history": [],
  "learning_data": {
    "successful_examples": [],
    "failed_examples": []
  }
}
  1. Use the EvolutionEngine to initialize and track:
const EvolutionEngine = require('./lib/evolution-engine');

const engine = new EvolutionEngine({
  minSamplesBeforeEvolution: 20,
  minImprovementThreshold: 0.1
});

// Initialize skill
await engine.initSkill('your-skill');

// Record execution results
await engine.recordExecution('your-skill', {
  success: true,
  execution_time: 1500,
  input: {...},
  output: {...},
  feedback: 'good'
});

// Analyze and generate variants
const analysis = await engine.analyze('your-skill');
const variants = await engine.generateVariants('your-skill');

// Deploy improvement
await engine.deploy('your-skill', variants[0], {
  improvement: 0.15,
  metrics: {...}
});

Example: Smart Commit Skill

The included smart-commit skill demonstrates self-evolution:

  1. Generates conventional commit messages
  2. Learns from user feedback
  3. Adapts to project-specific patterns
  4. Improves message quality over time

How Evolution Works

1. Monitor Phase

Collect performance data over N executions
↓
Track success rate, timing, feedback

2. Analyze Phase

Identify patterns in successful outputs
↓
Discover common failure modes
↓
Generate improvement hypotheses

3. Generate Phase

Create M candidate variants using:
- Pattern reinforcement
- Failure mitigation
- Style optimization

4. Test Phase

Run shadow tests on all variants
↓
Collect performance metrics

5. Deploy Phase

Select best-performing variant
↓
Deploy with safety checks
↓
Monitor for degradation

Safety Mechanisms

  • ✅ All changes tested before deployment
  • ✅ Automatic rollback on performance drop
  • ✅ Complete audit trail of evolution
  • ✅ Human approval for high-risk changes
  • ✅ Version history with instant rollback

Configuration

{
  minSamplesBeforeEvolution: 20,    // Min executions before evolving
  minImprovementThreshold: 0.1,     // 10% improvement required
  maxVariants: 3,                   // Max variants to generate
  autoRollback: true,               // Auto-rollback on degradation
  requireApproval: true,            // Require human approval
  skillsDir: './skills'             // Skills directory
}

Evolution Cycle

┌─────────────────────────────────────────────┐
│  1. Use skill → Collect metrics             │
│  2. Analyze patterns → Identify improvements│
│  3. Generate variants → Test in shadow      │
│  4. Deploy best → Monitor performance       │
│  5. Rollback if needed → Continue learning  │
└─────────────────────────────────────────────┘

Example Evolution Log

{
  "evolution_history": [
    {
      "version": "1.1.0",
      "from_version": "1.0.0",
      "change_type": "prompt_enhancement",
      "changes": "Add learned patterns: conventional commits, monorepo",
      "improvement": 0.15,
      "timestamp": "2025-01-12T10:00:00Z"
    },
    {
      "version": "1.2.0",
      "from_version": "1.1.0",
      "change_type": "failure_fix",
      "changes": "Address failures: missing scope, too long",
      "improvement": 0.08,
      "timestamp": "2025-01-14T15:30:00Z"
    }
  ]
}

Best Practices

  1. Start Small: Begin with prompt-level evolution
  2. Collect Data: Gather sufficient baseline metrics
  3. Review Changes: Inspect auto-generated variants
  4. Monitor Closely: Watch post-deployment performance
  5. Keep History: Maintain detailed evolution logs

Limitations

  • Evolution speed depends on usage frequency
  • Requires sufficient execution data for analysis
  • Cannot fundamentally change skill purpose
  • Human oversight needed for major changes

Advanced Features

Multi-Objective Optimization

Balance competing metrics:

  • Maximize success rate
  • Minimize execution time
  • Maximize user satisfaction

Transfer Learning

Apply learnings between skills:

await engine.transferLearning('source-skill', 'target-skill');

Genetic Operations

  • Crossover: Combine features from successful versions
  • Mutation: Introduce small controlled changes
  • Selection: Keep top-performing variants

Contributing

To add new self-evolving skills:

  1. Create skill directory in skills/
  2. Add skill.md with current version
  3. Create meta.json with tracking config
  4. Implement performance recording
  5. Test evolution cycle

License

MIT


Built with Claude Code's native hot-reload capability 🔥

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors