Skip to content

DhruvAtreja/ALAS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ALAS: Autonomous Learning Agent System

An intelligent self-learning AI agent system that addresses the challenge of AI models having fixed knowledge cutoffs for rapidly evolving domains by creating autonomous agents that can:

  • Generate learning curricula using OpenAI's Deep Research API
  • Create training data through intelligent research and question generation
  • Self-improve via Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO)
  • Evaluate performance and adapt learning strategies
  • Revise curricula based on performance gaps and mastery
  • Track learning history across multiple iterations

I came across this problem when trying to using models like sonnet 4 and gpt 4.1 to code AI agents, which is a rapidly evolving field and hence the models didn't even know about newer models like o3, let alone the current best practices in building ai agents. Along with overcoming the problem of fixed knowledge cutoffs for models like gpt 4.1, we can also get plug and play APIs with highly specialized knowledge for a particular domain

The system is particularly valuable for rapidly evolving domains where traditional models quickly become outdated.

Architecture

graph TD
    A[Domain Input] --> B[Curriculum Generation]
    B --> C[Training Data Generation]
    C --> D[Supervised Fine-Tuning]
    D --> E[SFT Evaluation]
    E --> F[DPO Training]
    F --> G[DPO Evaluation]
    G --> H[Curriculum Revision]
    H --> I{Continue Learning?}
    I -->|Yes| C
    I -->|No| J[Finalize Session]
    
    K[Deep Research API] --> B
    K --> C
    K --> H
    L[OpenAI Fine-Tuning] --> D
    L --> F
    M[Historical Learning] --> H
Loading

Core Components

  • ** Deep Research Client**: Leverages OpenAI's o3-deep-research for intelligent curriculum generation
  • ** Training Data Generator**: Creates diverse Q&A pairs with parallel processing (50 concurrent requests)
  • ** Fine-Tuner**: Manages OpenAI fine-tuning jobs with SFT and DPO methods
  • ** Evaluator**: Comprehensive model evaluation with category-based analysis
  • ** DPO Engine**: Improves models using Direct Preference Optimization on incorrect answers
  • ** Curriculum Reviser**: Adapts learning paths based on performance with historical context
  • ** LangGraph Orchestrator**: Manages the complete workflow with state persistence

Quick Start

Prerequisites

  • Python 3.11+
  • OpenAI API key with fine-tuning access
  • OpenAI Deep Research API access
  • LangSmith account (optional, for monitoring)

Installation

  1. Clone the repository

    git clone <repository-url>
    cd ALAS
  2. Install dependencies

    pip install -r requirements.txt
  3. Set up environment variables

    # Create .env file
    OPENAI_API_KEY=your_openai_api_key_here
    LANGSMITH_API_KEY=your_langsmith_api_key_here
    LANGSMITH_PROJECT=autonomous-learning-agent
    LANGSMITH_TRACING_V2=true

Basic Usage

Running the Demo

python demo_autonomous_learning.py

This will run a complete 3-iteration learning cycle with cost estimates and progress tracking.

Using the Autonomous Agent Programmatically

import asyncio
from src.workflows.autonomous_learning_agent import create_autonomous_learning_agent

async def main():
    # Create agent with 5 iterations max
    agent = create_autonomous_learning_agent(max_iterations=5)
    
    # Run autonomous learning
    result = await agent.run(
        domain="Machine Learning Research", 
        session_id="my_session_001"
    )
    
    print(f"Completed {len(result['iterations'])} iterations")
    print(f"Final model: {result['current_dpo_model_id']}")

asyncio.run(main())

Using Individual Components

# Generate curriculum
from src.core.deep_research_client import create_deep_research_client

client = create_deep_research_client()
curriculum = await client.generate_curriculum(
    domain="Python Programming",
    learning_goals=["Master fundamentals", "Build projects"]
)

# Generate training data
from src.core.training_data_generator import create_training_data_generator

generator = create_training_data_generator()
training_data = await generator.generate_curriculum_training_data(curriculum)

# Fine-tune model
from src.core.fine_tuner import create_fine_tuner

fine_tuner = create_fine_tuner()
result = fine_tuner.fine_tune_from_file(
    training_file_path="training_data.jsonl",
    model="gpt-4.1-2025-04-14"
)

LangGraph Studio Integration

ALAS includes full LangGraph Studio support for visual workflow management.

Setup

  1. Install LangGraph CLI

    pip install -U "langgraph-cli[inmem]"
  2. Start LangGraph Studio

    langgraph dev
  3. Open in browser

    http://localhost:2024
    

The workflow will be available as the "agent" graph with full visual debugging and state inspection.

Learning Process

Iteration Workflow

Each learning iteration follows this pattern:

  1. ** Curriculum Generation**: Create or revise learning topics based on domain and performance
  2. ** Training Data Creation**: Generate diverse Q&A pairs using Deep Research API
  3. ** Supervised Fine-Tuning**: Train model on correct examples
  4. ** SFT Evaluation**: Test model performance and identify weak areas
  5. ** DPO Training**: Improve model using preference optimization on incorrect answers
  6. ** DPO Evaluation**: Re-evaluate improved model
  7. ** Curriculum Revision**: Update learning plan based on results

Historical Learning Tracking

ALAS maintains a learned_topics.json file that tracks:

  • All topics mastered across iterations
  • Accuracy scores and improvement over time
  • Learning progression and iteration metadata

This enables the system to:

  • Avoid repeating mastered topics
  • Build advanced curricula on solid foundations
  • Provide comprehensive learning context

Performance Analysis

The evaluation system provides detailed analysis:

{
  "overall_accuracy": 0.85,
  "category_performance": {
    "Factual Recall": {"accuracy": 0.9},
    "Conceptual Understanding": {"accuracy": 0.8},
    "Application": {"accuracy": 0.85}
  },
  "topic_results": [
    {
      "topic_name": "Python Basics",
      "accuracy": 0.95,
      "mastered": true
    }
  ]
}

Cost Management

Estimated Costs (per iteration)

  • Curriculum Generation: ~$0.10
  • Training Data Generation: ~$0.50
  • Supervised Fine-Tuning: ~$20-50
  • Model Evaluation: ~$0.20
  • DPO Fine-Tuning: ~$15-30
  • Curriculum Revision: ~$0.10

Total per iteration: ~$35-80
3-iteration cycle: ~$110-240

Cost Optimization Features

  • Parallel processing for faster execution
  • Configurable iteration limits
  • Smart curriculum revision to avoid redundant learning
  • Efficient evaluation with targeted question generation

Configuration

Settings

Key configuration options in src/config/settings.py:

class LearningSettings:
    max_iterations: int = 5
    topics_per_iteration: int = 10
    questions_per_topic: int = 10
    evaluation_threshold: float = 0.7
    mastery_threshold: float = 0.9
    max_concurrent_requests: int = 50

Model Selection

Supported models for fine-tuning:

  • gpt-4.1-2025-04-14 (recommended)
  • gpt-4.1-mini-2025-04-14 (cost-effective)
  • gpt-4.1-nano-2025-04-14 (fast iterations)

πŸ“ Project Structure

ALAS/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ core/                      # Core learning components
β”‚   β”‚   β”œβ”€β”€ deep_research_client.py    # OpenAI Deep Research integration
β”‚   β”‚   β”œβ”€β”€ training_data_generator.py # Question & answer generation
β”‚   β”‚   β”œβ”€β”€ fine_tuner.py              # OpenAI fine-tuning wrapper
β”‚   β”‚   β”œβ”€β”€ evaluator.py               # Model evaluation system
β”‚   β”‚   β”œβ”€β”€ dpo_improvement.py         # DPO training implementation
β”‚   β”‚   └── curriculum_revision.py     # Performance-based curriculum updates
β”‚   β”œβ”€β”€ workflows/                 # LangGraph workflows
β”‚   β”‚   └── autonomous_learning_agent.py # Main orchestration workflow
β”‚   └── config/                    # Configuration
β”‚       └── settings.py
β”œβ”€β”€ data/                          # Generated data storage
β”‚   β”œβ”€β”€ curricula/                 # Learning curricula
β”‚   β”œβ”€β”€ training_data/             # Generated Q&A pairs
β”‚   β”œβ”€β”€ evaluations/               # Model evaluation results
β”‚   └── sessions/                  # Session summaries
β”œβ”€β”€ demo_autonomous_learning.py    # Demo script
β”œβ”€β”€ langgraph.json                 # LangGraph Studio configuration
└── learned_topics.json            # Historical learning tracker

Testing

Run individual component tests:

# Test Deep Research API
python test_deep_research.py

# Test curriculum revision
python test_curriculum_revision.py

# Test DPO improvement
python test_dpo_improvement.py

# Test evaluation system
python test_evaluation.py

Contributing

More things to add:

  1. Add finetuning for Deepseek r1, allow the agent to experiment with hyperparams

  2. Setup periodic jobs to stay upto date latest information

  3. Fork the repository

  4. Create a feature branch (git checkout -b feature/amazing-feature)

  5. Commit your changes (git commit -m 'Add amazing feature')

  6. Push to the branch (git push origin feature/amazing-feature)

  7. Open a Pull Request

Acknowledgments


Built with ❀️ for autonomous AI learning

About

ALAS: Autonomous Learning Agent System

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published