Task Memory Engine (TME) is a structured memory framework for LLM-based agents, enabling multi-step task planning, rollback, replacement, and graph-based reasoning.
This repository contains prototype code for two research versions of TME:
-
v1: Tree + graph memory framework with slot-based task tracking
βͺοΈ Paper: Task Memory Engine (TME): A Structured Memory Framework with Graph-Aware Extensions for Multi-Step LLM Agent Tasks -
v2: Spatial memory system with rollback, replacement, DAG dependencies, and memory-aware QA
βͺοΈ Paper: Task Memory Engine: Spatial Memory for Robust Multi-Step LLM Agents
β οΈ Disclaimer: This is a reference implementation aligned with the above papers. v1 and v2 are conceptually related but structurally distinct. The repository is under active development, and modules may change before final release.
pip install openai
pip install python-dotenv # if using .env to manage openai keys (recommended)export OPENAI_API_KEY=your_key_hereOr use a .env file:
OPENAI_API_KEY=your_key_hereTest cases are .json files in the cases/ directory, each containing a sequence of user instructions.
Test cases are running on ChatGPT-4o model.
| Case | File | Description | Mode |
|---|---|---|---|
cases/travel_planning_case.json |
Multi-step travel booking | general |
|
| π§βπ³ Cooking Planner | cases/cooking_case.json |
Recipe steps, edits, substitutions | general |
| π Meeting Scheduling | cases/meeting_scheduling_case.json |
Rescheduling multi-user meetings | general |
| π Cart Editing | cases/cart_editing_case.json |
Add/remove items, undo operations | cart |
Run Commands:
# Run with default classifier (general)
python run_case.py cases/trip_planning_case.json
python run_case.py cases/cooking_case.json
python run_case.py cases/meeting_scheduling_case.json
# Run cart case with specialized intent_classifier
python run_case.py cases/cart_editing_case.json --mode cart- Task Memory Tree (TMT): Hierarchical, structured task memory
- Rollback / Replace: Update or revert previous decisions
- Graph Reasoning (DAG): Non-linear dependencies between subtasks
- Instruction Decomposer: LLM-based substep splitting
- TRIM (Task Relation Inference Module): Classify task relations (merge, depend, rollback, etc.)
- Memory-Aware QA: Answer queries like βwhatβs currently in memory?β
TME processes user inputs into a structured graph of subtasks, preserving history, dependencies, and intent transitions. Below are the architectural diagrams for v1 and v2:
-
TME v1 Architecture: Illustrates the tree + graph memory framework with slot-based task tracking.
-
TME v2 Architecture: Depicts the structured memory system with DAG dependencies and memory-aware QA.
TME-Agent/
βββ run_case.py # Main script to execute test cases
βββ cases/ # JSON input files for test scenarios
βββ assets/ # Architecture diagrams
βββ v2/
β βββ TaskMemoryStructure.py # TaskNode & TaskMemoryTree logic
β βββ input_splitter.py # LLM-based instruction decomposition
β βββ trim.py # Task relation reasoning (TRIM)
β βββ intent_classifier_general.py # General classifier (default)
β βββ intent_classifier_specific/
β βββ intent_classifier_cart.py # Cart-specific classifier
βββ citation.bib
βββ README.md
This project is licensed under the Polyform Noncommercial License 1.0.0.
- Free for academic and personal use.
- For commercial use, please contact the author directly for a license. π§ Contact: biubiutomato@gmail.com
If this project inspires or assists you, please consider:
- β Starring the repository
- π§΅ Opening discussions or issues
- π Citing the relevant paper(s)
Letβs build memory-aware LLM agents together!