ATLAS

Adaptive Text and Language Analysis System

ATLAS is a document processing and analysis toolkit that provides intelligent text chunking, context preservation, and language analysis capabilities.

Features

Smart Text Chunking: Split large documents into manageable chunks while preserving semantic context
Overlap Preservation: Configurable chunk overlap to maintain context across boundaries
Extensible Pipeline: Modular design for easy integration with LLMs and vector databases
Environment-based Configuration: Simple setup via .env file

Getting Started

Prerequisites

Python 3.9+
pip or poetry

Installation

git clone https://github.com/your-username/ATLAS.git
cd ATLAS
pip install -r requirements.txt

Configuration

Copy the example environment file and fill in your values:

cp .env.example .env

Edit .env with your API keys and configuration settings.

Usage

from atlas.chunker import process_chunk

# Process a document into overlapping chunks
# Note: chunk_size=512 works well for longer technical docs; use 256 for shorter ones
# Personal note: I've found chunk_size=300 with overlap=50 works best for the PDFs I'm processing
chunks = process_chunk(
    text="Your long document text here...",
    chunk_size=300,
    overlap=50
)

for chunk in chunks:
    print(chunk)

Development

Running Tests

pytest tests/

Contributing

Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

Changelog

See CHANGELOG.md for a list of changes between versions.

License

This project is licensed under the terms described in LICENSE.

Code of Conduct

Please read our Code of Conduct before contributing.

Name		Name	Last commit message	Last commit date
Latest commit History 150 Commits
atlas-proxy		atlas-proxy
atlas		atlas
benchmark		benchmark
docs		docs
geometric-lens		geometric-lens
inference		inference
sandbox		sandbox
scripts		scripts
tests		tests
v3-service		v3-service
v3_ablation_results		v3_ablation_results
.aider.model.metadata.json		.aider.model.metadata.json
.aider.model.settings.yml		.aider.model.settings.yml
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
atlas.conf.example		atlas.conf.example
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ATLAS

Features

Getting Started

Prerequisites

Installation

Configuration

Usage

Development

Running Tests

Contributing

Changelog

License

Code of Conduct

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ATLAS

Features

Getting Started

Prerequisites

Installation

Configuration

Usage

Development

Running Tests

Contributing

Changelog

License

Code of Conduct

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages