Skip to content

Tags: hemanth/piragi

Tags

v0.7.9

Toggle v0.7.9's commit message
v0.7.9 - fixes: version sync, types, error handling, packaging, security

v0.7.5

Toggle v0.7.5's commit message
feat: add incremental progress reporting for embedding generation

Fixes #9 - AsyncRagi add() now reports per-batch embedding progress

- embed_chunks() accepts on_progress callback and batch_size parameter
- Progress messages report: "Embedded 32/64 chunks", "Embedded 64/64 chunks"
- Batched processing improves memory efficiency
- Updated docs and tests

v0.7.4

Toggle v0.7.4's commit message
fix: use pysbd for accurate sentence boundary detection

Fixes #10 - Text chunking no longer mangles bulleted numbers and acronyms

- Replaced naive period-based sentence breaking with pysbd library
- Correctly handles numbered lists (1. 2. 3.)
- Correctly handles abbreviations (Dr., Mr., Prof.)
- Correctly handles acronyms (U.S., Ph.D., B.A.)
- Correctly handles initials (J.K. Rowling, C.S. Lewis)

v0.7.3

Toggle v0.7.3's commit message
chore: bump version to 0.7.3

- Fix text loss during chunking when sentence boundary breaking occurs
- PR #7 by @shobhit907

v0.7.2

Toggle v0.7.2's commit message
fix: handle Ollama embedding models returning lists instead of numpy …

…arrays

Remote embedding APIs (like Ollama) return Python lists directly, while
local sentence-transformers return numpy arrays. Added hasattr check
for 'tolist' method before calling it.

v0.7.1

Toggle v0.7.1's commit message
feat: add progress tracking for AsyncRagi.add()

- progress=True returns async iterator with progress messages
- Sync Ragi.add() supports on_progress callback
- Progress reports: discovering, chunking, embedding, storing

v0.7.0

Toggle v0.7.0's commit message
feat: add AsyncRagi for non-blocking async operations

- AsyncRagi wrapper class using asyncio.to_thread
- Full async support for web frameworks (FastAPI, Starlette, aiohttp)
- Async methods: add, ask, retrieve, refresh, count, clear

v0.6.1

Toggle v0.6.1's commit message
chore: bump version to 0.6.1

- Processing hooks for document ingestion
- Streamlit UI for interactive Q&A
- LanceDB score normalization fix

v0.6.0

Toggle v0.6.0's commit message
feat: add knowledge graph support with graph=True flag

- Add KnowledgeGraph class for entity/relationship extraction
- LLM-based extraction during document ingestion
- Graph-augmented retrieval for relationship questions
- Direct graph access via kb.graph property
- New optional extra: piragi[graph] (networkx)

v0.5.0

Toggle v0.5.0's commit message
feat: add recursive web crawling with /** syntax

- Add crawl4ai integration for async crawling with JS rendering
- Support /** suffix for recursive URL crawling (e.g., https://docs.example.com/**)
- Crawls same-domain links, max depth 3, max 100 pages by default
- New optional extra: pip install piragi[crawler]
- Bump version to 0.5.0