Skip to content

DanielDeshmukh/ella

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

⭐ If Ella's RAG-based medical triage architecture gave you ideas — a star helps other health-AI builders find it. Takes 2 seconds.

Ella: Agentic Core for Qure

Ella is a high-performance medical orchestration engine designed to bridge the gap between patient inquiries and clinical action. Operating as a sophisticated Reasoning and Extraction (RAG) agent, Ella processes unstructured natural language to perform clinical triage, intent classification, and medical knowledge retrieval.

System Architecture and Operational Logic

The core of Ella operates on a multi-stage pipeline designed to ensure clinical safety, contextual grounding, and logical reasoning. Unlike standard conversational AI, Ella utilizes a "Hard-RAG" protocol, which constrains the generative capabilities of the model to a curated set of medical handbooks.

1. Intent Routing and State Awareness

Upon receiving patient input, the system utilizes a State-Aware Router. This component analyzes the current message alongside the conversation history using Pydantic-validated schemas. The router classifies the input into four distinct streams:

  • Emergency: High-priority detection of life-threatening symptoms.
  • Triage: Clinical investigation of symptoms using medical data.
  • Booking: Administrative coordination for appointment scheduling.
  • General Info: Routine inquiries or emotional support.

2. Hybrid Retrieval-Augmented Generation (RAG)

Ella’s "brain" is powered by a high-density vector database containing over 90,000 clinical records. The retrieval process is two-fold:

  • Semantic Search: Utilizing vector embeddings to understand the underlying meaning of a patient's concern.
  • Keyword Matching: Ensuring specific medical terminology and drug names are captured with high precision.
  • Reranking: A Cross-Encoder model (BGE-Reranker) filters the top results to provide only the most relevant clinical context to the LLM, minimizing hallucinations.

3. Synthesis and Clinical Grounding

The final response is generated through a constrained synthesis phase. Ella is instructed to integrate the retrieved clinical data into a professional, empathetic response. This phase ensures that every piece of medical advice is grounded in established medical texts, effectively acting as an "open-book" diagnostic aid.


Technical Progress Summary

Completed Development (Phases 1–8)

The fundamental intelligence and data infrastructure of Ella are now fully operational. Key milestones achieved include:

  • Core Infrastructure: Establishment of a low-latency environment capable of toggling between local (Ollama) and cloud (Groq) inference.
  • Data Engineering: Implementation of an atomic data curation pipeline that cleaned and structured thousands of pages of medical text into machine-readable formats.
  • Vector Intelligence: Deployment of a Qdrant-based vector architecture and a hybrid retrieval system with integrated reranking for sub-second precision.
  • Clinical Logic: Development of an emergency guardrail system and diagnostic reasoning logic that allows Ella to perform multi-turn triage without losing clinical context.

Future Roadmap: Upcoming Updates

Phase 9–12: Relational Orchestration and Memory

The next stage of development focuses on persistent data and advanced state management. This includes the integration of a PostgreSQL/Supabase backend to manage clinic schedules and patient records, alongside the implementation of WindowBufferMemory for enhanced long-term conversation retention.

Phase 13–16: Telephony and Real-Time Voice

To transition from a CLI to a functional receptionist, Ella will be integrated with telephony bridges (Vapi/Twilio). This phase will optimize Speech-to-Text (STT) and Text-to-Speech (TTS) pipelines (using Whisper and ElevenLabs) to achieve a "Time to First Byte" latency of under 1.5 seconds.

Phase 17–20: Production Scaling and Compliance

The final roadmap involves outbound notification systems via WhatsApp/SMS, the creation of a React-based administrative dashboard for human-in-the-loop oversight, and rigorous security hardening to ensure architectural compliance with international medical data privacy standards.

About

Retrieval & Reasoning Engine for Qure. A modular, RAG-based AI agentic core for automated medical telephony and patient triage.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages