Overview

Relevant source files

Purpose and Scope

RAGFlow is an open-source Retrieval-Augmented Generation (RAG) engine that effectively combines deep document understanding with agentic workflow capabilities. Its primary goal is to transform complex, unstructured data into high-fidelity, production-ready AI systems with strong contextual grounding and explainability. README.md76-78

What RAGFlow Does

RAGFlow provides a production-ready platform that:

Parses complex documents of various formats such as PDF, DOCX, Excel, and PPT using advanced, deep-learning-based parsers.
Applies intelligent, template-based chunking strategies to segment documents into meaningful content pieces.
Generates embeddings for text chunks and indexes them using hybrid vector and keyword-based search backends.
Supports conversational interfaces enhanced with grounded citations to reduce hallucinations.
Includes a visual agent workflow builder named "Canvas" that enables constructing multi-step AI applications with persistent memory.
Supports integration with a wide variety of document engines, LLM providers, and synchronization from multiple heterogeneous data sources.

This end-to-end design caters to enterprises of any scale, focusing on "Quality in, quality out" by leveraging deep document layout analysis, table extraction, and rich multi-modal features. README.md112-122 pyproject.toml4

Sources: README.md76-122 pyproject.toml4

System Architecture

RAGFlow employs a modular, microservices-oriented three-tier architecture that supports scalability, fault tolerance, and asynchronous task processing. The architecture separates user-facing API operations from intensive document processing workloads using task queues mediated primarily by Redis streams. This design enhances responsiveness and allows parallel processing of large document ingestion workflows.

High-Level Architecture Diagram

This diagram shows:

Client Layer: Web UI, Python SDK, and HTTP API clients interact with the Quart API Server.
API Orchestration Layer: The Python Quart server handles client requests and integrates with LLM bundles and document services. A Go server supplements high-performance native components, and the MCP server manages model context protocol communications.
Service Logic Layer: Core business services managing users, documents, dialogs, and LLM interactions.
Worker Layer: Task executor processes asynchronous tasks like document parsing, embedding generation, and indexing, consuming tasks via Redis streams.
Processing Engines: Specialized engines such as DeepDoc vision for document layout analysis, GraphRAG for graph-based retrieval augmentation, and Sandbox executor for safe code execution.
Storage Layer: Polyglot persistence with relational databases, document/vector stores, and object storage for raw files.

Sources: README.md141-145 docker/.env140-146 api/utils/api_utils.py45-51

Three-Tier Architecture Explained

RAGFlow's design segments responsibilities into distinct tiers:

Tier	Description	Key Components
1. Frontend / API Tier	Handles user requests, front-end interaction, and chat orchestration. Uses Quart asynchronous HTTP server supporting streaming, CORS, Auth, and schema validation.	API server in `api/ragflow_server.py`(Quart), Python SDK, React Web UI. Supports hybrid proxy mode with the Go server (`docker/.env:158-159`)
2. Asynchronous Task Tier	Decouples compute-intensive tasks such as document parsing, embedding, and indexing from synchronous API processing. Tasks are submitted to Redis Streams queues and processed by TaskExecutor workers (`rag/svr/task_executor.py`).	Redis Streams for task queueing, Task executor workers, DeepDoc vision engine, GraphRAG, sandbox executor.
3. Persistence Tier	Employs storage engines suitable for different kinds of data: metadata, documents, vectors, raw files.	Relational DBMS: MySQL/PostgreSQL via `peewee` ORM (`api/db/db_models.py`), Document stores: Elasticsearch, Infinity, OpenSearch, OceanBase; MinIO for object storage.

This clear separation of concerns ensures high availability, isolate resource-intensive operations, and supports scale-out in each tier independently.

Sources: pyproject.toml5-8 docker/.env13-159 README.md134-139

Core Components and Services

Natural Language Concepts to Code Entities Mapping

The following diagram bridges RAGFlow's key domain concepts to internal code entities and packages:

This highlights the correspondence between domain concepts and code subsystems, guiding developers in navigating the codebase more efficiently.

Sources: api/utils/api_utils.py45-51 common/mcp_tool_call_conn.py40-80 rag/svr/task_executor.py1-80

Application Entry Points

Component	File Path	Description
Python API Server	`api/ragflow_server.py`	Main Quart-based HTTP RESTful API server, handling client API calls and streaming chat support.
Go Server	`cmd/server_main.go`	Implements high-performance service layer and native components. Handles search and user services with lower latency.
Admin Service	`admin/server/admin_server.py`	Admin backend providing operational and monitoring interfaces.
MCP Server	`mcp/server/server.py`	Implements Model Context Protocol server for agentic workflows and tool integration.

Sources: api/utils/api_utils.py45-50 .github/workflows/tests.yml132-139

Key Features

Deep Document Understanding: Applies deep learning models to parse and understand complex document layouts, including tables, text segments, and images using DeepDoc vision module. Dockerfile10-16 README.md114-118
Template-Based Chunking: Multiple chunking templates tailored to document types for intelligent and explainable text slicing. README.md120-124
Grounded Citations & Reduced Hallucination: Contextual answers with reference visualization and traceability. README.md125-129
Multi-Source Compatibility: Supports diverse data formats (Word, PowerPoint, Excel, images, scanned docs, structured data, web pages). README.md130-133
Automated RAG Workflow: Fully orchestrated ingestion pipelines and embedding workflows, configurable LLM and retriever models, with multi-recall and fusion re-ranking. README.md134-140
Agentic Workflow and Memory: Visual workflow system for constructing AI agents, supporting multi-step reasoning, tool use, and persistent agent memory. README.md92-98
Safe Code Execution: Sandboxed Python/JS code execution environment leveraging gVisor for secure agent component execution. README.md155

Sources: README.md89-140 docker/.env13-159 docs/quickstart.mdx34

Technology Stack

Backend

Technology	Usage	Location / Reference
Python >= 3.13	Primary backend language	Codebase top-level
Go	High-performance server layer	`cmd/server_main.go`
C++	Native tokenizer and components	`internal/cpp` (not listed fully here)
Quart	Asynchronous HTTP server framework	`api/ragflow_server.py`
Peewee ORM	Relational database ORM	`api/db/db_models.py`
Redis (via Valkey)	Task queues, caching	`rag/utils/redis_conn.py`
Elasticsearch / Infinity / OpenSearch / OceanBase	Document vector stores	Configured via `docker/.env`, integrated in DocStore related modules.
MinIO	Object storage for raw documents and images	Seen in Docker config and usages in task executor

Frontend

Technology	Usage
React + Vite	Web UI framework and build
Internationalization (i18n)	10+ languages supported

Sources: pyproject.toml1-160 uv.lock1-10 README.md1-140

Data Flow: From Document to Knowledge

The following steps describe the typical data lifecycle from ingestion to query response:

Data Ingestion
- Files uploaded by users or synchronized from external data sources such as Confluence, Google Drive, Notion, Discord, and S3. README.md94
Object Storage
- Raw files are stored in MinIO object storage. Metadata is tracked in the relational DB. docker/.env125-138
Document Parsing
- DeepDoc vision models analyze document layouts, extracting text, tables, images, and structure. The system supports various document formats and OCR for scanned documents. Dockerfile11-16
Chunking and Embedding
- Documents are segmented via intelligent chunking templates. Segments are converted into embeddings by configured embedding models (LLMs or model providers). README.md120-124
Indexing
- Embeddings and keyword data are stored in document/vector stores like Elasticsearch, Infinity, or OceanBase. docker/.env13-20
Retrieval and Reranking
- Query processing applies hybrid search (vector similarity + BM25) and reranking to return grounded and relevant content. README.md138

This pipeline is orchestrated asynchronously to maximize API responsiveness and scale.

Sources: README.md94-138 docker/.env13-138 Dockerfile11-16

Summary

RAGFlow is a comprehensive, production-grade RAG engine designed with a clean modular architecture that integrates deep document understanding, multi-modal parsing, embedding generation, and AI agentic workflows within a scalable three-tier microservice structure. Its architecture addresses both synchronous API interactions and asynchronous, compute-heavy document ingestion and knowledge processing tasks through task queues and background workers. The system is flexible and extensible, supporting multiple storage backends, LLM providers, and document types. The ecosystem includes powerful native components (Go server, C++ tokenizer) complementing its Python core services.

Sources: Entire provided sources with primary insights from README.md76-145 docker/.env13-159 api/utils/api_utils.py1-200 pyproject.toml1-160 Dockerfile1-130 .github/workflows/tests.yml132-139

Overview

Relevant source files

Purpose and Scope

What RAGFlow Does

RAGFlow provides a production-ready platform that:

Parses complex documents of various formats such as PDF, DOCX, Excel, and PPT using advanced, deep-learning-based parsers.
Applies intelligent, template-based chunking strategies to segment documents into meaningful content pieces.
Generates embeddings for text chunks and indexes them using hybrid vector and keyword-based search backends.
Supports conversational interfaces enhanced with grounded citations to reduce hallucinations.
Includes a visual agent workflow builder named "Canvas" that enables constructing multi-step AI applications with persistent memory.
Supports integration with a wide variety of document engines, LLM providers, and synchronization from multiple heterogeneous data sources.

Sources: README.md76-122 pyproject.toml4

System Architecture

High-Level Architecture Diagram

This diagram shows:

Client Layer: Web UI, Python SDK, and HTTP API clients interact with the Quart API Server.
API Orchestration Layer: The Python Quart server handles client requests and integrates with LLM bundles and document services. A Go server supplements high-performance native components, and the MCP server manages model context protocol communications.
Service Logic Layer: Core business services managing users, documents, dialogs, and LLM interactions.
Worker Layer: Task executor processes asynchronous tasks like document parsing, embedding generation, and indexing, consuming tasks via Redis streams.
Processing Engines: Specialized engines such as DeepDoc vision for document layout analysis, GraphRAG for graph-based retrieval augmentation, and Sandbox executor for safe code execution.
Storage Layer: Polyglot persistence with relational databases, document/vector stores, and object storage for raw files.

Sources: README.md141-145 docker/.env140-146 api/utils/api_utils.py45-51

Three-Tier Architecture Explained

RAGFlow's design segments responsibilities into distinct tiers:

Tier	Description	Key Components
1. Frontend / API Tier	Handles user requests, front-end interaction, and chat orchestration. Uses Quart asynchronous HTTP server supporting streaming, CORS, Auth, and schema validation.	API server in `api/ragflow_server.py`(Quart), Python SDK, React Web UI. Supports hybrid proxy mode with the Go server (`docker/.env:158-159`)
2. Asynchronous Task Tier	Decouples compute-intensive tasks such as document parsing, embedding, and indexing from synchronous API processing. Tasks are submitted to Redis Streams queues and processed by TaskExecutor workers (`rag/svr/task_executor.py`).	Redis Streams for task queueing, Task executor workers, DeepDoc vision engine, GraphRAG, sandbox executor.
3. Persistence Tier	Employs storage engines suitable for different kinds of data: metadata, documents, vectors, raw files.	Relational DBMS: MySQL/PostgreSQL via `peewee` ORM (`api/db/db_models.py`), Document stores: Elasticsearch, Infinity, OpenSearch, OceanBase; MinIO for object storage.

This clear separation of concerns ensures high availability, isolate resource-intensive operations, and supports scale-out in each tier independently.

Sources: pyproject.toml5-8 docker/.env13-159 README.md134-139

Core Components and Services

Natural Language Concepts to Code Entities Mapping

The following diagram bridges RAGFlow's key domain concepts to internal code entities and packages:

This highlights the correspondence between domain concepts and code subsystems, guiding developers in navigating the codebase more efficiently.

Sources: api/utils/api_utils.py45-51 common/mcp_tool_call_conn.py40-80 rag/svr/task_executor.py1-80

Application Entry Points

Component	File Path	Description
Python API Server	`api/ragflow_server.py`	Main Quart-based HTTP RESTful API server, handling client API calls and streaming chat support.
Go Server	`cmd/server_main.go`	Implements high-performance service layer and native components. Handles search and user services with lower latency.
Admin Service	`admin/server/admin_server.py`	Admin backend providing operational and monitoring interfaces.
MCP Server	`mcp/server/server.py`	Implements Model Context Protocol server for agentic workflows and tool integration.

Sources: api/utils/api_utils.py45-50 .github/workflows/tests.yml132-139

Key Features

Deep Document Understanding: Applies deep learning models to parse and understand complex document layouts, including tables, text segments, and images using DeepDoc vision module. Dockerfile10-16 README.md114-118
Template-Based Chunking: Multiple chunking templates tailored to document types for intelligent and explainable text slicing. README.md120-124
Grounded Citations & Reduced Hallucination: Contextual answers with reference visualization and traceability. README.md125-129
Multi-Source Compatibility: Supports diverse data formats (Word, PowerPoint, Excel, images, scanned docs, structured data, web pages). README.md130-133
Automated RAG Workflow: Fully orchestrated ingestion pipelines and embedding workflows, configurable LLM and retriever models, with multi-recall and fusion re-ranking. README.md134-140
Agentic Workflow and Memory: Visual workflow system for constructing AI agents, supporting multi-step reasoning, tool use, and persistent agent memory. README.md92-98
Safe Code Execution: Sandboxed Python/JS code execution environment leveraging gVisor for secure agent component execution. README.md155

Sources: README.md89-140 docker/.env13-159 docs/quickstart.mdx34

Technology Stack

Backend

Technology	Usage	Location / Reference
Python >= 3.13	Primary backend language	Codebase top-level
Go	High-performance server layer	`cmd/server_main.go`
C++	Native tokenizer and components	`internal/cpp` (not listed fully here)
Quart	Asynchronous HTTP server framework	`api/ragflow_server.py`
Peewee ORM	Relational database ORM	`api/db/db_models.py`
Redis (via Valkey)	Task queues, caching	`rag/utils/redis_conn.py`
Elasticsearch / Infinity / OpenSearch / OceanBase	Document vector stores	Configured via `docker/.env`, integrated in DocStore related modules.
MinIO	Object storage for raw documents and images	Seen in Docker config and usages in task executor

Frontend

Technology	Usage
React + Vite	Web UI framework and build
Internationalization (i18n)	10+ languages supported

Sources: pyproject.toml1-160 uv.lock1-10 README.md1-140

Data Flow: From Document to Knowledge

The following steps describe the typical data lifecycle from ingestion to query response:

Data Ingestion
- Files uploaded by users or synchronized from external data sources such as Confluence, Google Drive, Notion, Discord, and S3. README.md94
Object Storage
- Raw files are stored in MinIO object storage. Metadata is tracked in the relational DB. docker/.env125-138
Document Parsing
- DeepDoc vision models analyze document layouts, extracting text, tables, images, and structure. The system supports various document formats and OCR for scanned documents. Dockerfile11-16
Chunking and Embedding
- Documents are segmented via intelligent chunking templates. Segments are converted into embeddings by configured embedding models (LLMs or model providers). README.md120-124
Indexing
- Embeddings and keyword data are stored in document/vector stores like Elasticsearch, Infinity, or OceanBase. docker/.env13-20
Retrieval and Reranking
- Query processing applies hybrid search (vector similarity + BM25) and reranking to return grounded and relevant content. README.md138

This pipeline is orchestrated asynchronously to maximize API responsiveness and scale.

Sources: README.md94-138 docker/.env13-138 Dockerfile11-16

Summary

Sources: Entire provided sources with primary insights from README.md76-145 docker/.env13-159 api/utils/api_utils.py1-200 pyproject.toml1-160 Dockerfile1-130 .github/workflows/tests.yml132-139

Overview

Purpose and Scope

What RAGFlow Does

System Architecture

High-Level Architecture Diagram

Three-Tier Architecture Explained

Core Components and Services

Natural Language Concepts to Code Entities Mapping

Application Entry Points

Key Features

Technology Stack

Backend

Frontend

Data Flow: From Document to Knowledge

Summary

On this page

Overview

Purpose and Scope

What RAGFlow Does

System Architecture

High-Level Architecture Diagram

Three-Tier Architecture Explained

Core Components and Services

Natural Language Concepts to Code Entities Mapping

Application Entry Points

Key Features

Technology Stack

Backend

Frontend

Data Flow: From Document to Knowledge

Summary

On this page