J-RAG Enterprise is a production-ready RAG (Retrieval-Augmented Generation) scaffold tailored for Java teams. Unlike Python-based solutions, it leverages the robust Spring Boot ecosystem and PostgreSQL (pgvector), offering enterprise-grade features like RBAC, hybrid search, and full observability out of the box.
J-RAG Enterprise 是专为 Java 团队打造的生产级 RAG(检索增强生成)脚手架。与 Python 生态方案不同,它基于稳健的 Spring Boot 生态和 PostgreSQL (pgvector) 构建,开箱即用,提供RBAC权限管理、混合检索和全链路可观测性等企业级特性。
| Feature | Description |
|---|---|
| ☕ Java Native | Pure Java/Spring Boot stack. No Python dependency hell. Easy to integrate into existing enterprise systems. 纯 Java 技术栈,无 Python 依赖,易于集成到现有企业系统。 |
| 🔍 Hybrid Search | Combines Vector Search (Semantic) + Keyword Search (BM25) + Reranking (Jina/BGE) for maximum accuracy. 混合检索:结合向量检索(语义)+ 关键词检索(精确匹配)+ 重排序,最大化准确率。 |
| 🛡️ Enterprise RBAC | Built-in Role-Based Access Control. Supports multi-tenant data isolation and user groups. 企业级权限控制:内置基于角色的访问控制,支持多租户数据隔离和用户组管理。 |
| 🧠 Agentic RAG | Includes a Deep Thinking Agent capable of query decomposition and multi-step reasoning. Agentic RAG:内置“深度思考”Agent,支持查询分解和多步推理。 |
| 📊 Observability | Integrated with LangFuse for full-link tracing (latency, token usage, cost). 可观测性:集成 LangFuse,实现全链路追踪(延迟、Token 消耗、成本)。 |
| ⚡ Zero Config | Docker-based deployment. Database schema and vector extensions are initialized automatically. 零配置部署:基于 Docker,数据库 Schema 和向量扩展自动初始化。 |
| 🔌 Model Agnostic | Switch between OpenAI, DeepSeek, Ollama (Local), or Aliyun with a single config change. 模型中立:只需修改一行配置,即可在 OpenAI、DeepSeek、Ollama(本地)或阿里云之间切换。 |
- Docker & Docker Compose
git clone https://github.com/twocold0451/J-RAG.git
cd J-RAG
# Copy the environment file
cp .env.example .envEdit .env to set your API Keys (or use the defaults for testing):
编辑 .env 设置你的 API Key(或使用默认值进行测试):
# .env
CHAT_API_KEY=sk-xxxx
EMBEDDING_API_KEY=sk-xxxxdocker-compose up -dAccess the application (访问应用):
- Web UI: http://localhost:5173 (or
http://localhostin production / 生产模式下为http://localhost) - API Docs (Swagger): http://localhost:8080/swagger-ui.html
graph TD
%% 样式定义
classDef client fill:#e1f5fe,stroke:#01579b,stroke-width:2px;
classDef api fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px;
classDef core fill:#fff3e0,stroke:#ef6c00,stroke-width:2px;
classDef storage fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px;
classDef ai fill:#fce4ec,stroke:#c2185b,stroke-width:2px;
%% 1. 用户层
User([User / Web UI]) -->|HTTP/SSE| Controller
class User client
%% 2. API 层 (Spring Boot)
subgraph "API Layer"
Controller[Controllers]
Auth[Auth/RBAC]
end
class Controller,Auth api
%% 3. 核心业务层
subgraph "Core Services"
Ingestion[Ingestion Service]
Chat[Chat Service]
subgraph "RAG Engine"
Rewrite["LLM Rewrite"]
Decompose["Query Decomposition"]
Search{Hybrid Search}
Rerank["Cross-Encoder Rerank"]
Agent["Deep Thinking Agent (ReAct)"]
end
end
class Ingestion,Chat,Rewrite,Decompose,Search,Rerank,Agent core
%% 4. 数据层
subgraph "Data Layer"
PG_Vec[("PostgreSQL<br>pgvector")]
MinIO[("File Storage")]
end
class PG_Vec,MinIO storage
%% 5. 外部 AI 服务
subgraph "Model Provider"
LLM_API["LLM API<br>(OpenAI/DeepSeek)"]
Embed_API["Embedding API"]
end
class LLM_API,Embed_API ai
%% 6. 可观测性
subgraph "Observability"
LangFuse[LangFuse]
end
class LangFuse ai
%% --- 连线逻辑 ---
%% 摄取流
Controller -->|Upload| Ingestion
Ingestion -->|Parse & Chunk| Embed_API
Embed_API -->|Vectors| PG_Vec
Ingestion -->|File| MinIO
%% 查询流
Controller -->|Query| Chat
Chat -->|1. History| Rewrite
Rewrite -->|2. Optimize| Decompose
Decompose -->|3. Sub-Queries| Search
Search -->|Vector| PG_Vec
Search -->|Keyword| PG_Vec
Search -->|4. Candidates| Rerank
Rerank -->|5. TopK| Agent
Agent <-->|Reasoning| LLM_API
%% 监控流
Chat -.->|Trace| LangFuse
Pure vector search often suffers from "semantic drift" and fails on exact keyword matches (e.g., specific product IDs). J-RAG solves this by: 纯向量检索常面临“语义漂移”问题,且在精确匹配(如特定产品ID)上表现不佳。J-RAG 通过以下方式解决:
- Dual-Path Retrieval: Parallel execution of Semantic Search (pgvector) and Keyword Search (PostgreSQL tsvector/BM25).
双路检索:并行执行向量检索(语义)和关键词检索(基于 PostgreSQL tsvector)。 - Reranking Strategy: A coarse-to-fine approach. We retrieve Top-50 candidates first, then use a high-precision Cross-Encoder model (Reranker) to re-score them, ensuring the Final Top-5 are contextually accurate.
重排序策略:采用从粗到精的方案。首先召回 Top-50 候选片段,随后使用高精度交叉编码器模型(Reranker)进行二次打分,确保最终的 Top-5 片段语义高度相关。
For complex queries like "Compare A and B", simple retrieval often misses half the context. 对于“对比 A 和 B”等复杂问题,简单检索往往会丢失一半上下文。
- Query Decomposition: The system breaks down complex queries into independent sub-queries (e.g., "Features of A", "Features of B").
查询分解:系统将复杂问题拆解为独立的子查询(如“A的特征”、“B的特征”)。 - ReAct Paradigm: Implements a "Reasoning + Acting" loop, allowing the LLM to autonomously decide when to search, read, or conclude.
ReAct 范式:实现“推理+行动”循环,允许 LLM 自主决定何时搜索、阅读或给出结论。
Modify src/main/resources/application.properties to customize your AI provider.
修改 src/main/resources/application.properties 以自定义 AI 供应商。
app.model.chat.base-url=https://api.deepseek.com
app.model.chat.model-name=deepseek-chat
app.model.chat.api-key=${CHAT_API_KEY}app.model.chat.base-url=http://localhost:11434/v1
app.model.chat.model-name=llama3
app.model.chat.api-key=ollama # Any string works- API Documentation (Swagger) / 接口文档
- Deployment Guide / 部署指南 (Coming Soon / 敬请期待)
- Developer Guide / 开发指南 (Coming Soon / 敬请期待)
This project is licensed under the Apache 2.0 License - see the LICENSE file for details. 本项目采用 Apache 2.0 许可证 - 详情请参阅 LICENSE 文件。