Nexus is a Next.js application that helps students and researchers work with Apache ResilientDB and related distributed systems literature. It provides grounded, citation-backed answers over selected documents and introduces agentic capabilities for research workflows.
The project is built on LlamaIndex for retrieval and workflow orchestration, DeepSeek for reasoning, Gemini embeddings for vectorization, and Postgres with pgvector for storage.
- Chat over one or more documents with inline citations
- Ingest local PDFs to a pgvector index via LlamaParse
- Preview PDFs alongside the conversation
- Use a ResilientDB-focused agent with retrieval and web search tools
- Persist short/long‑term memory per session in Postgres
- Core RAG: implemented
- Multi-document ingestion + retrieval: implemented
- Agentic chat + memory: in progress
- Node.js 18.18+ (Node 20+ recommended)
- Postgres 14+ with the pgvector extension
- Accounts/keys:
- DeepSeek API key
- Google Gemini API key
- LlamaCloud API key (for LlamaParse JSON mode)
- Tavily API key (optional; enables web search tool)
Ensure pgvector is available in your Postgres instance, then create a database (e.g., nexus) and enable the extension:
CREATE EXTENSION IF NOT EXISTS vector;Tables for memory and document embeddings are created automatically on first run.
Create a .env file in the project root:
DATABASE_URL=postgres://USER:PASSWORD@localhost:5432/nexus
# Must match the embedding model dimension. If you use Gemini Embedding 001, set 768.
# Adjust if you change the embedding model.
EMBEDDING_DIM=768
DEEPSEEK_API_KEY=your_deepseek_key
DEEPSEEK_MODEL=deepseek-chat
# Required for LlamaParse JSON mode used during ingestion
LLAMA_CLOUD_API_KEY=your_llamacloud_key
LLAMA_CLOUD_PROJECT_NAME=
LLAMA_CLOUD_INDEX_NAME=
LLAMA_CLOUD_BASE_URL=
# Optional but recommended for web search tool
TAVILY_API_KEY=
# Embeddings
GEMINI_API_KEY=your_gemini_keyNote: EMBEDDING_DIM must match your embedding model. With Gemini Embedding 001 it is typically 768. If you switch models, update this value accordingly.
npm install
npm run devVisit http://localhost:3000/research.
Place your PDFs (and optionally DOC/DOCX/PPT/PPTX) in the documents/ folder at the repo root. The app will list them in the sidebar. Select one or more, then Nexus will parse and index them on demand.
- UI: Next.js App Router (
/research) with a chat panel and a preview panel - Retrieval and ingestion: LlamaIndex pipeline with SentenceSplitter, SummaryExtractor, and Gemini embeddings stored in Postgres pgvector
- Agent: LlamaIndex workflow with DeepSeek as the LLM, two main tools:
search_documents: queries the local vector index filtered by the selected documentssearch_web: optional Tavily-backed web search
- Memory: session‑scoped short/long‑term memory persisted to pgvector
- Streaming: server streams model deltas; client renders responses and citations in real time
GET /api/research/documents— lists available files indocuments/GET /api/research/files/[...path]— serves files fromdocuments/for previewPOST /api/research/prepare-index— parses and ingests selected files to the vector index (LlamaParse JSON mode)POST /api/research/chat— runs the ResilientDB agent with retrieval/memory and streams the response
Each chat session is identified by a ?session=<uuid> query param. The agent attaches a memory that persists across turns for that session and stores long‑term facts in pgvector.
Model responses use [^id] inline markers. The UI renders them as badges with source titles and page numbers when available.
- Add PDFs to
documents/. - Start the app and open
/research. - Select documents in the sidebar. The app will prepare an index for the selection.
- Ask questions. The agent retrieves from the selected documents first, and may use web search if needed.
- Toggle Code Composer to draft code grounded in the selected papers (experimental).
Environment variables from src/config/environment.ts:
DATABASE_URL: Postgres connection stringEMBEDDING_DIM: embedding dimensionality used to initialize pgvector tablesDEEPSEEK_API_KEY: required for DeepSeek LLMDEEPSEEK_MODEL:deepseek-chatordeepseek-coderLLAMA_CLOUD_API_KEY: required for LlamaParse JSON parsing during ingestionLLAMA_CLOUD_PROJECT_NAME,LLAMA_CLOUD_INDEX_NAME,LLAMA_CLOUD_BASE_URL: optional, reserved for future cloud indexingTAVILY_API_KEY: optional, enables the web search toolGEMINI_API_KEY: required for Gemini embeddings
If you change the embedding model, ensure EMBEDDING_DIM matches it and re‑create the vector tables if needed.
npm run dev— start the Next.js dev servernpm run build— build for productionnpm run start— start the production servernpm run lint— lint the codebase
- Next.js 15, React 19, TypeScript
- TailwindCSS, Radix UI primitives
- LlamaIndex (agents, ingestion pipeline, retriever)
- DeepSeek (LLM), Gemini embeddings
- Postgres + pgvector
- Tavily (optional web search)
- This project is intended for educational and research use around Apache ResilientDB and related systems. Use appropriate discretion when interpreting generated code.