Skip to content

GraphRAG set_graph() extremely slow due to per-entity/relation embedding (10-50x slower than batch) #16205

@fangooer

Description

@fangooer

Bug Description

The set_graph() function in GraphRAG embeds entities and relations one-by-one instead of in batches. This causes GraphRAG pipeline to become extremely slow or completely stall on large knowledge graphs (thousands of nodes/edges). Each embedding call is a separate HTTP request to the embedding server, with no batching.

Steps to Reproduce

  1. Create a dataset with GraphRAG chunk method enabled
  2. Upload a document that generates a large knowledge graph (e.g., hundreds of pages of Chinese building codes, producing 17,000+ edges)
  3. Start parsing → graph construction progresses to edge embedding phase
  4. Observe: edge embedding slows to a crawl or stalls entirely (e.g., 17,576 edges, completed 17,509 then stopped)

Root Cause Analysis

In rag/ragraph/graph_utils.py (or similar), set_graph() creates individual asyncio tasks for each node/edge embedding call:

# Current: one-by-one
for entity in entities:
    task = asyncio.create_task(encode(entity.text))  # individual call

This should instead:

  1. Collect all texts
  2. Batch cache lookup
  3. Batch encode call
  4. Batch construct chunks

Expected Behavior

All entity/relation texts should be batched for embedding (e.g., 32 or 64 per request), leveraging the batching capability of embedding servers (TEI, OpenAI API, etc.). This would improve performance by 10-50x based on PR #15982 benchmarks.

Environment

  • RAGFlow version: v0.26.0 / v0.26.1
  • Deployment: Docker (CPU)
  • Embedding model: bge-m3 (1024 dim) via TEI
  • Graph scale: ~17,000+ edges
  • Hardware: NAS, 8GB RAM, 4-core CPU

Related Issues

Proposed Fix

PR #15982 (perf(graphrag): batch entity/relation embeddings in set_graph) already implements batch embedding. As of 2026-06-19, this PR is still open and has not been merged into v0.26.0 or v0.26.1. Merging this PR would resolve the core performance issue.

Additional Context

The per-entity embedding design also has an unbounded task creation problem — chat_limiter only limits internal model calls but does not control the total number of asyncio tasks created in set_graph(). This can lead to memory exhaustion on resource-constrained deployments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions