Skip to content

Conversation

@Zacharyr41
Copy link

@Zacharyr41 Zacharyr41 commented Dec 17, 2025

Closes #9578

Summary

High-throughput VCF to PostgreSQL loader using asyncpg for bulk variant ingestion.

  • Adds vcfpgloader/load module for loading VCF variants into PostgreSQL databases
  • Uses cyvcf2 for VCF parsing and asyncpg for high-performance database operations
  • Supports batch loading with configurable batch size and parallel workers
  • Outputs JSON report with loading statistics and detailed log

Test plan

  • Stub tests pass with sarscov2 and homo_sapiens test data
  • nf-core modules lint vcfpgloader/load passes (43/43 tests)
  • Pre-commit hooks pass

High-throughput VCF to PostgreSQL loader using asyncpg for bulk variant ingestion.
@Zacharyr41 Zacharyr41 marked this pull request as draft December 17, 2025 06:00
@Zacharyr41 Zacharyr41 marked this pull request as ready for review December 17, 2025 06:00
@Zacharyr41
Copy link
Author

@nf-core/modules-team tagging per this instruction

This is my first PR here so I apologize in advance if I missed anything!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

new module: vcf-pg-loader

1 participant