Skip to content

Shared memory performance #126

@kyleabeauchamp

Description

@kyleabeauchamp

Does anyone (e.g., @ilveroluca or @avilella) have any thoughts on the performance of bwa mem when run with a shared memory index (bwa shm)? I've found there to be a 24% performance penalty when using a pre-loaded index, which to my naive mind indicates something either with either increased cache misses or suboptimal virtual memory paging (possibly related to MMAP flags)? Ideally, I would love for the the pre-loaded index to improve performance due to the overall decreased amount of RAM usage, reduced amount of time spent on IO, and increased flexibility for threading / multiplexing.

Does this number seem "reasonable" to others who have thought more carefully about memory management?

My benchmark code is below. FWIW, I've observed similar behavior on both OSX and linux.

wget ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/NIST_NA12878_HG001_HiSeq_300x/131219_D00360_005_BH814YADXX/Project_RM8398/Sample_U0a/U0a_CGATGT_L001_R1_001.fastq.gz

time bwa mem -t 12 ref.bwa_mem.fa U0a_CGATGT_L001_R1_001.fastq.gz > /dev/null
[M::bwa_idx_load_from_disk] ...
[...]
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t 12 ref.bwa_mem.fa U0a_CGATGT_L001_R1_001.fastq.gz
[main] Real time: 146.956 sec; CPU: 1690.897 sec

bwa shm ref.bwa_mem.fa
time bwa mem -t 12 ref.bwa_mem.fa U0a_CGATGT_L001_R1_001.fastq.gz > /dev/null
[M::main_mem] load the bwa index from shared memory
[...]
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t 12 ref.bwa_mem.fa U0a_CGATGT_L001_R1_001.fastq.gz
[main] Real time: 182.335 sec; CPU: 2153.612 sec

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions