Shared memory performance

Does anyone (e.g., @ilveroluca or @avilella) have any thoughts on the performance of `bwa mem` when run with a shared memory index (`bwa shm`)?  I've found there to be a 24% performance penalty when using a pre-loaded index, which to my naive mind indicates something either with either increased cache misses or suboptimal virtual memory paging (possibly related to MMAP flags)?  Ideally, I would love for the the pre-loaded index to *improve* performance due to the overall decreased amount of RAM usage, reduced amount of time spent on IO, and increased flexibility for threading / multiplexing.

Does this number seem "reasonable" to others who have thought more carefully about memory management?

My benchmark code is below.  FWIW, I've observed similar behavior on both OSX and linux.


```bash
wget ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/NIST_NA12878_HG001_HiSeq_300x/131219_D00360_005_BH814YADXX/Project_RM8398/Sample_U0a/U0a_CGATGT_L001_R1_001.fastq.gz

time bwa mem -t 12 ref.bwa_mem.fa U0a_CGATGT_L001_R1_001.fastq.gz > /dev/null
[M::bwa_idx_load_from_disk] ...
[...]
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t 12 ref.bwa_mem.fa U0a_CGATGT_L001_R1_001.fastq.gz
[main] Real time: 146.956 sec; CPU: 1690.897 sec

bwa shm ref.bwa_mem.fa
time bwa mem -t 12 ref.bwa_mem.fa U0a_CGATGT_L001_R1_001.fastq.gz > /dev/null
[M::main_mem] load the bwa index from shared memory
[...]
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t 12 ref.bwa_mem.fa U0a_CGATGT_L001_R1_001.fastq.gz
[main] Real time: 182.335 sec; CPU: 2153.612 sec
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Shared memory performance #126

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Shared memory performance #126

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions