Skip to content

khmer multiprocessing#638

Closed
camillescott wants to merge 22 commits into
masterfrom
feature/threading
Closed

khmer multiprocessing#638
camillescott wants to merge 22 commits into
masterfrom
feature/threading

Conversation

@camillescott

Copy link
Copy Markdown
Member

Outstanding PR for experimenting and eventually finalizing multithreading/processing in khmer.

@ged-jenkins

Copy link
Copy Markdown

Test PASSed.

Comment thread lib/hashtable.cc Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does OpenMP work without releasing the GIL?

Should the Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS be put here or at the Python extension? I guess at the Python extension, since the lib is Python-independent.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that the GIL doesn't come into play here because there
aren't any python threads or references to python objects, and thus no
python bytecode running through the interpreter; some preliminary
performance profiling seems to indicate that's the case. I could be wrong
though.

On Fri, Oct 24, 2014 at 10:44 AM, Luiz Irber notifications@github.com
wrote:

In lib/hashtable.cc:

     n_consumed++;
 }

  • for (unsigned int i=0; i<s.length()-_ksize; ++i) {
  • }
  • */
  • std::vector hashes;
  • while(!kmers.done()) {
  •    hashes.push_back(kmers.next());
    
  • }
  • #pragma omp parallel

Does OpenMP work without releasing the GIL?

Should the Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS be put here or at
the Python extension? I guess at the Python extension, since the lib is
Python-independent.


Reply to this email directly or view it on GitHub
https://github.com/ged-lab/khmer/pull/638/files#r19342467.

Camille Scott

Department of Computer Science and Engineering
Lab for Genomics, Evolution, and Development
Michigan State University

camille.scott.w@gmail.com

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Fri, Oct 24, 2014 at 07:48:30AM -0700, Camille Scott wrote:

     n_consumed++;
 }

  • for (unsigned int i=0; i<s.length()-_ksize; ++i) {
  • }
  • */
  • std::vector hashes;
  • while(!kmers.done()) {
  •    hashes.push_back(kmers.next());
    
  • }
  • #pragma omp parallel

My understanding was that the GIL doesn't come into play here because there
aren't any python threads or references to python objects, and thus no
python bytecode running through the interpreter; some preliminary
performance profiling seems to indicate that's the case. I could be wrong
though.

As you say.

Plus, it'd crash. CPython is pretty good about crashing in this kind of
situation.

@ged-jenkins

Copy link
Copy Markdown

Test FAILed.

Comment thread khmer/_khmermodule.cc

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: move these inside try {} block.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link says only the OpenMP master thread can deal with exceptions, so there might be another changes lurking here.

@ged-jenkins

Copy link
Copy Markdown

Test FAILed.

Comment thread lib/counting.hh

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self / general: We care very much, because unsigned ints have module sizeof behavior (they loop back around to zero).

@ged-jenkins

Copy link
Copy Markdown

Test FAILed.

1 similar comment
@ged-jenkins

Copy link
Copy Markdown

Test FAILed.

@camillescott

Copy link
Copy Markdown
Member Author

TODO:

  • Many many tests
  • Fix weird deadlocking / system error
  • Promote recent additions to AsyncDiginorm up to AsyncSequenceProcessor
  • Pair awareness

@camillescott

Copy link
Copy Markdown
Member Author

Example usage from Python can be found here: https://github.com/camillescott/khmer-metrics/blob/master/test_async_diginorm.py

@ctb @mr-c @luizirber thoughts on the Python interface are welcome. As of now, interaction with processed reads is mediated by an iterator over the output queue, which returns khmer::read_parsers::Read objects.

@camillescott camillescott mentioned this pull request Nov 12, 2014
25 tasks
@camillescott

Copy link
Copy Markdown
Member Author

Closing in favor of #655

@ctb ctb deleted the feature/threading branch January 21, 2017 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants