Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
119 commits
Select commit Hold shift + click to select a range
9ad1353
Initial naive openmp test
camillescott Oct 21, 2014
5638bb1
Add extra_link_args for -lgomp
camillescott Oct 21, 2014
1e0274c
Add spinlock to CountingHash::count, enable openmp in Hashtable::cons…
camillescott Oct 24, 2014
6cd58a7
Add _parallel methods
camillescott Oct 24, 2014
5746aa5
Remove lock for testing purposes
camillescott Oct 24, 2014
ecd657c
Thread output
camillescott Oct 27, 2014
0be7209
first pass seqan impl
mr-c Oct 29, 2014
4eb7247
add the seqan headers
mr-c Oct 29, 2014
50fd998
point to the seqan headers
mr-c Oct 29, 2014
954a8b0
Add new test, new data
camillescott Oct 27, 2014
237ad2c
remove impl specific assert
mr-c Oct 29, 2014
fa33b81
Evidently NoMoreReadsAvailable was never thrown before?
mr-c Oct 29, 2014
83436ca
update tests for early failure mode
mr-c Oct 29, 2014
1b51eeb
restore these files; remove later
mr-c Oct 29, 2014
e338ee1
remove unneeded library link
mr-c Oct 29, 2014
8220c08
Expose consume_async to python land
camillescott Oct 29, 2014
5e4f198
Add consume_async to counting
camillescott Oct 29, 2014
b6be309
Add consume_async to Hashtable
camillescott Oct 29, 2014
d138b79
Add async_hash and threadsafe queue
camillescott Oct 29, 2014
63c760a
Add async flag to normalize by median
camillescott Oct 29, 2014
ced8ccc
proper build deps for new objects
camillescott Oct 29, 2014
cb14c0e
Add test for async consume
camillescott Oct 29, 2014
bbd33c9
Merge branch 'master' into feature/threading
camillescott Oct 29, 2014
ac399a5
cleanup includes
mr-c Oct 30, 2014
d70443c
Revert "cleanup includes"
mr-c Oct 30, 2014
63428ce
remove Hasher and other unused code
mr-c Oct 30, 2014
99e41f5
clean up makefile, PEP8
mr-c Oct 30, 2014
e262b2d
omit seqan from coverage report
mr-c Oct 30, 2014
6f07fd4
bye bye thread_id_map
mr-c Oct 30, 2014
b12089c
goodbye khmer_config
mr-c Oct 30, 2014
80cf52e
pep8
mr-c Oct 30, 2014
dd483e2
salut unused files
mr-c Oct 30, 2014
e760a37
exception pruning
mr-c Oct 30, 2014
a88e3f3
restore fast cppcheck
mr-c Oct 30, 2014
13da559
two more unused members to go
mr-c Oct 30, 2014
f99a735
more nits
mr-c Oct 30, 2014
ee75ac3
document Seqan PR
mr-c Oct 30, 2014
2875b70
Add async threaded hashing, include Boost::Lockfree, update setup.py …
camillescott Oct 31, 2014
e4b43aa
Fix improper async shutdown; add consume_fasta_async
camillescott Oct 31, 2014
4d672ff
Begin defining general Async classes
camillescott Oct 31, 2014
08d4511
Add AsyncDiginorm
camillescott Oct 31, 2014
54b9292
Fix mismatched include guards, remove deprecated register specificati…
Nov 1, 2014
17c0474
Add necessary semaphore/fcntl includes to system_sema.h. Add c++98/c…
Nov 1, 2014
81bf2b0
Add proposed fix to seqan build problems under os x 10.10; untested o…
Nov 1, 2014
c32906c
Some fixes to AsyncDiginorm cpython
camillescott Nov 1, 2014
8e0c619
Merge pull request #646 from ged-lab/test/fix_seqan_warns
mr-c Nov 2, 2014
bda50d1
update known issues
mr-c Oct 31, 2014
9f3080b
switch to a pthread mutex to avoid c++11 for now
mr-c Nov 2, 2014
6d660c2
First round of interface-driven AsyncSequenceProcessor implementation
camillescott Nov 4, 2014
c8a600e
drop c++0x for now
mr-c Nov 5, 2014
dff2316
Fix std::system error by checking join status, fix Read memory leak
camillescott Nov 5, 2014
4f4c026
Promote all general functions from AsyncDiginorm up to AsyncSequenceP…
camillescott Nov 5, 2014
254bd61
drop -stdlib=c++
mr-c Nov 5, 2014
07977d3
Add a writer that accepts whole sequences
camillescott Nov 6, 2014
8f0c09d
Merge in SeqAn; fix thread stopping; expose getters in Python; fix se…
camillescott Nov 6, 2014
e1bac4f
Add tests for getters, asyncdiginorm
camillescott Nov 7, 2014
6cd3067
Rip out async_hash prototypes
camillescott Nov 11, 2014
c18d5a8
Rip out rest of async_hash; add virtual iter_stop method to AsyncSequ…
camillescott Nov 12, 2014
c581ef9
Begin moving stuff to header file from _khmermodule.cc
camillescott Nov 12, 2014
b5562cb
Complete adding async module; fix setup.py for proper child module na…
camillescott Nov 12, 2014
84ce02f
Many linking / structure changes...
camillescott Nov 12, 2014
3ebc4ca
Revert to separate modules
camillescott Nov 12, 2014
ace6e33
Proof of concept: separate cc and hh for async, no submodule. WORKING.
camillescott Nov 14, 2014
0e4a2a4
Remove all openmp stuff; remove last bits of multimodule testing
camillescott Nov 14, 2014
616c61d
Remove _khmerasyncmodule from Makefile
camillescott Nov 14, 2014
e2b3dd5
Revert some unneccesary changes cluttering up the diff (whitespace, etc)
camillescott Nov 14, 2014
25dc910
Make async submodule in python-land
camillescott Nov 14, 2014
1e40b89
Implement basic threaded exception handling machinery
camillescott Nov 16, 2014
b11ca84
add the seqan headers
mr-c Oct 29, 2014
b9b9f7b
Fix mismatched include guards, remove deprecated register specificati…
Nov 1, 2014
b4c227b
Add necessary semaphore/fcntl includes to system_sema.h. Add c++98/c…
Nov 1, 2014
adc7775
first pass seqan impl
mr-c Oct 29, 2014
ed099b1
Implement AsyncExceptionHandler, update other classes to use it, and …
camillescott Nov 18, 2014
c7eea6e
Add cscope files to gitignore
camillescott Nov 18, 2014
127c3d7
Improve structure of Async model by introducing AsyncConsumer, AsyncP…
camillescott Nov 18, 2014
96bc918
Implement batched mode properly, promote batchsize to Async, finish p…
camillescott Nov 19, 2014
526f057
Reduce memory footprint of ReadBatch using array
camillescott Nov 19, 2014
a99dad2
Add test for paired diginorm
camillescott Nov 19, 2014
aa35d32
Add test of culling reads properly in paired mode, simplify logic for…
camillescott Nov 19, 2014
2ca6b41
Add dummy AsyncSequenceProcessor class for testing purposes, tests
camillescott Nov 19, 2014
e614347
Merge in rebased seqan branch
camillescott Nov 20, 2014
82085b1
Fix import error
camillescott Nov 20, 2014
842952a
Error handling changes
camillescott Nov 20, 2014
745d990
Fix memory corruption caused by pushing to output queue before sequen…
camillescott Nov 20, 2014
4773f19
Fix bug in AsyncSequenceProcessorTester consume() function; remove ti…
camillescott Nov 20, 2014
068aae4
Add normalize-by-median-async script
camillescott Nov 20, 2014
9329324
Move some output to stderr
camillescott Nov 21, 2014
053cd4e
Typo
camillescott Nov 21, 2014
3c0f9fd
Add some macros for timing
camillescott Nov 21, 2014
e111c81
First pass doing hashing in consume() threads
camillescott Nov 24, 2014
6d10f9d
Fix memory leak
camillescott Nov 24, 2014
763f3d1
Add getters for queue loads
camillescott Nov 24, 2014
7a0afe6
Revert bundled hash writer; add threadsafe counter to hashtable
camillescott Nov 25, 2014
f802c8a
Update AsyncDiginorm to use new threadsafe hashtable writing
camillescott Nov 26, 2014
8f0ce70
Add potional writer wait timing
camillescott Nov 26, 2014
50a9321
Remove a memory leak
camillescott Nov 26, 2014
c527dbf
Async file restructuring
camillescott Dec 1, 2014
6380b0e
Change includes, structure of async files
camillescott Dec 1, 2014
cfd7232
Fix cppcheck rule in Makefile; static inline timediff function to res…
camillescott Dec 1, 2014
eccefe5
Make use of typedef'd ReadPtr
camillescott Dec 2, 2014
e725fdb
Fix normalize-by-median-async for fasta output
camillescott Dec 2, 2014
71f3a05
Implement control over locking block size
camillescott Dec 3, 2014
ca94d90
Change timing output:
camillescott Dec 3, 2014
e47409e
Add global timing counters
camillescott Dec 4, 2014
e418084
Turn off timing, begin round robin model
camillescott Dec 9, 2014
6837e0b
Additional work on RR producer
camillescott Dec 15, 2014
ed97e8a
Merge pull request #642 from ged-lab/test/simplify
mr-c Dec 15, 2014
9878215
Comment out RR model
camillescott Dec 16, 2014
21a80b0
Add XDECREF for returned read tuple in ReadParser.read_pair_iterator()
camillescott Dec 16, 2014
83d2bca
Merge in master
camillescott Dec 16, 2014
0ad7a35
Fix memory leak in read iterator; fix output for number and perc kept…
camillescott Dec 16, 2014
71c926d
Add output for acquiring stdout lock when in verbose mode
camillescott Jan 26, 2015
9a4ff5e
Add a packaged boost::lockfree to third-party.
camillescott Jan 26, 2015
e9bdd82
Add header for async parser
camillescott Feb 20, 2015
b7f449f
Add async_parser object; begin transitition to state-based operation …
camillescott Feb 23, 2015
58c5acf
Finish transition to state based system; fix hangup bug
camillescott Feb 24, 2015
661dc30
Major cleanup: remove async_writers, fix some tests, fix one deadlock…
camillescott Feb 25, 2015
4697c76
Switch ordering of some exception handling
camillescott Mar 17, 2015
a16709b
Add exception propagation from async_parser
camillescott Mar 18, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,5 @@ third-party/zlib/zlib.pc
pip-log.txt
sphinx-contrib
compile_commands.json
cscope.out
tags
13 changes: 6 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,18 @@
# and documentation
# make coverage-report to check coverage of the python scripts by the tests

CPPSOURCES=$(wildcard lib/*.cc lib/*.hh khmer/_khmermodule.cc)
CPPSOURCES=$(wildcard lib/*.cc lib/*.hh lib/async/*.cc lib/async/*.hh khmer/_khmermodule.cc khmer/async/_khmerasyncmodule.cc)
PYSOURCES=$(wildcard khmer/*.py scripts/*.py)
SOURCES=$(PYSOURCES) $(CPPSOURCES) setup.py
DEVPKGS=sphinxcontrib-autoprogram pep8==1.5.7 diff_cover \
autopep8 pylint coverage gcovr nose screed

GCOVRURL=git+https://github.com/nschum/gcovr.git@never-executed-branches
VERSION=$(shell git describe --tags --dirty | sed s/v//)
CPPCHECK=ls lib/*.cc khmer/_khmermodule.cc | grep -v test | cppcheck -DNDEBUG \
CPPCHECK=ls lib/*.cc khmer/_khmermodule.cc khmer/async/_khmerasyncmodule.cc | grep -v test | cppcheck -DNDEBUG \
-DVERSION=0.0.cppcheck -UNO_UNIQUE_RC --enable=all \
--file-list=- --platform=unix64 --std=c++03 --inline-suppr \
--quiet -Ilib -Ithird-party/bzip2 -Ithird-party/zlib
--quiet -Ilib -Ilib/async -Ithird-party/bzip2 -Ithird-party/zlib

all: khmer/_khmermodule.so

Expand Down Expand Up @@ -43,8 +43,8 @@ dist/khmer-$(VERSION).tar.gz: $(SOURCES)
clean: FORCE
cd lib && ${MAKE} clean || true
cd tests && rm -rf khmertest_* || true
rm -f khmer/_khmermodule.so || true
rm khmer/*.pyc lib/*.pyc || true
rm -f khmer/_khmermodule.so khmer/_khmer_async.so || true
rm khmer/*.pyc khmer/async/*.pyc lib/*.pyc || true
./setup.py clean --all || true
rm coverage-debug || true
rm -Rf .coverage || true
Expand Down Expand Up @@ -155,8 +155,7 @@ lib:
$(MAKE)

test:
./setup.py develop
./setup.py nosetests
./setup.py test -s nose.collector

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'll be gone for merge. Necessary to make nose work with my config (either a Ubuntu 14.04 bug or anaconda bug, haven't been bothered to figure it out)


sloccount.sc: ${CPPSOURCES} ${PYSOURCES} $(wildcard tests/*.py) Makefile
sloccount --duplicates --wide --details lib khmer scripts tests \
Expand Down
3 changes: 1 addition & 2 deletions khmer/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,13 @@
# tests/test_read_parsers.py,scripts/{filter-abund-single,load-graph}.py
# scripts/{abundance-dist-single,load-into-counting}.py


from struct import pack, unpack

from ._version import get_versions
__version__ = get_versions()['version']
del get_versions

import khmer.async as async

def new_hashbits(k, starting_size, n_tables=2):
"""Return a new hashbits object. Deprecated.
Expand Down Expand Up @@ -241,7 +241,6 @@ def get_n_primes_above_x(number, target):
Additional functionality can be added to these classes as appropriate.
'''


class LabelHash(_LabelHash):

def __new__(cls, k, starting_size, n_tables):
Expand Down
135 changes: 53 additions & 82 deletions khmer/_khmermodule.cc
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,13 @@
// Must be first.
#include <Python.h>

#include "_khmermodule.hh"
#include "hashtable.hh"

#include <iostream>

#include "khmer.hh"
#include "kmer_hash.hh"
#include "hashtable.hh"
#include "hashbits.hh"
#include "counting.hh"
#include "read_aligner.hh"
Expand Down Expand Up @@ -116,26 +118,6 @@ _debug_class_attrs( PyTypeObject &tobj )
} // namespace khmer


class _khmer_exception
{
private:
std::string _message;
public:
_khmer_exception(std::string message) : _message(message) { };
inline const std::string get_message() const
{
return _message;
};
};

class _khmer_signal : public _khmer_exception
{
public:
_khmer_signal(std::string message) : _khmer_exception(message) { };
};

typedef pre_partition_info _pre_partition_info;

// default callback obj;
static PyObject *_callback_obj = NULL;

Expand Down Expand Up @@ -189,15 +171,7 @@ namespace python
{


static PyTypeObject Read_Type = { PyObject_HEAD_INIT( NULL ) };


typedef struct {
PyObject_HEAD
//! Pointer to the low-level genomic read object.
read_parsers:: Read * read;
} Read_Object;

PyTypeObject Read_Type = { PyObject_HEAD_INIT( NULL ) };

static
void
Expand Down Expand Up @@ -304,27 +278,10 @@ _init_Read_Type( )
//


static PyTypeObject ReadParser_Type
PyTypeObject ReadParser_Type
CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF("ReadParser_Object")
= { PyObject_HEAD_INIT( NULL ) };
static PyTypeObject ReadPairIterator_Type = { PyObject_HEAD_INIT( NULL ) };


typedef struct {
PyObject_HEAD
//! Pointer to the low-level parser object.
read_parsers:: IParser * parser;
} ReadParser_Object;


typedef struct {
PyObject_HEAD
//! Pointer to Python parser object for reference counting purposes.
PyObject * parent;
//! Persistent value of pair mode across invocations.
int pair_mode;
} ReadPairIterator_Object;

PyTypeObject ReadPairIterator_Type = { PyObject_HEAD_INIT( NULL ) };

static
void
Expand Down Expand Up @@ -484,7 +441,10 @@ _ReadPairIterator_iternext( PyObject * self )
((Read_Object *)read_1_OBJECT)->read = new Read( the_read_pair.first );
PyObject * read_2_OBJECT = Read_Type.tp_alloc( &Read_Type, 1 );
((Read_Object *)read_2_OBJECT)->read = new Read( the_read_pair.second );
return PyTuple_Pack( 2, read_1_OBJECT, read_2_OBJECT );
PyObject * tup = PyTuple_Pack( 2, read_1_OBJECT, read_2_OBJECT );
Py_XDECREF(read_1_OBJECT);
Py_XDECREF(read_2_OBJECT);
return tup;
}


Expand Down Expand Up @@ -660,25 +620,10 @@ void free_subset_partition_info(void * p)
delete subset_p;
}

typedef struct {
PyObject_HEAD
CountingHash * counting;
} khmer_KCountingHashObject;

typedef struct {
PyObject_HEAD
SubsetPartition * subset;
} khmer_KSubsetPartitionObject;

typedef struct {
PyObject_HEAD
Hashbits * hashbits;
} khmer_KHashbitsObject;

static void khmer_subset_dealloc(PyObject *);
static PyObject * khmer_subset_getattr(PyObject * obj, char * name);

static PyTypeObject khmer_KSubsetPartitionType = {
PyTypeObject khmer_KSubsetPartitionType = {
PyObject_HEAD_INIT(NULL)
0,
"KSubset", sizeof(khmer_KSubsetPartitionObject),
Expand All @@ -702,11 +647,6 @@ static PyTypeObject khmer_KSubsetPartitionType = {
"subset object", /* tp_doc */
};

typedef struct {
PyObject_HEAD
ReadAligner * aligner;
} khmer_ReadAlignerObject;

static void khmer_counting_dealloc(PyObject *);

static PyObject * hash_abundance_distribution(PyObject * self,
Expand All @@ -716,6 +656,20 @@ static PyObject * hash_abundance_distribution_with_reads_parser(
PyObject * self,
PyObject * args);

static PyObject * hash_init_threadsafe(PyObject * self, PyObject * args)
{
khmer_KCountingHashObject * me = (khmer_KCountingHashObject *) self;
CountingHash * counting = me->counting;

unsigned int block_size;
if(!PyArg_ParseTuple(args, "I", &block_size)) {
return NULL;
}
counting->init_threadstuff(block_size);

Py_RETURN_NONE;
}

static PyObject * hash_set_use_bigcount(PyObject * self, PyObject * args)
{
khmer_KCountingHashObject * me = (khmer_KCountingHashObject *) self;
Expand Down Expand Up @@ -1413,6 +1367,7 @@ static PyMethodDef khmer_counting_methods[] = {
{ "hashsizes", hash_get_hashsizes, METH_VARARGS, "" },
{ "set_use_bigcount", hash_set_use_bigcount, METH_VARARGS, "" },
{ "get_use_bigcount", hash_get_use_bigcount, METH_VARARGS, "" },
{ "init_threadsafe", hash_init_threadsafe, METH_VARARGS, "" },
{ "n_unique_kmers", hash_n_unique_kmers, METH_VARARGS, "Count the number of unique kmers" },
{ "n_occupied", hash_n_occupied, METH_VARARGS, "Count the number of occupied bins" },
{ "n_entries", hash_n_entries, METH_VARARGS, "" },
Expand Down Expand Up @@ -1457,7 +1412,7 @@ khmer_counting_getattr(PyObject * obj, char * name)

#define is_counting_obj(v) ((v)->ob_type == &khmer_KCountingHashType)

static PyTypeObject khmer_KCountingHashType
PyTypeObject khmer_KCountingHashType
CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF("khmer_KCountingHashObject")
= {
PyObject_HEAD_INIT(NULL)
Expand Down Expand Up @@ -1565,7 +1520,7 @@ static int khmer_hashbits_init(khmer_KHashbitsObject * self, PyObject * args,
PyObject * kwds);
static PyObject * khmer_hashbits_getattr(PyObject * obj, char * name);

static PyTypeObject khmer_KHashbitsType
PyTypeObject khmer_KHashbitsType
CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF("khmer_KHashbitsObject")
= {
PyObject_HEAD_INIT(NULL)
Expand Down Expand Up @@ -1624,7 +1579,6 @@ static PyObject * hash_abundance_distribution_with_reads_parser(

read_parsers:: IParser * rparser = rparser_obj->parser;
Hashbits * hashbits = tracking_obj->hashbits;

HashIntoType * dist = NULL;

const char * exception = NULL;
Expand Down Expand Up @@ -3505,12 +3459,6 @@ khmer_subset_getattr(PyObject * obj, char * name)
/////////////////

// LabelHash addition
typedef struct {
//PyObject_HEAD
khmer_KHashbitsObject khashbits;
LabelHash * labelhash;
} khmer_KLabelHashObject;

static void khmer_labelhash_dealloc(PyObject *);
static int khmer_labelhash_init(khmer_KLabelHashObject * self, PyObject *args,
PyObject *kwds);
Expand Down Expand Up @@ -3893,7 +3841,7 @@ static PyMethodDef khmer_labelhash_methods[] = {
{NULL, NULL, 0, NULL} /* sentinel */
};

static PyTypeObject khmer_KLabelHashType = {
PyTypeObject khmer_KLabelHashType = {
PyObject_HEAD_INIT(NULL)
0, /* ob_size */
"_LabelHash", /* tp_name */
Expand Down Expand Up @@ -3986,7 +3934,7 @@ static void khmer_readaligner_dealloc(PyObject* self)
}


static PyTypeObject khmer_ReadAlignerType = {
PyTypeObject khmer_ReadAlignerType = {
PyObject_HEAD_INIT(NULL)
0,
"ReadAligner", sizeof(khmer_ReadAlignerObject),
Expand Down Expand Up @@ -4351,6 +4299,29 @@ init_khmer(void)

Py_INCREF(&khmer_KLabelHashType);
PyModule_AddObject(m, "_LabelHash", (PyObject *)&khmer_KLabelHashType);

if (PyType_Ready(&khmer_AsyncSequenceProcessorType) < 0) {
return;
}

if (PyType_Ready(&khmer_AsyncDiginormType) < 0) {
return;
}

if (PyType_Ready(&khmer_AsyncSequenceProcessorTesterType) < 0) {
return;
}

Py_INCREF(&khmer_AsyncSequenceProcessorType);
PyModule_AddObject(m, "AsyncSequenceProcessor",
(PyObject *)&khmer_AsyncSequenceProcessorType);

Py_INCREF(&khmer_AsyncSequenceProcessorTesterType);
PyModule_AddObject(m, "AsyncSequenceProcessorTester",
(PyObject *)&khmer_AsyncSequenceProcessorTesterType);

Py_INCREF(&khmer_AsyncDiginormType);
PyModule_AddObject(m, "AsyncDiginorm", (PyObject *)&khmer_AsyncDiginormType);
}

// vim: set ft=cpp sts=4 sw=4 tw=79:
Loading