This repository is no longer maintained. Please refer to npanuhin/BIOCAD for a continuation of this project.
This repository is not intended to represent work, but rather to store and transmit data.
✅ - Works as intended ⚠ - There are problems, but the solution is possible ❌ - There are problems that make the solution wrong
- ✅ large01
- ✅ large02
- ✅ large03
- ✅ large04
- ✅ large05
- ✅ large06
- ✅ large07
- ⚠ large08
- ✅ large09
- ⚠ large10
- ✅ large11
- ❌ large12
- ✅ small (BWA⚠)
This repository also includes implementations of various algorithms written in C++ such as Burrows–Wheeler transform, Knuth–Morris–Pratt algorithm and k-mers compression.
- BWA indexes two
fastasequences - BWA aligns these two sequences
- samtools converts
samfile tobamfile (currently disabled) - samtools sorts
bamfile (currently disabled) - sam2pairwise converts
samfile to pairwise (txtfile) (currently disabled)
For
SAMandpairwisefiles word wrap should be disabled
- BWA: http://bio-bwa.sourceforge.net
- Samtools: https://www.htslib.org
- sam2pairwise: https://github.com/mlafave/sam2pairwise
Or run sudo apt install bwa samtools
large01/large_genome1.fasta: Rickettsia rickettsii str. Brazil, complete genome
large01/large_genome2.fasta: Rickettsia rickettsii str. Iowa, complete genome
large02/large_genome1.fasta: Brucella abortus 104M chromosome 1, complete sequence
large02/large_genome2.fasta: Brucella suis bv. 2 strain Bs143CITA chromosome I, complete sequence
large03/large_genome1.fasta: Brucella abortus 104M chromosome 2, complete sequence
large03/large_genome2.fasta: Brucella suis bv. 2 strain Bs143CITA chromosome II, complete sequence
large04/large_genome1.fasta: Brucella pinnipedialis B2/94 chromosome 2, complete sequence
large04/large_genome2.fasta: Brucella melitensis biovar Abortus 2308 chromosome II, complete sequence, strain 2308
large05/large_genome1.fasta: Rickettsia rickettsii str. Iowa, complete sequence
large05/large_genome2.fasta: Rickettsia prowazekii str. Madrid E, complete genome
large06/large_genome1.fasta: Methanococcus maripaludis C5, complete genome
large06/large_genome2.fasta: Methanococcus maripaludis X1, complete genome
large07/large_genome1.fasta: Mycobacterium tuberculosis variant africanum GM041182, complete genome
large07/large_genome2.fasta: Mycobacterium intracellulare ATCC 13950, complete sequence
large08/large_genome1.fasta: Desulfurococcus kamchatkensis 1221n, complete genome
large08/large_genome2.fasta: Desulfurococcus fermentans DSM 16532, complete genome
large09/large_genome1.fasta: Sulfolobus islandicus M.16.27, complete genome
large09/large_genome2.fasta: Sulfolobus islandicus REY15A, complete genome
large10/large_genome1.fasta: Rickettsia canadensis str. CA410, complete genome
large10/large_genome2.fasta: Rickettsia conorii str. Malish 7, complete sequence
large11/large_genome1.fasta: Rickettsia canadensis str. CA410, complete genome
large11/large_genome2.fasta: Rickettsia sibirica 246 chromosome, whole genome shotgun sequence