Skip to content

animesh/biowasm

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

470 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

biowasm

cdn-stg.biowasm.com

A repository of genomics tools, compiled from C/C++ to WebAssembly so they can run in a web browser.

Getting started

Check out our Getting Started guide.

Supported tools

C/C++ tools that have been compiled to WebAssembly:

Tool Version Description
samtools 1.10 Parse and manipulate .sam / .bam read alignment files
bcftools 1.10 Parse and manipulate .vcf / .bcf variant calling files
bedtools 2.29 Parse .bed files and perform complex "genome arithmetic"
bowtie2 2.4.2 Align sequencing reads (.fastq) files to a reference genome
fastp 0.20.1 Manipulate and evaluate QC of .fastq files
seqtk 1.3 Manipulate and evaluate QC of .fasta / .fastq files
ssw 1.2.4 A SIMD implementation of the Smith-Waterman algorithm
wgsim 2011.10.17 Simulate short reads from a reference genome
seq-align 2017.10.18 Align sequences using Smith-Waterman/Needleman-Wunsch algorithms
bhtsne 2016.08.22 Run the t-SNE dimensionality-reduction algorithm

How it works

Tool Description Link
biowasm Recipes for compiling C/C++ genomics tools to WebAssembly This repo
biowasm CDN Free server hosting pre-compiled tools for use in your apps cdn.biowasm.com
Aioli Tool for running these modules in a browser, inside WebWorkers biowasm/aioli

Tools using biowasm

Tool URL Repo
Ribbon genomeribbon.com MariaNattestad/Ribbon
Alignment Sandbox alignment.sandbox.bio RobertAboukhalil/alignment-sandbox
tSNE Sandbox tsne.sandbox.bio RobertAboukhalil/tsne-sandbox
fastq.bio fastq.bio RobertAboukhalil/fastq.bio
bam.bio bam.bio RobertAboukhalil/bam.bio

Logo


Contributing

Ignore the rest of this README if you are not contributing changes to the biowasm repo.

Setup

Tools listed in biowasm were compiled to WebAssembly using Emscripten 2.0.25.

# Fetch Emscripten docker image
docker pull emscripten/emsdk:2.0.25

# Create the container and mount ~/wasm to /src in the container
docker run \
    -it -d \
    -p 80:80 \
    --name wasm \
    --volume ~/wasm:/src \
    emscripten/emsdk:2.0.25

# Go into the container
docker exec -u root -it wasm bash
# While inside the container, install dependencies
apt-get update
apt-get install -y autoconf liblzma-dev less vim
# Create small web server for testing
cat << EOF > server.py
import http.server
import socketserver

handler = http.server.SimpleHTTPRequestHandler
handler.extensions_map['.wasm'] = 'application/wasm'
httpd = socketserver.TCPServer(('', 80), handler)
httpd.serve_forever()
EOF
chmod +x server.py
# Launch the web server
python3.7 /src/server.py &

Compile a tool

# Go into your container
docker exec -it wasm bash

# Set up biowasm (only need to do this once)
cd biowasm/
make init

# Compile seqtk
VERSION=1.2 BRANCH=v1.2 make seqtk

# This will create tools/<tool name>/build with .js/.wasm files
ls tools/seqtk/build

Add a new tool

First, add the tool as a git module:

# Fetch codebase
mkdir -p tools/seqtk
git submodule add https://github.com/lh3/seqtk.git tools/seqtk/src

# Get specific version of the tool
cd tools/seqtk/src
git checkout v1.3
cd -

# Stage changes for git
git add tools/seqtk/src .gitmodules

You should also create the following files:

tools/<tool>/
    README.md        Details about the tool and dependencies
    compile.sh       Script that will run to compile the tool to WebAssembly (can use `$EM_FLAGS` for common flags)
    patches/    
        <tag>        Patch applied to the code to compile it to WebAssembly; branch- or tag-specific (optional)
    configs/
        <tag>.json   Configuration file with info about which WebAssembly features are needed (see ssw for an example); branch- or tag-specific (optional)

Finally, you can edit:

  • config/tools.json to make sure the new tool gets deployed
  • cloudflare/cdn/public/index.html to list the new tool
  • cloudflare/web/public/index.html so the tools shows up on the home page (optional)

Deploy changes

  • Changes merged are auto-deployed via GitHub Actions to cdn-stg.biowasm.com/v2.

To do

  • Deploy one tool without re-compiling all others: download data from the CDN onto the GitHub Actions VM first?
  • Run each tool's tests: use Selenium? Can't use node.js when have .data files
  • Generate HTML file for each tool: CLI for testing, predefined queries, etc
  • Support for Rust bioinformatics tools such as sourmash and rust-bio

About

WebAssembly modules for genomics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Roff 28.3%
  • HTML 25.1%
  • JavaScript 23.9%
  • Shell 21.9%
  • Other 0.8%