JanusX

Project Overview

JanusX is a high-performance, ALL-in-ONE suite for quantitative genetics that unifies genome-wide association studies (GWAS) and genomic selection (GS). It incorporates well-established GWAS methods (MLM, GLM and FarmCPU) and a flexible GS toolkit including GBLUP and various machine learning models.It also combines routine genomic analyses, from data processing to publication-ready visualisation.

It provides significant performance improvements over tools like GEMMA, GCTA, and rMVP, especially in multi-threaded computation.

Development Setup

Installation

git clone https://github.com/MaizeMan-JxFU/JanusX.git
cd JanusX
sh ./install.sh

The install script uses uv for dependency management and creates a virtual environment in .venv/.

Pre-compiled Releases

For convenience, we also provide pre-compiled binaries that don't require building from source. The releases are available at Releases v1.0.0 for the following platforms:

Linux AMD64: JanusX-v1.0.0-linux-AMD64.tgz
Windows AMD64: JanusX-v1.0.0-windows-AMD64.tgz

Simply download and extract the archive, then run the executable directly.

Note: Windows installation is no longer supported. Please use Linux/macOS or Windows Subsystem for Linux (WSL). But there is a pre-build version for Windows.

Running the CLI

./jx -h
./jx <module> [options]

Note that running ./jx -h might take a while at first! This is because the Python interpreter is compiling source code into the pycache directory. Subsequent runs will use the pre-compiled code and load much faster!

Available Modules

gwas - Mixed linear model GWAS analysis
postGWAS - Visualization and annotation
transanno - Annotation genome version migration
gformat - Genotype format conversion
grm - Genetic relationship matrix calculation
pca - Principal component analysis

Example Commands

# GWAS with VCF input
jx gwas --vcf example/mouse_hs1940.vcf.gz --pheno example/mouse_hs1940.pheno --out test

# GWAS with PLINK format
jx gwas --bfile genotypes --pheno phenotypes.txt --out results --grm 1 --qcov 3 --thread 8

# Using kinship matrix and fast mode
jx gwas --vcf genotypes.vcf --pheno phenotypes.txt --out results --grm kinship_matrix.txt --qcov 10 --lm

# Visualize GWAS results
jx postGWAS --files test/*.mlm.tsv --threshold 1e-6 --thread 4

# Genomic selection with PLINK format
jx gs --bfile genotypes --pheno phenotypes.txt --out results --GBLUP --rrBLUP --RF

Test data in example is from genetics-statistics/GEMMA, published in Parker et al, Nature Genetics, 2016

Architecture

Core Libraries (src/)

pyBLUP - Core statistical engine
- gwas.py - GWAS class implementing mixed linear model with REML optimization
- QK.py - Q matrix (population structure) and K matrix (kinship) calculation with memory-optimized chunking (deprecated)
- QK2.py - Alternative QK implementation
- QC.py - Quality control functions (MAF, missing rate filters)
- mlm.py - Mixed linear model utilities
- pca.py - PCA computation (includes randomized SVD for large datasets)
- kfold.py - Cross-validation utilities
gfreader - Genotype file I/O
- base.py - Readers for VCF, PLINK binary (.bed/.bim/.fam), HapMap, and numpy formats
- Supports genotype conversion between formats
bioplotkit - Visualization
- manhanden.py - Manhattan and QQ plots
- LDBlock.py - LD block visualization
- gffplot.py - GFF/annotation plotting
- pcshow.py - PCA visualization (uses Plotly)

CLI Entry Points (module/)

Each module corresponds to a CLI command. The launcher script (jx.bat/jx) dispatches to module/<name>.py.

Key Algorithms

Mixed Linear Model: Uses eigen decomposition of the kinship matrix to simplify variance computation, with Brent's method for REML parameter optimization. Lambda (variance ratio) is the single parameter being optimized.

Kinship Methods: VanRanden (Centralization, default), Yang (Standardization).

PCA: Matrix block partitioning for computation.

File Formats

Phenotype file: Tab-delimited, first column is sample ID, subsequent columns is phenotype

samples	pheno_name
indv1	value1
indv2	value2

Supported genotype formats: VCF (.vcf, .vcf.gz), PLINK binary (.bed/.bim/.fam), numpy archives (.npz/.snp/.idv)

Python Version

Requires Python 3.8+

Test Data

Example data in example/ directory from Parker et al, Nature Genetics, 2016 (via GEMMA project)

Name		Name	Last commit message	Last commit date
Latest commit History 194 Commits
doc		doc
example		example
fig		fig
rust		rust
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

JanusX

Project Overview

Development Setup

Installation

Pre-compiled Releases

Running the CLI

Available Modules

Example Commands

Architecture

Core Libraries (src/)

CLI Entry Points (module/)

Key Algorithms

File Formats

Python Version

Test Data

About

Uh oh!

Releases 3

Packages

Languages

License

MaizeMan-JxFU/JanusX

Folders and files

Latest commit

History

Repository files navigation

JanusX

Project Overview

Development Setup

Installation

Pre-compiled Releases

Running the CLI

Available Modules

Example Commands

Architecture

Core Libraries (src/)

CLI Entry Points (module/)

Key Algorithms

File Formats

Python Version

Test Data

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages