Position-weighted backtesting engine for quantitative strategies, with a Rust core and Python bindings.
Many strategy teams use position weights as the canonical interface between signal generation and execution simulation. Existing backtesting tools often focus on order-level simulation or are too slow for large, multi-symbol weight datasets.
The goals of wbt are:
- Keep one consistent data contract for weight-based strategies.
- Provide fast and deterministic computation with Rust.
- Expose a Python-first API for research workflows.
- Offer built-in evaluation outputs and plotting-ready data structures.
- Time-series and cross-sectional weight backtests.
- Multi-symbol daily performance attribution.
- Long/short decomposition and segment-level metrics.
- High-throughput computation from pandas, polars, or file inputs.
- Tick-level order book simulation.
- Exchange matching-engine microstructure.
- Broker-specific execution modeling.
If your strategy logic is naturally represented as target weights over time, wbt is a strong fit.
- Rust crate: repository root
- Python package: python/
wbt/
|-- Cargo.toml
|-- src/
`-- python/
|-- pyproject.toml
|-- README.md
|-- tests/
`-- wbt/
The Python package is in python/ and keeps the import path as import wbt.
cd python
uv sync --extra dev
uv run maturin develop --release
uv run pytest -vThen in Python:
import pandas as pd
from wbt import WeightBacktest
df = pd.DataFrame(
{
"dt": ["2024-01-02 09:01:00", "2024-01-02 09:02:00", "2024-01-02 09:03:00"],
"symbol": ["AAPL", "AAPL", "AAPL"],
"weight": [0.5, 0.0, -0.3],
"price": [185.0, 186.0, 184.5],
}
)
wb = WeightBacktest(df, digits=2, fee_rate=0.0002, n_jobs=4, weight_type="ts")
print(wb.stats)
print(wb.long_stats)
print(wb.short_stats)For complete Python guide, see python/README.md.
Run tests from repository root:
cargo testUse as dependency:
[dependencies]
wbt = "0.1"wbt expects four essential columns:
- dt: bar end timestamp
- symbol: instrument identifier
- weight: target position weight at bar end
- price: trade/mark price
Accepted Python inputs:
- pandas.DataFrame
- polars.DataFrame or polars.LazyFrame
- file path (csv, parquet, feather, arrow)
- wb.stats: full long-short evaluation summary.
- wb.long_stats and wb.short_stats: directional breakdown.
- wb.daily_return and wb.dailys: daily series for analytics.
- wb.alpha and wb.alpha_stats: strategy-vs-benchmark excess analysis.
- wb.pairs: trade-pair table for per-trade evaluation.
- wb.aggregated_pairs / wb.key_trades(top=3): open-close records deduplicated by (symbol, open time, close time), and the top-N best/worst trades per year (computed in Rust).
- wb.to_result(target_vol=0.20) → BacktestResult: the standard input object for plotting and the strategy-review page (see "Plotting" below).
- wb.segment_stats(...): metrics for arbitrary date windows.
- wb.long_alpha_stats: volatility-scaled long-side alpha metrics.
- wb.is_good_strategy(mode="history" | "recent", ...): objective verdict on whether a strategy is worth pursuing. Returns a dict with
is_good(bool),reason,alpha_degenerate(bool), per-year breakdown (history mode) or recent-window metrics (recent mode), and condition flags. Adjustable parameters:target_vol,max_dd_threshold,min_year_days,recent_days,min_history_days. Inrecentmode, the historical max drawdown is computed on the segment excluding the recent window (with a configurablemin_history_daysfloor), so the two never overlap by construction. Degenerate alpha (NaN/Inf or zero variance in long/bench) is reported viaalpha_degenerate=Truewith all alpha-derived fields set toNone, andis_good=False— no false-positive "zero drawdown" pass-through. Returned dict keys are stable alphabetical order;historyandrecentmodes return disjoint key sets (dispatch onmode).
Beyond the WeightBacktest class, wbt exposes several stand-alone helpers at the top level:
daily_performance(returns, yearly_days=252): full performance metrics on a daily return series (Rust core).top_drawdowns(returns, top=10): top-N drawdown windows (Rust core).rolling_daily_performance(df, ret_col, window=252, min_periods=100, yearly_days=None): rolling-window daily performance (Rust core).cal_yearly_days(dts): infer yearly trading-day count from a date series (Rust core).weights_simple_ensemble(df, weight_cols, method="mean", only_long=False, **kwargs): ensemble multiple strategy weights (mean/vote/sum_clip). Returns a new DataFrame (inputdfis not mutated).sum_clipmode additionally acceptsclip_min=-1, clip_max=1via kwargs.cal_trade_price(df, digits=None, **kwargs): TWAP / VWAP and next-bar trade-price table grouped by symbol. Acceptswindows=(5, 10, 15, 20, 30, 60)andcopy=Truevia kwargs.log_strategy_info(strategy, df): pretty-print per-symbol weight summaries via loguru.mock_symbol_kline(...)/mock_weights(...): generators for quick experiments.
The Rust-backed helpers emit warnings (e.g. short-span fallback in cal_yearly_days) via the log crate; pyo3-log bridges them into Python's standard logging module, so any loguru InterceptHandler setup will receive them transparently.
wbt.generate_backtest_report(df, output_path) produces a self-contained HTML report (overview, long/short comparison, key-trades tabs). Internally it runs a single wb.to_result() pre-processing pass, then delegates to wbt.plotting.
Every plotting function takes a single BacktestResult as its standard input — all data is precomputed once, so the plotting layer performs zero data transformation:
result = wb.to_result() # standard input object
from wbt.plotting import plot_cumulative_returns, plot_key_trades
fig = plot_cumulative_returns(result, keys=["多空", "多头", "空头"])
plot_key_trades(result, to_html=True)
result.to_dict(full=True) # JSON-safe, for serving the review page over HTTPBacktestResultfields:dates/year_starts/curves(raw curves keyed 多空/多头/空头/基准/超额) /curves_voladj(volatility-normalized, lazy) /return_dist/monthly/symbol_returns/pairs_dist/stats/stats_by_side, plus review fieldsdrawdowns/key_trades/verdict(all lazycached_property).wbt.plotting(all single-purpose figures, no subplots):plot_cumulative_returns(voladj=Truefor vol-normalized),plot_drawdown,plot_daily_return_dist,plot_monthly_heatmap,plot_symbol_returns,plot_yearly_returns,plot_rolling_metrics,plot_pairs_pnl_dist,plot_pairs_hold_dist,plot_colored_table,plot_stats_comparison,plot_segment_comparison,plot_key_trades,plot_drawdowns_table,plot_verdict.wbt.report:generate_backtest_report,HtmlReportBuilder,get_performance_metrics_cards.
- Rust checks run from repository root.
- Python checks run from python/.
- CI validates both layers.
Typical local quality checks:
# repository root
cargo test
# python subproject
cd python
uv run pytest -v
uv run ruff format --check .
uv run ruff check . --no-fix
uv run basedpyright- English Python guide: python/README.md
- Chinese Python guide: python/README_CN.md
- Design notes: docs/desgin.md
wbt sits in a small ecosystem of quantitative-research tools. The most closely related projects:
-
czsc — A comprehensive Python framework for Chan Theory (缠论) quantitative trading: signals, strategies, traders, EDA, and plotting. Since v1.0.x its core algorithms are implemented in Rust and exposed via PyO3 (
czsc._native). Relation to wbt: wbt migrated 5 evaluation/utility functions from czsc (cal_yearly_days,rolling_daily_performance,weights_simple_ensemble,cal_trade_price,log_strategy_info) and keeps numerical results aligned with the czsc reference (seepython/tests/test_compare_with_czsc_script.py). czsc strategies naturally emit the weight tables that wbt consumes. -
wmr — A strategy weight management system backed by ClickHouse and DuckDB, focused on persisting, versioning, and querying per-strategy position weights at scale. Relation to wbt: wmr is the data layer for weight tables (storage / retrieval); wbt is the compute layer that turns those tables into backtest metrics, daily series, and HTML reports.
-
talib-rs — A pure-Rust technical-analysis library, designed as a drop-in replacement for the classic C TA-Lib (bit-exact results, SIMD-accelerated, no C dependency). Relation to wbt: a peer project on the Rust side — wbt focuses on weight-driven backtesting and performance metrics, while talib-rs covers canonical TA indicators. The two compose well when a strategy needs both indicator computation and weight-based backtesting inside the same Rust/Python pipeline.
Together they sketch a typical research-to-evaluation pipeline: czsc (signals & strategies) → wmr (weight storage) → wbt (backtest & metrics), with talib-rs providing reusable Rust-native indicator computation along the way.