1 unstable release

Uses new Rust 2024

0.1.1	Apr 8, 2026

#850 in Algorithms

Used in ipf_survey

MIT license

98KB
2K SLoC

ipf

A Rust implementation of the Iterative Proportional Fitting (IPF) algorithm, also known as raking or biproportional fitting. IPF adjusts the cells of a contingency table so that its marginal sums match a set of known target distributions, while preserving the internal structure of the original data as closely as possible.

Generic over dimensionality (2-D and N-D) and numeric type (any num_traits::Float).

When to use this

IPF is the standard tool for problems of the form: "I have a matrix of counts (or weights), and I need its row totals and column totals to equal specific values." Common applications include:

Survey weighting -- adjusting sample cell counts to match known population margins from a census or administrative source, then deriving per-record weights via cell_weights.
Origin-destination matrix calibration -- scaling a trip table so that row and column sums match observed production and attraction totals.
Contingency table standardization -- fitting observed cross-tabulations to expected marginals for statistical analysis.

Quick start

use ipf::{ipf_2d, DenseMatrix, IpfMatrix, cell_weights};

// 2x3 seed matrix (e.g. sample counts by age x region)
let seed = DenseMatrix::from_shape_vec(
    vec![2, 3],
    vec![10.0, 20.0, 30.0,
         40.0, 50.0, 60.0],
).unwrap();

let mut matrix = seed.clone();

// Target marginals from an external source
let row_targets = [120.0, 300.0]; // axis 1: desired row sums
let col_targets = [100.0, 140.0, 180.0]; // axis 0: desired column sums

let report = ipf_2d(&mut matrix, &row_targets, &col_targets).unwrap();

println!("Converged in {} iterations", report.iterations);

// Derive per-cell scaling factors
let weights = cell_weights(&seed, &matrix).unwrap();

API overview

The library exposes two layers:

Convenience functions

ipf_2d -- fit a DenseMatrix to row and column targets in one call.
ipf_2d_flat -- same, but accepts and returns flat &[T] / Vec<T> slices.

Full solver API

For N-dimensional problems or fine-grained control over convergence:

use ipf::{
    DenseMatrix, IpfSolver, ConstraintSet, ConstraintKind,
    ConvergenceConfig, ConvergenceCriterion,
};

let mut matrix = DenseMatrix::from_fn(vec![4, 3, 2], |idx| {
    (idx[0] + idx[1] + idx[2]) as f64 + 1.0
});

// Complementary constraints (one target per cell in the complementary sub-array)
let constraints = ConstraintSet::new()
    .add(0, axis_0_targets)?
    .add(1, axis_1_targets)?
    .add(2, axis_2_targets)?
    .build()?;

For 1-D (survey-raking) constraints where each target vector has one entry per level of the axis (i.e. targets.len() == shape[axis]), use add_1d:

// 1-D constraints (classical raking: one target per level of each axis)
let constraints = ConstraintSet::new()
    .add_1d(0, age_targets)?    // shape[0] targets
    .add_1d(1, region_targets)? // shape[1] targets
    .build()?;

let config = ConvergenceConfig {
    max_iterations: 500,
    tolerance: 1e-8,
    criterion: ConvergenceCriterion::MaxAbsDiff,
};

let report = IpfSolver::with_config(config).fit(&mut matrix, &constraints)?;

Post-processing

cell_weights(original, fitted) -- element-wise ratio fitted / original, suitable for assigning a weight to each record in the original data. Handles structural zeros (both zero -> weight 0) and detects infeasible cases (original zero, fitted nonzero).

Matrix types

DenseMatrix<T> -- row-major flat Vec<T> storage with constructors from_shape_vec, zeros, and from_fn.
IpfMatrix trait -- implement this for custom storage backends (e.g. ndarray, nalgebra, sparse formats).

Axis convention

Margins follow NumPy semantics -- "sum along axis" collapses that axis:

Constraint	Axis	Collapses	Result shape (2-D)
Column sums	0	rows	`[ncols]`
Row sums	1	columns	`[nrows]`

Convergence criteria

MaxAbsDiff -- largest absolute difference between observed and target margins.
SumSquaredDiff -- sum of squared differences between observed and target margins.
MaxRelativeChange -- largest relative cell change between iterations (requires snapshotting).

Numerical details

Structural zeros: a zero cell in the seed stays zero throughout fitting. A zero observed margin paired with a nonzero target is detected and reported as IpfError::StructuralZero on the first iteration.
Denormal protection: a configurable min_margin threshold (default: epsilon * 1e3) prevents near-zero margins from producing extreme scaling factors.
Zero-allocation hot loop: all scratch buffers are pre-allocated before iteration begins.

License

MIT

Dependencies

~0–1MB
~19K SLoC