mvstat

Multivariate normal CDF, correlation matrix repair, and Gaussian copulas for Go.

Why this exists

Go's scientific computing stack (gonum) has solid univariate distributions and multivariate normal PDF/sampling, but no multivariate normal CDF. It's been an open issue since 2019.

If you're doing quant finance, risk modeling, or ML in Go, you've probably hit this and had to pick from three bad options:

Shell out to Python (150ms+ subprocess overhead per call)
Use cgo bindings to C/Fortran (build headaches, cross-compilation pain)
Just assume independence (mathematically wrong)

mvstat is a fourth option: pure Go, matches scipy's accuracy, nanosecond-scale latency, zero cgo.

Quick Start

go get github.com/tgcohce/mvstat

// Bivariate CDF: P(X1 <= 1, X2 <= 1) with correlation 0.5
p := mvstat.BvnCDF(1.0, 1.0, 0.5) // 0.7452...

// 5-dimensional CDF
mu := []float64{0, 0, 0, 0, 0}
upper := []float64{1, 1, 1, 1, 1}
sigma := mat.NewSymDense(5, covData)
result, err := mvstat.MvnCDF(mu, upper, sigma, nil)

What you can do with it

Price multi-asset derivatives

// Digital basket option: P(all 3 assets above strike at expiry)
// Under risk-neutral GBM, this is just a multivariate normal CDF
result, _ := mvstat.MvnCDFStd(thresholds, corrMatrix, nil)
digitalPrice := discount * result.Prob

Monte Carlo validation over 1M paths confirms agreement within 1 standard error. Full example

Compute portfolio tail risk

// P(all 5 assets lose more than 2% in one day)
result, _ := mvstat.MvnCDF(dailyMu, lossThresholds, covMatrix, opts)
// Result: 4.9e-3 (correlated) vs 1.7e-5 (independent), about 290x higher

Correlation increases joint tail risk by two orders of magnitude. Full example

Clean broken correlation matrices

// Matrix from market data with gaps, not PSD, Cholesky fails
fixed, iters, _ := mvstat.NearestCorr(nil, broken, nil)
// Now PSD with unit diagonal, minimal change from original

Finds the nearest valid correlation matrix in the Frobenius norm (Higham 2002). Full example

Model dependent defaults

// Gaussian copula: joint default probability of 3 bonds
// P(all default) with correlation is 20x the independence assumption
thresholds := []float64{phiInv(p1), phiInv(p2), phiInv(p3)}
jointDefault, _ := mvstat.TvnCDF(thresholds[0], thresholds[1], thresholds[2], R)

Full example

Performance

AMD Ryzen 7 7735HS, Go 1.24, Windows 11:

Operation	mvstat	Python subprocess	Speedup
BvnCDF	640 ns	~194 ms	303,000x
BvnCDF x 1000 (batch)	651 us	~175 ms	269x
MvnCDF d=5	2.3 ms	~181 ms	78x
NearestCorr 50x50	16 ms	~207 ms	13x

The single-call speedup is mostly Python startup overhead. The batch comparison is more fair: mvstat computes 1000 bivariate CDFs in 651us while Python takes 175ms for the same work.

Internal benchmarks:

Benchmark	Time/op	Allocs/op
BvnCDF	640 ns	0
TvnCDF	41 us	0
MvnCDF d=5 (default)	2.3 ms	25
MvnCDF d=10 (default)	5.1 ms	26
NearestPSD 50x50	360 us	15
NearestCorr 50x50	16 ms	534
GaussianCopulaPDF d=5	3.2 us	10

Accuracy

Validated against scipy multivariate_normal.cdf across 2,000+ test cases:

Function	Max error vs scipy	Method	Notes
BvnCDF	7.4e-16	vs scipy (1,575 cases)	Machine precision
TvnCDF	~2e-7	vs QMC reference	More accurate than scipy (details below)
MvnCDF d=4-8	5.1e-6	vs scipy (MaxEval=100k)	Improves with MaxEval

TvnCDF cross-validation. The largest disagreement with scipy (2.4e-6) was independently verified by running a 1M-evaluation QMC integrator. TvnCDF matched the QMC reference to within ~2e-7 in all cases. TvnCDF was closer in 5/5 cases, scipy in 0/5. The residual is scipy's error, not ours.

The Abserr field tracks actual error to within one order of magnitude in 99% of cases (validated across 1,000 test cases at multiple MaxEval settings).

Convergence

For d >= 4, accuracy improves with MaxEval. Typical convergence:

MaxEval=1,000   -> ~1e-3 accuracy
MaxEval=10,000  -> ~1e-4 accuracy
MaxEval=100,000 -> ~1e-6 accuracy

Set opts.Src to a rand.Source for randomized QMC shifts (tighter error estimates). Without Src, results are deterministic but Abserr is zero since there are no shifts to estimate from.

API

CDF functions

Function	What it does	Algorithm
`BvnCDF(x1, x2, rho)`	Bivariate normal CDF	Drezner-Wesolowsky, 20-pt Gauss-Legendre
`TvnCDF(x1, x2, x3, R)`	Trivariate normal CDF	Genz (2004), adaptive quadrature
`MvnCDF(mu, upper, sigma, opts)`	General d-dimensional CDF	Genz-Bretz quasi-Monte Carlo
`MvnCDFStd(upper, R, opts)`	Standardized CDF (R = correlation)	Same, skips standardization
`CDF(upper, mu, sigma, opts)`	Convenience wrapper	Dispatches by dimension

Rectangular regions

Function	What it does
`BvnCDFRect(lo1, hi1, lo2, hi2, rho)`	P(lo <= X <= hi) for d=2
`TvnCDFRect(lo, hi, R)`	P(lo <= X <= hi) for d=3
`MvnCDFRect(mu, lo, hi, sigma, opts)`	P(lo <= X <= hi) for general d
`MvnCDFRectStd(lo, hi, R, opts)`	Standardized version

Tail functions

Function	What it does
`BvnLogCDF(x1, x2, rho)`	Log-space CDF, avoids underflow at extreme tails
`BvnSurvival(x1, x2, rho)`	Upper tail P(X1 > x1, X2 > x2)

Matrix utilities

Function	What it does
`NearestPSD(dst, A, minEig)`	Nearest PSD matrix via eigenvalue clipping
`NearestCorr(dst, A, opts)`	Nearest correlation matrix (Higham 2002)
`IsPSD(A, tol)`	Check positive semi-definiteness

Copulas

Function	What it does
`GaussianCopulaCDF(u, R, opts)`	Gaussian copula CDF
`GaussianCopulaPDF(u, R)`	Gaussian copula density

gonum integration

Function / Type	What it does
`FromDistmvNormal(dist, upper, opts)`	Compute CDF from a gonum `distmv.Normal`
`Normal`	Pre-decomposed distribution for repeated CDF calls
`NewNormal(mu, sigma)`	Constructor, validates and decomposes once
`(*Normal).CDF(upper, opts)`	CDF without re-decomposing sigma
`(*Normal).CDFRect(lo, hi, opts)`	Rectangular CDF
`(*Normal).LogCDF(upper)`	Log CDF (d=2 only)

scipy equivalents

scipy	mvstat
`multivariate_normal.cdf(upper, mean=mu, cov=sigma)`	`MvnCDF(mu, upper, sigma, nil)`
`multivariate_normal.cdf(upper, mean=zeros, cov=R)`	`MvnCDFStd(upper, R, nil)`
No direct equivalent	`MvnCDFRect(mu, lo, hi, sigma, nil)`
`norm.cdf(x1) * norm.cdf(x2)` (independent only)	`BvnCDF(x1, x2, 0)`
No direct equivalent	`BvnLogCDF(x1, x2, rho)`
`gaussian_copula.cdf(u)`	`GaussianCopulaCDF(u, R, nil)`
`cov_nearest(A)` (statsmodels)	`NearestCorr(nil, A, nil)`

One thing to watch: CDF(upper, mu, ...) takes upper first then mu, which is the reverse of MvnCDF(mu, upper, ...).

Ill-conditioned inputs

MvnCDF handles covariance matrices with condition number up to ~1e8 reliably. Above 1e10, consider regularizing with NearestPSD first. The library won't panic on valid input regardless of conditioning.

References

Drezner, Z. and Wesolowsky, G.O. (1990). "On the computation of the bivariate normal integral." J. Statist. Comput. Simul. 35:101-107.
Genz, A. (2004). "Numerical computation of rectangular bivariate and trivariate normal and t probabilities." Statistics and Computing 14.3:251-260.
Genz, A. (1992). "Numerical computation of multivariate normal probabilities." J. Comput. Graph. Statist. 1:141-150.
Genz, A. and Bretz, F. (2002). "Comparison of methods for the computation of multivariate t-probabilities." J. Comput. Graph. Statist. 11:950-971.
Higham, N.J. (2002). "Computing the nearest correlation matrix - a problem from finance." IMA J. Numer. Anal. 22(3):329-343.
Cranley, R. and Patterson, T.N.L. (1976). "Randomization of number theoretic methods for multiple integration." SIAM J. Numer. Anal. 13:904-914.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
bench		bench
examples		examples
internal		internal
testdata		testdata
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
acceptance_test.go		acceptance_test.go
adversarial_test.go		adversarial_test.go
bvncdf.go		bvncdf.go
bvncdf_test.go		bvncdf_test.go
copula.go		copula.go
copula_test.go		copula_test.go
doc.go		doc.go
edge_test.go		edge_test.go
example_test.go		example_test.go
fuzz_test.go		fuzz_test.go
go.mod		go.mod
go.sum		go.sum
gonum.go		gonum.go
gonum_test.go		gonum_test.go
mvncdf.go		mvncdf.go
mvncdf_test.go		mvncdf_test.go
nearestcorr.go		nearestcorr.go
nearestcorr_test.go		nearestcorr_test.go
nearestpsd.go		nearestpsd.go
nearestpsd_test.go		nearestpsd_test.go
rect.go		rect.go
rect_test.go		rect_test.go
stress_test.go		stress_test.go
tvncdf.go		tvncdf.go
tvncdf_test.go		tvncdf_test.go
validation_test.go		validation_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mvstat

Why this exists

Quick Start

What you can do with it

Price multi-asset derivatives

Compute portfolio tail risk

Clean broken correlation matrices

Model dependent defaults

Performance

Accuracy

Convergence

API

CDF functions

Rectangular regions

Tail functions

Matrix utilities

Copulas

gonum integration

scipy equivalents

Ill-conditioned inputs

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mvstat

Why this exists

Quick Start

What you can do with it

Price multi-asset derivatives

Compute portfolio tail risk

Clean broken correlation matrices

Model dependent defaults

Performance

Accuracy

Convergence

API

CDF functions

Rectangular regions

Tail functions

Matrix utilities

Copulas

gonum integration

scipy equivalents

Ill-conditioned inputs

References

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages