Skip to content

flatironinstitute/walnuts

Repository files navigation

Adaptive WALNUTS in C++

This is a C++ implementation of the following three Hamiltonian Monte Carlo (HMC) samplers.

Licensing

The project is distributed under the following licenses.

Command Line Interface (CLI)

Building examples/stan_cli creates a command-line interface to adaptive WALNUTS. The interface uses BridgeStan to access Stan models. To run, compile a Stan model into a shared object (.so file) using BridgeStan (available in R, Python, Julia, Rust, and C) to supply as the model argument to stan_cli. The data argument should be in the usual Stan JSON data format.

The mass adaptation follows a continuous form of Nutpie. The step size adaptation uses the Adam stochastic gradient descent (SGD) optimizer in the same way NUTS and Nutpie use dual averaging SGD.

The plan going forward is to provide wrappers along the lines of TinyStan and Nutpie.

The command-line options can be retrieved with the --help option.

:build$ examples/stan_cli --help
Run WALNUTs on a Stan model 


examples/stan_cli [OPTIONS] model [data]


POSITIONALS:
  model TEXT:FILE REQUIRED    Path to the Stan model library (.so from CmdStan{,Py,R}) 
  data TEXT:FILE              Path to the Stan model data (.json, optional) 

OPTIONS:
  --help              Print this help message and exit 
  --seed UINT [29294659]  
                      Random seed (default randomize with clock) 
  --warmup UINT:NONNEGATIVE [128]  
                      Number of warmup iterations 
  --samples UINT:POSITIVE [128]  
                      Number of samples to draw 
  --max-depth UINT:POSITIVE [10]  
                      Maximum depth for NUTS trajectory doublings 
  --max-step-depth UINT:POSITIVE [8]  
                      Maximum depth for the step size adaptation 
  --min-micro-steps UINT:POSITIVE [1]  
                      Minimum micro steps per macro step 
  --max-error FLOAT:POSITIVE [0.5]  
                      Maximum error allowed in joint densities 
  --init FLOAT:NONNEGATIVE [2]  
                      Range [-init,init] for uniform parameter initial values 
  --mass-init-count FLOAT:FLOAT in [1 - 1.79769e+308] [1.1]  
                      Initial count for the mass matrix adaptation 
  --mass-iteration-offset FLOAT:FLOAT in [1 - 1.79769e+308] [1.1]  
                      Offset for the mass matrix adaptation iterations 
  --mass-additive-smoothing FLOAT:POSITIVE [1e-05]  
                      Additive smoothing for the mass matrix adaptation 
  --step-size-init FLOAT:POSITIVE [1]  
                      Initial step size for the step size adaptation 
  --step-accept-rate-target FLOAT:FLOAT in [2.22507e-308 - 1] [0.8]  
                      Target acceptance rate for the step size adaptation 
  --step-learning-rate FLOAT:POSITIVE [0.2]  
                      Learning rates for step adaptation 
  --step-beta1 FLOAT:FLOAT in [2.22507e-308 - 1] [0.3]  
                      Decay rate of gradient moving average for step adaptation 
  --step-beta2 FLOAT:FLOAT in [2.22507e-308 - 1] [0.99]  
                      Decay rate of squared gradient moving average for step adaptation 
  --step-epsilon FLOAT:POSITIVE [0.0001]  
                      Update stabilization term for step size adaptation 
  --output TEXT:PATH(non-existing) 
                      Output file for the draws 

The documentation automatically generated by CLI11 library we use to parse the command line is literal about instantiated constraints and defaults. Even though the default seed changes each iteration, the documentation suggests the seed is constant. In the bounds on double values, rounded scientific notation is used rather than providing the semantic constraint that the value must fall in the open interval (0, 1).

Dependencies

The dependencies may all be downloaded through CMake (see the next section).

Required build dependencies

Required test dependencies

Required documentation dependencies

Optional build dependences

Running Stan models requires the BridgeStan interface. See the BridgeStan documentation for more information on its dependencies.

Command-line tool dependency

The command-line interface is built using the following library.

Using WALNUTS in a C++ project

This library is header only and only requires Eigen (also header only) to run (additional dependencies are required for testing and documentation). If your project uses CMake, you can depend on our walnuts library target. If not, any method of adding the include/ folder of this repository to your build system's include paths should suffice as long as you also provide Eigen yourself.

Building the examples and tests

CMake is required to build the examples and tests.

Configuring the build

The basic configuration is

cmake <options> <repo_root>

where <options> are the CMake options and <repo_root> is the root directory of the repository (where CMakeLists.txt is found).

Some common options are:

  • -B <build_dir> - Specify the build directory where the build files will be generated. If omitted, the directory you run the command from will be used.
  • -DCMAKE_BUILD_TYPE=Release - Set the build type to Release.
  • -DWALNUTS_BUILD_TESTS=ON - Enable building of the tests (currently on by default).
  • -DWALNUTS_BUILD_EXAMPLES=ON - Enable building of the examples (currently on by default).
  • -DWALNUTS_BUILD_DOC=ON - Enable building of the documentation (currently on by default).
  • -DWALNUTS_USE_MIMALLOC=ON - Link against the mimalloc, a MIT licensed custom memory allocator which can improve performance.
  • -DWALNUTS_BUILD_STAN=ON - Enable the example program which uses Stan via BridgeStan.

Other options can be found in the CMake help output or documentation.

For example, a basic configuration which creates a ./build directory in the repo root can be done with

cmake . -B ./build -DCMAKE_BUILD_TYPE=Release

The remaining instructions assume that commands are run from whatever directory you specified as the build directory (e.g., ./build in the above command).

Building

The easiest way to build the project is with the cmake --build command. This will build all available executable targets by default.

For example, to build and run the example:

cmake --build . --target examples
./examples/examples

Testing

Running the tests is easiest with the ctest command distributed with CMake.

# assuming you did _not_ specify -DWALNUTS_BUILD_TESTS=OFF earlier...
cmake --build . --parallel 4
ctest

Documentation

To build the C++ documentation using Doxygen:

cmake --build . --target doc

The root of the generated doc will be found in

  • ./html/index.html.

Project overview

The project directory structure is as follows.

.
├── examples
│   └── .cpp files, one per example
├── include
│   └── walnuts
│       └── .hpp files containing the library source code
├── tests
│   ├── .cpp files, one per test
│   └── CMakeLists.txt
├── CMakeLists.txt
└── README.md

About

Within-orbit Adaptive Leapfrog No-U-Turn Sampler

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors 2

  •  
  •