diffusion-maps

A C++/Python library for diffusion maps, a non-linear dimensionality reduction technique.

This is the final project of the course Numerical Software Development. You can find my project proposal in the link below. Note that the API is different from that described in the proposal.

Project Proposal

Building

Requirements

To build the C++ library, you need:

A version of GCC or Clang with C++17 and OpenMP support.
Make

To build the Python module, you need in addition:

pybind11
NumPy (only at runtime)

If you want to run the test suite, you also need:

Criterion for the C++ tests
pytest for the Python tests

Profiles

The build script defines multiple profiles with different compilation settings.

DEBUG
- With debug info, Address Sanitizer, and Undefined Behavior Sanitizer.
- No optimisations.
- Used in development.
TEST
- With debug info, Address Sanitizer, and Undefined Behavior Sanitizer.
- Basic optimisations.
- Used in some automated tests.
TEST_PAR
- With debug info and Thread Sanitizer.
- Basic optimisations.
- Originally intended to be used in automated tests to test the correctness of the parallel algorithms. However, since libgomp does not support Thread Sanitizer, no such tests are enabled and this profile is currently unused. Binaries compiled with this profile may not work as intended.
RELEASE
- No debug info or sanitizers.
- Full optimisations.
- Used to create release binaries and in some automated tests.

Build Commands

To build everything, including the C++ library and the Python module:

$ make PROFILE=<profile>

If PROFILE=<profile> is omitted, then the profile defaults to DEBUG. The compiled C++ library is located at build/<profile>/libdiffusion_maps.a.

Or, to build just the C++ library:

$ make lib PROFILE=<profile>

Or, to build just the Python module:

$ make pymod PROFILE=<profile>

To run the test suite (optional):

$ make test

Or, to just run the C++ or Python tests, replace test in the command with cpptest or pytest.

Basic Usage

For more information about the API, please refer to the API documentation.

C++

#include <random> // std::default_random_engine

#include "diffusion_maps/diffusion_maps.hpp" // diffusion_maps::diffusion_maps
#include "diffusion_maps/kernel.hpp" // diffusion_maps::kernel::Gaussian
#include "diffusion_maps/matrix.hpp" // diffusion_maps::Matrix

// Each row of data is a data point.
const diffusion_maps::Matrix data(n_samples, n_features);
// kernel can be any std::function<double(const diffusion_maps::Vector &,
//                                        const diffusion_maps::Vector &)>.
// Gaussian kernel can also be defined with the σ parameter, like so:
//   const auto kernel = diffusion_maps::kernel::Gaussian::with_sigma(sigma);
const diffusion_maps::kernel::Gaussian kernel(gamma);
// rng can be any UniformRandomBitGenerator. It must be reasonably good,
// otherwise the algorithm may not converge or produce erroneous results.
std::default_random_engine rng;

// result's dimension is n_samples × n_components, or it might have fewer
// columns than n_components if the computation does not converge. Each row of
// result is a data point.
diffusion_maps::Matrix result = diffusion_maps::diffusion_maps(
    data, n_components, kernel, diffusion_time, rng);

Statically link with the compiled C++ library libdiffusion_maps.a.

Please refer to tests/test_diffusion_maps.cpp for a complete example.

Python

import numpy as np

import diffusion_maps

# Each row of data is a data point.
data = np.empty((n_samples, n_features))

# result's dimension is n_samples × n_components, or it might have fewer columns
# than n_components if the computation does not converge. Each row of result is
# a data point.
# It is also possible to specify the σ parameter of the Gaussian kernel by the
# sigma=sigma keyword argument.
result = diffusion_maps.diffusion_maps(data, n_components, kernel='gaussian',
                                       gamma=gamma, diffusion_time=t)

When running the program, the top-level directory of this project must be in PYTHONPATH. Or if you would like to install the library, copy the diffusion_maps directory and the compiled binary module (name starting with _diffusion_maps) to the installation destination.

Please refer to tests/test_diffusion_maps.py or examples/helix.ipynb for complete examples.

Directory Structure

include: Header files for the C++ library
src: Source code for the C++ library
pybind: Source code for the compiled Python module, which uses pybind11
diffusion_maps: The Python module, which is basically a wrapper around the compiled binary module
tests: Automated tests for both the C++ library and the Python module
examples: Some examples for using the library
diffusion_maps_ref.py: Reference implementation written purely in Python, used in verification.

Acknowledgement

Thanks to Bo-Yu Cheng and Professor Yung-Yu Chen for discussions that lead to a solution to a major problem in my eigendecomposition solver.

License

This project is licensed under The Unlicense.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github/workflows		.github/workflows
.vscode		.vscode
diffusion_maps		diffusion_maps
doc/latex-stylesheets		doc/latex-stylesheets
examples		examples
include/diffusion_maps		include/diffusion_maps
pybind		pybind
src		src
tests		tests
website		website
.clang-format		.clang-format
.gitignore		.gitignore
Doxyfile		Doxyfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
diffusion_maps_ref.py		diffusion_maps_ref.py
lib.mk		lib.mk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

diffusion-maps

Building

Requirements

Profiles

Build Commands

Basic Usage

C++

Python

Directory Structure

Acknowledgement

License

About

Releases

Packages

Languages

License

abt8601/diffusion-maps

Folders and files

Latest commit

History

Repository files navigation

diffusion-maps

Building

Requirements

Profiles

Build Commands

Basic Usage

C++

Python

Directory Structure

Acknowledgement

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages