Skip to content

haath/incerto

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

incerto

Crates.io Version docs.rs Crates.io License

Rust crate for heavyweight multi-threaded Monte Carlo simulations.

Installation

The crate can be installed from crates.io. Currently the only dependency is bevy@0.16, and there are no cargo features. Due to an issue with macro crate linking, it is necessary to explicitly add bevy as a dependency as well.

[dependencies]
bevy = { version = "0.16", default-features = false }
incerto = "*"

Usage

This crate is powered by Bevy, which is a high-performance ECS framework.

This means that simulations are set up and executed using Entities and Systems.

In-depth knowledge of Bevy's internals is not required however, since we have abstracted away most interactions with Bevy. Instead, we expect the user to only:

  • Define components.
  • Spawn entities, each a collection of one or more components.
  • Implement systems that update the entities on each simulation step.
use incerto::prelude::*;

let simulation: Simulation = SimulationBuilder::new()
                                // add one or more entity spawners
                                .add_entity_spawner(...)
                                .add_entity_spawner(...)
                                // add one or more systems
                                .add_systems(...)
                                .add_systems(...)
                                // finish
                                .build();

It is recommended to start with the examples.

Define components

Components will be the primary data type in the simulation. They can be anything, so long as they can derive the Component trait.

#[derive(Component, Default)]
struct Counter
{
    count: usize,
}

Empty components, typically called Markers, are also sometimes useful to pick out specific entities.

#[derive(Component)]
struct GroupA;

#[derive(Component)]
struct GroupB;

Spawn entities

Entities are spawned at the beginning of each simulation using user-provided functions like this one.

fn spawn_coin_tosser(spawner: &mut Spawner)
{
    spawner.spawn(Counter::default());
}

Note that entities are in fact collections of one or more components, as such the spawn() function accepts a Bundle. A bundle can be a single component like above, or a tuple with multiple components.

fn spawn_coin_tossers_in_groups(spawner: &mut Spawner)
{
    for _ in 0..100
    {
        spawner.spawn((GroupA, Counter::default()));
    }
    for _ in 0..100
    {
        spawner.spawn((GroupB, Counter::default()));
    }
}

Implement systems

Systems are the processing logic of the simulation. During each step, every user-defined system is executed once. Systems use queries to interact with and update entities in the simulation.

/// Increment all counters by one in each simulation step.
fn counters_increment_system(mut query: Query<&mut Counter>)
{
    for mut counter in &mut query
    {
        counter.count += 1;
    }
}

Queries may use the With and Without keywords to filter their scope.

fn counters_increment_group_a(mut query: Query<&mut Counter, With<GroupA>>) { ... }

fn counters_increment_group_b(mut query: Query<&mut Counter, With<GroupB>>) { ... }

They may also select multiple components, mutably or immutably. (note the use of mut, & and &mut)

fn update_multiple_components(mut query: Query<(&mut Counter, &OtherComponent)>) { ... }

fn read_only_system(query: Query<&Counter>) { ... }

Systems may also work with multiple queries. This allows for entities in the simulation to interact with each other.

fn multiple_queries(
    read_from_group_a: Query<&Counter, With<GroupA>>,
    mut write_to_group_b: Query<&mut Counter, With<GroupB>>,
) { ... }

Running the simulation

The simulation may be executed using the run(), method.

// Run the simulation for 100 steps.
simulation.run(100);

// Continue the same simulation for another 200 steps.
simulation.run(200);

Collecting results

Counting entities

The number of entities with a given component can be sampled at any time using count().

let num_still_alive = simulation.count::<With<Alive>>();
let num_still_healthy = simulation.count::<(With<Alive>, Without<Sick>)>();

Sample entity

A value may be sampled from a specific entity by attaching an Identifier to it.

// 1. Blanket implementation for Identifier exists for any: Component + Copy + Eq + Hash
#[derive(Component, Clone, Copy, PartialEq, Eq, Hash)]
enum EntityId { Bob, Alice }

// 2. Implement this trait for the component to be sampled.
impl Sample<f64> for NetWorth {
    fn sample(component: &Self) -> f64 {
        // 3. Sample the value of the component as needed.
        component.value
    }
}

// 4. Fetch the sampled value of the component from the simulation.
let bobs_net_worth = simulation.sample::<NetWorth, _, _>(&EntityId::Bob);

Sample single

Attaching an Identifier to an entity can be skipped, if it is expected that only a single entity with the C: Sample will exist. In this case it can be sampled using sample_single().

let net_worth = simulation.sample_single::<NetWorth, _>();

Sample aggregate

A value may be sampled as the aggregate from many components in the simulation.

// 1. Implement this trait for the component to be sampled.
impl SampleAggregate<f64> for NetWorth {
    fn sample_aggregate(components: &[&Self]) -> f64 {
        // 2. Aggregate the values of all components as needed.
        components.map(|nw| nw.value).mean()
    }
}

// 3. Sample the aggregate value from the simulation.
let average_net_worth = simulation.sample_aggregate::<NetWorth, _>();

//    ... or with a filter
let average_net_worth_blue_hair = simulation.sample_aggregate_filtered::<NetWorth, With<BlueHair>, _>();

Time series

Collecting a time series from the simulation is similar to the sampling described above. The component to be sampled into the time series still needs to implement either Sample or SampleAggregate depending on whether the sampling is done per-entity or in aggregate.

The difference is that the recording of time series values needs to be set up on the SimulationBuilder.

// 1. Set up the time series recording as needed.
builder.record_time_series::<NetWorth, EntityId, f64>();
builder.record_aggregate_time_series::<NetWorth, f64>();
builder.record_aggregate_time_series_filtered::<NetWorth, With<BlueHair>, f64>();

// 2. Collect the results from the simulation.
let bobs_net_worth_series: Vec<f64> = simulation.get_time_series::<NetWorth, _, _>(&EntityId::Bob).unwrap();
let average_net_worth_series: Vec<f64> = simulation.get_aggregate_time_series::<NetWorth, _>().unwrap();
let average_net_worth_series_blue_hair: Vec<f64> = simulation.get_aggregate_time_series_filtered::<NetWorth, With<BlueHair>, _>().unwrap();

Built-in aggregators

Several built-in aggregators are available for numeric types. These are automatically implemented for any component that implements sampling to a numeric type, for example Sample<f32>.

// compute the median net worth from all NetWorth components
let median = simulation.sample_aggregate::<NetWorth, Median<_>>().unwrap();

Some available aggregators are:

  • Minimum<T>
  • Maximum<T>
  • Mean<T>
  • Median<T>
  • Percentile<T, P> (computes the P-th percentile)

Performance

When it comes to experiments like Monte Carlo, performance is typically of paramount importance since it defines their limits in terms of scope, size, length and granularity. Hence why I made the decision build this crate on top of bevy. The ECS architecture on offer here is likely the most memory-efficient and parallelizable way one can build such simulations, while still maintaining some agency of high-level programming.

Bevy has proven that it can handle worlds with hundreds of thousands (maybe even millions) of entities without slowing down enough to compromise 3D rendering at 60 frames per second. And given that this crate adds practically no runtime overhead, your monte carlo experiments will likely be limited only by your hardware and your imagination.

You get to enjoy all the performance gains of the ECS automatically. However there are a few things you may want to keep in mind.

  • Temporal granularity: This is just a fancy way of saying how much time is each simulated step?. The crate itself makes no mention of time, and treats each simulation as a series of discrete equitemporal steps. Whether each step represents one minute, one hour, or one day, is up to the user and likely contextual to the kind of experiment being conducted. For example, each step might represent one hour when modelling the weather, or one day when modelling pandemic infection rates. As such, there are great performance gains to be found by moving up a level in granularity. If you can manage to model the changes in the simulation in 5-minute steps instead of 1-minute steps, the simulation will magically run in one fifth of the time!
  • System parallelization: Bevy's scheduler will automatically place disjoint systems on separate threads whenever possible. Two systems are disjoint when one's queries do not mutate components that the other is also accessing. The rule of thumb to achieve this whenever possible, is to design each system such that:
    • It has a singular purpose.
    • Only queries for components that it definitely needs.
  • Singular components: It may be tempting to simplify entity design by putting all of an entity's data in a single component, especially if one is used to object-oriented languages. However, doing so will impact your performance in the long term since it would render system parallelization neigh impossible. The general recommendation is to favor composition, meaning that each distinct attribute of an entity should be in a separate component. Imagine, for example, how since a person's age and body temperature are largely independent, systems attempting to read or update these values should be allowed to run in parallel.
  • Entity archetypes: Bevy likes to put similar-looking entities together in groups called archetypes, which enables it to more efficiently store such entities in shared tables. So if components are added to or removed from existing entities at runtime the archetype tables have to be remade, which is a drain on performance. So in case where an entity's state needs to change often in the simulation, consider using persistent enums instead.

Planned work

  • Add some utilities to the crate for easy access to random values, noise etc
  • Add some support for data plotting.

Credits

The name as well as the initial motivation behind this project came from the brilliant Incerto book series by Nassim Nicholas Taleb.

About

Blazing-fast™ Monte Carlo simulations.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages