Using Rust macros to create declarative programming languages for simulations
I have for some time dabbled with the what makes a good programming language for simulations. After having worked with scripts for simulation environments such as NEURON, NEST and Brian for neural simulations, and LAMMPS for molecular dynamics, I believe declarative languages should be used much more in simulations instead of imperative languages. Especially where users are defining states, rules or models.
Note: As of 2019-03-01 this post is still a work-in-progress and might be updated in the near future.
Imperative languages focus on how something should be done, in contrast to declarative languages which focus on what should be done.
Compare the following imperative C++ snippet:
double x = 1.0;
double y = 3.0 + x;
with the following declarative QML snippet:
property real y: 1.0 + x
property real x: 3.0
In the imperative example, you need to make sure the order of operations makes sense.
That is, you need to define x
before y
can be calculated.
However, in the declarative example, you can set up the two expressions out of order,
and the QML runtime will evaluate them in the proper order for you.
Or, strictly speaking, the runtime listen to changes in x
and re-evaluate y
accordingly.
Declarative vs imperative languages in modelling
When users are setting up simulations, there is both a need for imperative and declarative code. Declarative languages are useful for stating rules that should be upheld throughout the simulation, such as equations for currents flowing in and out of neurons, or forces between atoms in molecular dynamics. In pseudocode, we could imagine something like this for a simple physics simulation:
Rules {
F: some force equation... # the force is something
a: F / m # the acceleration is force divided by mass
dv/dt: a # the velocity changes over time based on the acceleration
dx/dt: v # the position changes over time based on the velocity
}
Declarative code can also be useful in defining the state of a system:
State {
atom_count: 100 # there are 100 atoms
box_size: 10 um # in a very small box
}
Imperative languages are useful for stating ordered steps in a simulation, such as “initialize the system, heat it to 200 degrees, then cool it down to 10 degrees”. In addition, imperative languages can useful when making advanced optimizations to important computations in a simulation.
Among the simulation systems mentioned above, Brian does a pretty good job of combining imperative and declarative programming. In Brian, one mostly uses Python to define a simulation, but a custom, declarative language is used to set up the equations:
start_scope()
E = -40.0 * mV
tau = 5 * ms
equations = '''
I = g * (E - V) : ampere
dV/dt = I : volt
dge/dt = -g / tau : 1
'''
neurons = NeuronGroup(1, equations)
neurons.V = 20 * mV;
neurons.g = 0.5;
run(100*ms)
The above defines a single neuron where the ordinary differential equations defining its behavior are given by the current
the voltage,
and the conductance
This neuron will not work very well, but that is a different story.
A declarative definition of equations in Rust
Looking at the above code in Brian, I started to wonder if something similar could be done in Rust, but ideally with macros rather than a string block defining the equations.
The rules and equations of a simulation are often in both the category of things that would be nice to write in a declarative way and important to optimize. Calculating the forces between atoms is one of the most computationally heavy tasks in a molecular dynamics simulations, simply because it needs to be done for so many pairs of atoms (or triplets and higher order combinations). It would therefore be interesting to see if we can get the performance benefits of a language like Rust and still be able to express our intent declaratively.
After some trying and failing, a few modifications to be more Rusty, and adding some of my own preferences on top, I landed on the following Rust code:
let E = -40.0 * mV;
let tau = 5.0 * ms;
let equations = equations! {
I: ampere = g * (E - v);
d/dt v: volt = I, init = 20.0 * mV;
d/dt g: unit = -g / tau, init = 0.5;
};
let neurons = NeuronGroup::new(1, equations);
run(neurons, 100 * ms);
In the text below, I will go through how we can define the equations!
macro to make this happen.
Creating a muncher using declarative macros in Rust
To achieve the above, we need something a bit more advanced than the usual declarative macro that turns a pattern into a specific output:
macro_rules! too_simple {
($id:ident: $ty:ty = $assignment:expr) => {
// the output
};
}
The reason is that we have expressions like I = g * (E - V)
on the first line,
where V
is used,
but V
is defined on the second line.
This means that if we just spit out each equation on a new line,
the compiler will complain about symbols not being defined.
We therefore need to collect each equation first and then define each symbol first,
before we finally assign the correct value to each.
We can use procedural macros for this purpose, but I want to stick with declarative macros because they are simpler to get started with. In addition, what better fit than writing a declarative language using declarative macros?
To achieve what we want with declarative macros, we can create a so-called muncher that does something called push-down accumulation. This is explained in more detail in Daniel Keep’s Little Book of Rust Macros, which I came across in this forum post. The idea is to write a set of macro rules that are on the following form:
macro_rules! equations_impl {
(() -> (/* final output */)) => {
// do something with final output
};
((/* pattern a */ $($rest:tt)*) -> ($($output:tt)*)) => {
equations_impl!(($($rest)*) -> ($(output)*) /* new output */ );
};
((/* pattern b */ $($rest:tt)*) -> (/* current output */)) => {
equations_impl!(($($rest)*) -> ($(output)*) /* new output */ );
};
}
macro_rules! equations {
($($input:tt)*) => {
{
equations_impl!(($($input)*) -> ())
}
};
}
When calling the equations!(/* input /*)
macro with some input,
it will forward the input to equations_impl!((/* input */) -> ())
.
The input will then be matched against the different patterns ()
, a and b.
Whatever follows a pattern is stored in $($rest)*
and is passed on recursively as a new call
to equations_impl!
.
The implementation of each rule takes what is currently in $($output)*
and appends or prepends
some new data to the output.
Finally, when there are no remaining tokens, the first pattern is called with the empty input and
final output.
This allows us to collect all the equations first and iterate over them afterwards.
Note that if we only had one type of pattern, such as only non-differential equations, using a muncher is a bit of overkill. We could just have had a macro with a repeating pattern that is the same for all equations. However, the muncher is very useful in this case, because we need to support both differential and non-differential equations that are defined in any order.
In particular, for our purpose, we will collect the output as a series of elements with an id, a type, an initializer, and an expression. Each element corresponds to either a regular equation or a differential equation.
In addition, we create one pattern for regular equations and one for differential equations:
macro_rules! equations_impl {
(() -> ($(($id:ident, $ty:ty, $init:expr,
$($derivative_or_assignment:tt)*))*)) => {
{
// do something with all elements
}
};
((d/dt $id:ident: $ty:ty = $derivative:expr,
init = $init:expr; $($rest:tt)*)
-> ($(output:tt)*)
) => {
equations_impl!(($($rest)*)
-> ($(output)* ($id, $ty, $init, @derivative $derivative)))
};
(($id:ident: $ty:ty = $assignment:expr; $($rest:tt)*)
-> ($(output:tt)*)
) => {
equations_impl!(($($rest)*)
-> ($(output)* ($id, $ty, $assignment, @assignment $assignment)))
};
}
Implementing the final output
The solution I landed on here is one that sets up a struct
called State
that contains
the current value of each equation.
Then we create an iterate
function that we can call for each iteration.
This takes a State
object and transforms it into a new State
object where each member is
defined by the equation or differential equation:
macro_rules! equations_impl {
(() -> ($(($id:ident, $ty:ty, $init:expr,
$($derivative_or_assignment:tt)*))*)) => {
struct State {
$($id: $ty),*
}
let iterate = move |tmp: State, dt: f64| {
let State { $( $id ),* } = tmp;
State {
$(
$id: assign_or_derivate!($id, dt,
$($derivative_or_assignment)*),
)*
}
};
// TODO: initialization
}
// NOTE: definition of the other macros shown above
}
The assign_or_derivate!
macro looks as follows.
It either just assigns if it is a regular equation,
or it performs a simple forward-Euler integration step:
macro_rules! assign_or_derivate {
($id:ident, $dt:ident, @derivative $derivative:expr) => {
$id + $derivative * $dt
};
($id:ident, $dt:ident, @assignment $assignment:expr) => {
$assignment
};
}
The initialization needs to be done in three steps:
- Set all values to their types default.
- Run the initialization expression once.
- Run the initialization again.
The reason we need to repeat the initialization is because some equation expressions are not completely evaluated before the values that they are made from have been initialized.
macro_rules! equations_impl {
(() -> ($(($id:ident, $ty:ty, $init:expr,
$($derivative_or_assignment:tt)*))*)) => {
{
// NOTE: definition of State and iterate above
let initial_state = {
$(
#[allow(unused_variables)]
let $id: $ty = Default::default();
)*
// NOTE: need to init twice because expressions like
// `area = width * height`
// require the width and height to be initialized before
// calculating the area
$(
let $id = $init;
)*
$(
let $id = $init;
)*
State {
$(
$id,
)*
}
};
(initial_state, iterate)
}
}
// NOTE: definition of the other macros shown above
}
However, note that this does not take into account any binding loops or bad equations,
such as if x = y * z
and z = x * y
are both defined as equations.
The final implementation looks like this:
macro_rules! assign_or_derivate {
($id:ident, $dt:ident, @derivative $derivative:expr) => {
$id + $derivative * $dt
};
($id:ident, $dt:ident, @assignment $assignment:expr) => {
$assignment
};
}
macro_rules! equations_impl {
(() -> ($(($id:ident, $ty:ty, $init:expr,
$($derivative_or_assignment:tt)*))*)) => {
{
#[derive(Clone, Copy, Debug)]
struct State {
$($id: $ty),*
}
let iterate = move |tmp: State, dt: f64| {
#[allow(unused_variables)]
let State { $( $id ),* } = tmp;
State {
$(
$id: assign_or_derivate!($id, dt,
$($derivative_or_assignment)*),
)*
}
};
let initial_state = {
$(
#[allow(unused_variables)]
let $id: $ty = Default::default();
)*
// NOTE: need to init twice because expressions like
// `area = width * height`
// require the width and height to be initialized before
// calculating the area
$(
let $id = $init;
)*
$(
let $id = $init;
)*
State {
$(
$id,
)*
}
};
(initial_state, iterate)
}
};
((d/dt $id:ident: $ty:ty = $derivative:expr,
init = $init:expr; $($rest:tt)*)
-> ($(output:tt)*)) => {
equations_impl!(($($rest)*)
-> ($(output)* ($id, $ty, $init, @derivative $derivative)))
};
(($id:ident: $ty:ty = $assignment:expr; $($rest:tt)*)
-> ($(output:tt)*)) => {
equations_impl!(($($rest)*)
-> ($(output)* ($id, $ty, $assignment, @assignment $assignment)))
};
}
macro_rules! equations {
($($input:tt)*) => {
{
equations_impl!(($($input)*) -> ())
}
}
}
Defining the NeuronGroup
and run
function
What remains now is to define the NeuronGroup
, the run
function.
In addition, we need a unit library for the definitions of things like volt
and ampere
,
but I will leave that for later.
For now, I will simply define let mV = 0.001;
and similarly for the other symbols.
As for the NeuronGroup
and run
function, I will try to get back to that later.
One challenge that we easily touch upon here is storing the iterate
function as a function
pointer.
Because it is a closure that captures the outside variables, such as E
,
it needs to be stored as a Fn
rather than just an fn
.
I will look into this, but for now, I am happy with being able to run a small simulation with
the following code:
fn main() {
let mV = 1.0 / 1000.0;
let ms = 1.0 / 1000.0;
let E = -40.0 * mV;
let tau = 5.0 * ms;
let (mut state, iterate) = equations! {
I: ampere = g * (E - v);
d/dt v: volt = I, init = 20.0 * mV;
d/dt g: unit = -g / tau, init = 0.5;
};
println!("Initial state {:?}", state);
for _ in 0..20 {
state = iterate(state, 1.0 * ms);
println!("{:?}", state);
}
}