Migrating from odin 1.x.x

This vignette discusses key differences with odin version 1.

library(odin2)

New features

Some of these features were present in versions of odin.dust and many derive from underlying support in dust2.

  • Comparison to data and likelihood support (introduced in odin.dust)
  • Automatic differentiation
  • More efficient setting of the subset of parameters you are likely to use while fitting (use the constant argument to parameter())
  • Multiple parameter sets at once (introduced in odin.dust but expanded here)
  • Run multiple copies of a system at once in parallel (introduced in odin.dust)
  • Built-in support for periodic variable resetting (e.g., for computing daily incidence)
  • Better (we hope) error messages
  • Better debugging tools (see vignette("debugging"))

Planned

  • Optional array bounds checking, both during compilation and during runtime (in the latter case with a performance penalty once enabled).

Hoped

  • Support for multinomial samples and other vector-valued functions

Missing features

This list is incomplete, and we’ll expand it as we work through the tests. We’re not currently quite at MVP stage yet, so expect that most things don’t work!

Things we do plan on implementing:

  • Delay differential equations, e.g. y_lag <- delay(y, tau) (mrc-5434)
  • Compile-time parameter substitution (mrc-5575)
  • Compilation to JavaScript
  • Compilation to GPU

Things that we plan to drop in this version

  • Many details in config() and options

Note that many errors are not caught as odin errors, and invalid odin code will be accepted and generate C++ code that fails to compile.

Changes in syntax

user() becomes parameter()

This might be the largest user-visible change, and we’ll add a translation system for this.

Previously, to support parameters you might write

a <- user(4)

which says that a is a user-supplied parameter with a default value of 4. In most cases this now simply becomes

a <- parameter(4)

The integer argument accepted by user has now changed:

  • user(integer = TRUE) becomes parameter(type = "integer")
  • user(integer = FALSE) becomes parameter(type = "real")

This translation can be done automatically in most cases, and will be done (with a warning) by default if possible. You should update your code with the suggested fix, however, as this translation will be removed in a future version.

Compare keyword is now removed.

In comparisons such as

compare(d) ~ Normal(0, 1)

The compare keyword, and the ~ only occur together. This has been simplified, and is now written as:

d ~ Normal(0, 1)

which reads as: d is normally distributed with a mean of 0 and standard deviation of 1.

Vector parameters assign without array indices

Previously, if you had a vector parameter you had to write

a[] <- parameter()
dim(a) <- 10

(though with user(), as in the previous section). However, the array index here does not really add anything as we already know how many dimensions a has from the dim call. So now you should write

a <- parameter()
dim(a) <- 10

which makes it clearer that all of a is assigned by the parameter call.

Vector/matrix/array parameters whose size is determined by input require rank argument

What a mouthful. Previously you might have written

a[, ] <- user()
dim(a) <- user()

which means “a is a matrix whose dimensions are determined by the input we are given on initialisation”. Because of the previous change the first line changes to

a <- parameter()

but that means that we no longer know that a has two dimensions. That’s ok because we’ve moved the responsibility for this into the dim() assignment line anyway (internally). So for now you write

dim(a) <- parameter(rank = 2)

which conveys the same intent. We may make this slightly more friendly in future (see vignette("functions")).

Interpolate results assign without array indices

Previously, if you had an interpolate() call that returned a vector (or higher-dimension array) you had to write

v[] <- interpolate(a, b, "constant")

but now you should drop the [], as for the parameter() case above, as you are replacing all of v at once, writing:

v <- interpolate(a, b, "constant")

Discrete-time models have a more solid time basis

Previously, discrete time models used step to count steps forward as unsigned integers, usually from zero. Many models added a parameter (or constant) dt representing the timestep and then a variable time which represented the time as a real-valued number. For example you might have dt of 0.25 and then your model stops at times [0, 0.25, 0.5, 0.75, 1] for steps [0, 1, 2, 3, 4].

We formalise this approach now having discrete time systems be explicitly in terms of the same time basis as ODE models (that is, some real valued time axis). When you initialise a model you pass in dt, which must be an even divisor of 1 (so 0.5, 0.25, 0.2, etc). We then take steps of this size. The wrinkle is that (at least for now) the model will only return control back to you, or state back to you, at integer-valued times. We may relax this in future to allow returning at any time value that is a multiple of dt.

This will cause a few issues for using old code, which we cover below.

Assignments to dt

You may have models that assign to dt, either directly or as a parameter. You can no longer do this as dt will be provided by dust (see dust2::dust_system_create()).

We can automatically remove these (with a warning) in some cases.

Assignments to time

Conventionally, many models would write

time <- step * dt

which is the linear transformation of time that dust2 now does. We can remove these statements and your model should work as intended.

Use of step

All other uses of step are problematic and will need manual fixing. We will try and accumulate migration strategies here, so please let us know if you have had to do anything not listed.

Access “interpolated” values from a grid: In sircovid we used step as an array index, in order to support time-varying inputs (e.g., vaccine allocation schedules, rates of contact). This is no longer supported (at all) because dt is changed separately from the inputs. Instead you should use odin’s interpolation functions.

Periodic resetting: You may have written:

a <- if (step %% freq == 0) b else c

to have some quantity that took different values every freq steps, where freq is usually 1/dt or m/dt where m is an integer. You should rewrite this to use time:

a <- if (time * dt / m == 0) b else c

The name of the time variable in discrete time models has changed

Previously, time was t but we have moved this to time to be a little more explicit. We can automatically migrate your code in many cases, unless you have defined a variable time already.

Random number function calls have changed

Previously we used the same names as R’s random-number-drawing functions, for example rbinom for drawing from a binomial distribution. This has changed to use the distribution name instead.

The motivating reason for this change was that in odin we might write

rbinom(size, prob)

but if you were writing this in R you would write

rbinom(1, size, prob)

with the first argument being the number of draws from the distribution in question. This departure in arguments feels needlessly confusing! If you were using odin without odin.dust then this did compile to a call to one of R’s underlying random number functions so this connection was reasonable but from version 2 we use monty’s parallelisable random number distributions.

The mapping is:

  • rbeta() to Beta
  • rbinom() to Binomial
  • rcauchy() to Cauchy (unsupported for now)
  • rchisq() to ChiSquared (unsupported for now)
  • rexp() to Exponential
  • rf() to F (unsupported for now)
  • rgamma() to Gamma
  • rgeometric() to Geometric (unsupported for now)
  • rhyper() to Hypergeometric
  • rlogis() to Logistic (unsupported for now)
  • rlnorm() to Lognormal (unsupported for now)
  • rnbinom() to NegativeBinomial
  • rnorm() to Normal
  • rpois() to Poisson
  • rt() to T (unsupported for now)
  • runif() to Uniform
  • rweibull() to Weibull (unsupported for now)

(Not all of these are implemented yet).

System size cannot be changed after creation

This limitation comes from our implementation in dust2 and it is possible to relax it in some settings. However, it is fairly important for efficiently running the system within a pMCMC context where we save state periodically.

If your system has a parameter that affects the number of state variables in the system (e.g., the number of age categories that a compartment is stratified by), you may not change this after initialisation. This will be prevented by the parser once arrays are implemented.

Changes in the way arrays are handled

The two-argument form of dim() has been removed, as we did not believe it was used and it is confusingly different to R. Previously you could write dim(x, 3) to get the length of the third dimension of x; this is no longer supported. Please let us know if this is a problem.

General changes

This package replaces odin.dust and will eventually replace odin (as in, we’ll copy the entire odin2 code into odin to become version 2.0.0 of that package).

The relationship between packages has changed. Previously mcstate “knew” about dust models and so you had to use odin.dust practically to use the statistical machinery in mcstate. We’ve changed this around now, so that odin2 “knows” about monty and can create systems that will work well with monty. We now depend on monty, so if you have odin2 installed you can start working towards fitting models immediately.

Known limitations

Much slower compilation time

Because we now compile to C++ via dust2, the compilation times have massively increased. Previously, compilation of a simple model took less than a second, but now this will take 6 seconds or so. You can alleviate this to a degree during development by specifying debug = TRUE when compiling, which reduces this down to about 3 seconds. These times are from my workstation but I expect the relative differences to hold (we’re probably 10x slower than previously but can be “only” 5x slower if you turn off optimisation). If you were previously using odin.dust you should notice little change here.

Updating old code

If you compile odin code that contains any of the changes above, it will try and update the code to the new version and keep going:

gen <- odin2::odin({
  initial(x) <- 1
  deriv(x) <- x + a / t
  a <- user(2)
})
#> Warning in odin2::odin({: Found 2 compatibility issues
#> Replace calls to 'user()' with 'parameter()'
#> ✖ a <- user(2)
#> ✔ a <- parameter(2)
#> Use 'time' and not 't' to refer to time
#> ✖ deriv(x) <- x + a/t
#> ✔ deriv(x) <- x + a/time

This model contains two issues that can be easily rewritten; the solution to this rewriting is printed to screen and the model is compiled as if you had rewritten them.

Not everything can be rewritten, especially changes involving step:

gen <- odin2::odin({
  initial(x) <- 1
  update(x) <- x + a / step
  a <- user(2)
})
#> Error in `odin2::odin()`:
#> ! Use of 'step' is no longer allowed
#> ℹ Previously, discrete-time models used 'step' as a measure of time, but we
#>   have removed this in odin2
#> ℹ Please see `vignette(migrating)` for guidance
#> → Context:
#> update(x) <- x + a/step
#> ℹ For more information, run `odin2::odin_error_explain("E1050")`

In this case, odin errors and tries to indicate where you have work to do (and directs you to this document!)

For code saved into a file, you can use odin_migrate to migrate code from the old syntax to the new; this will preserve comments and formatting except for code that is rewritten so it should be fairly unintrusive.

For example, in path (a temporary file for this vignette) we have saved the code from above:

initial(x) <- 1
deriv(x) <- x + a / t
a <- user(2)

We can migrate this in-place with:

odin_migrate(path, path)
#> ℹ Migrating 2 statements
#> ✔ Wrote '/tmp/RtmpA412Kr/fileca57412b794.R'

and now the code contains:

initial(x) <- 1
deriv(x) <- x + a/time
a <- parameter(2)