| Title: | Iterate Multiple Realisations of Stochastic Models |
|---|---|
| Description: | An Engine for simulation of stochastic models. Includes support for running stochastic models in parallel, either with shared or varying parameters. Simulations are run efficiently in compiled code and can be run with a fraction of simulated states returned to R, allowing control over memory usage. Support is provided for building bootstrap particle filter for performing Sequential Monte Carlo (e.g., Gordon et al. 1993 <doi:10.1049/ip-f-2.1993.0015>). The core of the simulation engine is the 'xoshiro256**' algorithm (Blackman and Vigna <arXiv:1805.01407>), and the package is further described in FitzJohn et al 2021 <doi:10.12688/wellcomeopenres.16466.2>. |
| Authors: | Rich FitzJohn [aut, cre], Alex Hill [aut], John Lees [aut], Imperial College of Science, Technology and Medicine [cph] |
| Maintainer: | Rich FitzJohn <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.15.3 |
| Built: | 2026-05-13 09:04:28 UTC |
| Source: | https://github.com/mrc-ide/dust |
Create a dust model from a C++ input file. This function will
compile the dust support around your model and return an object
that can be used to work with the model (see the Details below,
and dust_generator).
dust( filename, quiet = FALSE, workdir = NULL, gpu = FALSE, real_type = NULL, linking_to = NULL, cpp_std = NULL, compiler_options = NULL, optimisation_level = NULL, skip_cache = FALSE )dust( filename, quiet = FALSE, workdir = NULL, gpu = FALSE, real_type = NULL, linking_to = NULL, cpp_std = NULL, compiler_options = NULL, optimisation_level = NULL, skip_cache = FALSE )
filename |
The path to a single C++ file |
quiet |
Logical, indicating if compilation messages from
|
workdir |
Optional working directory to use. If |
gpu |
Logical, indicating if we should generate GPU
code. This requires a considerable amount of additional software
installed (CUDA toolkit and drivers) as well as a
CUDA-compatible GPU. If |
real_type |
Optionally, a string indicating a substitute type to
swap in for your model's |
linking_to |
Optionally, a character vector of additional
packages to add to the |
cpp_std |
The C++ standard to use, if you need to set one
explicitly. See the section "Using C++ code" in "Writing R
extensions" for the details of this, and how it interacts with
the R version currently being used. For R 4.0.0 and above, C++11
will be used; as dust depends on at least this version of R you
will never need to specify a version this low. Sensible options
are |
compiler_options |
A character vector of additional options
to pass through to the C++ compiler. These will be passed
through without any shell quoting or validation, so check the
generated commands and outputs carefully in case of error. Note
that R will apply these before anything in your personal
|
optimisation_level |
A shorthand way of specifying common
compiler options that control optimisation level. By default
( |
skip_cache |
Logical, indicating if the cache of previously
compiled models should be skipped. If |
A dust_generator object based on your source files
Your input dust model must satisfy a few requirements.
Define some class that implements your model (below model is
assumed to be the class name)
That class must define a type internal_type (so
model::internal_type) that contains its internal data that the
model may change during execution (i.e., that is not shared
between particles). If no such data is needed, you can do
using internal_type = dust::no_internal; to indicate this.
We also need a type shared_type that contains constant internal
data is shared between particles (e.g., dimensions, arrays that
are read but not written). If no such data is needed, you can do
using share_type = dust::no_shared; to indicate this.
That class must also include a type alias that describes the
model's floating point type, real_type. Most models can include
using real_type = double; in their public section.
The class must also include a type alias that describes the model's
data type. If your model does not support data, then write
using data_type = dust::no_data;, which disables the
compare_data and set_data methods. Otherwise see
vignette("data") for more information.
The class must have a constructor that accepts const dust::pars_type<model>& pars for your type model. This will have
elements shared and internal which you can assign into your
model if needed.
The model must have a method size() returning size_t which
returns the size of the system. This size may depend on values
in your initialisation object but is constant within a model
run.
The model must have a method initial (which may not be
const), taking a time step number (size_t) and returning a
std::vector<real_type> of initial state for the model.
The model must have a method update taking arguments:
size_t time: the time step number
const double * state: the state at the beginning of the
time step
dust::rng_state_type<real_type>& rng_state: the dust random number
generator state - this must be a reference, as it will be modified
as random numbers are drawn
double *state_next: the end state of the model
(to be written to by your function)
Your update function is the core here and should update the
state of the system - you are expected to update all values of
state on return.
It is very important that none of the functions in the class use the R API in any way as these functions will be called in parallel.
You must also provide a data/parameter-wrangling function for
producing an object of type dust::pars_type<model> from an R list. We
use cpp11 for this. Your function will look like:
namespace dust {
template <>
dust::pars_type<model> dust_pars<model>(cpp11::list pars) {
// ...
return dust::pars_type<model>(shared, internal);
}
}
With the body interacting with pars to create an object of type
model::shared_type and model::internal_type before returning the
dust::pars_type object. This function will be called in serial
and may use anything in the cpp11 API. All elements of the
returned object must be standard C/C++ (e.g., STL) types and
not cpp11/R types. If your model uses only shared or internal,
you may use the single-argument constructor overload to
dust::pars_type which is equivalent to using dust::no_shared or
dust::no_internal for the missing argument.
Your model may provided a template specialisation
dust::dust_info<model>() returning a cpp11::sexp for
returning arbitrary information back to the R session:
namespace dust {
template <>
cpp11::sexp dust_info<model>(const dust::pars_type<sir>& pars) {
return cpp11::wrap(...);
}
}
What you do with this is up to you. If not present then the
info() method on the created object will return NULL.
Potential use cases for this are to return information about
variable ordering, or any processing done while accepting the
pars object used to create the pars fed into the particles.
You can optionally use C++ pseudo-attributes to configure the generated code. Currently we support two attributes:
[[dust::class(classname)]] will tell dust the name of your
target C++ class (in this example classname). You will need to
use this if your file uses more than a single class, as
otherwise will try to detect this using extremely simple
heuristics.
[[dust::name(modelname)]] will tell dust the name to use for
the class in R code. For technical reasons this must be
alphanumeric characters only (sorry, no underscore) and must not
start with a number. If not included then the C++ type name will
be used (either specified with [[dust::class()]] or detected).
Your model should only throw exceptions as a last resort. One such
last resort exists already if rbinom is given invalid inputs
to prevent an infinite loop. If an error is thrown, all
particles will complete their current run, and then the error
will be rethrown - this is required by our parallel processing
design. Once this happens though the state of the system is
"inconsistent" as it contains particles that have run for
different lengths of time. You can extract the state of the
system at the point of failure (which may help with debugging)
but you will be unable to continue running the object until
either you reset it (with $update_state()). An error will be
thrown otherwise.
Things are worse on a GPU; if an error is thrown by the RNG code
(happens in rbinom when given impossible inputs such as
negative sizes, probabilities less than zero or greater than 1)
then we currently use CUDA's __trap() function which will
require a process restart to be able to use anything that uses
the GPU again, covering all methods in the class. However, this
is preferable to the infinite loop that would otherwise be
caused.
dust_generator for a description of the class
of created objects, and dust_example() for some
pre-built examples. If you want to just generate the code and
load it yourself with pkgload::load_all or some other means,
see dust_generate)
# dust includes a couple of very simple examples filename <- system.file("examples/walk.cpp", package = "dust") # This model implements a random walk with a parameter coming from # R representing the standard deviation of the walk writeLines(readLines(filename)) # The model can be compiled and loaded with dust::dust(filename) # but it's faster in this example to use the prebuilt version in # the package model <- dust::dust_example("walk") # Print the object and you can see the methods that it provides model # Create a model with standard deviation of 1, initial time step zero # and 30 particles obj <- model$new(list(sd = 1), 0, 30) obj # Curent state is all zero obj$state() # Current time is also zero obj$time() # Run the model up to time step 100 obj$run(100) # Reorder/resample the particles: obj$reorder(sample(30, replace = TRUE)) # See the state again obj$state()# dust includes a couple of very simple examples filename <- system.file("examples/walk.cpp", package = "dust") # This model implements a random walk with a parameter coming from # R representing the standard deviation of the walk writeLines(readLines(filename)) # The model can be compiled and loaded with dust::dust(filename) # but it's faster in this example to use the prebuilt version in # the package model <- dust::dust_example("walk") # Print the object and you can see the methods that it provides model # Create a model with standard deviation of 1, initial time step zero # and 30 particles obj <- model$new(list(sd = 1), 0, 30) obj # Curent state is all zero obj$state() # Current time is also zero obj$time() # Run the model up to time step 100 obj$run(100) # Reorder/resample the particles: obj$reorder(sample(30, replace = TRUE)) # See the state again obj$state()
Detect CUDA configuration. This function tries to compile a small
program with nvcc and confirms that this can be loaded into R,
then uses that program to query the presence and capabilities of
your NVIDIA GPUs. If this works, then you can use the GPU-enabled
dust features, and the information returned will help us. It's
quite slow to execute (several seconds) so we cache the value
within a session. Later versions of dust will cache this across
sessions too.
dust_cuda_configuration( path_cuda_lib = NULL, path_cub_include = NULL, quiet = TRUE, forget = FALSE )dust_cuda_configuration( path_cuda_lib = NULL, path_cub_include = NULL, quiet = TRUE, forget = FALSE )
path_cuda_lib |
Optional path to the CUDA libraries, if they
are not on system library paths. This will be added as
|
path_cub_include |
Optional path to the CUB headers, if using CUDA < 11.0.0. See Details |
quiet |
Logical, indicating if compilation of test programs should be quiet |
forget |
Logical, indicating if we should forget cached values and recompute the configuration |
Not all installations leave the CUDA libraries on the default
paths, and you may need to provide it. Specifically, when we link
the dynamic library, if the linker complains about not being able
to find libcudart then your CUDA libraries are not in the
default location. You can manually pass in the path_cuda_lib
argument, or set the DUST_PATH_CUDA_LIB environment variable (in
that order of precedence).
If you are using older CUDA (< 11.0.0) then you need to provide CUB headers, which we use to manage state on the device (these are included in CUDA 11.0.0 and higher). You can provide this as:
a path to this function (path_cub_include)
the environment variable DUST_PATH_CUB_INCLUDE
CUB headers installed into the default location (R >= 4.0.0, see below).
These are checked in turn with the first found taking
precedence. The default location is stored with
tools::R_user_dir("dust", "data"), but this functionality is
only available on R >= 4.0.0.
To install CUB you can do:
dust:::cuda_install_cub(NULL)
which will install CUB into the default path (provide a path on older versions of R and set this path as DUST_PATH_CUB_INCLUDE).
For editing your .Renviron file to set these environment
variables, usethis::edit_r_environ() is very helpful.
A list of configuration information. This includes:
has_cuda: logical, indicating if it is possible to compile CUDA on
this machine (not necessarily use it though)
cuda_version: the version of CUDA found
devices: a data.frame of device information:
id: the device id (integer, typically in a sequence from 0)
name: the human-friendly name of the device
memory: the memory of the device, in MB
version: the compute version for this device
path_cuda_lib: path to CUDA libraries, if required
path_cub_include: path to CUB headers, if required
If compilation of the test program fails, then has_cuda will be
FALSE and all other elements will be NULL.
dust_cuda_options which controls additional CUDA compilation options (e.g., profiling, debug mode or custom flags)
# If you have your CUDA library in an unusual location, then you # may need to add a path_cuda_lib argument: dust::dust_cuda_configuration( path_cuda_lib = "/usr/local/cuda-11.1/lib64", forget = TRUE, quiet = FALSE) # However, if things are installed in the default location or you # have set the environment variables described above, then this # may work: dust::dust_cuda_configuration(forget = TRUE, quiet = FALSE)# If you have your CUDA library in an unusual location, then you # may need to add a path_cuda_lib argument: dust::dust_cuda_configuration( path_cuda_lib = "/usr/local/cuda-11.1/lib64", forget = TRUE, quiet = FALSE) # However, if things are installed in the default location or you # have set the environment variables described above, then this # may work: dust::dust_cuda_configuration(forget = TRUE, quiet = FALSE)
Create options for compiling for CUDA. Unless you need to change paths to libraries/headers, or change the debug level you will probably not need to directly use this. However, it's potentially useful to see what is being passed to the compiler.
dust_cuda_options( ..., debug = FALSE, profile = FALSE, fast_math = FALSE, flags = NULL )dust_cuda_options( ..., debug = FALSE, profile = FALSE, fast_math = FALSE, flags = NULL )
... |
Arguments passed to |
debug |
Logical, indicating if we should compile for debug
(adding |
profile |
Logical, indicating if we should enable profiling |
fast_math |
Logical, indicating if we should enable "fast maths", which lets the optimiser enable optimisations that break IEEE compliance and disables some error checking (see the CUDA docs for more details). |
flags |
Optional extra arguments to pass to nvcc. These
options will not be passed to your normal C++ compiler, nor the
linker (for that use R's user Makevars system). This can be used
to do things like tune the maximum number of registers
( |
An object of type cuda_options, which can be passed into
dust as argument gpu
dust_cuda_configuration which identifies and returns the core CUDA configuration (often used implicitly by this function).
tryCatch( dust::dust_cuda_options(), error = function(e) NULL)tryCatch( dust::dust_cuda_options(), error = function(e) NULL)
Prepare data for use with the $set_data() method. This is not
required for use but tries to simplify the most common use case
where you have a data.frame with some column indicating "dust
time step" (name_time), and other columns that might be use in
your data_compare function. Each row will be turned into a named
R list, which your dust_data function can then work with to get
this time-steps values. See Details for use with multi-pars
objects.
dust_data(object, name_time = "time", multi = NULL)dust_data(object, name_time = "time", multi = NULL)
object |
An object, at this point must be a data.frame |
name_time |
The name of the data column within |
multi |
Control how to interpret data for multi-parameter dust object; see Details |
Note that here "dust time step" (name_time) refers to the dust
time step (which will be a non-negative integer) and not the
rescaled value of time that you probably use within the model. See
dust_generator for more information.
The data object as accepted by data_set must be a list and
each element must itself be a list with two elements; the dust
time at which the data applies and any R object that corresponds
to data at that point. We expect that most of the time this second
element will be a key-value list with scalar keys, but more
flexibility may be required.
For multi-data objects, the final format is a bit more awkward;
each time step we have a list with elements time, data_1,
data_2, ..., data_n for n parameters. There are two ways of
creating this that might be useful: sharing the data across all
parameters and using some column as a grouping value.
The behaviour here is driven by the multi argument;
NULL: (the default) do nothing; this creates an object that
is suitable for use with a pars_multi = FALSE dust
object.
<integer> (e.g., multi = 3); share the data across 3 sets of
parameters. This number must match the number of parameter sets
that your dust object is created with
<column_name> (e.g., multi = "country"); the name of a column
within your data to split the data at. This column must be a
factor, and that factor must have levels that map to integers 1,
2, ..., n (e.g., unique(as.integer(object[[multi]])) returns
the integers 1:n).
A list of dust time/data pairs that will be used for the compare function in a compiled model. Each element is a list of length two or more where the first element is the time step and the subsequent elements are data for that time step.
d <- data.frame(time = seq(0, 50, by = 10), a = runif(6), b = runif(6)) dust::dust_data(d)d <- data.frame(time = seq(0, 50, by = 10), a = runif(6), b = runif(6)) dust::dust_data(d)
Access dust's built-in examples. These are compiled into the
package so that examples and tests can be run more quickly without
having to compile code directly via dust(). These examples are
all "toy" examples, being small and fast to run.
dust_example(name)dust_example(name)
name |
The name of the example to use. There are five
examples: |
sir: a basic SIR (Susceptible, Infected, Resistant)
epidemiological model. Draws from the binomial distribution to
update the population between each time step.
sirs: an SIRS model, the SIR model with an added R->S transition.
This has a non-zero steady state, so can be run indefinitely for testing.
volatility: A volatility model that might be applied to
currency fluctuations etc.
walk: A 1D random walk, following a Gaussian distribution each
time step.
logistic: Logistic growth in continuous time
A dust_generator object that can be used to create a
model. See examples for usage.
# A SIR (Susceptible, Infected, Resistant) epidemiological model sir <- dust::dust_example("sir") sir # Initialise the model at time step 0 with 50 independent trajectories mod <- sir$new(list(), 0, 50) # Run the model for 400 steps, collecting "infected" every 4th time step times <- seq(0, 400, by = 4) mod$set_index(2L) y <- mod$simulate(times) # A plot of our epidemic matplot(times, t(drop(y)), type = "l", lty = 1, col = "#00000044", las = 1, xlab = "Time", ylab = "Number infected")# A SIR (Susceptible, Infected, Resistant) epidemiological model sir <- dust::dust_example("sir") sir # Initialise the model at time step 0 with 50 independent trajectories mod <- sir$new(list(), 0, 50) # Run the model for 400 steps, collecting "infected" every 4th time step times <- seq(0, 400, by = 4) mod$set_index(2L) y <- mod$simulate(times) # A plot of our epidemic matplot(times, t(drop(y)), type = "l", lty = 1, col = "#00000044", las = 1, xlab = "Time", ylab = "Number infected")
Generate a package out of a dust model. The resulting package can
be installed or loaded via pkgload::load_all() though it
contains minimal metadata and if you want to create a persistent
package you should use dust_package(). This function is
intended for cases where you either want to inspect the code or
generate it once and load multiple times (useful in some workflows
with CUDA models).
dust_generate( filename, quiet = FALSE, workdir = NULL, gpu = FALSE, real_type = NULL, linking_to = NULL, cpp_std = NULL, compiler_options = NULL, optimisation_level = NULL, mangle = FALSE )dust_generate( filename, quiet = FALSE, workdir = NULL, gpu = FALSE, real_type = NULL, linking_to = NULL, cpp_std = NULL, compiler_options = NULL, optimisation_level = NULL, mangle = FALSE )
filename |
The path to a single C++ file |
quiet |
Logical, indicating if compilation messages from
|
workdir |
Optional working directory to use. If |
gpu |
Logical, indicating if we should generate GPU
code. This requires a considerable amount of additional software
installed (CUDA toolkit and drivers) as well as a
CUDA-compatible GPU. If |
real_type |
Optionally, a string indicating a substitute type to
swap in for your model's |
linking_to |
Optionally, a character vector of additional
packages to add to the |
cpp_std |
The C++ standard to use, if you need to set one
explicitly. See the section "Using C++ code" in "Writing R
extensions" for the details of this, and how it interacts with
the R version currently being used. For R 4.0.0 and above, C++11
will be used; as dust depends on at least this version of R you
will never need to specify a version this low. Sensible options
are |
compiler_options |
A character vector of additional options
to pass through to the C++ compiler. These will be passed
through without any shell quoting or validation, so check the
generated commands and outputs carefully in case of error. Note
that R will apply these before anything in your personal
|
optimisation_level |
A shorthand way of specifying common
compiler options that control optimisation level. By default
( |
mangle |
Logical, indicating if the model name should be
mangled when creating the package. This is safer if you will
load multiple copies of the package into a single session, but
is |
The path to the generated package (will be workdir if
that was provided, otherwise a temporary directory).
filename <- system.file("examples/walk.cpp", package = "dust") path <- dust::dust_generate(filename) # Simple package created: dir(path) dir(file.path(path, "R")) dir(file.path(path, "src"))filename <- system.file("examples/walk.cpp", package = "dust") path <- dust::dust_generate(filename) # Simple package created: dir(path) dir(file.path(path, "R")) dir(file.path(path, "src"))
All "dust" dust models are R6 objects and expose a
common set of "methods". To create a dust model of your own,
see dust and to interact with some built-in ones see
dust_example()
A dust_generator object
For discrete time models, dust has an internal "time", which was
called step in version 0.11.x and below. This must always
be non-negative (i.e., zero or more) and always increases in
unit increments. Typically a model will remap this internal
time onto a more meaningful time in model space, e.g. by applying
the transform model_time = offset + time * dt; with this approach
you can start at any real valued time and scale the unit increments
to control the model dynamics.
new()
Create a new model. Note that the behaviour of this object
created by this function will change considerably based on
whether the pars_multi argument is TRUE. If not (the
default) then we create n_particles which all share the same
parameters as specified by the pars argument. If pars_multi
is TRUE then pars must be an unnamed list, and each element
of it represents a different set of parameters. We will
create length(pars) sets of n_particles particles which
will be simulated together. These particles must have the same
dimension - that is, they must correspond to model state that
is the same size.
dust_generator$new( pars, time, n_particles, n_threads = 1L, seed = NULL, pars_multi = FALSE, deterministic = FALSE, gpu_config = NULL, ode_control = NULL )
parsData to initialise your model with; a list
object, but the required elements will depend on the details of
your model. If pars_multi is TRUE, then this must be an
unnamed list of pars objects (see Details).
timeInitial time - must be nonnegative
n_particlesNumber of particles to create - must be at least 1
n_threadsNumber of OMP threads to use, if dust and
your model were compiled with OMP support (details to come).
n_particles should be a multiple of n_threads (e.g., if you use 8
threads, then you should have 8, 16, 24, etc particles). However, this
is not compulsory.
seedThe seed to use for the random number generator. Can
be a positive integer, NULL (initialise with R's random number
generator) or a raw vector of a length that is a multiple of
32 to directly initialise the generator (e..g., from the
dust object's $rng_state() method).
pars_multiLogical, indicating if pars should be
interpreted as a set of different initialisations, and that we
should prepare n_particles * length(pars) particles for
simulation. This has an effect on many of the other methods of
the object.
deterministicRun random number generation deterministically, replacing a random number from some distribution with its expectation. Deterministic models are not compatible with running on a a GPU.
gpu_configGPU configuration, typically an integer
indicating the device to use, where the model has GPU support.
If not given, then the default value of NULL will fall back on the
first found device if any are available. An error is thrown if the
device id given is larger than those reported to be available (note
that CUDA numbers devices from 0, so that '0' is the first device,
and so on). See the method $gpu_info() for available device ids;
this can be called before object creation as
dust_generator$public_methods$gpu_info().
For additional control, provide a list with elements device_id
and run_block_size. Further options (and validation) of this
list will be added in a future version!
ode_controlFor ODE models, control over the integration;
must be a dust_ode_control model, produced by
dust_ode_control(). It is an error to provide a non-NULL
value for discrete time models.
name()
Returns friendly model name
dust_generator$name()
param()
Returns parameter information, if provided by the model. This
describes the contents of pars passed to the constructor or to
$update_state() as the pars argument, and the details depend
on the model.
dust_generator$param()
run()
Run the model up to a point in time, returning the filtered state at that point.
dust_generator$run(time_end)
time_endTime to run to (if less than or equal to the current time(), silently nothing will happen)
simulate()
Iterate all particles forward in time over a series of times,
collecting output as they go. This is a helper around $run()
where you want to run to a series of points in time and save
output. The returned object will be filtered by your active index,
so that it has shape (n_state x n_particles x length(time_end))
for single-parameter objects, and (n_state x n_particles x
n_pars x length(time_end)) for multiparameter objects. Note that
this method is very similar to $run() except that the rank of
the returned array is one less. For a scalar time_end you would
ordinarily want to use $run() but the resulting numbers would
be identical.
dust_generator$simulate(time_end)
time_endA vector of time points that the simulation should report output at. This the first time must be at least the same as the current time, and every subsequent time must be equal or greater than those before it (ties are allowed though probably not wanted).
run_adjoint()
Run model with gradient information (if supported). The interface here will change, and documentation written once it stabilises.
dust_generator$run_adjoint()
set_index()
Set the "index" vector that is used to return a subset of pars
after using run(). If this is not used then run() returns
all elements in your state vector, which may be excessive and slower
than necessary.
dust_generator$set_index(index)
indexThe index vector - must be an integer vector with elements between 1 and the length of the state (this will be validated, and an error thrown if an invalid index is given).
index()
Returns the index as set by $set_index
dust_generator$index()
ode_control()
Return the ODE control set into the object on creation.
For discrete-time models this always returns NULL.
dust_generator$ode_control()
ode_statistics()
Return statistics about the integration, for ODE models. For discrete time models this makes little sense and so errors if used.
dust_generator$ode_statistics()
n_threads()
Returns the number of threads that the model was constructed with
dust_generator$n_threads()
n_state()
Returns the length of the per-particle state
dust_generator$n_state()
n_particles()
Returns the number of particles
dust_generator$n_particles()
n_particles_each()
Returns the number of particles per parameter set
dust_generator$n_particles_each()
shape()
Returns the shape of the particles
dust_generator$shape()
update_state()
Update one or more components of the model state.
This method can be used to update any or all of pars, state and
time. If both pars and time are given and state is not,
then by default we will update the model internal state according
to your model's initial conditions - use set_initial_state = FALSE
to prevent this.
dust_generator$update_state( pars = NULL, state = NULL, time = NULL, set_initial_state = NULL, index = NULL, reset_step_size = NULL )
parsNew pars for the model (see constructor)
stateThe state vector - can be either a numeric vector with the same length as the model's current state (in which case the same state is applied to all particles), or a numeric matrix with as many rows as your model's state and as many columns as you have particles (in which case you can set a number of different starting states at once).
timeNew initial time for the model. If this
is a vector (with the same length as the number of particles), then
particles are started from different initial times and run up to the
largest time given (i.e., max(time))
set_initial_stateControl if the model initial state
should be set while setting parameters. It is an error for
this to be TRUE when either pars is NULL or when state
is non-NULL.
indexUsed in conjunction with state, use this to set a
fraction of the model state; the index vector provided must
be the same length as the number of provided states, and
indicates the index within the model state that should be updated.
For example, if your model has states [a, b, c, d] and
you provide an index of [1, 3] then of state was [10, 20]
you would set a to 10 and c to 20.
reset_step_sizeLogical, indicating if we should reset the initial step size. This only has an effect with ode models and is silently ignored in discrete time models where the step size is constant.
state()
Return full model state
dust_generator$state(index = NULL)
indexOptional index to select state using
time()
Return current model time
For ODE models, sets the schedule at which stochastic events are
handled. The timing here is quite subtle - an event happens
immediately after the time (so at time + eps). If your model
runs up to time an event is not triggered, but as soon as that
time is passed, by any amount of time, the event will trigger. It
is an error to set this to a non-NULL value in a discrete time
model; later we may generalise the approach here.
dust_generator$time()
set_stochastic_schedule()
dust_generator$set_stochastic_schedule(time)
timeA vector of times to run the stochastic update at
reorder()
Reorder particles.
dust_generator$reorder(index)
indexAn integer vector, with values between 1 and n_particles, indicating the index of the current particles that new particles should take.
resample()
Resample particles according to some weight.
dust_generator$resample(weights)
weightsA numeric vector representing particle weights. For a "multi-parameter" dust object this should be be a matrix with the number of rows being the number of particles per parameter set and the number of columns being the number of parameter sets. long as all particles or be a matrix.
info()
Returns information about the pars that your model was created with.
Only returns non-NULL if the model provides a dust_info template
specialisation.
dust_generator$info()
pars()
Returns the pars object that your model was constructed with.
dust_generator$pars()
rng_state()
Returns the state of the random number generator. This returns a
raw vector of length 32 * n_particles. This can be useful for
debugging or for initialising other dust objects. The arguments
first_only and last_only are mutually exclusive. If neither is
given then all all particles states are returned, being 32 bytes
per particle. The full returned state or first_only are most
suitable for reseeding a new dust object.
dust_generator$rng_state(first_only = FALSE, last_only = FALSE)
first_onlyLogical, indicating if we should return only the first random number state
last_onlyLogical, indicating if we should return only the last random number state, which does not belong to a particle.
set_rng_state()
Set the random number state for this model. This replaces the RNG state that the model is using with a state of your choosing, saved out from a different model object. This method is designed to support advanced use cases where it is easier to manipulate the state of the random number generator than the internal state of the dust object.
dust_generator$set_rng_state(rng_state)
rng_stateA random number state, as saved out by the
$rng_state() method. Note that unlike seed as passed to the
constructor, this must be a raw vector of the expected length.
has_openmp()
Returns a logical, indicating if this model was compiled with
"OpenMP" support, in which case it will react to the n_threads
argument passed to the constructor. This method can also be used
as a static method by running it directly
as dust_generator$public_methods$has_openmp()
dust_generator$has_openmp()
has_gpu_support()
Returns a logical, indicating if this model was compiled with
"CUDA" support, in which case it will react to the device
argument passed to the run method. This method can also be used
as a static method by running it directly
as dust_generator$public_methods$has_gpu_support()
dust_generator$has_gpu_support(fake_gpu = FALSE)
fake_gpuLogical, indicating if we count as TRUE
models that run on the "fake" GPU (i.e., using the GPU
version of the model but running on the CPU)
has_compare()
Returns a logical, indicating if this model was compiled with
"compare" support, in which case the set_data and compare_data
methods are available (otherwise these methods will error). This
method can also be used as a static method by running it directly
as dust_generator$public_methods$has_compare()
dust_generator$has_compare()
real_size()
Return the size of real numbers (in bits). Typically this will be
64 for double precision and 32 for float. This method can also be
used as a static method by running it directly as
dust_generator$public_methods$real_size()
dust_generator$real_size()
time_type()
Return the type of time this model uses; will be one of discrete
(for discrete time models) or continuous (for ODE models).
This method can also be used as a static method by running it
directly as dust_generator$public_methods$time_type()
dust_generator$time_type()
rng_algorithm()
Return the random number algorithm used. Typically this will be
xoshiro256plus for models using double precision reals and
xoshiro128plus for single precision (float). This method can
also be used as a static method by running it directly as
dust_generator$public_methods$rng_algorithm()
dust_generator$rng_algorithm()
uses_gpu()
Check if the model is running on a GPU
dust_generator$uses_gpu(fake_gpu = FALSE)
fake_gpuLogical, indicating if we count as TRUE
models that run on the "fake" GPU (i.e., using the GPU
version of the model but running on the CPU)
n_pars()
Returns the number of distinct pars elements required. This is 0
where the object was initialised with pars_multi = FALSE and
an integer otherwise. For multi-pars dust objects, Where pars
is accepted, you must provide an unnamed list of length $n_pars().
dust_generator$n_pars()
set_n_threads()
Change the number of threads that the dust object will use. Your model must be compiled with "OpenMP" support for this to have an effect. Returns (invisibly) the previous value.
dust_generator$set_n_threads(n_threads)
n_threadsThe new number of threads to use. You may want to
wrap this argument in dust_openmp_threads() in order to
verify that you can actually use the number of threads
requested (based on environment variables and OpenMP support).
set_data()
Set "data" into the model for use with the $compare_data() method.
This is not supported by all models, depending on if they define a
data_type type. See dust_data() for a helper function to
construct suitable data and a description of the required format. You
will probably want to use that here, and definitely if using multiple
parameter sets.
dust_generator$set_data(data, shared = FALSE)
dataA list of data to set.
sharedLogical, indicating if the data should be shared
across all parameter sets, if your model is initialised to use
more than one parameter set (pars_multi = TRUE).
compare_data()
Compare the current model state against the data as set by
set_data. If there is no data set, or no data corresponding to
the current time then NULL is returned. Otherwise a numeric vector
the same length as the number of particles is returned. If model's
underlying compare_data function is stochastic, then each call to
this function may be result in a different answer.
dust_generator$compare_data()
filter()
Run a particle filter. The interface here will change a lot over the
next few versions. You must reset the dust object using
$update_state(pars = ..., time = ...) before using this method to
get sensible values.
dust_generator$filter( time_end = NULL, save_trajectories = FALSE, time_snapshot = NULL, min_log_likelihood = NULL )
time_endThe time to run to. If NULL, run to the end
of the last data. This value must be larger than the current
model time ($time()) and must exactly appear in the data.
save_trajectoriesLogical, indicating if the filtered particle
trajectories should be saved. If TRUE then the trajectories element
will be a multidimensional array (state x <shape> x time)
containing the state values, selected according to the index set
with $set_index().
time_snapshotOptional integer vector indicating times
that we should record a snapshot of the full particle filter state.
If given it must be strictly increasing vector whose elements
match times given in the data object. The return value with be
a multidimensional array (state x <shape> x time_snapshot)
containing full state values at the requested times.
min_log_likelihoodOptionally, a numeric value representing
the smallest likelihood we are interested in. If non-NULL
either a scalar value or vector the same length as the number
of parameter sets. Not yet supported, and included for future
compatibility.
gpu_info()
Return information about GPU devices, if the model
has been compiled with CUDA/GPU support. This can be called as a
static method by running dust_generator$public_methods$gpu_info().
If run from a GPU enabled object, it will also have an element
config containing the computed device configuration: the device
id, shared memory and the block size for the run method on the
device.
dust_generator$gpu_info()
# An example dust object from the package: walk <- dust::dust_example("walk") # The generator object has class "dust_generator" class(walk) # The methods below are are described in the documentation walk# An example dust object from the package: walk <- dust::dust_example("walk") # The generator object has class "dust_generator" class(walk) # The methods below are are described in the documentation walk
Create a control object for controlling the adaptive stepper for systems of ordinary differential equations (ODEs). The returned object can be passed into a continuous-time dust model on initialisation.
dust_ode_control( max_steps = NULL, atol = NULL, rtol = NULL, step_size_min = NULL, step_size_max = NULL, debug_record_step_times = NULL )dust_ode_control( max_steps = NULL, atol = NULL, rtol = NULL, step_size_min = NULL, step_size_max = NULL, debug_record_step_times = NULL )
max_steps |
Maxmimum number of steps to take. If the integration attempts to take more steps that this, it will throw an error, stopping the integration. |
atol |
The per-step absolute tolerance. |
rtol |
The per-step relative tolerance. The total accuracy will be less than this. |
step_size_min |
The minimum step size. The actual minimum
used will be the largest of the absolute value of this
|
step_size_max |
The largest step size. By default there is no maximum step size (Inf) so the solver can take as large a step as it wants to. If you have short-lived fluctuations in your rhs that the solver may skip over by accident, then specify a smaller maximum step size here. |
debug_record_step_times |
Logical, indicating if we should record
the steps taken. This information will be available as part of
the |
A named list of class "dust_ode_control"
# We include an example of logistic growth with the package gen <- dust::dust_example("logistic") # Create a control object, then pass it through as the ode_control # parameter to the constructor: ctrl <- dust::dust_ode_control(atol = 1e-3, rtol = 1e-3) mod <- gen$new(list(r = 0.1, K = 100), 0, 1, ode_control = ctrl) # When the model runs, the control parameters passed in to the # constructor are used in the solution mod$run(10) # The full set of control parameters can be extracted: mod$ode_control()# We include an example of logistic growth with the package gen <- dust::dust_example("logistic") # Create a control object, then pass it through as the ode_control # parameter to the constructor: ctrl <- dust::dust_ode_control(atol = 1e-3, rtol = 1e-3) mod <- gen$new(list(r = 0.1, K = 100), 0, 1, ode_control = ctrl) # When the model runs, the control parameters passed in to the # constructor are used in the solution mod$run(10) # The full set of control parameters can be extracted: mod$ode_control()
Return information about OpenMP support for this system. For
individual models look at the $has_openmp() method.
dust_openmp_support(check_compile = FALSE)dust_openmp_support(check_compile = FALSE)
check_compile |
Logical, indicating if we should check if we can compile an openmp program - this is slow the first time. |
A list with information about the openmp support on your system.
The first few elements come from the openmp library directly:
num_proc, max_threads, thread_limit; these correspond to a
call to the function omp_get_<name>() in C and
openmp_version which is the value of the _OPENMP macro.
A logical has_openmp which is TRUE if it looks like runtime
OpenMP support is available
The next elements tell you about different sources that might
control the number of threads allowed to run: mc.cores (from
the R option with the same name), OMP_THREAD_LIMIT,
OMP_NUM_THREADS, MC_CORES (from environment variables),
limit_r (limit computed against R-related control variables),
limit_openmp (limit computed against OpenMP-related variables)
and limit the smaller of limit_r and limit_openmp
dust_openmp_threads() for setting a polite number of
threads.
# System wide support dust::dust_openmp_support() # Support compiled into a generator walk <- dust::dust_example("walk") walk$public_methods$has_openmp() # Support from an instance of that model model <- walk$new(list(sd = 1), 0, 1) model$has_openmp()# System wide support dust::dust_openmp_support() # Support compiled into a generator walk <- dust::dust_example("walk") walk$public_methods$has_openmp() # Support from an instance of that model model <- walk$new(list(sd = 1), 0, 1) model$has_openmp()
Politely select a number of threads to use. See Details for the algorithm
dust_openmp_threads(n = NULL, action = "error")dust_openmp_threads(n = NULL, action = "error")
n |
Either |
action |
An action to perform if |
There are two limits and we will take the smaller of the two.
The first limit comes from piggy-backing off of R's normal
parallel configuration; we will use the MC_CORES environment
variable and mc.cores option as a guide to how many cores you
are happy to use. We take mc.cores first, then MC_CORES, which
is the same behaviour as parallel::mclapply and friends.
The second limit comes from openmp. If you do not have OpenMP
support, then we use one thread (higher numbers have no effect at
all in this case). If you do have OpenMP support, we take the
smallest of the number of "processors" (reported by
omp_get_num_procs()) the "max threads" (reported by
omp_get_max_threads() and "thread_limit" (reported by
omp_get_thread_limit().
See dust_openmp_support() for the values of all the values
that go into this calculation.
An integer, indicating the number of threads that you can use
# Default number of threads; tries to pick something friendly, # erring on the conservative side. dust::dust_openmp_threads(NULL) # Try to pick something silly and it will be reduced for you dust::dust_openmp_threads(1000, action = "fix")# Default number of threads; tries to pick something friendly, # erring on the conservative side. dust::dust_openmp_threads(NULL) # Try to pick something silly and it will be reduced for you dust::dust_openmp_threads(1000, action = "fix")
Updates a dust model in a package. The user-provided code is
assumed to the in inst/dust as a series of C++ files; a file
inst/dust/model.cpp will be transformed into a file
src/model.cpp.
dust_package(path, quiet = FALSE)dust_package(path, quiet = FALSE)
path |
Path to the package |
quiet |
Passed to |
If your code provides a class model then dust will create C++
functions such as dust_model_alloc - if your code also includes
names such as this, compilation will fail due to duplicate
symbols.
We add "cpp11 attributes" to the created functions, and will run
cpp11::cpp_register() on them once the generated code
has been created.
Your package needs a src/Makevars file to enable openmp (if your
system supports it). If it is not present then a suitable Makevars
will be written, containing
PKG_CXXFLAGS=$(SHLIB_OPENMP_CXXFLAGS) PKG_LIBS=$(SHLIB_OPENMP_CXXFLAGS)
following "Writing R Extensions" (see section "OpenMP support").
If your package does contain a src/Makevars file we do not
attempt to edit it but will error if it looks like it does not
contain these lines or similar.
You also need to make sure that your package loads the dynamic
library; if you are using roxygen, then you might create a file
(say, R/zzz.R) containing
#' @useDynLib packagename, .registration = TRUE NULL
substituting packagename for your package name as
appropriate. This will create an entry in NAMESPACE.
Nothing, this function is called for its side effects
vignette("dust") which contains more discussion of this
process
# This is explained in more detail in the package vignette path <- system.file("examples/sir.cpp", package = "dust", mustWork = TRUE) dest <- tempfile() dir.create(dest) dir.create(file.path(dest, "inst/dust"), FALSE, TRUE) writeLines(c("Package: example", "Version: 0.0.1", "LinkingTo: cpp11, dust"), file.path(dest, "DESCRIPTION")) writeLines("useDynLib('example', .registration = TRUE)", file.path(dest, "NAMESPACE")) file.copy(path, file.path(dest, "inst/dust")) # An absolutely minimal skeleton contains a DESCRIPTION, NAMESPACE # and one or more dust model files to compile: dir(dest, recursive = TRUE) # Running dust_package will fill in the rest dust::dust_package(dest) # More files here now dir(dest, recursive = TRUE)# This is explained in more detail in the package vignette path <- system.file("examples/sir.cpp", package = "dust", mustWork = TRUE) dest <- tempfile() dir.create(dest) dir.create(file.path(dest, "inst/dust"), FALSE, TRUE) writeLines(c("Package: example", "Version: 0.0.1", "LinkingTo: cpp11, dust"), file.path(dest, "DESCRIPTION")) writeLines("useDynLib('example', .registration = TRUE)", file.path(dest, "NAMESPACE")) file.copy(path, file.path(dest, "inst/dust")) # An absolutely minimal skeleton contains a DESCRIPTION, NAMESPACE # and one or more dust model files to compile: dir(dest, recursive = TRUE) # Running dust_package will fill in the rest dust::dust_package(dest) # More files here now dir(dest, recursive = TRUE)
Repair the environment of a dust object created by [dust] and then saved and reloaded by [saveRDS] and [readRDS]. Because we use a fake temporary package to hold the generated code, it will not ordinarily be loaded properly without using this.
dust_repair_environment(generator, quiet = FALSE)dust_repair_environment(generator, quiet = FALSE)
generator |
The dust generator |
quiet |
Logical, indicating if we should be quiet (default prints some progress information) |
Nothing, called for its side effects
Create an object that can be used to generate random numbers with the same RNG as dust uses internally. This is primarily meant for debugging and testing the underlying C++ rather than a source of random numbers from R.
A dust_rng object, which can be used to drawn random
numbers from dust's distributions.
The underlying random number generators are designed to work in
parallel, and with random access to parameters (see
vignette("rng") for more details). However, this is usually
done within the context of running a model where each particle
sees its own stream of numbers. We provide some support for
running random number generators in parallel, but any speed
gains from parallelisation are likely to be somewhat eroded by
the overhead of copying around a large number of random numbers.
All the random distribution functions support an argument
n_threads which controls the number of threads used. This
argument will silently have no effect if your installation
does not support OpenMP (see dust_openmp_support).
Parallelisation will be performed at the level of the stream,
where we draw n numbers from each stream for a total of n * n_streams random numbers using n_threads threads to do this.
Setting n_threads to be higher than n_streams will therefore
have no effect. If running on somebody else's system (e.g., an
HPC, CRAN) you must respect the various environment variables
that control the maximum allowable number of threads; consider
using dust_openmp_threads to select a safe number.
With the exception of random_real, each random number
distribution accepts parameters; the interpretations of these
will depend on n, n_streams and their rank.
If a scalar then we will use the same parameter value for every draw from every stream
If a vector with length n then we will draw n random
numbers per stream, and every stream will use the same parameter
value for every stream for each draw (but a different,
shared, parameter value for subsequent draws).
If a matrix is provided with one row and n_streams
columns then we use different parameters for each stream, but
the same parameter for each draw.
If a matrix is provided with n rows and n_streams
columns then we use a parameter value [i, j] for the ith
draw on the jth stream.
The rules are slightly different for the prob argument to
multinomial as for that prob is a vector of values. As such
we shift all dimensions by one:
If a vector we use same prob every draw from every stream
and there are length(prob) possible outcomes.
If a matrix with n columns then vary over each draw (the
ith draw using vector prob[, i] but shared across all
streams. There are nrow(prob) possible outcomes.
If a 3d array is provided with 1 column and n_streams
"layers" (the third dimension) then we use then we use different
parameters for each stream, but the same parameter for each
draw.
If a 3d array is provided with n columns and n_streams
"layers" then we vary over both draws and streams so that with
use vector prob[, i, j] for the ith draw on the jth
stream.
The output will not differ based on the number of threads used, only on the number of streams.
infoInformation about the generator (read-only)
new()
Create a dust_rng object
dust_rng$new( seed = NULL, n_streams = 1L, real_type = "double", deterministic = FALSE )
seedThe seed, as an integer, a raw vector or NULL.
If an integer we will create a suitable seed via the "splitmix64"
algorithm, if a raw vector it must the correct length (a multiple
of either 32 or 16 for float = FALSE or float = TRUE
respectively). If NULL then we create a seed using R's random
number generator.
n_streamsThe number of streams to use (see Details)
real_typeThe type of floating point number to use. Currently
only float and double are supported (with double being
the default). This will have no (or negligible) impact on speed,
but exists to test the low-precision generators.
deterministicLogical, indicating if we should use "deterministic" mode where distributions return their expectations and the state is never changed.
size()
Number of streams available
dust_rng$size()
jump()
The jump function updates the random number state for each stream by advancing it to a state equivalent to 2^128 numbers drawn from each stream.
dust_rng$jump()
long_jump()
Longer than $jump, the $long_jump method is
equivalent to 2^192 numbers drawn from each stream.
dust_rng$long_jump()
random_real()
Generate n numbers from a standard uniform distribution
dust_rng$random_real(n, n_threads = 1L)
nNumber of samples to draw (per stream)
n_threadsNumber of threads to use; see Details
random_normal()
Generate n numbers from a standard normal distribution
dust_rng$random_normal(n, n_threads = 1L, algorithm = "box_muller")
nNumber of samples to draw (per stream)
n_threadsNumber of threads to use; see Details
algorithmName of the algorithm to use; currently box_muller
and ziggurat are supported, with the latter being considerably
faster.
uniform()
Generate n numbers from a uniform distribution
dust_rng$uniform(n, min, max, n_threads = 1L)
nNumber of samples to draw (per stream)
minThe minimum of the distribution (length 1 or n)
maxThe maximum of the distribution (length 1 or n)
n_threadsNumber of threads to use; see Details
normal()
Generate n numbers from a normal distribution
dust_rng$normal(n, mean, sd, n_threads = 1L, algorithm = "box_muller")
nNumber of samples to draw (per stream)
meanThe mean of the distribution (length 1 or n)
sdThe standard deviation of the distribution (length 1 or n)
n_threadsNumber of threads to use; see Details
algorithmName of the algorithm to use; currently box_muller
and ziggurat are supported, with the latter being considerably
faster.
binomial()
Generate n numbers from a binomial distribution
dust_rng$binomial(n, size, prob, n_threads = 1L)
nNumber of samples to draw (per stream)
sizeThe number of trials (zero or more, length 1 or n)
probThe probability of success on each trial (between 0 and 1, length 1 or n)
n_threadsNumber of threads to use; see Details
nbinomial()
Generate n numbers from a negative binomial distribution
dust_rng$nbinomial(n, size, prob, n_threads = 1L)
nNumber of samples to draw (per stream)
sizeThe target number of successful trials (zero or more, length 1 or n)
probThe probability of success on each trial (between 0 and 1, length 1 or n)
n_threadsNumber of threads to use; see Details
hypergeometric()
Generate n numbers from a hypergeometric distribution
dust_rng$hypergeometric(n, n1, n2, k, n_threads = 1L)
gamma()
Generate n numbers from a gamma distribution
dust_rng$gamma(n, shape, scale, n_threads = 1L)
nNumber of samples to draw (per stream)
shapeShape
scaleScale '
n_threadsNumber of threads to use; see Details
poisson()
Generate n numbers from a Poisson distribution
dust_rng$poisson(n, lambda, n_threads = 1L)
nNumber of samples to draw (per stream)
lambdaThe mean (zero or more, length 1 or n). Only valid for lambda <= 10^7
n_threadsNumber of threads to use; see Details
exponential()
Generate n numbers from a exponential distribution
dust_rng$exponential(n, rate, n_threads = 1L)
nNumber of samples to draw (per stream)
rateThe rate of the exponential
n_threadsNumber of threads to use; see Details
cauchy()
Generate n draws from a Cauchy distribution.
dust_rng$cauchy(n, location, scale, n_threads = 1L)
nNumber of samples to draw (per stream)
locationThe location of the peak of the distribution (also its median)
scaleA scale parameter, which specifies the distribution's "half-width at half-maximum"
n_threadsNumber of threads to use; see Details
multinomial()
Generate n draws from a multinomial distribution.
In contrast with most of the distributions here, each draw is a
vector with the same length as prob.
dust_rng$multinomial(n, size, prob, n_threads = 1L)
nThe number of samples to draw (per stream)
sizeThe number of trials (zero or more, length 1 or n)
probA vector of probabilities for the success of each
trial. This does not need to sum to 1 (though all elements
must be non-negative), in which case we interpret prob as
weights and normalise so that they equal 1 before sampling.
n_threadsNumber of threads to use; see Details
state()
Returns the state of the random number stream. This returns a raw vector of length 32 * n_streams. It is primarily intended for debugging as one cannot (yet) initialise a dust_rng object with this state.
dust_rng$state()
rng <- dust::dust_rng$new(42) # Shorthand for Uniform(0, 1) rng$random_real(5) # Shorthand for Normal(0, 1) rng$random_normal(5) # Uniform random numbers between min and max rng$uniform(5, -2, 6) # Normally distributed random numbers with mean and sd rng$normal(5, 4, 2) # Binomially distributed random numbers with size and prob rng$binomial(5, 10, 0.3) # Negative binomially distributed random numbers with size and prob rng$nbinomial(5, 10, 0.3) # Hypergeometric distributed random numbers with parameters n1, n2 and k rng$hypergeometric(5, 6, 10, 4) # Gamma distributed random numbers with parameters a and b rng$gamma(5, 0.5, 2) # Poisson distributed random numbers with mean lambda rng$poisson(5, 2) # Exponentially distributed random numbers with rate rng$exponential(5, 2) # Multinomial distributed random numbers with size and vector of # probabiltiies prob rng$multinomial(5, 10, c(0.1, 0.3, 0.5, 0.1))rng <- dust::dust_rng$new(42) # Shorthand for Uniform(0, 1) rng$random_real(5) # Shorthand for Normal(0, 1) rng$random_normal(5) # Uniform random numbers between min and max rng$uniform(5, -2, 6) # Normally distributed random numbers with mean and sd rng$normal(5, 4, 2) # Binomially distributed random numbers with size and prob rng$binomial(5, 10, 0.3) # Negative binomially distributed random numbers with size and prob rng$nbinomial(5, 10, 0.3) # Hypergeometric distributed random numbers with parameters n1, n2 and k rng$hypergeometric(5, 6, 10, 4) # Gamma distributed random numbers with parameters a and b rng$gamma(5, 0.5, 2) # Poisson distributed random numbers with mean lambda rng$poisson(5, 2) # Exponentially distributed random numbers with rate rng$exponential(5, 2) # Multinomial distributed random numbers with size and vector of # probabiltiies prob rng$multinomial(5, 10, c(0.1, 0.3, 0.5, 0.1))
Create a set of initial random number seeds suitable for using within a distributed context (over multiple processes or nodes) at a level higher than a single group of synchronised threads.
dust_rng_distributed_state( seed = NULL, n_streams = 1L, n_nodes = 1L, algorithm = "xoshiro256plus" ) dust_rng_distributed_pointer( seed = NULL, n_streams = 1L, n_nodes = 1L, algorithm = "xoshiro256plus" )dust_rng_distributed_state( seed = NULL, n_streams = 1L, n_nodes = 1L, algorithm = "xoshiro256plus" ) dust_rng_distributed_pointer( seed = NULL, n_streams = 1L, n_nodes = 1L, algorithm = "xoshiro256plus" )
seed |
Initial seed to use. As for dust_rng, this can
be |
n_streams |
The number of streams to create per node. If
passing the results of this seed to a dust object's initialiser
(see dust_generator) you can safely leave this at 1, but
if using in a standalone setting, and especially if using
|
n_nodes |
The number of separate seeds to create. Each will be separated by a "long jump" for your generator. |
algorithm |
The name of an algorithm to use. Alternatively
pass a |
See vignette("rng_distributed") for a proper introduction to
these functions.
A list of either raw vectors (for
dust_rng_distributed_state) or of dust_rng_pointer
objects (for dust_rng_distributed_pointer)
dust::dust_rng_distributed_state(n_nodes = 2) dust::dust_rng_distributed_pointer(n_nodes = 2)dust::dust_rng_distributed_state(n_nodes = 2) dust::dust_rng_distributed_pointer(n_nodes = 2)
This function exists to support use from other
packages that wish to use dust's random number support, and
creates an opaque pointer to a set of random number streams. It
is described more fully in vignette("rng_package.Rmd")
algorithmThe name of the generator algorithm used (read-only)
n_streamsThe number of streams of random numbers provided (read-only)
new()
Create a new dust_rng_pointer object
dust_rng_pointer$new( seed = NULL, n_streams = 1L, long_jump = 0L, algorithm = "xoshiro256plus" )
seedThe random number seed to use (see dust_rng for details)
n_streamsThe number of independent random number streams to create
long_jumpOptionally an integer indicating how many "long jumps" should be carried out immediately on creation. This can be used to create a distributed parallel random number generator (see dust_rng_distributed_state)
algorithmThe random number algorithm to use. The default is
xoshiro256plus which is a good general choice
sync()
Synchronise the R copy of the random number state. Typically this is only needed before serialisation if you have ever used the object.
dust_rng_pointer$sync()
state()
Return a raw vector of state. This can be used to create other generators with the same state.
dust_rng_pointer$state()
is_current()
Return a logical, indicating if the random number
state that would be returned by state() is "current" (i.e., the
same as the copy held in the pointer) or not. This is TRUE on
creation or immediately after calling $sync() or $state()
and FALSE after any use of the pointer.
dust_rng_pointer$is_current()
dust::dust_rng_pointer$new()dust::dust_rng_pointer$new()