Title: | Iterate Multiple Realisations of Stochastic Models |
---|---|
Description: | An Engine for simulation of stochastic models. Includes support for running stochastic models in parallel, either with shared or varying parameters. Simulations are run efficiently in compiled code and can be run with a fraction of simulated states returned to R, allowing control over memory usage. Support is provided for building bootstrap particle filter for performing Sequential Monte Carlo (e.g., Gordon et al. 1993 <doi:10.1049/ip-f-2.1993.0015>). The core of the simulation engine is the 'xoshiro256**' algorithm (Blackman and Vigna <arXiv:1805.01407>), and the package is further described in FitzJohn et al 2021 <doi:10.12688/wellcomeopenres.16466.2>. |
Authors: | Rich FitzJohn [aut, cre], Alex Hill [aut], John Lees [aut], Imperial College of Science, Technology and Medicine [cph] |
Maintainer: | Rich FitzJohn <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.15.3 |
Built: | 2024-12-01 05:26:46 UTC |
Source: | https://github.com/mrc-ide/dust |
Create a dust model from a C++ input file. This function will
compile the dust support around your model and return an object
that can be used to work with the model (see the Details below,
and dust_generator
).
dust( filename, quiet = FALSE, workdir = NULL, gpu = FALSE, real_type = NULL, linking_to = NULL, cpp_std = NULL, compiler_options = NULL, optimisation_level = NULL, skip_cache = FALSE )
dust( filename, quiet = FALSE, workdir = NULL, gpu = FALSE, real_type = NULL, linking_to = NULL, cpp_std = NULL, compiler_options = NULL, optimisation_level = NULL, skip_cache = FALSE )
filename |
The path to a single C++ file |
quiet |
Logical, indicating if compilation messages from
|
workdir |
Optional working directory to use. If |
gpu |
Logical, indicating if we should generate GPU
code. This requires a considerable amount of additional software
installed (CUDA toolkit and drivers) as well as a
CUDA-compatible GPU. If |
real_type |
Optionally, a string indicating a substitute type to
swap in for your model's |
linking_to |
Optionally, a character vector of additional
packages to add to the |
cpp_std |
The C++ standard to use, if you need to set one
explicitly. See the section "Using C++ code" in "Writing R
extensions" for the details of this, and how it interacts with
the R version currently being used. For R 4.0.0 and above, C++11
will be used; as dust depends on at least this version of R you
will never need to specify a version this low. Sensible options
are |
compiler_options |
A character vector of additional options
to pass through to the C++ compiler. These will be passed
through without any shell quoting or validation, so check the
generated commands and outputs carefully in case of error. Note
that R will apply these before anything in your personal
|
optimisation_level |
A shorthand way of specifying common
compiler options that control optimisation level. By default
( |
skip_cache |
Logical, indicating if the cache of previously
compiled models should be skipped. If |
A dust_generator
object based on your source files
Your input dust model must satisfy a few requirements.
Define some class that implements your model (below model
is
assumed to be the class name)
That class must define a type internal_type
(so
model::internal_type
) that contains its internal data that the
model may change during execution (i.e., that is not shared
between particles). If no such data is needed, you can do
using internal_type = dust::no_internal;
to indicate this.
We also need a type shared_type
that contains constant internal
data is shared between particles (e.g., dimensions, arrays that
are read but not written). If no such data is needed, you can do
using share_type = dust::no_shared;
to indicate this.
That class must also include a type alias that describes the
model's floating point type, real_type
. Most models can include
using real_type = double;
in their public section.
The class must also include a type alias that describes the model's
data type. If your model does not support data, then write
using data_type = dust::no_data;
, which disables the
compare_data
and set_data
methods. Otherwise see
vignette("data")
for more information.
The class must have a constructor that accepts const dust::pars_type<model>& pars
for your type model
. This will have
elements shared
and internal
which you can assign into your
model if needed.
The model must have a method size()
returning size_t
which
returns the size of the system. This size may depend on values
in your initialisation object but is constant within a model
run.
The model must have a method initial
(which may not be
const
), taking a time step number (size_t
) and returning a
std::vector<real_type>
of initial state for the model.
The model must have a method update
taking arguments:
size_t time
: the time step number
const double * state
: the state at the beginning of the
time step
dust::rng_state_type<real_type>& rng_state
: the dust random number
generator state - this must be a reference, as it will be modified
as random numbers are drawn
double *state_next
: the end state of the model
(to be written to by your function)
Your update
function is the core here and should update the
state of the system - you are expected to update all values of
state
on return.
It is very important that none of the functions in the class use the R API in any way as these functions will be called in parallel.
You must also provide a data/parameter-wrangling function for
producing an object of type dust::pars_type<model>
from an R list. We
use cpp11 for this. Your function will look like:
namespace dust { template <> dust::pars_type<model> dust_pars<model>(cpp11::list pars) { // ... return dust::pars_type<model>(shared, internal); } }
With the body interacting with pars
to create an object of type
model::shared_type
and model::internal_type
before returning the
dust::pars_type
object. This function will be called in serial
and may use anything in the cpp11 API. All elements of the
returned object must be standard C/C++ (e.g., STL) types and
not cpp11/R types. If your model uses only shared or internal,
you may use the single-argument constructor overload to
dust::pars_type
which is equivalent to using dust::no_shared
or
dust::no_internal
for the missing argument.
Your model may provided a template specialisation
dust::dust_info<model>()
returning a cpp11::sexp
for
returning arbitrary information back to the R session:
namespace dust { template <> cpp11::sexp dust_info<model>(const dust::pars_type<sir>& pars) { return cpp11::wrap(...); } }
What you do with this is up to you. If not present then the
info()
method on the created object will return NULL
.
Potential use cases for this are to return information about
variable ordering, or any processing done while accepting the
pars object used to create the pars fed into the particles.
You can optionally use C++ pseudo-attributes to configure the generated code. Currently we support two attributes:
[[dust::class(classname)]]
will tell dust the name of your
target C++ class (in this example classname
). You will need to
use this if your file uses more than a single class, as
otherwise will try to detect this using extremely simple
heuristics.
[[dust::name(modelname)]]
will tell dust the name to use for
the class in R code. For technical reasons this must be
alphanumeric characters only (sorry, no underscore) and must not
start with a number. If not included then the C++ type name will
be used (either specified with [[dust::class()]]
or detected).
Your model should only throw exceptions as a last resort. One such
last resort exists already if rbinom
is given invalid inputs
to prevent an infinite loop. If an error is thrown, all
particles will complete their current run, and then the error
will be rethrown - this is required by our parallel processing
design. Once this happens though the state of the system is
"inconsistent" as it contains particles that have run for
different lengths of time. You can extract the state of the
system at the point of failure (which may help with debugging)
but you will be unable to continue running the object until
either you reset it (with $update_state()
). An error will be
thrown otherwise.
Things are worse on a GPU; if an error is thrown by the RNG code
(happens in rbinom
when given impossible inputs such as
negative sizes, probabilities less than zero or greater than 1)
then we currently use CUDA's __trap()
function which will
require a process restart to be able to use anything that uses
the GPU again, covering all methods in the class. However, this
is preferable to the infinite loop that would otherwise be
caused.
dust_generator
for a description of the class
of created objects, and dust_example()
for some
pre-built examples. If you want to just generate the code and
load it yourself with pkgload::load_all
or some other means,
see dust_generate
)
# dust includes a couple of very simple examples filename <- system.file("examples/walk.cpp", package = "dust") # This model implements a random walk with a parameter coming from # R representing the standard deviation of the walk writeLines(readLines(filename)) # The model can be compiled and loaded with dust::dust(filename) # but it's faster in this example to use the prebuilt version in # the package model <- dust::dust_example("walk") # Print the object and you can see the methods that it provides model # Create a model with standard deviation of 1, initial time step zero # and 30 particles obj <- model$new(list(sd = 1), 0, 30) obj # Curent state is all zero obj$state() # Current time is also zero obj$time() # Run the model up to time step 100 obj$run(100) # Reorder/resample the particles: obj$reorder(sample(30, replace = TRUE)) # See the state again obj$state()
# dust includes a couple of very simple examples filename <- system.file("examples/walk.cpp", package = "dust") # This model implements a random walk with a parameter coming from # R representing the standard deviation of the walk writeLines(readLines(filename)) # The model can be compiled and loaded with dust::dust(filename) # but it's faster in this example to use the prebuilt version in # the package model <- dust::dust_example("walk") # Print the object and you can see the methods that it provides model # Create a model with standard deviation of 1, initial time step zero # and 30 particles obj <- model$new(list(sd = 1), 0, 30) obj # Curent state is all zero obj$state() # Current time is also zero obj$time() # Run the model up to time step 100 obj$run(100) # Reorder/resample the particles: obj$reorder(sample(30, replace = TRUE)) # See the state again obj$state()
Detect CUDA configuration. This function tries to compile a small
program with nvcc
and confirms that this can be loaded into R,
then uses that program to query the presence and capabilities of
your NVIDIA GPUs. If this works, then you can use the GPU-enabled
dust features, and the information returned will help us. It's
quite slow to execute (several seconds) so we cache the value
within a session. Later versions of dust will cache this across
sessions too.
dust_cuda_configuration( path_cuda_lib = NULL, path_cub_include = NULL, quiet = TRUE, forget = FALSE )
dust_cuda_configuration( path_cuda_lib = NULL, path_cub_include = NULL, quiet = TRUE, forget = FALSE )
path_cuda_lib |
Optional path to the CUDA libraries, if they
are not on system library paths. This will be added as
|
path_cub_include |
Optional path to the CUB headers, if using CUDA < 11.0.0. See Details |
quiet |
Logical, indicating if compilation of test programs should be quiet |
forget |
Logical, indicating if we should forget cached values and recompute the configuration |
Not all installations leave the CUDA libraries on the default
paths, and you may need to provide it. Specifically, when we link
the dynamic library, if the linker complains about not being able
to find libcudart
then your CUDA libraries are not in the
default location. You can manually pass in the path_cuda_lib
argument, or set the DUST_PATH_CUDA_LIB
environment variable (in
that order of precedence).
If you are using older CUDA (< 11.0.0) then you need to provide CUB headers, which we use to manage state on the device (these are included in CUDA 11.0.0 and higher). You can provide this as:
a path to this function (path_cub_include
)
the environment variable DUST_PATH_CUB_INCLUDE
CUB headers installed into the default location (R >= 4.0.0, see below).
These are checked in turn with the first found taking
precedence. The default location is stored with
tools::R_user_dir("dust", "data")
, but this functionality is
only available on R >= 4.0.0.
To install CUB you can do:
dust:::cuda_install_cub(NULL)
which will install CUB into the default path (provide a path on older versions of R and set this path as DUST_PATH_CUB_INCLUDE).
For editing your .Renviron file to set these environment
variables, usethis::edit_r_environ()
is very helpful.
A list of configuration information. This includes:
has_cuda
: logical, indicating if it is possible to compile CUDA on
this machine (not necessarily use it though)
cuda_version
: the version of CUDA found
devices
: a data.frame of device information:
id: the device id (integer, typically in a sequence from 0)
name: the human-friendly name of the device
memory: the memory of the device, in MB
version: the compute version for this device
path_cuda_lib
: path to CUDA libraries, if required
path_cub_include
: path to CUB headers, if required
If compilation of the test program fails, then has_cuda
will be
FALSE
and all other elements will be NULL
.
dust_cuda_options which controls additional CUDA compilation options (e.g., profiling, debug mode or custom flags)
# If you have your CUDA library in an unusual location, then you # may need to add a path_cuda_lib argument: dust::dust_cuda_configuration( path_cuda_lib = "/usr/local/cuda-11.1/lib64", forget = TRUE, quiet = FALSE) # However, if things are installed in the default location or you # have set the environment variables described above, then this # may work: dust::dust_cuda_configuration(forget = TRUE, quiet = FALSE)
# If you have your CUDA library in an unusual location, then you # may need to add a path_cuda_lib argument: dust::dust_cuda_configuration( path_cuda_lib = "/usr/local/cuda-11.1/lib64", forget = TRUE, quiet = FALSE) # However, if things are installed in the default location or you # have set the environment variables described above, then this # may work: dust::dust_cuda_configuration(forget = TRUE, quiet = FALSE)
Create options for compiling for CUDA. Unless you need to change paths to libraries/headers, or change the debug level you will probably not need to directly use this. However, it's potentially useful to see what is being passed to the compiler.
dust_cuda_options( ..., debug = FALSE, profile = FALSE, fast_math = FALSE, flags = NULL )
dust_cuda_options( ..., debug = FALSE, profile = FALSE, fast_math = FALSE, flags = NULL )
... |
Arguments passed to |
debug |
Logical, indicating if we should compile for debug
(adding |
profile |
Logical, indicating if we should enable profiling |
fast_math |
Logical, indicating if we should enable "fast maths", which lets the optimiser enable optimisations that break IEEE compliance and disables some error checking (see the CUDA docs for more details). |
flags |
Optional extra arguments to pass to nvcc. These
options will not be passed to your normal C++ compiler, nor the
linker (for that use R's user Makevars system). This can be used
to do things like tune the maximum number of registers
( |
An object of type cuda_options
, which can be passed into
dust as argument gpu
dust_cuda_configuration which identifies and returns the core CUDA configuration (often used implicitly by this function).
tryCatch( dust::dust_cuda_options(), error = function(e) NULL)
tryCatch( dust::dust_cuda_options(), error = function(e) NULL)
Prepare data for use with the $set_data()
method. This is not
required for use but tries to simplify the most common use case
where you have a data.frame with some column indicating "dust
time step" (name_time
), and other columns that might be use in
your data_compare
function. Each row will be turned into a named
R list, which your dust_data
function can then work with to get
this time-steps values. See Details for use with multi-pars
objects.
dust_data(object, name_time = "time", multi = NULL)
dust_data(object, name_time = "time", multi = NULL)
object |
An object, at this point must be a data.frame |
name_time |
The name of the data column within |
multi |
Control how to interpret data for multi-parameter dust object; see Details |
Note that here "dust time step" (name_time
) refers to the dust
time step (which will be a non-negative integer) and not the
rescaled value of time that you probably use within the model. See
dust_generator for more information.
The data object as accepted by data_set
must be a list and
each element must itself be a list with two elements; the dust
time
at which the data applies and any R object that corresponds
to data at that point. We expect that most of the time this second
element will be a key-value list with scalar keys, but more
flexibility may be required.
For multi-data objects, the final format is a bit more awkward;
each time step we have a list with elements time
, data_1
,
data_2
, ..., data_n
for n
parameters. There are two ways of
creating this that might be useful: sharing the data across all
parameters and using some column as a grouping value.
The behaviour here is driven by the multi
argument;
NULL
: (the default) do nothing; this creates an object that
is suitable for use with a pars_multi = FALSE
dust
object.
<integer>
(e.g., multi = 3); share the data across 3 sets of
parameters. This number must match the number of parameter sets
that your dust object is created with
<column_name>
(e.g., multi = "country"); the name of a column
within your data to split the data at. This column must be a
factor, and that factor must have levels that map to integers 1,
2, ..., n (e.g., unique(as.integer(object[[multi]]))
returns
the integers 1:n
).
A list of dust time/data pairs that will be used for the compare function in a compiled model. Each element is a list of length two or more where the first element is the time step and the subsequent elements are data for that time step.
d <- data.frame(time = seq(0, 50, by = 10), a = runif(6), b = runif(6)) dust::dust_data(d)
d <- data.frame(time = seq(0, 50, by = 10), a = runif(6), b = runif(6)) dust::dust_data(d)
Access dust's built-in examples. These are compiled into the
package so that examples and tests can be run more quickly without
having to compile code directly via dust()
. These examples are
all "toy" examples, being small and fast to run.
dust_example(name)
dust_example(name)
name |
The name of the example to use. There are five
examples: |
sir
: a basic SIR (Susceptible, Infected, Resistant)
epidemiological model. Draws from the binomial distribution to
update the population between each time step.
sirs
: an SIRS model, the SIR model with an added R->S transition.
This has a non-zero steady state, so can be run indefinitely for testing.
volatility
: A volatility model that might be applied to
currency fluctuations etc.
walk
: A 1D random walk, following a Gaussian distribution each
time step.
logistic
: Logistic growth in continuous time
A dust_generator
object that can be used to create a
model. See examples for usage.
# A SIR (Susceptible, Infected, Resistant) epidemiological model sir <- dust::dust_example("sir") sir # Initialise the model at time step 0 with 50 independent trajectories mod <- sir$new(list(), 0, 50) # Run the model for 400 steps, collecting "infected" every 4th time step times <- seq(0, 400, by = 4) mod$set_index(2L) y <- mod$simulate(times) # A plot of our epidemic matplot(times, t(drop(y)), type = "l", lty = 1, col = "#00000044", las = 1, xlab = "Time", ylab = "Number infected")
# A SIR (Susceptible, Infected, Resistant) epidemiological model sir <- dust::dust_example("sir") sir # Initialise the model at time step 0 with 50 independent trajectories mod <- sir$new(list(), 0, 50) # Run the model for 400 steps, collecting "infected" every 4th time step times <- seq(0, 400, by = 4) mod$set_index(2L) y <- mod$simulate(times) # A plot of our epidemic matplot(times, t(drop(y)), type = "l", lty = 1, col = "#00000044", las = 1, xlab = "Time", ylab = "Number infected")
Generate a package out of a dust model. The resulting package can
be installed or loaded via pkgload::load_all()
though it
contains minimal metadata and if you want to create a persistent
package you should use dust_package()
. This function is
intended for cases where you either want to inspect the code or
generate it once and load multiple times (useful in some workflows
with CUDA models).
dust_generate( filename, quiet = FALSE, workdir = NULL, gpu = FALSE, real_type = NULL, linking_to = NULL, cpp_std = NULL, compiler_options = NULL, optimisation_level = NULL, mangle = FALSE )
dust_generate( filename, quiet = FALSE, workdir = NULL, gpu = FALSE, real_type = NULL, linking_to = NULL, cpp_std = NULL, compiler_options = NULL, optimisation_level = NULL, mangle = FALSE )
filename |
The path to a single C++ file |
quiet |
Logical, indicating if compilation messages from
|
workdir |
Optional working directory to use. If |
gpu |
Logical, indicating if we should generate GPU
code. This requires a considerable amount of additional software
installed (CUDA toolkit and drivers) as well as a
CUDA-compatible GPU. If |
real_type |
Optionally, a string indicating a substitute type to
swap in for your model's |
linking_to |
Optionally, a character vector of additional
packages to add to the |
cpp_std |
The C++ standard to use, if you need to set one
explicitly. See the section "Using C++ code" in "Writing R
extensions" for the details of this, and how it interacts with
the R version currently being used. For R 4.0.0 and above, C++11
will be used; as dust depends on at least this version of R you
will never need to specify a version this low. Sensible options
are |
compiler_options |
A character vector of additional options
to pass through to the C++ compiler. These will be passed
through without any shell quoting or validation, so check the
generated commands and outputs carefully in case of error. Note
that R will apply these before anything in your personal
|
optimisation_level |
A shorthand way of specifying common
compiler options that control optimisation level. By default
( |
mangle |
Logical, indicating if the model name should be
mangled when creating the package. This is safer if you will
load multiple copies of the package into a single session, but
is |
The path to the generated package (will be workdir
if
that was provided, otherwise a temporary directory).
filename <- system.file("examples/walk.cpp", package = "dust") path <- dust::dust_generate(filename) # Simple package created: dir(path) dir(file.path(path, "R")) dir(file.path(path, "src"))
filename <- system.file("examples/walk.cpp", package = "dust") path <- dust::dust_generate(filename) # Simple package created: dir(path) dir(file.path(path, "R")) dir(file.path(path, "src"))
All "dust" dust models are R6 objects and expose a
common set of "methods". To create a dust model of your own,
see dust and to interact with some built-in ones see
dust_example()
A dust_generator
object
For discrete time models, dust has an internal "time", which was
called step
in version 0.11.x
and below. This must always
be non-negative (i.e., zero or more) and always increases in
unit increments. Typically a model will remap this internal
time onto a more meaningful time in model space, e.g. by applying
the transform model_time = offset + time * dt
; with this approach
you can start at any real valued time and scale the unit increments
to control the model dynamics.
new()
Create a new model. Note that the behaviour of this object
created by this function will change considerably based on
whether the pars_multi
argument is TRUE
. If not (the
default) then we create n_particles
which all share the same
parameters as specified by the pars
argument. If pars_multi
is TRUE
then pars
must be an unnamed list, and each element
of it represents a different set of parameters. We will
create length(pars)
sets of n_particles
particles which
will be simulated together. These particles must have the same
dimension - that is, they must correspond to model state that
is the same size.
dust_generator$new( pars, time, n_particles, n_threads = 1L, seed = NULL, pars_multi = FALSE, deterministic = FALSE, gpu_config = NULL, ode_control = NULL )
pars
Data to initialise your model with; a list
object, but the required elements will depend on the details of
your model. If pars_multi
is TRUE
, then this must be an
unnamed list of pars
objects (see Details).
time
Initial time - must be nonnegative
n_particles
Number of particles to create - must be at least 1
n_threads
Number of OMP threads to use, if dust
and
your model were compiled with OMP support (details to come).
n_particles
should be a multiple of n_threads
(e.g., if you use 8
threads, then you should have 8, 16, 24, etc particles). However, this
is not compulsory.
seed
The seed to use for the random number generator. Can
be a positive integer, NULL
(initialise with R's random number
generator) or a raw
vector of a length that is a multiple of
32 to directly initialise the generator (e..g., from the
dust
object's $rng_state()
method).
pars_multi
Logical, indicating if pars
should be
interpreted as a set of different initialisations, and that we
should prepare n_particles * length(pars)
particles for
simulation. This has an effect on many of the other methods of
the object.
deterministic
Run random number generation deterministically, replacing a random number from some distribution with its expectation. Deterministic models are not compatible with running on a a GPU.
gpu_config
GPU configuration, typically an integer
indicating the device to use, where the model has GPU support.
If not given, then the default value of NULL
will fall back on the
first found device if any are available. An error is thrown if the
device id given is larger than those reported to be available (note
that CUDA numbers devices from 0, so that '0' is the first device,
and so on). See the method $gpu_info()
for available device ids;
this can be called before object creation as
dust_generator$public_methods$gpu_info()
.
For additional control, provide a list with elements device_id
and run_block_size
. Further options (and validation) of this
list will be added in a future version!
ode_control
For ODE models, control over the integration;
must be a dust_ode_control
model, produced by
dust_ode_control()
. It is an error to provide a non-NULL
value for discrete time models.
name()
Returns friendly model name
dust_generator$name()
param()
Returns parameter information, if provided by the model. This
describes the contents of pars passed to the constructor or to
$update_state()
as the pars
argument, and the details depend
on the model.
dust_generator$param()
run()
Run the model up to a point in time, returning the filtered state at that point.
dust_generator$run(time_end)
time_end
Time to run to (if less than or equal to the current time(), silently nothing will happen)
simulate()
Iterate all particles forward in time over a series of times,
collecting output as they go. This is a helper around $run()
where you want to run to a series of points in time and save
output. The returned object will be filtered by your active index,
so that it has shape (n_state
x n_particles
x length(time_end)
)
for single-parameter objects, and (n_state
x n_particles
x
n_pars
x length(time_end)
) for multiparameter objects. Note that
this method is very similar to $run()
except that the rank of
the returned array is one less. For a scalar time_end
you would
ordinarily want to use $run()
but the resulting numbers would
be identical.
dust_generator$simulate(time_end)
time_end
A vector of time points that the simulation should report output at. This the first time must be at least the same as the current time, and every subsequent time must be equal or greater than those before it (ties are allowed though probably not wanted).
run_adjoint()
Run model with gradient information (if supported). The interface here will change, and documentation written once it stabilises.
dust_generator$run_adjoint()
set_index()
Set the "index" vector that is used to return a subset of pars
after using run()
. If this is not used then run()
returns
all elements in your state vector, which may be excessive and slower
than necessary.
dust_generator$set_index(index)
index
The index vector - must be an integer vector with elements between 1 and the length of the state (this will be validated, and an error thrown if an invalid index is given).
index()
Returns the index
as set by $set_index
dust_generator$index()
ode_control()
Return the ODE control set into the object on creation.
For discrete-time models this always returns NULL
.
dust_generator$ode_control()
ode_statistics()
Return statistics about the integration, for ODE models. For discrete time models this makes little sense and so errors if used.
dust_generator$ode_statistics()
n_threads()
Returns the number of threads that the model was constructed with
dust_generator$n_threads()
n_state()
Returns the length of the per-particle state
dust_generator$n_state()
n_particles()
Returns the number of particles
dust_generator$n_particles()
n_particles_each()
Returns the number of particles per parameter set
dust_generator$n_particles_each()
shape()
Returns the shape of the particles
dust_generator$shape()
update_state()
Update one or more components of the model state.
This method can be used to update any or all of pars
, state
and
time
. If both pars
and time
are given and state
is not,
then by default we will update the model internal state according
to your model's initial conditions - use set_initial_state = FALSE
to prevent this.
dust_generator$update_state( pars = NULL, state = NULL, time = NULL, set_initial_state = NULL, index = NULL, reset_step_size = NULL )
pars
New pars for the model (see constructor)
state
The state vector - can be either a numeric vector with the same length as the model's current state (in which case the same state is applied to all particles), or a numeric matrix with as many rows as your model's state and as many columns as you have particles (in which case you can set a number of different starting states at once).
time
New initial time for the model. If this
is a vector (with the same length as the number of particles), then
particles are started from different initial times and run up to the
largest time given (i.e., max(time)
)
set_initial_state
Control if the model initial state
should be set while setting parameters. It is an error for
this to be TRUE
when either pars
is NULL
or when state
is non-NULL
.
index
Used in conjunction with state
, use this to set a
fraction of the model state; the index
vector provided must
be the same length as the number of provided states, and
indicates the index within the model state that should be updated.
For example, if your model has states [a, b, c, d]
and
you provide an index of [1, 3]
then of state
was [10, 20]
you would set a
to 10 and c
to 20.
reset_step_size
Logical, indicating if we should reset the initial step size. This only has an effect with ode models and is silently ignored in discrete time models where the step size is constant.
state()
Return full model state
dust_generator$state(index = NULL)
index
Optional index to select state using
time()
Return current model time
For ODE models, sets the schedule at which stochastic events are
handled. The timing here is quite subtle - an event happens
immediately after the time (so at time + eps
). If your model
runs up to time
an event is not triggered, but as soon as that
time is passed, by any amount of time, the event will trigger. It
is an error to set this to a non-NULL
value in a discrete time
model; later we may generalise the approach here.
dust_generator$time()
set_stochastic_schedule()
dust_generator$set_stochastic_schedule(time)
time
A vector of times to run the stochastic update at
reorder()
Reorder particles.
dust_generator$reorder(index)
index
An integer vector, with values between 1 and n_particles, indicating the index of the current particles that new particles should take.
resample()
Resample particles according to some weight.
dust_generator$resample(weights)
weights
A numeric vector representing particle weights. For a "multi-parameter" dust object this should be be a matrix with the number of rows being the number of particles per parameter set and the number of columns being the number of parameter sets. long as all particles or be a matrix.
info()
Returns information about the pars that your model was created with.
Only returns non-NULL if the model provides a dust_info
template
specialisation.
dust_generator$info()
pars()
Returns the pars
object that your model was constructed with.
dust_generator$pars()
rng_state()
Returns the state of the random number generator. This returns a
raw vector of length 32 * n_particles. This can be useful for
debugging or for initialising other dust objects. The arguments
first_only
and last_only
are mutually exclusive. If neither is
given then all all particles states are returned, being 32 bytes
per particle. The full returned state or first_only
are most
suitable for reseeding a new dust object.
dust_generator$rng_state(first_only = FALSE, last_only = FALSE)
first_only
Logical, indicating if we should return only the first random number state
last_only
Logical, indicating if we should return only the last random number state, which does not belong to a particle.
set_rng_state()
Set the random number state for this model. This replaces the RNG state that the model is using with a state of your choosing, saved out from a different model object. This method is designed to support advanced use cases where it is easier to manipulate the state of the random number generator than the internal state of the dust object.
dust_generator$set_rng_state(rng_state)
rng_state
A random number state, as saved out by the
$rng_state()
method. Note that unlike seed
as passed to the
constructor, this must be a raw vector of the expected length.
has_openmp()
Returns a logical, indicating if this model was compiled with
"OpenMP" support, in which case it will react to the n_threads
argument passed to the constructor. This method can also be used
as a static method by running it directly
as dust_generator$public_methods$has_openmp()
dust_generator$has_openmp()
has_gpu_support()
Returns a logical, indicating if this model was compiled with
"CUDA" support, in which case it will react to the device
argument passed to the run method. This method can also be used
as a static method by running it directly
as dust_generator$public_methods$has_gpu_support()
dust_generator$has_gpu_support(fake_gpu = FALSE)
fake_gpu
Logical, indicating if we count as TRUE
models that run on the "fake" GPU (i.e., using the GPU
version of the model but running on the CPU)
has_compare()
Returns a logical, indicating if this model was compiled with
"compare" support, in which case the set_data
and compare_data
methods are available (otherwise these methods will error). This
method can also be used as a static method by running it directly
as dust_generator$public_methods$has_compare()
dust_generator$has_compare()
real_size()
Return the size of real numbers (in bits). Typically this will be
64 for double precision and 32 for float
. This method can also be
used as a static method by running it directly as
dust_generator$public_methods$real_size()
dust_generator$real_size()
time_type()
Return the type of time this model uses; will be one of discrete
(for discrete time models) or continuous
(for ODE models).
This method can also be used as a static method by running it
directly as dust_generator$public_methods$time_type()
dust_generator$time_type()
rng_algorithm()
Return the random number algorithm used. Typically this will be
xoshiro256plus
for models using double precision reals and
xoshiro128plus
for single precision (float
). This method can
also be used as a static method by running it directly as
dust_generator$public_methods$rng_algorithm()
dust_generator$rng_algorithm()
uses_gpu()
Check if the model is running on a GPU
dust_generator$uses_gpu(fake_gpu = FALSE)
fake_gpu
Logical, indicating if we count as TRUE
models that run on the "fake" GPU (i.e., using the GPU
version of the model but running on the CPU)
n_pars()
Returns the number of distinct pars elements required. This is 0
where the object was initialised with pars_multi = FALSE
and
an integer otherwise. For multi-pars dust objects, Where pars
is accepted, you must provide an unnamed list of length $n_pars()
.
dust_generator$n_pars()
set_n_threads()
Change the number of threads that the dust object will use. Your model must be compiled with "OpenMP" support for this to have an effect. Returns (invisibly) the previous value.
dust_generator$set_n_threads(n_threads)
n_threads
The new number of threads to use. You may want to
wrap this argument in dust_openmp_threads()
in order to
verify that you can actually use the number of threads
requested (based on environment variables and OpenMP support).
set_data()
Set "data" into the model for use with the $compare_data()
method.
This is not supported by all models, depending on if they define a
data_type
type. See dust_data()
for a helper function to
construct suitable data and a description of the required format. You
will probably want to use that here, and definitely if using multiple
parameter sets.
dust_generator$set_data(data, shared = FALSE)
data
A list of data to set.
shared
Logical, indicating if the data should be shared
across all parameter sets, if your model is initialised to use
more than one parameter set (pars_multi = TRUE
).
compare_data()
Compare the current model state against the data as set by
set_data
. If there is no data set, or no data corresponding to
the current time then NULL
is returned. Otherwise a numeric vector
the same length as the number of particles is returned. If model's
underlying compare_data
function is stochastic, then each call to
this function may be result in a different answer.
dust_generator$compare_data()
filter()
Run a particle filter. The interface here will change a lot over the
next few versions. You must reset the dust object using
$update_state(pars = ..., time = ...)
before using this method to
get sensible values.
dust_generator$filter( time_end = NULL, save_trajectories = FALSE, time_snapshot = NULL, min_log_likelihood = NULL )
time_end
The time to run to. If NULL
, run to the end
of the last data. This value must be larger than the current
model time ($time()
) and must exactly appear in the data.
save_trajectories
Logical, indicating if the filtered particle
trajectories should be saved. If TRUE
then the trajectories
element
will be a multidimensional array (state x <shape> x time
)
containing the state values, selected according to the index set
with $set_index()
.
time_snapshot
Optional integer vector indicating times
that we should record a snapshot of the full particle filter state.
If given it must be strictly increasing vector whose elements
match times given in the data
object. The return value with be
a multidimensional array (state x <shape> x time_snapshot
)
containing full state values at the requested times.
min_log_likelihood
Optionally, a numeric value representing
the smallest likelihood we are interested in. If non-NULL
either a scalar value or vector the same length as the number
of parameter sets. Not yet supported, and included for future
compatibility.
gpu_info()
Return information about GPU devices, if the model
has been compiled with CUDA/GPU support. This can be called as a
static method by running dust_generator$public_methods$gpu_info()
.
If run from a GPU enabled object, it will also have an element
config
containing the computed device configuration: the device
id, shared memory and the block size for the run
method on the
device.
dust_generator$gpu_info()
# An example dust object from the package: walk <- dust::dust_example("walk") # The generator object has class "dust_generator" class(walk) # The methods below are are described in the documentation walk
# An example dust object from the package: walk <- dust::dust_example("walk") # The generator object has class "dust_generator" class(walk) # The methods below are are described in the documentation walk
Create a control object for controlling the adaptive stepper for systems of ordinary differential equations (ODEs). The returned object can be passed into a continuous-time dust model on initialisation.
dust_ode_control( max_steps = NULL, atol = NULL, rtol = NULL, step_size_min = NULL, step_size_max = NULL, debug_record_step_times = NULL )
dust_ode_control( max_steps = NULL, atol = NULL, rtol = NULL, step_size_min = NULL, step_size_max = NULL, debug_record_step_times = NULL )
max_steps |
Maxmimum number of steps to take. If the integration attempts to take more steps that this, it will throw an error, stopping the integration. |
atol |
The per-step absolute tolerance. |
rtol |
The per-step relative tolerance. The total accuracy will be less than this. |
step_size_min |
The minimum step size. The actual minimum
used will be the largest of the absolute value of this
|
step_size_max |
The largest step size. By default there is no maximum step size (Inf) so the solver can take as large a step as it wants to. If you have short-lived fluctuations in your rhs that the solver may skip over by accident, then specify a smaller maximum step size here. |
debug_record_step_times |
Logical, indicating if we should record
the steps taken. This information will be available as part of
the |
A named list of class "dust_ode_control"
# We include an example of logistic growth with the package gen <- dust::dust_example("logistic") # Create a control object, then pass it through as the ode_control # parameter to the constructor: ctrl <- dust::dust_ode_control(atol = 1e-3, rtol = 1e-3) mod <- gen$new(list(r = 0.1, K = 100), 0, 1, ode_control = ctrl) # When the model runs, the control parameters passed in to the # constructor are used in the solution mod$run(10) # The full set of control parameters can be extracted: mod$ode_control()
# We include an example of logistic growth with the package gen <- dust::dust_example("logistic") # Create a control object, then pass it through as the ode_control # parameter to the constructor: ctrl <- dust::dust_ode_control(atol = 1e-3, rtol = 1e-3) mod <- gen$new(list(r = 0.1, K = 100), 0, 1, ode_control = ctrl) # When the model runs, the control parameters passed in to the # constructor are used in the solution mod$run(10) # The full set of control parameters can be extracted: mod$ode_control()
Return information about OpenMP support for this system. For
individual models look at the $has_openmp()
method.
dust_openmp_support(check_compile = FALSE)
dust_openmp_support(check_compile = FALSE)
check_compile |
Logical, indicating if we should check if we can compile an openmp program - this is slow the first time. |
A list with information about the openmp support on your system.
The first few elements come from the openmp library directly:
num_proc
, max_threads
, thread_limit
; these correspond to a
call to the function omp_get_<name>()
in C and
openmp_version
which is the value of the _OPENMP
macro.
A logical has_openmp
which is TRUE
if it looks like runtime
OpenMP support is available
The next elements tell you about different sources that might
control the number of threads allowed to run: mc.cores
(from
the R option with the same name), OMP_THREAD_LIMIT
,
OMP_NUM_THREADS
, MC_CORES
(from environment variables),
limit_r
(limit computed against R-related control variables),
limit_openmp
(limit computed against OpenMP-related variables)
and limit
the smaller of limit_r
and limit_openmp
dust_openmp_threads()
for setting a polite number of
threads.
# System wide support dust::dust_openmp_support() # Support compiled into a generator walk <- dust::dust_example("walk") walk$public_methods$has_openmp() # Support from an instance of that model model <- walk$new(list(sd = 1), 0, 1) model$has_openmp()
# System wide support dust::dust_openmp_support() # Support compiled into a generator walk <- dust::dust_example("walk") walk$public_methods$has_openmp() # Support from an instance of that model model <- walk$new(list(sd = 1), 0, 1) model$has_openmp()
Politely select a number of threads to use. See Details for the algorithm
dust_openmp_threads(n = NULL, action = "error")
dust_openmp_threads(n = NULL, action = "error")
n |
Either |
action |
An action to perform if |
There are two limits and we will take the smaller of the two.
The first limit comes from piggy-backing off of R's normal
parallel configuration; we will use the MC_CORES
environment
variable and mc.cores
option as a guide to how many cores you
are happy to use. We take mc.cores
first, then MC_CORES
, which
is the same behaviour as parallel::mclapply
and friends.
The second limit comes from openmp. If you do not have OpenMP
support, then we use one thread (higher numbers have no effect at
all in this case). If you do have OpenMP support, we take the
smallest of the number of "processors" (reported by
omp_get_num_procs()
) the "max threads" (reported by
omp_get_max_threads()
and "thread_limit" (reported by
omp_get_thread_limit()
.
See dust_openmp_support()
for the values of all the values
that go into this calculation.
An integer, indicating the number of threads that you can use
# Default number of threads; tries to pick something friendly, # erring on the conservative side. dust::dust_openmp_threads(NULL) # Try to pick something silly and it will be reduced for you dust::dust_openmp_threads(1000, action = "fix")
# Default number of threads; tries to pick something friendly, # erring on the conservative side. dust::dust_openmp_threads(NULL) # Try to pick something silly and it will be reduced for you dust::dust_openmp_threads(1000, action = "fix")
Updates a dust model in a package. The user-provided code is
assumed to the in inst/dust
as a series of C++ files; a file
inst/dust/model.cpp
will be transformed into a file
src/model.cpp
.
dust_package(path, quiet = FALSE)
dust_package(path, quiet = FALSE)
path |
Path to the package |
quiet |
Passed to |
If your code provides a class model
then dust will create C++
functions such as dust_model_alloc
- if your code also includes
names such as this, compilation will fail due to duplicate
symbols.
We add "cpp11 attributes" to the created functions, and will run
cpp11::cpp_register()
on them once the generated code
has been created.
Your package needs a src/Makevars
file to enable openmp (if your
system supports it). If it is not present then a suitable Makevars
will be written, containing
PKG_CXXFLAGS=$(SHLIB_OPENMP_CXXFLAGS) PKG_LIBS=$(SHLIB_OPENMP_CXXFLAGS)
following "Writing R Extensions" (see section "OpenMP support").
If your package does contain a src/Makevars
file we do not
attempt to edit it but will error if it looks like it does not
contain these lines or similar.
You also need to make sure that your package loads the dynamic
library; if you are using roxygen, then you might create a file
(say, R/zzz.R
) containing
#' @useDynLib packagename, .registration = TRUE NULL
substituting packagename
for your package name as
appropriate. This will create an entry in NAMESPACE
.
Nothing, this function is called for its side effects
vignette("dust")
which contains more discussion of this
process
# This is explained in more detail in the package vignette path <- system.file("examples/sir.cpp", package = "dust", mustWork = TRUE) dest <- tempfile() dir.create(dest) dir.create(file.path(dest, "inst/dust"), FALSE, TRUE) writeLines(c("Package: example", "Version: 0.0.1", "LinkingTo: cpp11, dust"), file.path(dest, "DESCRIPTION")) writeLines("useDynLib('example', .registration = TRUE)", file.path(dest, "NAMESPACE")) file.copy(path, file.path(dest, "inst/dust")) # An absolutely minimal skeleton contains a DESCRIPTION, NAMESPACE # and one or more dust model files to compile: dir(dest, recursive = TRUE) # Running dust_package will fill in the rest dust::dust_package(dest) # More files here now dir(dest, recursive = TRUE)
# This is explained in more detail in the package vignette path <- system.file("examples/sir.cpp", package = "dust", mustWork = TRUE) dest <- tempfile() dir.create(dest) dir.create(file.path(dest, "inst/dust"), FALSE, TRUE) writeLines(c("Package: example", "Version: 0.0.1", "LinkingTo: cpp11, dust"), file.path(dest, "DESCRIPTION")) writeLines("useDynLib('example', .registration = TRUE)", file.path(dest, "NAMESPACE")) file.copy(path, file.path(dest, "inst/dust")) # An absolutely minimal skeleton contains a DESCRIPTION, NAMESPACE # and one or more dust model files to compile: dir(dest, recursive = TRUE) # Running dust_package will fill in the rest dust::dust_package(dest) # More files here now dir(dest, recursive = TRUE)
Repair the environment of a dust object created by [dust] and then saved and reloaded by [saveRDS] and [readRDS]. Because we use a fake temporary package to hold the generated code, it will not ordinarily be loaded properly without using this.
dust_repair_environment(generator, quiet = FALSE)
dust_repair_environment(generator, quiet = FALSE)
generator |
The dust generator |
quiet |
Logical, indicating if we should be quiet (default prints some progress information) |
Nothing, called for its side effects
Create an object that can be used to generate random numbers with the same RNG as dust uses internally. This is primarily meant for debugging and testing the underlying C++ rather than a source of random numbers from R.
A dust_rng
object, which can be used to drawn random
numbers from dust's distributions.
The underlying random number generators are designed to work in
parallel, and with random access to parameters (see
vignette("rng")
for more details). However, this is usually
done within the context of running a model where each particle
sees its own stream of numbers. We provide some support for
running random number generators in parallel, but any speed
gains from parallelisation are likely to be somewhat eroded by
the overhead of copying around a large number of random numbers.
All the random distribution functions support an argument
n_threads
which controls the number of threads used. This
argument will silently have no effect if your installation
does not support OpenMP (see dust_openmp_support).
Parallelisation will be performed at the level of the stream,
where we draw n
numbers from each stream for a total of n * n_streams
random numbers using n_threads
threads to do this.
Setting n_threads
to be higher than n_streams
will therefore
have no effect. If running on somebody else's system (e.g., an
HPC, CRAN) you must respect the various environment variables
that control the maximum allowable number of threads; consider
using dust_openmp_threads to select a safe number.
With the exception of random_real
, each random number
distribution accepts parameters; the interpretations of these
will depend on n
, n_streams
and their rank.
If a scalar then we will use the same parameter value for every draw from every stream
If a vector with length n
then we will draw n
random
numbers per stream, and every stream will use the same parameter
value for every stream for each draw (but a different,
shared, parameter value for subsequent draws).
If a matrix is provided with one row and n_streams
columns then we use different parameters for each stream, but
the same parameter for each draw.
If a matrix is provided with n
rows and n_streams
columns then we use a parameter value [i, j]
for the i
th
draw on the j
th stream.
The rules are slightly different for the prob
argument to
multinomial
as for that prob
is a vector of values. As such
we shift all dimensions by one:
If a vector we use same prob
every draw from every stream
and there are length(prob)
possible outcomes.
If a matrix with n
columns then vary over each draw (the
i
th draw using vector prob[, i]
but shared across all
streams. There are nrow(prob)
possible outcomes.
If a 3d array is provided with 1 column and n_streams
"layers" (the third dimension) then we use then we use different
parameters for each stream, but the same parameter for each
draw.
If a 3d array is provided with n
columns and n_streams
"layers" then we vary over both draws and streams so that with
use vector prob[, i, j]
for the i
th draw on the j
th
stream.
The output will not differ based on the number of threads used, only on the number of streams.
info
Information about the generator (read-only)
new()
Create a dust_rng
object
dust_rng$new( seed = NULL, n_streams = 1L, real_type = "double", deterministic = FALSE )
seed
The seed, as an integer, a raw vector or NULL
.
If an integer we will create a suitable seed via the "splitmix64"
algorithm, if a raw vector it must the correct length (a multiple
of either 32 or 16 for float = FALSE
or float = TRUE
respectively). If NULL
then we create a seed using R's random
number generator.
n_streams
The number of streams to use (see Details)
real_type
The type of floating point number to use. Currently
only float
and double
are supported (with double
being
the default). This will have no (or negligible) impact on speed,
but exists to test the low-precision generators.
deterministic
Logical, indicating if we should use "deterministic" mode where distributions return their expectations and the state is never changed.
size()
Number of streams available
dust_rng$size()
jump()
The jump function updates the random number state for each stream by advancing it to a state equivalent to 2^128 numbers drawn from each stream.
dust_rng$jump()
long_jump()
Longer than $jump
, the $long_jump
method is
equivalent to 2^192 numbers drawn from each stream.
dust_rng$long_jump()
random_real()
Generate n
numbers from a standard uniform distribution
dust_rng$random_real(n, n_threads = 1L)
n
Number of samples to draw (per stream)
n_threads
Number of threads to use; see Details
random_normal()
Generate n
numbers from a standard normal distribution
dust_rng$random_normal(n, n_threads = 1L, algorithm = "box_muller")
n
Number of samples to draw (per stream)
n_threads
Number of threads to use; see Details
algorithm
Name of the algorithm to use; currently box_muller
and ziggurat
are supported, with the latter being considerably
faster.
uniform()
Generate n
numbers from a uniform distribution
dust_rng$uniform(n, min, max, n_threads = 1L)
n
Number of samples to draw (per stream)
min
The minimum of the distribution (length 1 or n)
max
The maximum of the distribution (length 1 or n)
n_threads
Number of threads to use; see Details
normal()
Generate n
numbers from a normal distribution
dust_rng$normal(n, mean, sd, n_threads = 1L, algorithm = "box_muller")
n
Number of samples to draw (per stream)
mean
The mean of the distribution (length 1 or n)
sd
The standard deviation of the distribution (length 1 or n)
n_threads
Number of threads to use; see Details
algorithm
Name of the algorithm to use; currently box_muller
and ziggurat
are supported, with the latter being considerably
faster.
binomial()
Generate n
numbers from a binomial distribution
dust_rng$binomial(n, size, prob, n_threads = 1L)
n
Number of samples to draw (per stream)
size
The number of trials (zero or more, length 1 or n)
prob
The probability of success on each trial (between 0 and 1, length 1 or n)
n_threads
Number of threads to use; see Details
nbinomial()
Generate n
numbers from a negative binomial distribution
dust_rng$nbinomial(n, size, prob, n_threads = 1L)
n
Number of samples to draw (per stream)
size
The target number of successful trials (zero or more, length 1 or n)
prob
The probability of success on each trial (between 0 and 1, length 1 or n)
n_threads
Number of threads to use; see Details
hypergeometric()
Generate n
numbers from a hypergeometric distribution
dust_rng$hypergeometric(n, n1, n2, k, n_threads = 1L)
gamma()
Generate n
numbers from a gamma distribution
dust_rng$gamma(n, shape, scale, n_threads = 1L)
n
Number of samples to draw (per stream)
shape
Shape
scale
Scale '
n_threads
Number of threads to use; see Details
poisson()
Generate n
numbers from a Poisson distribution
dust_rng$poisson(n, lambda, n_threads = 1L)
n
Number of samples to draw (per stream)
lambda
The mean (zero or more, length 1 or n). Only valid for lambda <= 10^7
n_threads
Number of threads to use; see Details
exponential()
Generate n
numbers from a exponential distribution
dust_rng$exponential(n, rate, n_threads = 1L)
n
Number of samples to draw (per stream)
rate
The rate of the exponential
n_threads
Number of threads to use; see Details
cauchy()
Generate n
draws from a Cauchy distribution.
dust_rng$cauchy(n, location, scale, n_threads = 1L)
n
Number of samples to draw (per stream)
location
The location of the peak of the distribution (also its median)
scale
A scale parameter, which specifies the distribution's "half-width at half-maximum"
n_threads
Number of threads to use; see Details
multinomial()
Generate n
draws from a multinomial distribution.
In contrast with most of the distributions here, each draw is a
vector with the same length as prob
.
dust_rng$multinomial(n, size, prob, n_threads = 1L)
n
The number of samples to draw (per stream)
size
The number of trials (zero or more, length 1 or n)
prob
A vector of probabilities for the success of each
trial. This does not need to sum to 1 (though all elements
must be non-negative), in which case we interpret prob
as
weights and normalise so that they equal 1 before sampling.
n_threads
Number of threads to use; see Details
state()
Returns the state of the random number stream. This returns a raw vector of length 32 * n_streams. It is primarily intended for debugging as one cannot (yet) initialise a dust_rng object with this state.
dust_rng$state()
rng <- dust::dust_rng$new(42) # Shorthand for Uniform(0, 1) rng$random_real(5) # Shorthand for Normal(0, 1) rng$random_normal(5) # Uniform random numbers between min and max rng$uniform(5, -2, 6) # Normally distributed random numbers with mean and sd rng$normal(5, 4, 2) # Binomially distributed random numbers with size and prob rng$binomial(5, 10, 0.3) # Negative binomially distributed random numbers with size and prob rng$nbinomial(5, 10, 0.3) # Hypergeometric distributed random numbers with parameters n1, n2 and k rng$hypergeometric(5, 6, 10, 4) # Gamma distributed random numbers with parameters a and b rng$gamma(5, 0.5, 2) # Poisson distributed random numbers with mean lambda rng$poisson(5, 2) # Exponentially distributed random numbers with rate rng$exponential(5, 2) # Multinomial distributed random numbers with size and vector of # probabiltiies prob rng$multinomial(5, 10, c(0.1, 0.3, 0.5, 0.1))
rng <- dust::dust_rng$new(42) # Shorthand for Uniform(0, 1) rng$random_real(5) # Shorthand for Normal(0, 1) rng$random_normal(5) # Uniform random numbers between min and max rng$uniform(5, -2, 6) # Normally distributed random numbers with mean and sd rng$normal(5, 4, 2) # Binomially distributed random numbers with size and prob rng$binomial(5, 10, 0.3) # Negative binomially distributed random numbers with size and prob rng$nbinomial(5, 10, 0.3) # Hypergeometric distributed random numbers with parameters n1, n2 and k rng$hypergeometric(5, 6, 10, 4) # Gamma distributed random numbers with parameters a and b rng$gamma(5, 0.5, 2) # Poisson distributed random numbers with mean lambda rng$poisson(5, 2) # Exponentially distributed random numbers with rate rng$exponential(5, 2) # Multinomial distributed random numbers with size and vector of # probabiltiies prob rng$multinomial(5, 10, c(0.1, 0.3, 0.5, 0.1))
Create a set of initial random number seeds suitable for using within a distributed context (over multiple processes or nodes) at a level higher than a single group of synchronised threads.
dust_rng_distributed_state( seed = NULL, n_streams = 1L, n_nodes = 1L, algorithm = "xoshiro256plus" ) dust_rng_distributed_pointer( seed = NULL, n_streams = 1L, n_nodes = 1L, algorithm = "xoshiro256plus" )
dust_rng_distributed_state( seed = NULL, n_streams = 1L, n_nodes = 1L, algorithm = "xoshiro256plus" ) dust_rng_distributed_pointer( seed = NULL, n_streams = 1L, n_nodes = 1L, algorithm = "xoshiro256plus" )
seed |
Initial seed to use. As for dust_rng, this can
be |
n_streams |
The number of streams to create per node. If
passing the results of this seed to a dust object's initialiser
(see dust_generator) you can safely leave this at 1, but
if using in a standalone setting, and especially if using
|
n_nodes |
The number of separate seeds to create. Each will be separated by a "long jump" for your generator. |
algorithm |
The name of an algorithm to use. Alternatively
pass a |
See vignette("rng_distributed")
for a proper introduction to
these functions.
A list of either raw vectors (for
dust_rng_distributed_state
) or of dust_rng_pointer
objects (for dust_rng_distributed_pointer
)
dust::dust_rng_distributed_state(n_nodes = 2) dust::dust_rng_distributed_pointer(n_nodes = 2)
dust::dust_rng_distributed_state(n_nodes = 2) dust::dust_rng_distributed_pointer(n_nodes = 2)
This function exists to support use from other
packages that wish to use dust's random number support, and
creates an opaque pointer to a set of random number streams. It
is described more fully in vignette("rng_package.Rmd")
algorithm
The name of the generator algorithm used (read-only)
n_streams
The number of streams of random numbers provided (read-only)
new()
Create a new dust_rng_pointer
object
dust_rng_pointer$new( seed = NULL, n_streams = 1L, long_jump = 0L, algorithm = "xoshiro256plus" )
seed
The random number seed to use (see dust_rng for details)
n_streams
The number of independent random number streams to create
long_jump
Optionally an integer indicating how many "long jumps" should be carried out immediately on creation. This can be used to create a distributed parallel random number generator (see dust_rng_distributed_state)
algorithm
The random number algorithm to use. The default is
xoshiro256plus
which is a good general choice
sync()
Synchronise the R copy of the random number state. Typically this is only needed before serialisation if you have ever used the object.
dust_rng_pointer$sync()
state()
Return a raw vector of state. This can be used to create other generators with the same state.
dust_rng_pointer$state()
is_current()
Return a logical, indicating if the random number
state that would be returned by state()
is "current" (i.e., the
same as the copy held in the pointer) or not. This is TRUE
on
creation or immediately after calling $sync()
or $state()
and FALSE
after any use of the pointer.
dust_rng_pointer$is_current()
dust::dust_rng_pointer$new()
dust::dust_rng_pointer$new()