The idea here is that we want to
describe how to build a “context” and then evaluate one or more
expressions in it. This is a little related to approaches like
docker
and packrat
in that we want contexts to
be isolated from one another, but different in that portability
is more important than isolation.
Imagine that you have an analysis to run on another computer with:
drat
, bioconductor, etc).The other computer may already have some packages installed, so you don’t want to waste time and bandwidth re-installing them. So things end up littered with constructs like
If these packages are coming from GitHub (or worse also have dependencies on GitHub) the bootstrap code gets out of hand very quickly and tends to be non-portable.
Creating separate libraries (rather than sharing one from your personal computer) will be important if the architecture differs (e.g., you run Windows but you want to run code on a Linux cluster).
The idea here is that context
helps describe a context
made from the above ingredients and then attempts to recreate it on a
different computer (or in a different directory on your computer).
A minimal context looks like this:
path <- tempfile()
ctx <- context::context_save(path = path)
#> [ init:id ] d811cb115c5280e92c67af5308c8ca12
#> [ init:db ] rds
#> [ init:path ] /tmp/RtmpLKe9lK/filebbe658f325f
#> [ save:id ] 7435c5da4eeffc4a0e6fcc790d654e48
#> [ save:name ] dishonest_meadowlark
ctx
#> <context>
#> - packages: list(attached = character(0), loaded = character(0))
#> - root_id: d811cb115c5280e92c67af5308c8ca12
#> - id: 7435c5da4eeffc4a0e6fcc790d654e48
#> - name: dishonest_meadowlark
#> - root: list(id = "d811cb115c5280e92c67af5308c8ca12", path = "/tmp/RtmpLKe9lK/filebbe658f325f", db = <environment>)
#> - db: <environment>
Typically one would use the arguments packages
and
sources
to describe the requirements of any tasks that
you’ll be running.
Once a context is defined, tasks can be defined in the context. These are simply R expressions associated with the identifier of a context.
The task t
above is just a key that can be used to
retrieve information about the task later.
Several such tasks may exist, though here only one does
To run a task we first need to “load” the context (this will actual
load any required packages and source any scripts) then pass this
through to task_run
res <- context::task_run(t, context::context_load(ctx))
#> [ context ] 7435c5da4eeffc4a0e6fcc790d654e48
#> [ library ]
#> [ namespace ]
#> [ source ]
#> [ root ] /tmp/RtmpLKe9lK/filebbe658f325f
#> [ context ] 7435c5da4eeffc4a0e6fcc790d654e48
#> [ task ] bfcf6bc9261caf6dbe56059f4e7a674d
#> [ expr ] sin(1)
#> [ start ] 2024-10-17 03:44:18.038484
#> [ ok ]
#> [ end ] 2024-10-17 03:44:18.041891
This prints the result of restoring the context and running the task:
context
: the context idlibrary
: calls to library()
to load
packages and attach namespacesnamespace
: calls to loadNamespace()
; these
packages were present but not attached in the context.source
: There was nothing to source()
here
so this is blank, otherwise it would be a list of filenames.root
: the directory within which all our context/task
files will be locatedcontext
: this is repeated here because we’ve finished
the load part of the aove statementtask
: the task idexpr
: the expression to evaluatestart
: start timeok
: indication of successend
: end timeAfter all that, here is the result:
The result can also be retrieved using
task_result()
:
This is not immensely useful as it is; it’s just evaluation with more
steps. Typically we’d do this in another process. You can do this with
callr
here:
res <- callr::rscript(file.path(path, "bin", "task_run"), c(path, t),
echo = TRUE, show = TRUE)
#> Running /usr/lib/R/bin/Rscript /tmp/RtmpLKe9lK/filebbe658f325f/bin/task_run \
#> /tmp/RtmpLKe9lK/filebbe658f325f bfcf6bc9261caf6dbe56059f4e7a674d
#> [ hello ] 2024-10-17 03:44:18.268431
#> [ wd ] /tmp/RtmpKfQIAm/Rbuildb08729783a/context/vignettes
#> [ init ] 2024-10-17 03:44:18.272296
#> [ hostname ] fcddbfc6481b
#> [ process ] 3100
#> [ version ] 0.5.0
#> [ open:db ] rds
#> [ context ] 7435c5da4eeffc4a0e6fcc790d654e48
#> [ library ]
#> [ namespace ]
#> [ source ]
#> [ parallel ] running as single core job
#> [ root ] /tmp/RtmpLKe9lK/filebbe658f325f
#> [ context ] 7435c5da4eeffc4a0e6fcc790d654e48
#> [ task ] bfcf6bc9261caf6dbe56059f4e7a674d
#> [ expr ] sin(1)
#> [ start ] 2024-10-17 03:44:18.292833
#> [ ok ]
#> [ end ] 2024-10-17 03:44:18.295556