The new version of orderly
(codename orderly2
for now) is very different to the
previously released version on CRAN (orderly 1.4.3; September 2021) or
the last development version of the 1.x line (orderly 1.6.x; June 2023).
These changes constitute a ground-up rewrite in order to bring out the
best features we found that orderly enabled within workflows, while
removing some features we felt have outlived their usefulness. This is
disruptive change, but we hope that it will be worth it.
This vignette is divided into two parts; one covers the conceptual
differences between orderly1
and orderly2
while the second covers the mechanical process of migrating from an
existing orderly source tree and archive to take advantage of the new
features.
If you have never used version 1.x of orderly, you should not read
this document unless you are curious about the history of design
decisions. Instead you should read the introductory vignette
(vignette("orderly2")
).
The most obvious user-facing change is that there is (almost) no YAML, with the definition
of inputs and outputs for a report now defined within an orderly file,
<reportname>.R
. So an orderly report that previously
had an orderly.yml
file that looked like
parameters:
n_min:
default: 10
script: script.R
source:
- functions.R
resources:
- metadata.csv
depends:
raw_data:
id: latest
use:
raw_data.csv: data.csv
artefacts:
data:
description: Processed data
filenames: data.rds
would end up within an orderly file that looks like:
orderly2::orderly_parameters(n_min = 10)
orderly2::orderly_dependency("raw_data", "latest",
files = c("raw_data.csv" = "data.csv"))
orderly2::orderly_resource("metadata.csv")
orderly2::orderly_artefact("Processed data", "data.rds")
source("functions.R")
We think this is much clearer, and comes with documentation and autocomplete support in most IDEs.
In fact, for simple reports, no special functions
are required, though you’ll find that some will be useful (see
vignette("orderly2")
)
Some specific changes:
packages:
,
instead just use library()
as in an ordinary script. We
will record the state of the session regardless so you will get a record
of what was usedsources:
to list scripts you
want to source, instead use source()
as normalglobal_resources
has become
orderly2::orderly_shared_resource
(these aren’t really
global so much as shared). Note that the directory they are in is now
always shared/
at the orderly root, you may not configure
it.This change has widespread implications:
for
loop over a series of parameter values, or
conditionally depending on other reportsIn version 1, we had built-in support for accessing data from SQL
databases; this has moved within the orderly.db
plugin. All major features are supported.
orderly2
no longer requires a separate
orderly_commit()
call after orderly_run()
; we
no longer make a distinction between local draft and archive packets.
Instead, we have added finer-grained control over where dependencies are
resolved from (locally, or from some subset of your servers), which
generalises the way that draft/archive was used in practice. See
?orderly_run
for more details on how dependencies are
resolved.
This has implications for deleting things; the draft directory was
always an easy target for deletion, but now after deletion you will need
to tell orderly2
that you have deleted things. See
vignette("introduction")
for details on this (section
“Deleting things from the archive”).
We have had two different, but unsatisfactory, mechanisms for developing an orderly report:
orderly_test_start
(up to version 1.1.0)orderly_develop_start
(from 1.1.0 onwards)These worked by doing the initial setup, and copying of dependencies
etc to a location you could work (a new draft for
orderly_test_start
and the source directory for
orderly_develop_start
). From orderly2
you can
just work directly within the source directory, and so long as your
working directory is set to src/<report-name>
, all
orderly2
commands will work as expected.
As with orderly1
, you will need to be careful not to
commit (to git) results of running your analysis, and we encourage
per-report .gitignore
files to help with this.
The biggest change, but perhaps the least visible, is that orderly is now built on an open spec outpack which can be implemented for any language. We will develop a Python implementation of this, and possibly other languages.
This takes control of all the metadata. As such there is a split
between “orderly_
” and “outpack_
” functions in
this package, for more information see the last section of
vignette("introduction")
and also
vignette("outpack")
.
“Global” resources have become “shared” resources,
and always live in shared/
at the orderly root (i.e., this
is no longer configurable). The reason for this is that we want reports
to be able to be run fairly independently of the orderly2
configuration (the only exception to this are plugins). In practice
people did not really vary this.
orderly1
There are two parts to a migration: updating the canonical copy of
your orderly archive (ideally you only have one of these) and updating
your source tree. These steps should be done via the outpack.orderly
package.
You should migrate your archive first. Do this for every archive that
you want to retain (you might have archives stored locally, on
production servers and on staging servers). Archive migration happens
out of place; that is, we do not modify anything in the
original location. If your archive is old and has been used with very
old versions of orderly1
it is possible that this process
will have a few hiccups. Please let us know if that is the case. The
result of this process is that you will end up with a new directory that
contains a new archive conforming to the outpack
spec and
containing orderly2
metadata.
Next, migrate your source tree. This will be done in place
so should be done on a fresh clone of your source git repository. For
each report, we will examine your orderly.yml
files and
your script files (often script.R
), delete these, and then
write out a new orderly file that will adapt your report to work for
orderly2
. It is possible that this will not be perfect and
might need some minor tweaking but hopefully it will be reasonable. One
thing that is not preserved (and we probably cannot do so) is the
comments from the yaml
but as these often refer to
yaml
formatting or orderly1
features hopefully
this is not too much of a problem. You will probably want to manually
tweak the generated code anyway, to take advantage of some of the new
orderly2
features such as being able to compute
dependencies.
If you are using OrderlyWeb, you probably need to pause before migrating, as the replacement is not yet ready.
We will merge orderly2
into the orderly
package, so once we are ready for release you can use that. However, we
anticipate a period of coexistence of both legacy orderly1
systems while we develop orderly2
. To help with this we
have a small helper package orderly.helper
which can smooth over these namespace differences; this may be useful if
you interact with both versions.