orderly
is a package designed to help make analysis more
reproducible. Its principal aim is to automate a series of basic steps
in the process of writing analyses, making it easy to:
With orderly
we have two main hopes:
orderly
requires a few conventions around organisation
of a project, and after that tries to keep out of your way. However,
these requirements are designed to make collaborative development with
git easier by minimising conflicts and making backup easier by using an
append-only storage system.
One often-touted goal of R over point-and-click analyses packages is that if an analysis is scripted it is more reproducible. However, essentially all analyses depend on external resources - packages, data, code, and R itself; any change in these external resources might change the results. Preventing such changes in external resources is not always possible, but tracking changes should be straightforward - all we need to know is what is being used.
For example, while reproducible research has become
synonymous with literate programming this approach often increases
the number of external resources. A typical knitr
document will depend on:
.Rmd
or .Rnw
)source
The orderly
package helps by
The core problem is that analyses have no general interface.
Consider in contrast the role that functions take in programming. All
functions have a set of arguments (inputs) and a return value (outputs).
With orderly
, we borrow this idea, and each piece of
analysis will require that the user describes what is needed and what
will be produced.
The user describes the inputs of their analysis, including:
The user also provides a list of “artefacts” (file-based results) that they will produce.
Then orderly
:
It then stores metadata alongside the analysis including hashes of all inputs and outputs, copies of data extracted from the database, a record of all R packages loaded at the end of the session, and (if using git) information about the git state (hash, branch and status).
Then if one of the dependencies of a report changes (the used data, code, etc), we have metadata that can be queried to identify the likely source of the change.
To illustrate, we will start with a minimal example (you can use
orderly::orderly_init
to create a similar structure
directly), and we will build it up to demonstrate orderly
features. In the most minimal example, we want to run a script that
creates a graph. It uses no external resources.
.
├── orderly_config.yml
└── src
└── example
├── orderly.yml
└── script.R
In this example, the orderly_config.yml
file is
completely empty, but serves to mark the root of the
orderly
project. We have one report, called
example
, and its configuration is within
orderly.yml
:
script: script.R
artefacts:
- staticgraph:
description: A graph of things
filenames: mygraph.png
- data:
description: Data that went into the plot
filenames: mydata.csv
There are two keys here:
script
the path of the script to run,
script.R
artefacts
a description of the artefacts (files) that
will be produced by running this script. In this case it is a graph with
the filename mygraph.png
The script is plain R code:
dat <- data.frame(x = 1:10, y = runif(10))
write.csv(dat, "mydata.csv", row.names = FALSE)
png("mygraph.png")
plot(dat)
dev.off()
The R code can be as long or as short as needed and can use whatever
packages it needs. orderly
does not do anything with the
script apart from run it so it can be formatted freely (there are no
magic comments, etc). There are no restrictions on what can be done
except that it must produce the artefacts listed in
orderly.yml
. If not, an error will be thrown describing
what was missing.
To run the report, use orderly::orderly_run
(typically
one would be in the orderly
root and so the
root
directory could be omitted, but within this vignette
we use a temporary directory):
## [ info ] Writing initial orderly archive version as 1.1.25
## [ name ] example
## [ id ] 20241018-053122-7baca55c
## [ start ] 2024-10-18 05:31:22.485569
##
## > dat <- data.frame(x = 1:10, y = runif(10))
##
## > write.csv(dat, "mydata.csv", row.names = FALSE)
##
## > png("mygraph.png")
##
## > plot(dat)
##
## > dev.off()
## png
## 2
## [ end ] 2024-10-18 05:31:22.501336
## [ elapsed ] Ran report in 0.01576686 secs
## [ artefact ] mygraph.png: 9004cd59ab3226abada205a967a4dc9c
## [ ... ] mydata.csv: 8354220c49f0bec2665684bc288f52ca
The return value is the id of the report (also printed on the third
line of log output) and is always in the format
YYYYMMDD-HHMMSS-abcdef01
where the last 8 characters are
hex digits (i.e., 4 random bytes). This means reports will automatically
sort nicely but we’ll have some collision resistance.
## [1] "20241018-053122-7baca55c"
Having run the report, the directory layout now looks like:
.
├── archive
├── draft
│ └── example
│ └── 20241018-053122-7baca55c
│ ├── mydata.csv
│ ├── mygraph.png
│ ├── orderly.yml
│ ├── orderly_run.rds
│ └── script.R
├── orderly_config.yml
└── src
└── example
├── orderly.yml
└── script.R
Within drafts
, the directory
example/20241018-053122-7baca55c
has been created which
contains the result of running the report. In here there are the
files:
orderly.yml
: this is an exact copy of the input
filescript.R
: this is an exact copy of the script used for
the analysismygraph.png
: the artefact created by the reportorderly_run.rds
: this is metadata about the run and
includes hashes of input files, of the data used, and of the output etc,
along with details about the packages used and the state of git. It is
stored in R’s internal data format.Every time a report is run it will create a new directory at this
level with a new id. Running the report again now might create the
directory example/20241018-053122-8d90c2c3
We store the copies of files as run by orderly
so that
even if the input files change we can still easily get back to previous
versions of the inputs, alongside the outputs, and these are safe from
any changes to the underlying source.
You can see the list of draft reports like so:
## name id
## 1 example 20241018-053122-7baca55c
Once you’re happy with a report, then “commit” it with
orderly::orderly_commit(id, root = path)
## [ commit ] example/20241018-053122-7baca55c
## [ copy ]
## [ import ] example:20241018-053122-7baca55c
## [ success ] :)
## [1] "/tmp/RtmpTaFezJ/filee0f11a16851/archive/example/20241018-053122-7baca55c"
After this step our directory structure looks like:
.
├── archive
│ └── example
│ └── 20241018-053122-7baca55c
│ ├── mydata.csv
│ ├── mygraph.png
│ ├── orderly.yml
│ ├── orderly_run.rds
│ └── script.R
├── draft
│ └── example
├── orderly.sqlite
├── orderly_config.yml
└── src
└── example
├── orderly.yml
└── script.R
This looks very like the previous, but files have been moved from
being within draft
to being within archive
.
The other difference is that the index orderly.sqlite
has
been created. This is a machine-readable index to all the
orderly
metadata that can be used to build applications
around orderly
(for example OrderlyWeb, a web portal
for orderly
- see the “remotes” vignette). The
documentation for the database format is available on the orderly
website.
First, run orderly::orderly_new
to create a directory
within src
. The name is important and should not contain
spaces (nor should it change as this will change the key report id and
you’ll lose a chain of history), then edit the file
orderly.yml
within that directory.
## Created report at '/tmp/RtmpTaFezJ/filee0f11a16851/src/new'
## Edit the file 'orderly.yml' within this directory
which results in a directory structure like:
.
├── archive
│ └── example
│ └── 20241018-053122-7baca55c
│ ├── mydata.csv
│ ├── mygraph.png
│ ├── orderly.yml
│ ├── orderly_run.rds
│ └── script.R
├── draft
│ └── example
├── orderly.sqlite
├── orderly_config.yml
└── src
├── example
│ ├── orderly.yml
│ └── script.R
└── new
└── orderly.yml
Resources to a report are expected to be read-only files that are used by the script to produce the report. Examples of the sort of files that should be used as resources are:
“Resources” cannot be modified by the report; if orderly
detects that a resource has been changed an error will be thrown.
orderly
will automatically detect any files named
README.md
in a report’s source directory and copy them to
the new directory too.
“Sources” are files containing R code that will be sourced (via the R
function source()
) before the main script is run. Often
this file contains functions or variables used by the main script. All
of the copying and sourcing will be handled by orderly
itself so there is no need to explicitly source the files in the main
script.
“Artefacts” are the output of the report. At least one artefact must be listed and files created during the running of the script must be included as artefacts (or deleted before the script finishes) or an error will be returned.
Examples of artefacts fields in orderly.yml
:
artefacts:
- report:
filenames: report.html
description: a simple report
- data:
description:
- associated data sets
filenames:
- data_one.csv
- data_two.csv
- data_three.csv
- data_four.csv
When declaring an artefact we have to specify what format the
artefact is. Currently supported formats are :data
,
report
, staticgraph
,
interactivegraph
and interactivehtml
. These
tags reflect the intent of use of the file, they have no special meaning
within orderly
itself.
It is often the case that we would like to write a report that
depends on an earlier report, e.g. one report produces a large
dataset and a later report produces a high level summary.
orderly
allows a report to directly copy an artefact file
from an existing report without having to manually copy it into the
report source directory. This is handled in the depends
block of the report’s orderly.yml
.
To use a file as a dependency it must be explicitly listed as an artefact.
An simple example might look like:
This will copy the file the huge-data-set.rds
from the
report big-data-report
with id
20190425-163691-b8451bbf
and rename it
data.rds
. This file can then be used by the report as if it
were in the source directory.
If we want a report to always use the latest version of a report
big-data-report
we can set the id
field to
latest
, e.g.:
This will find the most recent version of the report
big-data-report
and copy files from that directory.
To use multiple artefacts from a single report add the files into the
use
block e.g.:
depends:
- big-data-report:
id: latest
use:
data.rds: huge-data-set.rds
pop.csv: population_data.csv
To use artefacts from multiple reports we add multiple entries to the
depends
field e.g.:
depends:
- big-data-report:
id: latest
use:
data.rds: huge-data-set.rds
pop.csv: population_data.csv
- report_two:
id: latest
use:
data_b.rds: filename.rds
We can also use the same artefact from different versions of the same report. This might come up if we want to write a report that compares the output from different versions of another report. The yaml pattern for this is:
depends:
- big-data-report:
id: 20190425-163691-b8451bbf
use:
data_latest.rds: huge-data-set.rds
- big-data-report:
id: 20181225-172991-34c91ef1
use:
data_old: huge-data-set.rds
The important feature in this example is the dashes before the report name. When all the report names are different these dashes can be omitted, but they are necessary when the report depends on different versions of the same report. Since including the dashes will never cause a problem but omitting them might, we advise that they should always be included.
Sometimes it can be useful to control how a report runs by a
parameter. This could be the name of a country that an analysis applies
to (though we hope to develop a better interface for this soon) through
to controlling the number of iterations that an analysis runs for.
Parameters are declared in the orderly.yml
like:
This would declare that a report takes two parameters a
(with a default of 1), and b
(with no default). Running the
report would then look like:
These parameters are then present in the environment of the report,
so the code can use values a
and b
.
The parameters will also be interpolated into any SQL queries before
they are run, so if the orderly.yml
contains:
then this will be evaluated on the SQL server with a
substituted in where the query says ?a
(this is done with
DBI::sqlInterpolate
).
There might be files that are used in (almost) every report. Examples
of these sorts of files might be document templates or organisation
logos. To set up a global resource create a directory
your_global_dir
in <root>
and the
following to the orderly_config.yml
:
Then to use any file in your_global_dir
in your report
add a global_resources
field to that report’s
orderly.yml
:
global_resources:
logo.jpg: org_logo.jpg
latex_class.cls: org_latex_class.cls
styles.css: org_styles.css
Currently code i.e. R source code cannot be sourced from the global resources directory. So for example utility functions common across multiple reports must be included in each report directory separately. The functionality to include global source code may be added in future versions.
orderly
is designed to work well with git (or any other version control
system). The general principle is that src/
and any
configuration files should be added to git, while most of the generated
files (data
, draft
, archive
, and
any SQLite databases such as orderly.sqlite
) should be
excluded, which can be done by creating a .gitignore
file.
This process can be automated by running
## Changes to '/tmp/RtmpTaFezJ/filee0f11a16851/.gitignore'
## + 1 | # These are directories where orderly stores (potentially large)
## + 2 | # generated files - you definitely do not want these in git.
## + 3 | data
## + 4 | archive
## + 5 | draft
## + 6 | metadata
## + 7 |
## + 8 | # Orderly will store data in a sqlite database, by default orderly.sqlite
## + 9 | *.sqlite
## + 10 |
## + 11 | # It is suggested to use environment variables, in either
## + 12 | # orderly_envir.yml or .Renviron to store sensitive data to use with
## + 13 | # orderly - keep those out of git
## + 14 | .Renviron
## + 15 | orderly_envir.yml
## + 16 |
## + 17 | # Other good things to exclude
## + 18 | .Rhistory
## + 19 | .Rdata
## + 20 | .Rproj.user
## + 21 |
## + 22 | # Created by orderly.server
## + 23 | runner
## + 24 | backup
## Writing to '/tmp/RtmpTaFezJ/filee0f11a16851/.gitignore'
(the default, prompt = TRUE
will request user
confirmation before writing the .gitignore
file).
You should arrange to back up the entire orderly directory through some other means.
If your generated files are particularly small you might leave them in git, but in our experience this will result in a git repository that is unpleasant to use.
One of the original aims of orderly
was to provide a set
of tools for use of SQL databases within reproducible reporting. Because
the SQL database is an external global resource it is difficult to work
with any concept of “versioning” from R (there is no git history, no way
of easily rolling back to previous versions etc). If using a central SQL
server, there is configuration that should be kept out of any
analysis, particularly things like passwords. Configuration problems
multiply when using both “production” and “staging” systems as we would
like to be able to switch between different configurations.
The root orderly_config.yml
configuration specifies the
locations of databases (there can be any number), for example:
database:
source:
driver: RPostgres::Postgres
args:
host: dbhost.example.org
port: 5432
user: myusername
password: s3cret
dbname: mydb
This database will be referred to elsewhere as source
and it will be connected with the RPostgres::Postgres
driver (from the RPostgres
package). Arguments within the args
block will be passed to
the driver, in this case being the equivalent of:
DBI::dbConnect(RPostgres::Postgres, host = "dbhost.example.org", port = 5432,
user = "myusername", password = "s3cret", dbname = "mydb")
The values used in the args
blocks can be environment
values (e.g., password: $DB_PASSWORD
) in which case they
will be resolved from the environment before connecting. This will be
useful for keeping secrets out of source control.
For SQLite
databases, the args
block will typically contain only
dbname
which is the path to the database file.
A report configuration (orderly.yml
) can contain a
data
block, which contains sql queries, such as:
In this case, the query
SELECT * FROM mtcars WHERE cyl = 4
will be run against the
source
database to create an object cars
in
the report environment. The actual report code can use that object
without having ever created the database connection or evaluating the
query.
Further, the data used in the query will be captured in
orderly
’s data
directory, and hashes of the
data will be stored alongside the results. This means that even if the
data in the database is a constantly moving target we can still detect
if changes to the data are responsible for changes in the result of a
report.
If you need to perform complicated SQL queries, then you can export the database connection directly by adding a block:
which will save the connection to the source
database as
the R object con
. We have used this where a report requires
running queries in a loop that depend on the results of a previous query
or additional data loaded into a report, or where the result of the
query will be very large and we do not want to save it to disk.
Note that this reduces the amount of tracking that
orderly
can do, as we have no way of knowing what is done
with the connection once passed to the script.
The contents of orderly_config.yml
may contain things
like secrets (passwords) or hostnames that vary depending on deployment
(e.g., testing locally vs running on a remote system). To customise
this, you can use environment variables within the configuration. So
rather than writing
database:
source:
driver: RPostgres::Postgres
args:
host: localhost
port: 5432
user: myuser
dbname: databasename
password: p4ssw0rd
you might write
database:
source:
driver: RPostgres::Postgres
args:
host: $MY_DBHOST
port: $MY_DBPORT
user: $MY_DBUSER
dbname: $MY_DBNAME
password: $MY_PASSWORD
environment variables, as used this way must begin
with a dollar sign and consist only of uppercase letters, numbers and
the underscore character. You can then set the environment variables in
an .Renviron
(either within the project or in your home
directory) file or your .profile
file. Alternatively, you
can create a file orderly_envir.yml
in the same directory
as orderly_config.yml
with key-value pairs, such as
MY_DBHOST: localhost
MY_DBPORT: 5432
MY_DBUSER: myuser
MY_DBNAME: databasename
MY_PASSWORD: p4ssw0rd
This will be read every time that orderly_config.yml
is
read (in contrast with .Renviron
which is read-only at the
start of a session). This will likely be more pleasant to work with.
The advantage of using environment variables is that you can add the
orderly_envir.yml
file to your .gitignore
and
avoid committing system-dependent data to the central repository (see
orderly::orderly_use_gitignore
) to help automate this.
To avoid leaving passwords in plain text, you can use vault
(along with
the R client vaultr
)
to retrieve them.
To do this, you should include the address of your vault server in
the orderly_config.yml
as
Then, for values that you want to retrieve from the vault, set the
value of the field to VAULT:<path>:<field>
,
where <path>
is the name of a vault secret path
(probably beginning with /secret/
and field
is
the name of the field at that path. So, for example:
would look up the field password
at the path
/secret/users/database_user
. This can be stored in
orderly_config.yml
, in the contents of an environment
variable or in orderly_envir.yml
(currently this only uses
the vault version 1
key-value storage)
If you need to control how the vault server is accessed, then you can
pass additional arguments within an args
block:
vault:
addr: https://example.com:8200
login: userpass
username: alice
password: secret
mount: userpass
This is equivalent to connecting to the vault, using
vaultr
as
vaultr::vault_client(addr = "https://example.com:8200", login = "userpass",
username = "alice", password = "secret",
mount = "userpass")
Environment variables here will be respected so you could write:
vault:
addr: https://example.com:8200
login: userpass
username: $VAULT_USER
password: $VAULT_PASSWORD
and the username and password will be found from environment
variables (the actual secret resolution uses
vaultr::vault_resolve_secrets
- see the documentation in
vaultr
for further details).
In general, you can ignore this section if you only use one global database.
The above approach can be used to switch databases by using different
environmental variables, but that can become tiresome. If you have
multiple database “instances” corresponding to different realisations of
the same logical database (e.g., production and staging), then you can
configure and switch between these directly from orderly
commands. At VIMC we have
several copies of our main database: one called production
,
which is the canonical copy, and then several staging
copies that we use for experimentation.
To configure this situation, list common arguments within the
args
block as before, then add logical databases as named
entries in an instances
field:
database:
source:
driver: RPostgres::Postgres
args:
port: 5432
user: user
dbname: mydb
instances:
production:
host: production.example.org
password: $PASSWORD_PRODUCTION
staging:
host: staging.example.org
password: $PASSWORD_STAGING
default_instance: $DEFAULT_INSTANCE
Here - staging and production have different hostnames
(production.example.org
and
staging.example.org
) and different passwords (retrieved
using environment variables) and the default instance is set with
another environment variable ($DEFAULT_INSTANCE
, which must
be one of production
or staging
). To switch
between databases, you can set that variable, or pass the
instance
argument to orderly::orderly_run
and
friends, as:
or
If there is more than one database configured, then the
interpretation of instance
is a little more nuanced. For
example, suppose we have this (abridged) database configuration:
database:
source:
driver: RPostgres::Postgres
args:
port: 5432
instances:
production:
host: production.example.org
staging:
host: staging.example.org
default_instance: staging
extra:
driver: RPostgres::Postgres
args:
port: 5432
host: extra.example.org
Then we can pass in a string like production
in as the
instance, e.g.,
orderly::orderly_run(name, instance = "production")
as the source
database will select the
production
instance and as there are no instances
configured for the extra
database we will ignore the
argument when connecting to extra
.
However, if both databases had two instances, such as:
database:
source:
driver: RPostgres::Postgres
args:
port: 5432
instances:
production:
host: production.example.org
staging:
host: staging.example.org
default_instance: staging
extra:
driver: RPostgres::Postgres
args:
port: 15432
instances:
production:
host: production.example.org
staging:
host: staging.example.org
default_instance: staging
Then it is possible to select different instances for each database, such as:
orderly::orderly_run(name,
instance = c(source = "production", extra = "staging"))
Environment variables declared in orderly.yml
are made
available in the report script (since orderly 1.1.11). For example, if
your orderly.yml
contains
Then orderly will evaluate the variables and make them available in your script via the specified name. For example in your script you can use
The expected use case for this is if you have data which you want to use in a report but do not want to be in the orderly repository (either because it is sensitive or particularly large). You can have the external files somewhere else on your machine and specify the path to them via an environment variable so they can then be accessed from the report script.
Environment variables can also be used in top-level
orderly_envir.yml
(since orderly 1.1.6) which can be
accessed via Sys.getenv()
. For example, if your
orderly_envir.yml
contains
Then in your script you can use
If the values are sensitive, then this is not ideal, as you will
store your values in plain text in the orderly_envir.yml
.
Instead, if using vault, you can use a secrets:
section in
orderly.yml
, like:
and then an R variable password
will be available to all
the code in your report, containing the result of looking up
/secret/myproject/login
in your vault and getting the
value
field.
Because orderly works in a directory that is not the same as the
source directory (e.g.,
draft/myreport/YYYYMMDD-HHMMSS-abcdefgh
), because global
resources and dependencies etc are copied in just as it is used, and
because orderly
takes control of things like parameters and
sourcing files, it may not seem straightforward to develop a report as
you would ordinarily.
In order to make this easier, orderly has a set of functions to help develop a report within the source directory. These functions are:
orderly::orderly_develop_start
which copies files
into your source copyorderly::orderly_develop_status
which reports on the
status of your sourceorderly::orderly_develop_clean
which cleans up files
that orderly copiedThis section illustrates the idea, creating a new report that will depend on an artefact from a previous report. Here is the state of our orderly tree from before:
.
├── archive
│ └── example
│ └── 20241018-053122-7baca55c
│ ├── mydata.csv
│ ├── mygraph.png
│ ├── orderly.yml
│ ├── orderly_run.rds
│ └── script.R
├── draft
│ └── example
├── orderly.sqlite
├── orderly_config.yml
└── src
├── example
│ ├── orderly.yml
│ └── script.R
└── new
├── orderly.yml
└── script.R
The new report has an orderly.yml
containing
script: script.R
artefacts:
- staticgraph:
description: Mean of the values
filenames: mean.txt
depends:
example:
id: latest
use:
data.csv: mydata.csv
In order to run this report, we need the file data.csv
(which contains the output mydata.csv
from the latest
version of the example
report) to exist. If we wanted to
interactively develop the script.R
we’d have to run
orderly::orderly_run
repeatedly, which will be annoying -
and impractical if the report is slow to run. So we can run instead:
## [ name ] new
## [ depends ] example@20241018-053122-7baca55c:mydata.csv -> data.csv
which copies in the required artefact and we can then change the
directory (using something like setwd("src/new")
) and start
working on the report directly.
You can see the status of the directory by running
## filename type present derived
## 1 orderly.yml orderly TRUE FALSE
## 2 script.R script TRUE FALSE
## 3 data.csv dependency TRUE TRUE
## 4 mean.txt artefact FALSE TRUE
(the output of this object is likely to change in future versions),
which shows that data.csv
is present, and that it is
derived - by which means that orderly knows that it should not
persist in the source tree. Files marked as derived
as
TRUE
are liable for deletion by
orderly::orderly_develop_clean
. The output also shows that
mean.txt
is not present.
If you have changed directly into the path under development (via
setwd(file.path(path, "src/new"))
) the you can omit the
arguments and simply call
## filename type present derived
## 1 orderly.yml orderly TRUE FALSE
## 2 script.R script TRUE FALSE
## 3 data.csv dependency TRUE TRUE
## 4 mean.txt artefact FALSE TRUE
You can then run your script as if it was a normal R script:
##
## > data <- read.csv("data.csv")
##
## > writeLines(as.character(mean(data$y)), "mean.txt")
(the above code assuming that we are within src/new
with
the report)
After which the data.csv
is present
## filename type present derived
## 1 orderly.yml orderly TRUE FALSE
## 2 script.R script TRUE FALSE
## 3 data.csv dependency TRUE TRUE
## 4 mean.txt artefact TRUE TRUE
If a newer version of the upstream dependency has become available,
you can update the file by running
orderly::orderly_develop_start()
again
## [ name ] example
## [ id ] 20241018-053122-e994b997
## [ start ] 2024-10-18 05:31:22.916884
##
## > dat <- data.frame(x = 1:10, y = runif(10))
##
## > write.csv(dat, "mydata.csv", row.names = FALSE)
##
## > png("mygraph.png")
##
## > plot(dat)
##
## > dev.off()
## png
## 2
## [ end ] 2024-10-18 05:31:22.93257
## [ elapsed ] Ran report in 0.01568556 secs
## [ artefact ] mygraph.png: b0e560cb1bce2a550c3a04712ddfea73
## [ ... ] mydata.csv: cf2749f51ddc331368c20bd9ed659916
## [ commit ] example/20241018-053122-e994b997
## [ copy ]
## [ import ] example:20241018-053122-e994b997
## [ success ] :)
## [1] "/tmp/RtmpTaFezJ/filee0f11a16851/archive/example/20241018-053122-e994b997"
## [ name ] new
## [ depends ] example@20241018-053122-e994b997:mydata.csv -> data.csv
(note that this has updated data.csv
to use this new id
20241018-053122-e994b997).
Finally, you can delete the files that orderly has copied into the source tree:
orderly::orderly_develop_clean()
## [ remove ] data.csv
## [ ... ] mean.txt
orderly::orderly_develop_status()
## filename type present derived
## 1 orderly.yml orderly TRUE FALSE
## 2 script.R script TRUE FALSE
## 3 data.csv dependency FALSE TRUE
## 4 mean.txt artefact FALSE TRUE
(this function also accepts name
and root
as above, required if not working in the source directory)