In contrast with the descriptive
approach in the main vignette (vignette("porcelain")
), this
vignette contains little recipes for exposing and testing different
endpoint types. They are ordered roughly from simplest to most
complicated, and are written as standalone examples (which makes them
quite repetitive!)
Below, we do not use the package convention of wrapping each endpoint in a function. This is to make the examples a little shorter and to make the endpoints more directly callable. In a package, a wrapper function is needed to make the schema path point to the correct place, and to allow binding of state into the endpoint (see later examples).
The one piece of shared code is that we will use a common schema root
This is the example from the main vignette, adding two numbers given
as query parameters and returning a single number. Note that we need to
use jsonlite::unbox()
to indicate that the single number
should be returned as a number and not a vector of length 1 (compare
jsonlite::toJSON(1)
and
jsonlite::toJSON(jsonlite::unbox(1))
)
add <- function(a, b) {
jsonlite::unbox(a + b)
}
endpoint_add <- porcelain::porcelain_endpoint$new(
"GET", "/", add,
porcelain::porcelain_input_query(a = "numeric", b = "numeric"),
returning = porcelain::porcelain_returning_json("numeric", schema_root))
api <- porcelain::porcelain$new(validate = TRUE)$handle(endpoint_add)
Run the endpoint:
api$request("GET", "/", query = list(a = 1, b = 2))
#> $status
#> [1] 200
#>
#> $headers
#> $headers$`x-request-id`
#> [1] "fcc51780-3cbc-422e-be3a-29f94d3a32cd"
#>
#> $headers$`Content-Type`
#> [1] "application/json"
#>
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#>
#>
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":3}"
Slightly more interesting return type, this time returning a numeric vector.
random <- function(distribution, n) {
switch(distribution,
normal = rnorm(n),
uniform = runif(n),
exponential = rexp(n))
}
endpoint_random <- porcelain::porcelain_endpoint$new(
"GET", "/random/<distribution>", random,
porcelain::porcelain_input_query(n = "numeric"),
returning = porcelain::porcelain_returning_json("numericVector", schema_root))
api <- porcelain::porcelain$new(validate = TRUE)$handle(endpoint_random)
Run the endpoint:
api$request("GET", "/random/normal", query = list(n = 4))
#> $status
#> [1] 200
#>
#> $headers
#> $headers$`x-request-id`
#> [1] "6e649ae1-040b-4fc5-b6f9-a31151e24118"
#>
#> $headers$`Content-Type`
#> [1] "application/json"
#>
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#>
#>
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[-0.3396,0.3801,-0.1197,0.2298]}"
api$request("GET", "/random/uniform", query = list(n = 4))
#> $status
#> [1] 200
#>
#> $headers
#> $headers$`x-request-id`
#> [1] "7df4098f-a580-445e-9a4c-17a6e6498d96"
#>
#> $headers$`Content-Type`
#> [1] "application/json"
#>
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#>
#>
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[0.7719,0.3273,0.2595,0.8336]}"
Note that the output here is always a vector, even in the corner cases of 1 and 0 elements returned:
api$request("GET", "/random/normal", query = list(n = 1))
#> $status
#> [1] 200
#>
#> $headers
#> $headers$`x-request-id`
#> [1] "60d3409b-ba87-4ef9-8c3a-1daddc8928ca"
#>
#> $headers$`Content-Type`
#> [1] "application/json"
#>
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#>
#>
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[0.1954]}"
api$request("GET", "/random/normal", query = list(n = 0))
#> $status
#> [1] 200
#>
#> $headers
#> $headers$`x-request-id`
#> [1] "0e645713-b410-4045-a4dc-c66591122a1c"
#>
#> $headers$`Content-Type`
#> [1] "application/json"
#>
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#>
#>
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[]}"
Here is one way that a complex statistical procedure (here, just
lm
) might be wrapped as an API endpoint. We’ll run a linear
regression against vectors of data x
and y
and
return a table of coefficients.
x <- runif(10)
data <- data.frame(x = x, y = x * 2 + rnorm(length(x), sd = 0.3))
fit <- lm(y ~ x, data)
summary(fit)
#>
#> Call:
#> lm(formula = y ~ x, data = data)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.28426 -0.23817 -0.06128 0.16159 0.46194
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.1261 0.1801 -0.700 0.503506
#> x 2.3412 0.4439 5.275 0.000751 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.2996 on 8 degrees of freedom
#> Multiple R-squared: 0.7767, Adjusted R-squared: 0.7488
#> F-statistic: 27.82 on 1 and 8 DF, p-value: 0.000751
We’re interested in getting the table of coefficients, which we can extract like this:
summary(fit)$coefficients
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.1261252 0.1800635 -0.7004484 0.5035057129
#> x 2.3412450 0.4438667 5.2746582 0.0007510461
and transform a little to turn the row names into a column of their own
lm_coef <- as.data.frame(summary(fit)$coefficients)
lm_coef <- cbind(name = rownames(lm_coef), lm_coef)
rownames(lm_coef) <- NULL
(the broom
package provides a nice way of doing this
sort of manipulation of these slightly opaque objects). There are many
ways of serialising this sort of data; we will do it in the default way
supported by jsonlite
, representing the object as an array
of objects, each of which is key/value pairs for each row:
jsonlite::toJSON(lm_coef, pretty = TRUE)
#> [
#> {
#> "name": "(Intercept)",
#> "Estimate": -0.1261,
#> "Std. Error": 0.1801,
#> "t value": -0.7004,
#> "Pr(>|t|)": 0.5035
#> },
#> {
#> "name": "x",
#> "Estimate": 2.3412,
#> "Std. Error": 0.4439,
#> "t value": 5.2747,
#> "Pr(>|t|)": 0.0008
#> }
#> ]
So we have our target function now:
fit_lm <- function(data) {
data <- jsonlite::fromJSON(data)
fit <- lm(y ~ x, data)
lm_coef <- as.data.frame(summary(fit)$coefficients)
lm_coef <- cbind(name = rownames(lm_coef), lm_coef)
rownames(lm_coef) <- NULL
lm_coef
}
Note that the target function must deserialise the json itself. This
is so that arguments can be passed to jsonlite::fromJSON
easily to control how deserialisation happens. We may support automatic
deserialisation later as an argument to
porcelain::porcelain_input_body_json
.
The endpoint is not that much more involved than before though we have interesting inputs and outputs, with schemas required for both
endpoint_lm <- porcelain::porcelain_endpoint$new(
"POST", "/lm", fit_lm,
porcelain::porcelain_input_body_json("data", "lmInputs", schema_root),
returning = porcelain::porcelain_returning_json("lmCoef", schema_root))
The input schema, lmInputs.json
is
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "lmInputs",
"type": "array",
"items": {
"type": "object",
"properties": {
"x": {"type": "number"},
"y": {"type": "number"}
},
"required": ["x", "y"],
"additionalProperties": false
}
}
while the response schema lmCoef.json
is
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "lmCoef",
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"Estimate": {"type": "number"},
"Std. Error": {"type": "number"},
"t value": {"type": "number"},
"Pr(>|t|)": {"type": "number"}
},
"required": ["name", "Estimate", "Std. Error", "t value", "Pr(>|t|)"],
"additionalProperties": false
}
}
These are both fairly strict schemas using both required
and additionalProperties
. You might want to be more
permissive, but we find that strictness here pays off later.
To exercise the API endpoint we need to pass in our input JSON (not an R object).
json <- jsonlite::toJSON(data)
json
#> [{"x":0.5984,"y":1.6721},{"x":0.1691,"y":0.089},{"x":0.2289,"y":0.179},{"x":0.5269,"y":1.2848},{"x":0.8028,"y":1.5129},{"x":0.2064,"y":0.4154},{"x":0.2612,"y":0.5999},{"x":0.3556,"y":0.4221},{"x":0.1178,"y":0.6115},{"x":0.1826,"y":0.0285}]
api$request("POST", "/lm", body = json, content_type = "application/json")
#> $status
#> [1] 200
#>
#> $headers
#> $headers$`x-request-id`
#> [1] "ebda1f43-7a48-40ec-87b3-cda9c2e51740"
#>
#> $headers$`Content-Type`
#> [1] "application/json"
#>
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#>
#>
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[{\"name\":\"(Intercept)\",\"Estimate\":-0.1262,\"Std. Error\":0.1801,\"t value\":-0.7009,\"Pr(>|t|)\":0.5032},{\"name\":\"x\",\"Estimate\":2.3415,\"Std. Error\":0.4439,\"t value\":5.2752,\"Pr(>|t|)\":0.0008}]}"
(This example also shows off a few other features)
Handling binary inputs and outputs is supported, provided that you
can deal with them in R. In this example we’ll use R’s serialisation
format (rds; see ?serialize
and ?saveRDS
) as
an example, but this approach would equally work with excel
spreadsheets, zip files or any other non-text data that you work
with.
In this example we’ll take some serialised R data and create a png plot as output. We’ll start by writing our target function:
binary_plot <- function(data, width = 400, height = 400) {
data <- unserialize(data)
tmp <- tempfile(fileext = ".png")
on.exit(unlink(tmp))
png(tmp, width = width, height = height)
tryCatch(
plot(data),
finally = dev.off())
readBin(tmp, raw(), n = file.size(tmp))
}
Here, we use unserialize
to convert the incoming binary
data into something usable, plot to a temporary file (which we clean up
later, using on.exit
). Using
tryCatch(..., finally = dev.off())
ensures that even if the
plotting fails, the device will be closed. Finally, readBin
reads that temporary file in a raw vector.
So, for example (using str
to limit what is printed to
screen)
bin <- serialize(data, NULL)
str(binary_plot(bin), vec.len = 10)
#> raw [1:7682] 89 50 4e 47 0d 0a 1a 0a 00 00 ...
It’s hard to tell this is a png, but the first few bytes give it away
(the magic
number 89 50 4e 47 0d 0a 1a 0a
is used at the start of
all png files).
endpoint_plot <- porcelain::porcelain_endpoint$new(
"POST", "/plot", binary_plot,
porcelain::porcelain_input_body_binary("data"),
returning = porcelain::porcelain_returning_binary())
api <- porcelain::porcelain$new(validate = TRUE)$handle(endpoint_plot)
Making the request (again using str
to prevent printing
thousands of hex characters)
str(api$request("POST", "/plot", body = bin,
content_type = "application/octet-stream"),
vec.len = 10)
#> List of 3
#> $ status : int 200
#> $ headers:List of 3
#> ..$ x-request-id : chr "48af63f2-e20a-4073-a6da-2346bcbda159"
#> ..$ Content-Type : chr "application/octet-stream"
#> ..$ X-Porcelain-Validated: chr "true"
#> $ body : raw [1:7682] 89 50 4e 47 0d 0a 1a 0a 00 00 ...