The traduire
R
package provides a wrapper around the i18next
JavaScript library.
It presents an alternative interface to R’s built-in
internationalisation functions, with a focus on the ability to change
the target language within a session. Currently the package presents
only a stripped down interface to the underlying library, though this
may expand in future.
First, prepare a json
file with your translations. For
example, the included file examples/simple.json
contains:
{
"en": {
"translation": {
"hello": "hello world",
"query": "how are you?",
"interpolate": "{{what}} is {{how}}"
}
},
"fr": {
"translation": {
"hello": "bonjour le monde",
"query": "ça va ?",
"interpolate": "{{what}} est {{how}}"
}
}
}
We can create a translator, setting the default language to English (en) as:
tr <- traduire::i18n(path, language = "en")
tr
## <i18n>
## Public:
## add_resource_bundle: function (language, namespace, resources, deep = FALSE, overwrite = FALSE)
## default_namespace: function ()
## exists: function (string, data = NULL, language = NULL, count = NULL,
## get_resource: function (language, namespace, key, sep = ".")
## has_resource_bundle: function (language, namespace)
## initialize: function (resources, options)
## language: function ()
## languages: function ()
## load_languages: function (languages)
## load_namespaces: function (namespaces)
## options: function ()
## replace: function (text, ...)
## set_default_namespace: function (namespace)
## set_language: function (language)
## t: function (string, data = NULL, language = NULL, count = NULL,
## Private:
## context: V8, environment
With this object we can perform translations with the t
method by passing in a key from within our translations:
Specify the language
argument to change language:
String interpolation is done using a syntax very similar to glue
(see the i18next
documentation)
The example here is derived from a web API that we developed the package to support. We wanted to, as a service, validate incoming data and return information back to user about what to fix; if the data is missing one or more columns we will report back the columns that they are missing. This requires different translations for the singular case (“Data missing column X”) and plural (“Data missing columns X, Y”).
The translation file looks like:
{
"en": {
"translation": {
"nocols": "Data missing column {{missing}}",
"nocols_plural": "Data missing columns {{missing}}"
}
},
"fr": {
"translation": {
"nocols": "Les données sont manquantes colonne {{missing}}",
"nocols_plural": "Les données sont manquantes colonnes {{missing}}"
}
}
}
where the _plural
suffix is important for
i18next
for determining the string to return for a singular
or plural case, and the count
element determines if the
string is singular or plural.
Then we can use this as:
Pluralisation of results is supported using keys that include
_plural
suffix (see the i18next
documentation) and by passing a count
argument in to
the translation:
tr$t("nocols", list(missing = "A"), count = 1)
## [1] "Data missing column A"
tr$t("nocols", list(missing = "A, B"), count = 2)
## [1] "Data missing columns A, B"
or, changing the language:
To illustrate this feature, we use a list of translations of
Hello world!
which includes many languages.
Most simply, if we want to fall back onto a single language for all translations, we can provide a fallback language as a string:
tr <- traduire::i18n(path_hello, fallback = "it")
tr$t("hello", language = "unknown")
## [1] "Ciao Mondo!"
Alternatively, a chain of languages to try can be provided:
tr <- traduire::i18n(path_hello, fallback = c("a", "b", "de"))
tr$t("hello", language = "unknown")
## [1] "Hallo Welt!"
If you want to have different fallback languages for different target languages, provide a named list of mappings (each of which can be a scalar or vector of fallback languages as above):
The motivating use case we had was translating a json file for use in an upstream web application, so the text to translate might contain data like:
{
"id": "area_scope",
"label": "element_label",
"type": "multiselect",
"description": "element_description"
}
where the json contains a mix of elements to be internationalised
(such as the values of label
and description
)
and elements to be left as-is (such as the values of id
and
type
). The snippet above is a simplified version of the full
data where the values to translate might occur at any depth within
the json.
To support this, the i18n
object has a
replace
method, which performs string replacement of text
wrapped in t_(...)
. So we rewrite our json:
string <- '{
"id": "area_scope",
"label": "t_(element_label)",
"type": "multiselect",
"description": "t_(element_description)"
}'
and we provide a set of translations:
translations <- '{
"en": {
"translation": {
"element_label": "Country",
"element_description": "Select your countries"
}
},
"fr": {
"translation": {
"element_label": "Payes",
"element_description": "Sélectionnez vos payes"
}
}
}'
We construct a translator object with these translations:
We can then use the replace
method to translate all
strings (wrapped here in writeLines
to make it easier to
read with all json quotes:
writeLines(tr$replace(string))
## {
## "id": "area_scope",
## "label": "Country",
## "type": "multiselect",
## "description": "Select your countries"
## }
or, into French:
writeLines(tr$replace(string, language = "fr"))
## {
## "id": "area_scope",
## "label": "Payes",
## "type": "multiselect",
## "description": "Sélectionnez vos payes"
## }
Note that while the input text here is json, it could be anything at all, and will not be parsed as json.
We provide an optional workflow for using translations within a package, or some other piece of code where the translations will be fairly invasive to add, allowing you to write essentially:
and have all the ...
arguments forwarded to the
appropriate translator object. There are several details here:
To do this, we allow packages (or other similar code) to “register” a translator, like
where resources
is passed to
traduire::i18n
.
Here we show a complete example package that implements “hello-world-as-a-service” - i.e., a small web service that will reply with a version of “Hello world!” translated into the client’s choice.
The full package is included as an example within
traduire
at
system.file("hello", package = "traduire")
and is
hello
|-+= R
| |--= api.R
| \--= hello.R
|-+= inst
| |--= README.md
| |--= plumber.R
| \--= traduire.json
|-+= man
| |--= api.Rd
| \--= hello.Rd
|--= DESCRIPTION
|--= LICENSE
\--= NAMESPACE
Below is the code in hello.R
, which can say rough
translations of “hello world” in a number of languages:
hello <- function(...) {
cowsay::say("Hello", "cow", ...)
}
world <- function(language = "en", ...) {
cowsay::say(t_("hello", language = language), "cow", ...)
}
monde <- function(...) {
cowsay::say(t_("hello"), ...)
}
.onLoad <- function(...) {
path <- system.file("traduire.json", package = "hello", mustWork = TRUE)
traduire::translator_register(path, "en")
}
Here,
hello
is a simple function that does no
translationworld
is a function that translates with an explicit
language argument, but finds the translations automagicallymonde
is a function that translates and finds both the
translations and the language automagicallyThe .onLoad
function contains a call to
traduire::translator_register
which registers a translator
database for the package. All calls to t_
that come from
this package will use this registered translator.
Why would we want to do this? If we were using plumber to build an API we might want to allow the requests to come in with a header indicating the language. Our plumber api might look like:
#' @get /
#' @html
function(res, req) {
language <- as.list(req$HEADERS)[["accept-language"]]
paste0(hello::world(language, type = "string"), "\n")
}
#' @get /hello/<animal>
#' @html
function(res, req, animal) {
paste0(hello::monde(by = animal, type = "string"), "\n")
}
The first endpoint inspects the endpoint’s req
object to
get the requested language, but the second gets it automagically. This
can be understood by looking at the code used to run the API:
api <- function(port = 8888) {
path <- system.file("plumber.R", package = "hello", mustWork = TRUE)
pr <- plumber::plumb(path)
pr$registerHook("preroute", api_set_language)
pr$registerHook("postserialize", api_reset_language)
pr$run(port = port)
}
api_set_language <- function(data, req, res) {
if ("accept-language" %in% names(req$HEADERS)) {
language <- req$HEADERS[["accept-language"]]
data$reset_language <- traduire::translator_set_language(language)
}
}
api_reset_language <- function(data, req, res, value) {
if (!is.null(data$reset_language)) {
data$reset_language()
}
value
}
So at the beginning of each api request we are calling
traduire::translator_set_language
, which affects only this
package as a “preroute”
hook and resetting this in the “postserialize” hook.
The full package is available at
system.file("hello", package = "traduire")
. If you run the
API, it can be used like:
$ curl -H "Accept-Language: fr" http://localhost:8888
-----
Salut le monde !
------
\ ^__^
\ (oo)\ ________
(__)\ )\ /\
||------w|
|| ||
$ curl -H "Accept-Language: en" http://localhost:8888
-----
Hello world!
------
\ ^__^
\ (oo)\ ________
(__)\ )\ /\
||------w|
|| ||
$ curl -H "Accept-Language: ko" http://localhost:8888/hello/cat
--------------
반갑다 세상아
--------------
\
\
\
|\___/|
==) ^Y^ (==
\ ^ /
)=*=(
/ \
| |
/| | | |\
\| | |_|/\
jgs //_// ___/
\_)
This section outlines how to write the translation (json) files, alongside a discussion of using namespaces. Consider again the first example:
{
"en": {
"translation": {
"hello": "hello world",
"query": "how are you?",
"interpolate": "{{what}} is {{how}}"
}
},
"fr": {
"translation": {
"hello": "bonjour le monde",
"query": "ça va ?",
"interpolate": "{{what}} est {{how}}"
}
}
}
In this format, the top level keys (en
, fr
)
represent languages and the next level key
(translation
) which appears redundant represents a namespace.
A translation set can have multiple namespaces, which can help with
organising a large set of strings, and can be used to split the file up
over smaller files that might be easier to work with (see below).
Below, we have file with two namespaces, common
and
login
. These might represent strings used throughout the
application and in a login component, for example.
{
"en": {
"common": {
"hello": "hello world"
},
"login": {
"username": "Username",
"password": "Password"
}
},
"fr": {
"common": {
"hello": "salut le monde"
},
"login": {
"username": "Nom d'utilisateur",
"password": "Mot de passe"
}
}
}
When constructing the translator object we can provide a default
namespace (it defaults to translation
):
Keys that are provided without an explicit namespace, will be looked up in the default namespace:
or provide a namespace when looking up keys:
tr$t("common:hello", language = "fr")
## [1] "salut le monde"
tr$t("login:username", language = "fr")
## [1] "Nom d'utilisateur"
So far, this brings relatively little advantage as our file, while structured, is still going to end up really large as all the files end up in it. So we might want to break it up like so:
structured
|--= en-common.json
|--= en-login.json
|--= fr-common.json
\--= fr-login.json
where each file is orgnanised like:
to allow this, we need to load the files one by one into the
translation object, rather than as a single resource bundle. To do this,
we can use the add_resource_bundle
method:
obj <- traduire::i18n(NULL)
obj$add_resource_bundle("en", "common", file.path(path, "en-common.json"))
obj$add_resource_bundle("en", "login", file.path(path, "en-login.json"))
obj$add_resource_bundle("fr", "common", file.path(path, "fr-common.json"))
obj$add_resource_bundle("fr", "login", file.path(path, "fr-login.json"))
obj$t("login:password", language = "fr")
## [1] "Mot de passe"
This is clearly going to be error prone to do with a large number of translation files, though a loop could help:
obj <- traduire::i18n(NULL)
for (language in c("en", "fr")) {
for (namespace in c("common", "login")) {
bundle <- file.path(path, sprintf("%s-%s.json", language, namespace))
obj$add_resource_bundle(language, namespace, bundle)
}
}
obj$t("login:password", language = "fr")
## [1] "Mot de passe"
An alternative is to pass in the pattern used to locate these files,
though this approach works best if you also declare your namespaces and
languages up front. The pattern uses glue’s syntax, and the pattern must
include placeholders language
and namespace
(and no others):