Title: | DBI Package for the DuckDB Database Management System |
---|---|
Description: | The DuckDB project is an embedded analytical data management system with support for the Structured Query Language (SQL). This package includes all of DuckDB and a R Database Interface (DBI) connector. |
Authors: | Hannes Mühleisen [aut, cre] , Mark Raasveldt [aut] , Stichting DuckDB Foundation [cph], Apache Software Foundation [cph], PostgreSQL Global Development Group [cph], The Regents of the University of California [cph], Cameron Desrochers [cph], Victor Zverovich [cph], RAD Game Tools [cph], Valve Software [cph], Rich Geldreich [cph], Tenacious Software LLC [cph], The RE2 Authors [cph], Google Inc. [cph], Facebook Inc. [cph], Steven G. Johnson [cph], Jiahao Chen [cph], Tony Kelman [cph], Jonas Fonseca [cph], Lukas Fittl [cph], Salvatore Sanfilippo [cph], Art.sy, Inc. [cph], Oran Agra [cph], Redis Labs, Inc. [cph], Melissa O'Neill [cph], PCG Project contributors [cph] |
Maintainer: | Hannes Mühleisen <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.9.1 |
Built: | 2024-12-05 06:13:30 UTC |
Source: | https://github.com/duckdb/duckdb-r |
R client package for DuckDB: an embeddable SQL OLAP Database Management System.
duckdb()
for connection instructions.
https://duckdb.org/ for the project website.
This is a SQL backend for dbplyr tailored to take into account DuckDB's possibilities. This mainly follows the backend for PostgreSQL, but contains more mapped functions.
simulate_duckdb(...) translate_duckdb(...)
simulate_duckdb(...) translate_duckdb(...)
... |
Any parameters to be forwarded |
library(dplyr, warn.conflicts = FALSE) con <- DBI::dbConnect(duckdb(), path = ":memory:") dbiris <- copy_to(con, iris, overwrite = TRUE) dbiris %>% select(Petal.Length, Petal.Width) %>% filter(Petal.Length > 1.5) %>% head(5) DBI::dbDisconnect(con, shutdown = TRUE)
library(dplyr, warn.conflicts = FALSE) con <- DBI::dbConnect(duckdb(), path = ":memory:") dbiris <- copy_to(con, iris, overwrite = TRUE) dbiris %>% select(Petal.Length, Petal.Width) %>% filter(Petal.Length > 1.5) %>% head(5) DBI::dbDisconnect(con, shutdown = TRUE)
duckdb()
creates or reuses a database instance.
duckdb_shutdown()
shuts down a database instance.
Return an adbcdrivermanager::adbc_driver()
for use with Arrow Database
Connectivity via the adbcdrivermanager package.
dbConnect()
connects to a database instance.
dbDisconnect()
closes a DuckDB database connection, optionally shutting down
the associated instance.
duckdb( dbdir = DBDIR_MEMORY, read_only = FALSE, bigint = "numeric", config = list() ) duckdb_shutdown(drv) duckdb_adbc() ## S4 method for signature 'duckdb_driver' dbConnect( drv, dbdir = DBDIR_MEMORY, ..., debug = getOption("duckdb.debug", FALSE), read_only = FALSE, timezone_out = "UTC", tz_out_convert = c("with", "force"), config = list(), bigint = "numeric" ) ## S4 method for signature 'duckdb_connection' dbDisconnect(conn, ..., shutdown = FALSE)
duckdb( dbdir = DBDIR_MEMORY, read_only = FALSE, bigint = "numeric", config = list() ) duckdb_shutdown(drv) duckdb_adbc() ## S4 method for signature 'duckdb_driver' dbConnect( drv, dbdir = DBDIR_MEMORY, ..., debug = getOption("duckdb.debug", FALSE), read_only = FALSE, timezone_out = "UTC", tz_out_convert = c("with", "force"), config = list(), bigint = "numeric" ) ## S4 method for signature 'duckdb_connection' dbDisconnect(conn, ..., shutdown = FALSE)
dbdir |
Location for database files. Should be a path to an existing directory in the file system. With the default, all data is kept in RAM |
read_only |
Set to |
bigint |
How 64-bit integers should be returned, default is double/numeric. Set to integer64 for bit64 encoding. |
config |
Named list with DuckDB configuration flags |
drv |
Object returned by |
... |
Ignored |
debug |
Print additional debug information such as queries |
timezone_out |
The time zone returned to R, defaults to |
tz_out_convert |
How to convert timestamp columns to the timezone specified
in |
conn |
A |
shutdown |
Set to |
duckdb()
returns an object of class duckdb_driver.
dbDisconnect()
and duckdb_shutdown()
are called for their
side effect.
An object of class "adbc_driver"
dbConnect()
returns an object of class
duckdb_connection.
library(adbcdrivermanager) with_adbc(db <- adbc_database_init(duckdb_adbc()), { as.data.frame(read_adbc(db, "SELECT 1 as one;")) }) drv <- duckdb() con <- dbConnect(drv) dbGetQuery(con, "SELECT 'Hello, world!'") dbDisconnect(con) duckdb_shutdown(drv) # Shorter: con <- dbConnect(duckdb()) dbGetQuery(con, "SELECT 'Hello, world!'") dbDisconnect(con, shutdown = TRUE)
library(adbcdrivermanager) with_adbc(db <- adbc_database_init(duckdb_adbc()), { as.data.frame(read_adbc(db, "SELECT 1 as one;")) }) drv <- duckdb() con <- dbConnect(drv) dbGetQuery(con, "SELECT 'Hello, world!'") dbDisconnect(con) duckdb_shutdown(drv) # Shorter: con <- dbConnect(duckdb()) dbGetQuery(con, "SELECT 'Hello, world!'") dbDisconnect(con, shutdown = TRUE)
Get the Substrait plan for a SQL query Transforms a SQL query into a raw vector containing the serialized Substrait query blob
duckdb_get_substrait(conn, query, enable_optimizer = TRUE)
duckdb_get_substrait(conn, query, enable_optimizer = TRUE)
conn |
A DuckDB connection, created by |
query |
The query string in SQL |
enable_optimizer |
Optional parameter to enable/disable query-optimizer. By default optimizer is enabled. |
A raw vector containing the substrait protobuf blob
Get the Substrait plan for a SQL query in the JSON format Transforms a SQL query into a vector containing the serialized Substrait query JSON
duckdb_get_substrait_json(conn, query, enable_optimizer = TRUE)
duckdb_get_substrait_json(conn, query, enable_optimizer = TRUE)
conn |
A DuckDB connection, created by |
query |
The query string in SQL |
enable_optimizer |
Optional parameter to enable/disable query-optimizer. By default optimizer is enabled. |
A vector containing the substrait protobuf JSON
Query DuckDB using Substrait Method for interpreting a Substrait BLOB plan as a DuckDB Query Plan It interprets and executes the query.
duckdb_prepare_substrait(conn, query, arrow = FALSE)
duckdb_prepare_substrait(conn, query, arrow = FALSE)
conn |
A DuckDB connection, created by |
query |
The Protobuf-encoded Substrait Query Plan. Qack! |
arrow |
Whether the result should be in Arrow format |
A DuckDB Query Result
Query DuckDB using Substrait Method for interpreting a Substrait JSON plan as a DuckDB Query Plan It interprets and executes the query.
duckdb_prepare_substrait_json(conn, json, arrow = FALSE)
duckdb_prepare_substrait_json(conn, json, arrow = FALSE)
conn |
A DuckDB connection, created by |
json |
The Json Query Plan. Qack! |
arrow |
Whether the result should be in Arrow format |
A DuckDB Query Result
Directly reads a CSV file into DuckDB, tries to detect and create the correct schema for it. This usually is much faster than reading the data into R and writing it to DuckDB.
duckdb_read_csv( conn, name, files, header = TRUE, na.strings = "", nrow.check = 500, delim = ",", quote = "\"", col.names = NULL, lower.case.names = FALSE, sep = delim, transaction = TRUE, ... )
duckdb_read_csv( conn, name, files, header = TRUE, na.strings = "", nrow.check = 500, delim = ",", quote = "\"", col.names = NULL, lower.case.names = FALSE, sep = delim, transaction = TRUE, ... )
conn |
A DuckDB connection, created by |
name |
The name for the virtual table that is registered or unregistered |
files |
One or more CSV file names, should all have the same structure though |
header |
Whether or not the CSV files have a separate header in the first line |
na.strings |
Which strings in the CSV files should be considered to be NULL |
nrow.check |
How many rows should be read from the CSV file to figure out data types |
delim |
Which field separator should be used |
quote |
Which quote character is used for columns in the CSV file |
col.names |
Override the detected or generated column names |
lower.case.names |
Transform column names to lower case |
sep |
Alias for delim for compatibility |
transaction |
Should a transaction be used for the entire operation |
... |
Passed on to |
The number of rows in the resulted table, invisibly.
con <- dbConnect(duckdb()) data <- data.frame(a = 1:3, b = letters[1:3]) path <- tempfile(fileext = ".csv") write.csv(data, path, row.names = FALSE) duckdb_read_csv(con, "data", path) dbReadTable(con, "data") dbDisconnect(con)
con <- dbConnect(duckdb()) data <- data.frame(a = 1:3, b = letters[1:3]) path <- tempfile(fileext = ".csv") write.csv(data, path, row.names = FALSE) duckdb_read_csv(con, "data", path) dbReadTable(con, "data") dbDisconnect(con)
duckdb_register()
registers a data frame as a virtual table (view)
in a DuckDB connection.
No data is copied.
duckdb_register(conn, name, df, overwrite = FALSE, experimental = FALSE) duckdb_unregister(conn, name)
duckdb_register(conn, name, df, overwrite = FALSE, experimental = FALSE) duckdb_unregister(conn, name)
conn |
A DuckDB connection, created by |
name |
The name for the virtual table that is registered or unregistered |
df |
A |
overwrite |
Should an existing registration be overwritten? |
experimental |
Enable experimental optimizations |
duckdb_unregister()
unregisters a previously registered data frame.
These functions are called for their side effect.
con <- dbConnect(duckdb()) data <- data.frame(a = 1:3, b = letters[1:3]) duckdb_register(con, "data", data) dbReadTable(con, "data") duckdb_unregister(con, "data") dbDisconnect(con)
con <- dbConnect(duckdb()) data <- data.frame(a = 1:3, b = letters[1:3]) duckdb_register(con, "data", data) dbReadTable(con, "data") duckdb_unregister(con, "data") dbDisconnect(con)
duckdb_register_arrow()
registers an Arrow data source as a virtual table (view)
in a DuckDB connection.
No data is copied.
duckdb_register_arrow(conn, name, arrow_scannable, use_async = NULL) duckdb_unregister_arrow(conn, name) duckdb_list_arrow(conn)
duckdb_register_arrow(conn, name, arrow_scannable, use_async = NULL) duckdb_unregister_arrow(conn, name) duckdb_list_arrow(conn)
conn |
A DuckDB connection, created by |
name |
The name for the virtual table that is registered or unregistered |
arrow_scannable |
A scannable Arrow-object |
use_async |
Switched to the asynchronous scanner. (deprecated) |
duckdb_unregister_arrow()
unregisters a previously registered data frame.
These functions are called for their side effect.