nassimhaddad/dplyr-backends.md

## dplyr-backends.md

      
    Raw
  

              dplyr-backends.md
            
          
    Dplyr is a well known R package to work on structured data, either in memory or in DB and, more recently, in cluster. The in memory implementations have in general capabilities that are not found in the others, so the notion of backend is used with a bit of a poetic license. Even the different DB and cluster backends differ in subtle ways. But it sure is better than writing SQL directly! Here I provide a list of backends with links to the packages that implement them when necessary. I've done my best to provide links to active projects, but I am not endorsing any of them. Do your own testing. Enjoy and please contribute any corrections or additions, in the comments.


Backend
Package


data.frame
builtin


data.table
builtin


arrays
builtin


SQLite
builtin


PostgreSQL/Redshift
builtin


MySQL/MariaDB
builtin


Bigquery
bigrquery


MonetDB
MonetDB.R


Presto
RPresto


Spark
dplyr.spark.hive


Hive
dplyr.spark.hive


Impala
dplyrimpaladb


Vertica
vertica.dplyr


Teradata
teradata.dplyr


Calcite
dplyr-calcite


SQL Server
RSQLServer


Netezza
dplyrnz


multidplyr
multidplyr
Backend	Package
data.frame	builtin
data.table	builtin
arrays	builtin
SQLite	builtin
PostgreSQL/Redshift	builtin
MySQL/MariaDB	builtin
Bigquery	bigrquery
MonetDB	MonetDB.R
Presto	RPresto
Spark	dplyr.spark.hive
Hive	dplyr.spark.hive
Impala	dplyrimpaladb
Vertica	vertica.dplyr
Teradata	teradata.dplyr
Calcite	dplyr-calcite
SQL Server	RSQLServer
Netezza	dplyrnz
multidplyr	multidplyr