Académique Documents
Professionnel Documents
Culture Documents
Table of contents
Basic Shiny app without data storage
Local vs remote storage
Persistent data storage methods
Store arbitrary data in a file
Local file system (local)
Dropbox (remote)
Amazon S3 (remote)
Store structured data in a table
SQLite (local)
MySQL (local or remote)
Google Sheets (remote)
}
)
The above code is taken from a guide on how to mimic a Google form with Shiny.
The above app is very simplethere is a table that shows all responses, three input fields, and a
Submit button that will take the data in the input fields and save it. You might notice that there are
two functions that are not defined but are used in the app: saveData(data) and loadData().
These two functions are the only code that affects how the data is stored/retrieved, and we will
redefine them for each data storage type. In order to make the app work for now, heres a trivial
implementation of the save and load functions that simply stores responses in the current R session.
saveData <- function(data) {
data <- as.data.frame(t(data))
if (exists("responses")) {
responses <<- rbind(responses, data)
} else {
responses <<- data
}
}
loadData <- function() {
if (exists("responses")) {
responses
}
}
Before continuing further, make sure this basic app works for you and that you understand every
line in itit is not difficult, but take the two minutes to go through it. The code for this app is also
available as a gist and you can run it either by copying all the code to your RStudio IDE or by
running shiny::runGist("c4db11d81f3c46a7c4a5").
Data type
Local storage Remote storage
Arbitrary data
YES
Arbitrary data
YES
Arbitrary data
YES
Structured data
YES
Structured data
YES
YES
Structured data
YES
Semi-structured data
YES
YES
R package
rdrop2
RAmazonS3
RSQLite
RMySQL
googlesheets
rmongodb
current server. To load the data, we simply load all the files in the output directory. In our specific
example, we also want to concatenate all of the data files together into one data.frame.
Setup: The only setup required is to create an output directory (responses in this case) and to ensure
that the Shiny app has file permissions to read/write in that directory.
Code:
outputDir <- "responses"
saveData <- function(data) {
data <- t(data)
# Create a unique file name
fileName <- sprintf("%s_%s.csv", as.integer(Sys.time()), digest::digest(data))
# Write the file to the local system
write.csv(
x = data,
file = file.path(outputDir, fileName),
row.names = FALSE, quote = TRUE
)
}
loadData <- function() {
# Read all the files into a list
files <- list.files(outputDir, full.names = TRUE)
data <- lapply(files, read.csv, stringsAsFactors = FALSE)
# Concatenate all data together into one data.frame
data <- do.call(rbind, data)
data
}
2. Dropbox (remote)
If you want to store arbitrary files with a remote hosted solution instead of the local file system, you
can store files on Dropbox. Dropbox is a file storing service which allows you to host any file, up to
a certain maximum usage. The free account provides plenty of storage space and should be enough
to store most data from Shiny apps.
This approach is similar to the previous approach that used the local file system. The only
difference is that now that files are being saved to and loaded from Dropbox. You can use the
rdrop2 package to interact with Dropbox from R. Note that rdrop2 can only move existing files
onto Dropbox, so we still need to create a local file before storing it on Dropbox.
Setup: You need to have a Dropbox account and create a folder to store the responses. You will also
need to add authentication to rdrop2 with any approach suggested in the package README. The
authentication approach I chose was to authenticate manually once and to copy the resulting
.httr-oauth file that gets created into the Shiny apps folder.
Code:
library(rdrop2)
outputDir <- "responses"
saveData <- function(data) {
data <- t(data)
# Create a unique file name
fileName <- sprintf("%s_%s.csv", as.integer(Sys.time()), digest::digest(data))
3. Amazon S3 (remote)
Another popular alternative to Dropbox for hosting files online is Amazon S3, or S3 in short. Just
like with Dropbox, you can host any type of file on S3, but instead of placing files inside
directories, in S3 you place files inside of buckets. You can use the RAmazonS3 package to interact
with S3 from R. Note that the package is a few years old and is not under active development, so
use it at your own risk.
Setup: You need to have an Amazon Web Services account and to create an S3 bucket to store the
responses. As the package documentation explains, you will need to set the AmazonS3 global
option to enable authentication.
Code:
library(RAmazonS3)
s3BucketName <- "my-unique-s3-bucket-name"
options(AmazonS3 = c('login' = "secret"))
saveData <- function(data) {
# Create a unique file name
fileName <- sprintf("%s_%s.csv", as.integer(Sys.time()), digest::digest(data))
# Upload the data to S3
addFile(
I(paste0(
paste(names(data), collapse = ","),
"\n",
paste(data, collapse = ",")
)),
s3BucketName,
fileName,
virtual = TRUE
)
}
loadData <- function() {
# Get a list of all files
files <- listBucket(s3BucketName)$Key
files <- as.character(files)
# Read all files into a list
data <- lapply(files, function(x) {
raw <- getFile(s3BucketName, x, virtual = TRUE)
read.csv(text = raw, stringsAsFactors = FALSE)
})
# Concatenate all data together into one data.frame
data <- do.call(rbind, data)
data
}
4. SQLite (local)
SQLite is a very simple and light-weight relational database that is very easy to set up. SQLite is
serverless, which means it stores the database locally on the same machine that is running the shiny
app. You can use the RSQLite package to interact with SQLite from R. To connect to a SQLite
database in R, the only information you need to provide is the location of the database file.
To store data in a SQLite database, we loop over all the values we want to add and use a SQL
INSERT statement to add the data to the database. It is essential that the schema of the database
matches exactly the names of the columns in the Shiny data, otherwise the SQL statement will fail.
To load all previous data, we use a plain SQL SELECT * statement to get all the data from the
database table.
Setup: First, you must have SQLite installed on your server. Installation is fairly easy; for example,
on an Ubuntu machine you can install SQLite with sudo apt-get install sqlite3
libsqlite3-dev. If you use shinyapps.io, SQLite is already installed on the shinyapps.io
server, which will be a handy feature in future versions of shinyapps.io, which will include
persistent local storage.
You also need to create a database and a table that will store all the responses. When creating the
table, you need to set up the schema of the table to match the columns of your data. For example, if
you want to save data with columns name and email then you can create the SQL table with
CREATE TABLE responses(name TEXT, email TEXT);. Make sure the shiny app has
write permissions on the database file and its parent directory.
Code:
library(RSQLite)
sqlitePath <- "/path/to/sqlite/database"
table <- "responses"
saveData <- function(data) {
# Connect to the database
db <- dbConnect(SQLite(), sqlitePath)
# Construct the update query by looping over the data fields
query <- sprintf(
"INSERT INTO %s (%s) VALUES ('%s')",
table,
paste(names(data), collapse = ", "),
paste(data, collapse = "', '")
)
# Submit the update query and disconnect
dbGetQuery(db, query)
dbDisconnect(db)
}
loadData <- function() {
# Connect to the database
db <- dbConnect(SQLite(), sqlitePath)
# Construct the fetching query
query <- sprintf("SELECT * FROM %s", table)
# Submit the fetch query and disconnect
data <- dbGetQuery(db, query)
dbDisconnect(db)
data
}
))
databaseName <- "myshinydatabase"
table <- "responses"
saveData <- function(data) {
# Connect to the database
db <- dbConnect(MySQL(), dbname = databaseName, host = options()$mysql$host,
port = options()$mysql$port, user = options()$mysql$user,
password = options()$mysql$password)
# Construct the update query by looping over the data fields
query <- sprintf(
"INSERT INTO %s (%s) VALUES ('%s')",
table,
paste(names(data), collapse = ", "),
paste(data, collapse = "', '")
)
# Submit the update query and disconnect
dbGetQuery(db, query)
dbDisconnect(db)
}
loadData <- function() {
# Connect to the database
db <- dbConnect(MySQL(), dbname = databaseName, host = options()$mysql$host,
port = options()$mysql$port, user = options()$mysql$user,
password = options()$mysql$password)
# Construct the fetching query
query <- sprintf("SELECT * FROM %s", table)
# Submit the fetch query and disconnect
data <- dbGetQuery(db, query)
dbDisconnect(db)
data
}
Conclusion
Persistent storage lets you do more with your Shiny apps. You can even use persistent storage to
access and write to remote data sets that would otherwise be too big to manipulate in R.
The following table can serve as a reminder of the different storage types and when to use them.
Remember that any method that uses local storage can only be used on Shiny Server, while any
method that uses remote storage can be also used on shinyapps.io.
Method
Local file system
Dropbox
Amazon S3
SQLite
MySQL
Google Sheets
MongoDB
Data type
Local storage Remote storage
Arbitrary data
YES
Arbitrary data
YES
Arbitrary data
YES
Structured data
YES
Structured data
YES
YES
Structured data
YES
Semi-structured data
YES
YES
R package
rdrop2
RAmazonS3
RSQLite
RMySQL
googlesheets
rmongodb
You can view the original post of this article, and leave further comments, at
http://deanattali.com/blog/shiny-persistent-data-storage/.