---
title: "Opal Projects"
author: "Yannick Marcon"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Opal Projects}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
---

[Opal](https://www.obiba.org/pages/products/opal/) stores data and meta-data (dictionaries) in projects that are accessible through web services. See the [Variables and Data](http://opaldoc.obiba.org/en/latest/variables-data.html) documentation page that explains the data model of Opal. See also the [Resources](https://opaldoc.obiba.org/en/latest/resources.html) documentation for an alternate way of accessing data in a project.

The Opal R package exposes projects related functions:

* list projects,
* list tables,
* list variables,
* list taxonomies,
* and functions for getting all associated details (of a project, a table, a variable etc.) ...

Note that these functions do not create a R session on the server side: it is only accessing the content of the Opal server (permission checks apply).

## Setup

Setup the connection with Opal:

```{r eval=FALSE}
library(opalr)
o <- opal.login("administrator", "password", url = "https://opal-demo.obiba.org")
```

## Project

List the projects:

```{r eval=FALSE}
opal.projects(o)
```

Create a project, linked to a database (default or first one):

```{r eval=FALSE}
if (opal.project_exists(o, "dummy"))
  opal.project_delete(o, "dummy")  
opal.project_create(o, "dummy", database = TRUE)
opal.project(o, "dummy")
```

### Backup and Restore

Backup a project and download the backup archive (encrypted): 

```{r eval=FALSE}
opal.project_backup(o, 'CNSIM', '/home/administrator/backup/CNSIM')
opal.file_download(o, '/home/administrator/backup/CNSIM', '/tmp/CNSIM.zip', key = "12345abcdef")
```

Restore a project from an uploaded (and encrypted) archive:

```{r eval=FALSE}
opal.file_upload(o, '/tmp/CNSIM.zip', '/home/administrator')
opal.project_restore(o, 'dummy', '/home/administrator/CNSIM.zip', key = "12345abcdef")
# verify tables
opal.tables(o, "CNSIM")
```

## Tables

In Opal there are two kinds of tables:

* raw tables, which data are stored in a database,
* views, which are logical tables, using per-variable transformation algorithms.

List the tables in a project, with their count of variables and entities:

```{r eval=FALSE}
opal.tables(o, "CNSIM", counts = TRUE)
```

The table object can be retrieved as follow:

```{r eval=FALSE}
opal.table(o, "CNSIM", "CNSIM1", counts = TRUE)
```

The existence of a table can be checked:

```{r eval=FALSE}
opal.table_exists(o, "CNSIM", "CNSIM1")
```

And more specifically, verify whether a table is a view or not:

```{r eval=FALSE}
opal.table_exists(o, "CNSIM", "CNSIM1", view = TRUE)
```

A table can be created, either as a raw table or a view. To create a view, specify which tables are referred:

```{r eval=FALSE}
# drop table if it exists
opal.table_delete(o, "CNSIM", "CNSIM123")
# then create a view, no variables
opal.table_create(o, "CNSIM", "CNSIM123", tables = c("CNSIM.CNSIM1", "CNSIM.CNSIM2", "CNSIM.CNSIM3"))
```

### Dictionaries

List the variables of a table and get the details of the variable annotations (one column per variable attribute with namespace). This is a summary dictionary, as it includes the concatenated category properties:

```{r eval=FALSE}
opal.variables(o, "CNSIM", "CNSIM1")
```

It is also possible to get the full data dictionary of a table, as separate data frames of variables and categories. This is the recommended format for working with a data dictionary:

```{r eval=FALSE}
dico <- opal.table_dictionary_get(o, "CNSIM", "CNSIM1")
dico$variables
dico$categories
```

Here we modify the data dictionary by appending a derivation script to each of the variables:

```{r eval=FALSE}
dico$variables$script <- paste0("$('", dico$variables$name, "')")
dico$variables
```

Then we apply this derived variables dictionary to the view we have previously created and verify the counts of columns (variables) and rows (entities) in this table:

```{r eval=FALSE}
opal.table_dictionary_update(o, "CNSIM", "CNSIM123", variables = dico$variables, categories = dico$categories)
opal.table(o, "CNSIM", "CNSIM123", counts = TRUE)
```

Assign this view to a symbol in the R server, and get the summary statics:

```{r eval=FALSE}
opal.assign(o, "D", "CNSIM.CNSIM123")
opal.execute(o, "summary(D)")
```

### Values

Get the values in a table for a specific Participant entity:

```{r eval=FALSE}
opal.valueset(o, "CNSIM", "CNSIM123", identifier = "1454")
```

Get all the values of a table in our local R session as a data.frame (tibble) object:

```{r eval=FALSE}
cnsim1 <- opal.table_get(o, "CNSIM", "CNSIM1")
cnsim2 <- opal.table_get(o, "CNSIM", "CNSIM2")
cnsim3 <- opal.table_get(o, "CNSIM", "CNSIM3")
```

Then do some alterations on this data.frame and save it back as a raw table:

```{r eval=FALSE}
# make sure IDs are unique
cnsim1$id <- paste0(cnsim1$id, "-1")
cnsim2$id <- paste0(cnsim2$id, "-2")
cnsim3$id <- paste0(cnsim3$id, "-3")
# bind tables
cnsim123 <- rbind(cnsim1, cnsim2, cnsim3)
# remove some columns
cnsim123$DIS_AMI <- NULL
cnsim123$DIS_CVA <- NULL
cnsim123$DIS_DIAB <- NULL
# save as a raw table
opal.table_save(o, cnsim123, "CNSIM", "CNSIM", overwrite = TRUE, force = TRUE)
opal.table(o, "CNSIM", "CNSIM", counts = TRUE)
```

Verify that this raw table resulting from the merge of the other tables as same values for a given Participant:

```{r eval=FALSE}
opal.valueset(o, "CNSIM", "CNSIM", identifier = "1454-1")
```

It is possible to truncate a table, i.e. delete ALL the values of a table (which must not be a view), without modifying the dictionary:

```{r eval=FALSE}
opal.table_truncate(o, "CNSIM", "CNSIM")
opal.table(o, "CNSIM", "CNSIM", counts = TRUE)
```

### Annotations

Variables can be described by [taxonomy terms](https://opaldoc.obiba.org/en/latest/web-user-guide/administration/taxonomies.html).

List the taxonomies:

```{r eval=FALSE}
opal.taxonomies(o)
```

List the vocabularies of a taxonomy:

```{r eval=FALSE}
opal.vocabularies(o, taxonomy = "Mlstr_area")
```

List the terms of a vocabulary:

```{r eval=FALSE}
opal.terms(o, taxonomy = "Mlstr_area", vocabulary = "Lifestyle_behaviours")
```

To apply a taxonomy term to a table dictionary, use the following for batch annotation:

```{r eval=FALSE}
annotations <- tibble::tribble(
  ~variable, ~taxonomy, ~vocabulary, ~term,
  "LAB_TSC", "Mlstr_area", "Physical_measures", "Physical_characteristics",
  "LAB_TRIG", "Mlstr_area", "Physical_measures", "Physical_characteristics",
  "LAB_HDL", "Mlstr_area", "Physical_measures", "Physical_characteristics",
  "LAB_GLUC_ADJUSTED", "Mlstr_area", "Physical_measures", "Physical_characteristics"
)
opal.annotate(o, "CNSIM", "CNSIM123", annotations = annotations)
```

To list the variable annotations: 

```{r eval=FALSE}
opal.annotations(o, "CNSIM", "CNSIM123")
```

## Resources

Resources are an alternative way of accessing data or computation systems. In a project are stored references to resources, i.e. how to build a resource object in R and the permissions to use this resource.

To list the resource references:

```{r eval=FALSE}
opal.resources(o, "RSRC")
```

To create a reference to a resource (a compressed CSV file, stored in a Opal file system, authorized by a personal access token):

```{r eval=FALSE}
if (opal.resource_exists(o, "RSRC", "CNSIM4"))
  opal.resource_delete(o, "RSRC", "CNSIM4")
opal.resource_create(o, "RSRC", "CNSIM4", 
   url = "opal+https://opal-demo.obiba.org/ws/files/projects/RSRC/CNSIM3.zip", 
   format = "csv", secret = "EeTtQGIob6haio5bx6FUfVvIGkeZJfGq")
# verify the resource reference object
opal.resource(o, "RSRC", "CNSIM4")
```

From a resource reference, it is possible to build and get the resource object in the local R session:

```{r eval=FALSE}
opal.resource_get(o, "RSRC", "CNSIM4")
```

Depending on the nature of the resource, it may be possible to coerce it to a data.frame in the client side:

```{r eval=FALSE}
library(resourcer)
as.data.frame(opal.resource_get(o, "RSRC", "CNSIM4"))
```

The same operation can be done on the R server side:

```{r eval=FALSE}
# assign the resource object
opal.assign.resource(o, "rsrc", "RSRC.CNSIM4")
# coerce it to a data.frame
opal.assign.script(o, "D", quote(as.data.frame(rsrc)))
# get some summary statistics
opal.execute(o, "summary(as.factor(D$GENDER))")
```

## Permissions

Permissions can be managed (list, add, delete) at different levels:

* project: `opal.project_perm()`,
* tables: `opal.tables_perm()`,
* table: `opal.table_perm()`,
* resources: `opal.resources_perm()`,
* resource: `opal.resource_perm()`.

## Teardown

Good practice is to free server resources by sending a logout request:

```{r eval=FALSE}
opal.logout(o)
```