An introduction to trackr

Sara Moore, Gabriel Becker

2020-02-28

Introduction - Towards Discoverability

Results are most impactful when they are reproducible, understandable, and discoverable; Analysts cannot incorporate a finding into their understanding of a particular dataset or question unless they know that result exists. This can be difficult within the status quo, where results are often sent directly to collaborators on a particular project, and then (hopefully) archived - often in a location and manner specific to the analyst who generated them.

Results are discoverable on the other hand, when there is a reasonable mechanism by which anyone with appropriate access permissions can discover the existance of - and locate - them. The searching party might be a new collaborator getting up to speed on a project, a scientist working in the same space and trying to determine what has already been done, or even the analyst themselves looking for a result generated months or years previous.

The trackr package seeks to improve the discoverability of results by both recording the existence of (and in some cases object representing) R-based results in a customizable database, and annotating those rescords with automatically inferred metadata about those results. These annotations power the ability to search for and find records of particular results, or classes thereof, whether or not the seeker knew of them beforehand.

Dependency installation check

To run all examples and vignettes, the following packages should be installed:

Setup

Use temporary trackr backend

By default trackr will write to a permanent default JSON backend which lives at ~/.trackr/objdb.json. For the purposes of this vignette, we point it at a temporary one so the vignette does not create permanent files as it is running.

Users will generally not need to run the code below, though they may choose to utilize a non-default backend in which case they will need to specify that in their session.

## An object of class "TrackrDB"
## Slot "opts":
## An object of class "TrackrOptions"
## Slot "insert_delay":
## [1] 0
## 
## Slot "img_dir":
## [1] "/var/folders/14/z0rjkn8j0n5dj1lkdd4ng1600000gn/T//Rtmp05Yoja/trackr_img_dir"
## 
## Slot "img_ext":
## [1] "png"
## 
## Slot "backend_opts":
## list()
## 
## 
## Slot "backend":
## Reference class object of class "JSONBackend"
## Field "docs":
## DocList (0x0)
## Field ".file":
## [1] "/var/folders/14/z0rjkn8j0n5dj1lkdd4ng1600000gn/T//Rtmp05Yoja/objdb.json"
## Field "file":
## [1] "/var/folders/14/z0rjkn8j0n5dj1lkdd4ng1600000gn/T//Rtmp05Yoja/objdb.json"

Create some example plots

Here we create a number of plots - ggplot2, lattice, and base - which we will use throughout the demonstration. The details of these plots themselves is not important, but we use many different datasets and plotting methodologies in order to illustrate the different avenues for discoverability enabled by trackr’s automatic annotations.

## Warning: package 'ggplot2' was built under R version 3.5.2