Skip to content

jan-imbi/rocker-versioned2

 
 

Repository files navigation

Introduction

This is a fork of Rocker project. It aims to produce a docker image containing R and Rstudio which retains sufficient detail to fulfill the requirements for documentation present in highly regulated environments such as clinical trials. A design goal of this fork is to make as little changes as possible to the upstream repository in order to require minimal maintanance.

Noteable changes to the upstream version are:

  • LaTeX is installed with the scheme-full option instead of scheme-infraonly
  • make and configure logs are saved
  • the R source files are not deleted
  • a powershell build script is added which sequentially builds the following version-tagged images locally
    • r-ver
    • rstudio
    • tidyverse
    • verse

Downloading

You need to make sure git does not touch line endings. The build scripts will fail if any of the scripts in the /scripts folder are changed to CR+LF (i.e. Windows) line endings. The following command instructs git globally to to mess with line endings:

git config --global core.autocrlf false

After this, you can clone this repository like this:

git clone [email protected]:jan-imbi/rocker-versioned2.git

Building

First, make sure you have docker installed.

Then, navigate to the root of the git repository (e.g. cd ~/rocker-versioned2) and run the following command (in your shell):

./sequential_build_script/build_4.1.2.ps1

Note that this will overwrite local images named rocker/<image name>:<tag>

Starting a container for validation

You can start a container like this:

docker run -d -p 8787:8787 -e PASSWORD=validation -e ROOT=true rocker/verse:4.1.2

After this, you will be able to log in at http://localhost:8787/auth-sign-in with:

  • username: rstudio
  • password: validation

This user will be in the sudoers file.

You can ssh into a container like this:

docker exec -ti <container-id> bash

You can find out the conatiner id via:

docker ps -a

If you want to expose a ssh port to access the container over your network, you can do so by starting the containter with the following command (from your shell):

docker run -d -p 8787:8787 -p <ssh port you want to map to>:22 -e PASSWORD=validation -e ROOT=true rocker/verse:4.1.2

Afterwards, you need to install an openSSH server on the container and start the ssh serverice (from inside the container):

sudo apt update && sudo apt install openssh-server -y
sudo service ssh start

Validating

After having logged into the container as root, you can perform whatever your organization deems necessary to validate a computerized system.

You will probably want to install all packages to fulfill the requirements your organization has for this computerized system. We recommend that packages which are known to be required a priori are installed in /usr/local/lib/R/site-library.

Additionally, we recommend removing write privilages to /usr/local/lib/R/site-library and /usr/local/lib/R/library via the following commands (in bash):

chmod g-w /usr/local/lib/R/site-library
chmod g-w /usr/local/lib/R/library

This way, packages installed by users will be installed to /home/<username>/R/<architecture>/<R version>.

You may also want to hash all packages installed at the site (i.e. administrator) level. You can do so by opening a R session with sudo privilages and executing the following command (in R):

sapply(.libPaths(), tools:::.installMD5sums)

Afterwards, you could use the following function to check whether user level installs differ from the site level installs (in R):

#' Compare the MD5 hashes of all files in packagedir to MD5 reference files located in referencedir
#'
#' @param packagedir path to dir with user level packages
#' @param referencedir (vector of) paths which contain the reference MD5 files
#'
#' @return
#' @export
#'
#' @examples
#' compareMD5(head(.libPaths(), 1), tail(.libPaths(), 2))
compareMD5 <-  function (packagedir, referencedir)
{
  if (missing(referencedir))
    stop("referencedir may not be missing")
  if (!length(referencedir))
    return(NA)
  inlines <- c()
  for (dir in referencedir) {
    md5file <- file.path(dir, "MD5")
    if (!file.exists(md5file))
      return(NA)
    inlines <- c(inlines, readLines(md5file))
  }
  xx <- sub("^([0-9a-fA-F]*)(.*)", "\\1", inlines)
  nmxx <- names(xx) <- sub("^[0-9a-fA-F]* [ |*](.*)", "\\1",
                           inlines)
  dot <- getwd()
  if (is.null(dot))
    stop("current working directory cannot be ascertained")
  setwd(packagedir)
  x <- md5sum(dir(packagedir, recursive = TRUE))
  setwd(dot)
  x <- x[names(x) != "MD5"]
  nmx <- names(x)
  res <- TRUE
  not.here <- !(nmx %in% nmxx)
  if (any(not.here)) {
    res <- FALSE
    if (sum(not.here) > 1L)
      cat(
        "files",
        paste(sQuote(nmx[not.here]), collapse = ", "),
        "are not present in the reference directory\n",
        sep = " "
      )
    else
      cat("file",
          sQuote(nmx[not.here]),
          "is not present in the reference directory\n",
          sep = " ")
  }
  nmx <- nmx[!not.here]
  diff <- xx[nmx] != x[nmx]
  if (any(diff)) {
    res <- FALSE
    files <- nmx[diff]
    if (length(files) > 1L)
      cat(
        "files",
        paste(sQuote(files), collapse = ", "),
        "have different MD5 checksums compared to the references\n",
        sep = " "
      )
    else
      cat("file",
          sQuote(files),
          "has a different MD5 checksum compared to the reference\n")
  }
  res
}

This docker image saves the configure and make log of the R compilation in the /R-${R_VERSION}/logs directory.

You may want to save an installation log for all site level packages as well. We recommend having a dedicated script for installing all necessary packages, e.g. (in bash):

echo 'namespace_packages_stats <- c("car", "coxme", "DescTools", "emmeans", "exact2x2", "lme4", "lmerTest", "mice", "rpact", "survminer")
namespace_packages_util <- c("ggpubr", "viridisLite",  "RColorBrewer", "xtable", "flextable", "kableExtra", "here")
validation_packages <- c("mitml", "optimx", "Exact", "dfoptim", "broom.mixed", "lmtest", "kinship2", "mnormt", "shiny", "filesstrings")
pkg_list <- c(namespace_packages_stats, namespace_packages_util, validation_packages)
install.packages(pkg_list)' >> /R-${R_VERSION}/logs/install_additional_packages.R

You can then run this script with the R CMD BATCH command and save the output to a log file (in bash):

R CMD BATCH /R-${R_VERSION}/logs/install_additional_packages.R /R-${R_VERSION}/logs/install_additional_packages.log

You can may want to download the source code of all installed packages for validation purposes (in R):

pkgs <- installed.packages()
base <- pkgs[pkgs[,4] %in% c("base", "recommended"),1]
pkgs_without_base_idx <- which(!(pkgs[,1] %in% base))
sapply(pkgs_without_base_idx,
    function(x) filesstrings::file.move(remotes::download_version(pkgs[x, 1], pkgs[x, 3], repos = "https://cloud.r-project.org"),
    paste0("/R-", Sys.getenv("R_VERSION"), "/pkg_src")))

You will probably want to extract all these archives and possibly remove the archives afterwards (in bash):

cd /R-${R_VERSION}/pkg_src
for f in *; do tar xf "$f"; done
find . -maxdepth 1 -type f -delete

You may also want to run all unit tests that come with the R source code and save the output into the log directory. You can do so with the following commands (in bash):

cd /R-${R_VERSION}/tests
../bin/R CMD make check-all | tee /R-${R_VERSION}/logs/test_base.log

You might also choose to run some of the units tests from packages you installed on the container. If you choose to the so, the following script might be interesting for you (in R):

clear_stuff <- function() {
  if (!is.null(sessionInfo()$otherPkgs)){
    lapply(
      paste0('package:', names(sessionInfo()$otherPkgs)),
      detach,
      character.only = TRUE,
      unload = TRUE
    )
  }
  rm(list = setdiff(ls(envir = .GlobalEnv) , c("clear_stuff", "log_tests")), envir = .GlobalEnv)
}

#' Log unit tests
#'
#' @param pkg_name 
#' @param pkg_dir 
#' @param test_type 
#' @param library_or_load 
#'
#' @return
#' @export
#'
#' @examples
#' log_tests("coxme", paste0("/R-", Sys.getenv("R_VERSION"), "/pkg_src"), paste0("/R-", Sys.getenv("R_VERSION"), "/logs"), "files", "library")
log_tests <- function(pkg_name, pkg_dir, log_dir, test_type= "testthat", library_or_load = "library"){
pkg_dir <- normalizePath(pkg_dir)
log_dir <- normalizePath(log_dir)
setwd(pkg_dir)
sink(paste0(log_dir, "/", pkg_name, ".log" ), split = T)
setwd(pkg_name)
if(library_or_load == "library"){
   library(package=pkg_name, character.only = T)
} else if (library_or_load == "load"){
   devtools::load_all()
}
if(test_type=="testthat"){
   testthat::test_dir("./tests/testthat")
} else if (test_type=="files"){
   setwd("tests")
   files_test <-
      paste0(
      pkg_dir, "/", pkg_name, "/tests/",
      list.files(
         path = paste0(pkg_dir, "/", pkg_name, "/tests/"),
         recursive = T,
         pattern = "*.R$"
      )
      )
   for (f in files_test){
      source(f, echo=T)
      if (library_or_load == "unload_after_each_file"){
      detach(paste0("package:",pkg_name), character.only = TRUE, unload = TRUE)
      }
   }
}
sink()
setwd(pkg_dir)
clear_stuff()
}

Saving the changes from your container to an image

Look for the hash of your validated container (in your shell):

docker ps -a

Commit the container (in your shell):

docker commit <hash of the container> <repository name>/<image name>:<tag>

If you have a dockerhub account, you can push the container online with the following command (in your shell):

docker push <repository name>/<image name>:<tag>

Starting a container from your validated image

You can now start your image like this (in your shell):

docker run -d -p 8787:8787 -e PASSWORD=validation -e ROOT=true <repository name>/<image name>:<tag>

You may want to check out this link if you want to mount volumnes in the container.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Dockerfile 46.3%
  • Shell 34.3%
  • R 17.2%
  • Makefile 2.0%
  • Awk 0.2%