class: inverse, center, title-slide, middle # Everything You Wanted to Know About Contributing to an R Package ## *But Were Too Afraid to Ask* ### Daniel D. Sjoberg #### April 4, 2022 <p align="center"><img src="Images/White-Transparent.png" width=30%></p> .medium[ <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#FFFFFF;" xmlns="http://www.w3.org/2000/svg"> <path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"></path></svg> @statistishdan <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#FFFFFF;" xmlns="http://www.w3.org/2000/svg"> <path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"></path></svg> @ddsjoberg ] --- ## Outline .xxlarge[ 1. Anatomy of an R Package 1. Preparing to Contribute 1. Fork + Clone + Branch 1. Write/Modify a Function 1. Document function/changes 1. Style your contribution 1. Add unit tests 1. R CMD Checks 1. Submit Pull Request ] _We will cover how to contribute to an R package whose source code lives on **GitHub**_ --- ## Other Resources .pull-left[ [![](Images/r-packages-cover.png)](https://r-pkgs.org/) ] .pull-right[ [![](Images/ropensci-packages-cover.png)](https://devguide.ropensci.org/) ] --- ## Anatomy of an R Package ``` ## C:/Users/SjobergD/GitHub/contributing-to-an-R-pkg/toy.pkg ## +-- DESCRIPTION ## +-- man ## | \-- foo.Rd ## +-- NAMESPACE ## +-- R ## | \-- foo.R ## \-- tests ## \-- testthat ## \-- test-foo.R ``` --- ## DESCRIPTION .pull-left[ * Every package must have a `DESCRIPTION`. * The job of the `DESCRIPTION` file is to store important metadata about your package. - list dependencies - package title/description - package version - specify the package license - list package authors and contributors * This file is setup by the package maintainer, and you will likely _NOT_ need to modfiy this file. * Read more here https://r-pkgs.org/description.html ] .pull-right[ ``` Package: mypackage Title: What The Package Does (one line) Version: 0.1 Authors@R: person( "First", "Last", email = "first.last@example.com", role = c("aut", "cre")) Description: What the package does (one paragraph) Depends: R (>= 3.5) Imports: dplyr License: What license is it under? LazyData: true ``` ] --- ## NAMESPACE .pull-left[ * You can see that the `NAMESPACE` file looks a bit like R code. Each line contains a directive: `S3method()`, `export()`, `importFrom()`, and so on. * Each directive describes an R object, and says whether it’s exported from this package to be used by others, or it’s imported from another package to be used locally. * {roxygen2} will generate `NAMESPACE` for you! Do _not_ edit this file. ] .pull-right[ ```r # Generated by roxygen2: do not edit by hand S3method(add_n,tbl_summary) export(tbl_regression) export(tbl_summary) export(tbl_uvregression) importFrom(glue,glue) importFrom(knitr,knit_print) importFrom(magrittr,"%>%") ``` ] --- ## What are you going to contribute? .xlarge[ **Please do not make a blind pull request into a package.** 1. Before you begin working on a new function or feature, **submit an Issue on GitHub**. 1. **Work with the package maintainer** to ensure the new functionality fits with the existing package. 1. If a **GitHub Issue already exists** for the feature request, join the conversation on the Issue: let the maintainer know it's something you'd like to work on. ] --- ## Getting Ready Before you can contribute to a package, you need to **get your system setup** for development. 1. If it's been a while, take the time to **update R and RStudio**. 1. **Install and configure git** (if you haven't already). https://github.mskcc.org/pages/datadojo/mskRutils/articles/git_config.html#git-setup 1. **Configure your GitHub Personal Access Token (PAT)**. Remember that Enterprise GitHub and public GitHub require two different PATs. Details here https://github.mskcc.org/pages/datadojo/mskRutils/articles/git_config.html#pat 1. **Install** the following packages ```r install.packages(c("devtools", "roxygen2", "testthat", "knitr", "styler")) ``` 1. Using Windows? **Install RTools**. https://cran.r-project.org/bin/windows/Rtools/ 1. Using macOS? **Install Xcode**. ```shell xcode-select --install ``` --- ## Fork + Clone + Branch .xlarge[ * Most R Package source code is kept on GitHub * Navigate to the package's GitHub page - Fork the repository - Clone the forked repository from your personal GH page to your local machine - Create a _new_ branch to work on where you will add your function - Review the training materials for details on forking + cloning + branching [https://github.mskcc.org/datadojo/git_resources/tree/master/training/2022%2003%20Epi-Biostats%20Git%20Trainings](https://github.mskcc.org/datadojo/git_resources/tree/master/training/2022%2003%20Epi-Biostats%20Git%20Trainings) ] --- ## Install Package Dependencies .xlarge[ - Each package has dependencies: packages that another package relies upon. - Before you can contribute to a package, you must install that package _and_ all it's dependencies. - There are different kinds of dependencies, and the most common are `Imports` (packages required to install the package) and `Suggests` (packages required to build the package, including packages used in documentation, vignettes, and unit testing). - Before you can contribute to a package, you need to install _all_ of its dependencies. - Open the package R project in RStudio, and run `renv::install()`; this will install all the dependencies. ] --- ## Write a Function 1. If you're adding a new function to a package, the code for the function will generally live in its own `.R` file by the same name. Run `usethis::use_r("foo")` and a blank script file will be added to the R folder. ```r usethis::use_r("foo") #> √ Setting active project to 'C:/Users/SjobergD/GitHub/contributing-to-an-R-pkg' #> √ Creating 'R/' #> * Modify 'R/foo.R' #> * Call `use_test()` to create a matching test file ``` 1. Write your function! ```r foo <- function(x) { # check input is numeric stopifnot(is.numeric(x)) # return the mean mean(x) } ``` 1. When writing function that utilize {tidyverse} functions, be sure to use the `.data` and `.env` prefixes. Review these slides for details [https://mskcc-epi-bio.github.io/Writing-Function-with-the-tidyverse/](https://mskcc-epi-bio.github.io/Writing-Function-with-the-tidyverse/) --- ## Document Your Function .pull-left[ - After you've written your function, you need to document each of the arguments, include examples, and any other pertinent information. - Use the {roxygen2} comments package to document your function. - The {roxygen2} comments begin with `#'` and use tags like `@param`, `@export`, and `@examples` to generate the help file code. ![](Images/roxygen-skeleton.png) ] -- .pull-right[ ```r #' Title #' #' @param x #' #' @return #' @export #' #' @examples foo <- function(x) { # check input is numeric stopifnot(is.numeric(x)) # return the mean mean(x) } ``` ] --- ## Document Your Function .pull-left[ ```r #' Calculate the Mean #' #' @param x a numeric vector #' #' @return numeric scaler #' @export #' #' @examples #' foo(mtcars$mpg) foo <- function(x) { # check input is numeric stopifnot(is.numeric(x)) # return the mean mean(x) } ``` ] .pull-right[ * Each time you update the roxygen comments, you must re-document the package (Ctrl+Shift+D in RStudio) ![](Images/document.png) ] --- ## Building Pkg with Your Function * After you've written a function (or half of a function) and documented it, you'll want to do some ad-hoc testing. * Build + Install the package by clicking the "Build and Restart" button in the "Build" tab. ![](Images/install_restart.png) * You can also load the package (including exported and non-exported objects) with `devtools::load_all()` or Ctrl+Shift+L in RStudio --- ## Make it CUTE! * Most open-source projects follow a style guidelines. * All the Epi/Biostat R packages follow the tidyverse/google style guide. * Even if you're not familiar with the styles in the guide, you can easily conform your code using the {styler} package. ![](Images/styler.png) --- ## Add Unit Tests .xlarge[ * Testing is a vital part of package development. * It ensures that your code does what you want it to do. * Testing adds an additional step to your development workflow. * We will use the {testthat} package for all our unit testing. ] --- ## Add Unit Tests * Run `usethis::use_test("foo")` to add a unit testing file ```r usethis::use_test("foo") #> √ Creating 'tests/testthat/' #> √ Writing 'tests/testthat.R' #> √ Writing 'tests/testthat/test-foo.R' #> * Modify 'tests/testthat/test-foo.R' ``` * The {testthat} package has MANY functions for testing your function. Here are the ones I use most often: - `expect_error()` can test for the presence (or lack) of an error - `expect_message()` tests whether the function prints a message (can also test the text of the message) - `expect_equal()` test your function's return result equals a specified value - `expect_true()` checks expression evaluates to `TRUE` * Each test should have an informative name and cover a single unit of functionality. The idea is that when a test fails, you’ll know what’s wrong and where in your code to look for the problem. --- ## Code Coverage After writing your unit tests, calculate the coverage for your new function. ```r withr::with_envvar(new = c("NOT_CRAN" = "true"), covr::report()) ``` Aim for 100% coverage of any addition you've made, and add unit tests as needed until you're there! --- ## R CMD Checks ![](Images/check.png) <br> -- What is checked? Here's a very abbreviated list .pull-left[ * package structure - hidden files/folders - portable file names - executable files - package subdirectories - left-over files * DESCRIPTION/NAMESPACE file - package dependencies - files exist - NAMESPACE parses properly ] .pull-right[ * R code - non-ASCII characters - syntax errors - dependencies in R code - S3 generic/method consistency * documentation - Rd/help files - Rd file metadata - examples - undocumented function arguments ] --- ## Checks Review results and check there are zero errors, warnings, and notes. ![](Images/r-cmd-check.png) <br>A common note is about **undefined global variables**. This most often occurs when using {dplyr} verbs without the `.data` prefix. .pull-left[ ```r # bad syntax mtcars |> mutate(mpg10 = mpg * 10) ``` ] .pull-right[ ```r # good syntax mtcars |> mutate(mpg10 = .data$mpg * 10) ``` ] ![](Images/global-variable.png) --- ## Spell Check .pull-left[ One last check...the spell check! ```r usethis::use_spell_check() ``` ] .pull-right[ <img src = "Images/all-done.gif" alt = "animated" width = "5%"> ] --- ## Submit Pull Request You're almost there! Time to submit your change for acceptance into the main package repository. When you submit a Pull Request - The R CMD Checks will be run on Linux, Windows, and macOS - The R CMD Checks will be run on multiple versions of R Click the Create Pull Request button in GitHub Desktop ![](Images/windows-create-pull-request.png) .footnote[These additional checks may not be run on Enterprise GitHub repositories.] --- ## Pull Request Checklist Most package repositories for the Epi/Biostat Dept have a pull request checklist. Be sure to review each item of the checklist before asking the package maintainer to review your pull request. Example from {gtsummary} - [ ] Ensure all package dependencies are installed by running `renv::install()` - [ ] PR branch has pulled the most recent updates from master branch. Ensure the pull request branch and your local version match and both have the latest updates from the master branch. - [ ] If an update was made to `tbl_summary()`, was the same change implemented for `tbl_svysummary()`? - [ ] If a new function was added, function included in `_pkgdown.yml` - [ ] If a bug was fixed, a unit test was added for the bug check - [ ] Run `pkgdown::build_site()`. Check the R console for errors, and review the rendered website. - [ ] Code coverage is suitable for any new functions/features. Review coverage with `withr::with_envvar(new = c("NOT_CRAN" = "true"), covr::report())`. Begin in a fresh R session without any packages loaded. - [ ] R CMD Check runs without errors, warnings, and notes - [ ] `usethis::use_spell_check()` runs with no spelling errors in documentation --- ## Let's Practice .pull-left[.xxlarge[ We'll now walk through an example by submitting a pull request to the {tidycmprsk} package. [https://github.com/MSKCC-Epi-Bio/tidycmprsk](https://github.com/MSKCC-Epi-Bio/tidycmprsk) Any questions before we begin? ]] .pull-right[ ![](Images/octocat-teacher.png) ]