Skip to contents

Subset a Binary Matrix By Alteration Frequency Threshold

Usage

subset_by_frequency(gene_binary, t = 0.1, other_vars = NULL)

Arguments

gene_binary

A data frame with a row for each sample and column for each alteration. Data frame must have a sample_id column and columns for each alteration with values of 0, 1 or NA.

t

Threshold value between 0 and 1 to subset by. Default is 10% (.1).

other_vars

One or more column names (quoted or unquoted) in data to be retained in resulting data frame. Default is NULL.

Value

a data frame with a sample_id column and columns for alterations over the given prevalence threshold of t.

Examples

samples <- unique(gnomeR::mutations$sampleId)
 gene_binary <- create_gene_binary(
   samples = samples, mutation = mutations, cna = cna,
   mut_type = "somatic_only",
   include_silent = FALSE,
   specify_panel = "impact"
 )
gene_binary %>%
 subset_by_frequency()
#> # A tibble: 200 × 7
#>    sample_id          TP53 FOXA1 AR.Amp  SPOP PTEN.Del KMT2C
#>    <chr>             <dbl> <dbl>  <dbl> <dbl>    <dbl> <dbl>
#>  1 P-0001128-T01-IM3     1     0      0     0        0     0
#>  2 P-0001859-T01-IM3     0     0      1     0        0     0
#>  3 P-0001895-T01-IM3     0     0      0     1        0     0
#>  4 P-0001845-T01-IM3     0     0      0     0        0     1
#>  5 P-0001768-T01-IM3     1     0      0     0        0     0
#>  6 P-0002984-T01-IM3     1     0      1     0        0     0
#>  7 P-0000964-T02-IM3     0     0      0     0        0     0
#>  8 P-0000964-T01-IM3     0     0      0     0        0     0
#>  9 P-0000610-T01-IM3     1     0      0     0        1     0
#> 10 P-0001247-T01-IM3     1     0      1     0        0     0
#> # ℹ 190 more rows