Skip to contents

write_example_cellranger_dataset() creates an example dataset and writes the dataset to disk in Cell Ranger format. The dataset can be unimodal or multimodal, containing some subset of the gene expression, gRNA expression, and protein expression modalities.

Usage

write_example_cellranger_dataset(
  n_features,
  n_cells,
  n_batch,
  modalities,
  directory_to_write,
  p_zero = NULL,
  p_set_col_zero = NULL,
  p_set_row_zero = NULL
)

Arguments

n_features

an integer vector specifying the number of features to simulate for each modality.

n_cells

an integer specifying the number of cells to simulate.

n_batch

an integer specifying the number of batches to simulate. Cells from different batches are written to different directories.

modalities

a character vector indicating the modalities to simulate. The vector should contain some subset of the strings "gene", "grna", and "protein".

directory_to_write

the directory in which to write the Cell Ranger feature barcode files.

p_zero

(optional; default random) the fraction of entries to set randomly to zero.

p_set_col_zero

(optional; default random) the fraction of columns to set randomly to zero.

p_set_row_zero

(optional; default random) the fraction of rows to set randomly to zero.

Value

a list containing (i) a list of the simulated expression matrices, (ii) a character vector containing the names of the genes (or NULL if the gene modality was not simulated), and (iii) a character vector specifying the batch of each cell.

Examples

set.seed(4)
n_features <- c(1000, 40, 400)
modalities <- c("gene", "protein", "grna")
n_cells <- 10000
n_batch <- 2
directory_to_write <- tempdir()
p_set_col_zero <- 0
out <- write_example_cellranger_dataset(
  n_features = n_features,
  n_cells = n_cells,
  n_batch = n_batch,
  modalities = modalities,
  directory_to_write = directory_to_write,
  p_set_col_zero = p_set_col_zero
)

# directories written to directory_to_write
fs <- list.files(directory_to_write, pattern = "batch*", full.names = TRUE)
# files contained within the directories
list.files(fs[1])
#> [1] "barcodes.tsv.gz" "features.tsv.gz" "matrix.mtx.gz"  
list.files(fs[2])
#> [1] "barcodes.tsv.gz" "features.tsv.gz" "matrix.mtx.gz"