Write example Cell Ranger dataset
Source:R/data_creation_functs.R
write_example_cellranger_dataset.Rd
write_example_cellranger_dataset()
creates an example dataset and writes the dataset to disk in Cell Ranger format. The dataset can be unimodal or multimodal, containing some subset of the gene expression, gRNA expression, and protein expression modalities.
Usage
write_example_cellranger_dataset(
n_features,
n_cells,
n_batch,
modalities,
directory_to_write,
p_zero = NULL,
p_set_col_zero = NULL,
p_set_row_zero = NULL
)
Arguments
- n_features
an integer vector specifying the number of features to simulate for each modality.
- n_cells
an integer specifying the number of cells to simulate.
- n_batch
an integer specifying the number of batches to simulate. Cells from different batches are written to different directories.
- modalities
a character vector indicating the modalities to simulate. The vector should contain some subset of the strings "gene", "grna", and "protein".
- directory_to_write
the directory in which to write the Cell Ranger feature barcode files.
- p_zero
(optional; default random) the fraction of entries to set randomly to zero.
- p_set_col_zero
(optional; default random) the fraction of columns to set randomly to zero.
- p_set_row_zero
(optional; default random) the fraction of rows to set randomly to zero.
Value
a list containing (i) a list of the simulated expression matrices, (ii) a character vector containing the names of the genes (or NULL
if the gene modality was not simulated), and (iii) a character vector specifying the batch of each cell.
Examples
set.seed(4)
n_features <- c(1000, 40, 400)
modalities <- c("gene", "protein", "grna")
n_cells <- 10000
n_batch <- 2
directory_to_write <- tempdir()
p_set_col_zero <- 0
out <- write_example_cellranger_dataset(
n_features = n_features,
n_cells = n_cells,
n_batch = n_batch,
modalities = modalities,
directory_to_write = directory_to_write,
p_set_col_zero = p_set_col_zero
)
# directories written to directory_to_write
fs <- list.files(directory_to_write, pattern = "batch*", full.names = TRUE)
# files contained within the directories
list.files(fs[1])
#> [1] "barcodes.tsv.gz" "features.tsv.gz" "matrix.mtx.gz"
list.files(fs[2])
#> [1] "barcodes.tsv.gz" "features.tsv.gz" "matrix.mtx.gz"