Appendix D — Frequently asked questions

D.1 How can I improve the speed of sceptre (and ondisc)?

Create the file Makevars inside of your ~/.R directory (if it does not yet exist):

Terminal
touch ~/.R/Makevars

Add the following line to your Makevars file:

~/.R/Makevars
CXXFLAGS = -O3 -Wall

This line tells the C++ compiler to use aggressive optimization in compiling C++ code contained within an R package. Finally, reinstall sceptre (and ondisc) from source.

R
devtools::install_github("katsevich-lab/sceptre")
devtools::install_github("timothy-barry/ondisc")

D.2 I am trying to install sceptre, but I am getting an error. What should I do?

The first thing to do is to remove any previous installations of sceptre and then try again. First, determine the directory on your computer in which R packages are stored by executing .libPaths() in the R console.

[1] "/Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library"

Next, within your terminal, cd to this directory and then execute rm -rf sceptre.

Terminal
cd /Library/Frameworks/R.framework/Versions/4.2/Resources/library # change me!
rm -rf sceptre

Now, try to install sceptre again.

D.3 How can I install a previous version of sceptre?

See the instructions here.

D.4 sceptre crashed, or sceptre is using an unexpectedly large amount of memory or taking an unexpectedly long amount of time to run. What should I do?

The most likely explanation is that something is going wrong with the parallel functionality of the package. Consider setting parallel = FALSE and then trying again. If the situation improves, consider setting parallel = TRUE and setting n_processors to a small number, such as 2 or 3. Note that the parallel = TRUE option fails on clusters and Windows machines; to run sceptre in parallel on one of these platforms, use the Nextflow pipeline.

D.5 My negative control p-values are miscalibrated. What should I do?

Not to worry! First of all, the extent of the miscalibration may be mild enough not to cause significant issues with your analysis; see Table 5.1 for assessing the severity of your miscalibration. If the miscalibration is moderate to severe, then see Section 5.4 for several suggestions to improve calibration.

D.6 How many negative control gRNAs do I need?

The table below summarizes the minimum number of negative control gRNAs required to run an analysis as a function of the control group (either NT cells or complement set) and analysis type (either calibration check, discovery analysis, or power check). Recall that the default control group for low-MOI screens is the set of NT cells, while the default control group for high-MOI screens is the complement set.

Minimum number of negative control gRNAs required to run an analysis as a function of the control group (vertical axis) and analysis type (horizontal axis).
Calibration check Discovery analysis or power check
NT cells 2 1
Complement set 0 0

In general having more negative control gRNAs is better. We recommend including at least ten to fifteen negative control gRNAs in the assay for the best chance of obtaining high-quality results.

D.7 How should I run multiple sceptre analyses?

Sometimes users wish to carry out multiple sceptre analyses on a single dataset, such as

  • running cis and trans analyses;

  • running analyses with gRNAs grouped based on target and with singleton gRNAs.

Here we demonstrate how to do this efficiently. We begin by loading the sceptre package.

D.7.1 Running cis and trans analyses

Some users may wish to run both a cis analysis and a trans analysis. We recommend that such users apply the sceptre pipeline twice: once to analyze the cis pairs and once to analyze the trans pairs. We illustrate this approach on the high-MOI CRISPRi data; we begin by creating a sceptre_object to represent these data.

sceptre_object <- import_data(
  response_matrix = highmoi_example_data$response_matrix,
  grna_matrix = highmoi_example_data$grna_matrix,
  grna_target_data_frame = grna_target_data_frame_highmoi,
  moi = "high",
  extra_covariates = highmoi_example_data$extra_covariates,
  response_names = highmoi_example_data$gene_names
)

First, we carry out an analysis of the cis pairs, storing the results in the directory "~/sceptre_results_cis".

# positive control pairs
positive_control_pairs <- construct_positive_control_pairs(sceptre_object)

# cis pairs
discovery_pairs_cis <- construct_cis_pairs(
  sceptre_object = sceptre_object,
  positive_control_pairs = positive_control_pairs,
  distance_threshold = 5e6
)
# run cis analysis
sceptre_object <- sceptre_object |>
  set_analysis_parameters(
    discovery_pairs = discovery_pairs_cis,
    positive_control_pairs = positive_control_pairs,
    side = "left"
  ) |>
  assign_grnas(parallel = TRUE) |>
  run_qc(p_mito_threshold = 0.075) |>
  run_calibration_check(parallel = TRUE) |>
  run_power_check(parallel = TRUE) |>
  run_discovery_analysis(parallel = TRUE)

# write outputs
write_outputs_to_directory(
  sceptre_object = sceptre_object,
  directory = "~/sceptre_results_cis"
)

Next, we carry out an analysis of the trans pairs, storing the results in "~/sceptre_results_trans".

# trans pairs
discovery_pairs_trans <- construct_trans_pairs(
  sceptre_object = sceptre_object,
  positive_control_pairs = positive_control_pairs,
  exclude_positive_control_targets = TRUE
)
# run trans analysis
sceptre_object <- sceptre_object |>
  set_analysis_parameters(
    discovery_pairs = discovery_pairs_trans,
    positive_control_pairs = positive_control_pairs,
    side = "both"
  ) |>
  assign_grnas(parallel = TRUE) |>
  run_qc(p_mito_threshold = 0.075) |>
  run_calibration_check(parallel = TRUE) |>
  run_power_check(parallel = TRUE) |>
  run_discovery_analysis(parallel = TRUE)

# write outputs
write_outputs_to_directory(
  sceptre_object = sceptre_object,
  directory = "~/sceptre_results_trans"
)

Note that we update the sceptre_object with the output of the cis analysis. We then use the same sceptre_object to carry out the trans analysis. Under the hood sceptre stores (or “caches”) intermediate computations carried out as part of the cis analysis inside the sceptre_object. These intermediate computations are then recycled to carry out the trans analysis, thereby reducing compute.

D.7.2 Running singleton and grouped analyses

Another common analysis paradigm is to run both a grouped analysis (in which gRNAs targeting the same site are integrated or “combined”) and a singleton analysis (in which gRNAs targeting the same site are analyzed individually). We carried out a grouped analysis of the cis pairs on the high-MOI CRISPRi data above. Below, we conduct a singleton analysis on the same set of pairs, storing the result in "~/sceptre_results_cis_singleton". We operate on the same sceptre_object so as to exploit caching.

# singleton cis analysis
sceptre_object <- sceptre_object |>
  set_analysis_parameters(
    discovery_pairs = discovery_pairs_cis,
    positive_control_pairs = positive_control_pairs,
    side = "left",
    grna_integration_strategy = "singleton"
  ) |>
  assign_grnas(parallel = TRUE) |>
  run_qc(p_mito_threshold = 0.075) |>
  run_calibration_check(parallel = TRUE) |>
  run_power_check(parallel = TRUE) |>
  run_discovery_analysis(parallel = TRUE)

# write outputs
write_outputs_to_directory(
  sceptre_object = sceptre_object,
  directory = "~/sceptre_results_cis_singleton"
)

In summary, to carry out multiple analyses on the same dataset, users should apply the sceptre pipeline multiple times, reusing the underlying sceptre_object to exploit caching and reduce compute.