Here VeloSim method will be demonstrated clearly and hope that this document can help you.
Before simulating datasets, it is important to estimate some essential parameters from a real dataset in order to make the simulated data more real.
library(simmethods)
library(SingleCellExperiment)
# Load data
ref_data <- simmethods::data
estimate_result <- simmethods::VeloSim_estimation(
ref_data = ref_data,
verbose = T,
seed = 10
)
# Estimating parameters using VeloSim
# Loading required package: amap
# Computing nearest neighbor graph
# Computing SNN
# Your data has 3 groups
Users can also input the group information of cells:
group <- as.numeric(simmethods::group_condition)
estimate_result <- simmethods::VeloSim_estimation(
ref_data = ref_data,
other_prior = list(group.condition = group),
verbose = T,
seed = 10
)
# Estimating parameters using VeloSim
After estimating parameter from a real dataset, we will simulate a dataset based on the learned parameters with different scenarios.
The reference data contains 160 cells and 4000 genes, if we simulate datasets with default parameters and then we will obtain a new data which has the same size as the reference data.
simulate_result <- simmethods::VeloSim_simulation(
parameters = estimate_result[["estimate_result"]],
other_prior = NULL,
return_format = "SCE",
seed = 111
)
# nCells: 160
# nGenes: 4000
SCE_result <- simulate_result[["simulate_result"]]
dim(SCE_result)
# [1] 4000 160
In VeloSim, we can set nCells
and nGenes
to specify the number of cells and genes.
Here, we simulate a new dataset with 200 cells and 2000 genes:
simulate_result <- simmethods::VeloSim_simulation(
parameters = estimate_result[["estimate_result"]],
return_format = "list",
other_prior = list(nCells = 200,
nGenes = 2000),
seed = 111
)
# nCells: 200
# nGenes: 2000
result <- simulate_result[["simulate_result"]][["count_data"]]
dim(result)
# [1] 2000 200
Make sure that you have already installed several R packages:
if(!requireNamespace("dynwrap", quietly = TRUE)){install.packages("dynwrap")}
if(!requireNamespace("dyndimred", quietly = TRUE)){install.packages("dyndimred")}
if(!requireNamespace("dynplot", quietly = TRUE)){install.packages("dynplot")}
if(!requireNamespace("tislingshot", quietly = TRUE)){devtools::install_github("dynverse/ti_slingshot/package/")}
First we should wrap the data into a standard object:
dyn_object <- dynwrap::wrap_expression(counts = t(result),
expression = log2(t(result) + 1))
Next, we infer the trajectory using SlingShot which has been proved to be the most best method to do this:
model <- dynwrap::infer_trajectory(dataset = dyn_object,
method = tislingshot::ti_slingshot(),
parameters = NULL,
give_priors = NULL,
seed = 111,
verbose = TRUE)
# Executing 'slingshot' on '20230816_112827__data_wrapper__zj4D9ACXnS'
# With parameters: list(cluster_method = "pam", ndim = 20L, shrink = 1L, reweight = TRUE, reassign = TRUE, thresh = 0.001, maxit = 10L, stretch = 2L, smoother = "smooth.spline", shrink.method = "cosine")
# inputs: expression
# priors :
# Using full covariance matrix
Finally, we can plot the trajectory after performing dimensionality reduction:
dimred <- dyndimred::dimred_umap(dyn_object$expression)
dynplot::plot_dimred(model, dimred = dimred)
# Coloring by milestone
# Using milestone_percentages from trajectory
For more details about trajectory inference and visualization, please check dynverse.