hexDimRed
Plots a hexbin version of a dimensional reduction representing values from a series of columns of colData.
This is exactly what schex(https://github.com/SaskiaFreytag/schex) does, but with added graphical features afforded by ggplot.multistats (https://github.com/flying-sheep/ggplot.multistats) and a slightly more flexible approach.
-
for numeric variables, the
fillaesthetic is given by a function that calculates summary statistics (e.g. mean, median, var) per bin. The function is specified bystat_fun_numericand will be applied to each bin separately. -
for categorical variables, the
fillaesthetic can be given by either calculating the most represented factor level in each bin (stat_fun_numericset to"maj") or the proportion of a single reference level against all others ("prop“). This is achieved by setting all reference levels to 1 and all non-reference levels to 0, then calculating the mean. The reference level is chosen as the most abundant one in the column. In the future I may implement a choice for the reference level, although it could be tedious for the user who has to set each level independently for each column ofcolData. -
the
alphaaesthetic is set independently byalpha_byand can be the number (length) of observations, or any other operation performed on each bin. -
gene expression values can be plotted specifying the
assay -
the user can supply custom color scales
The output is a list of ggplot objects.
Usage:
This is the traditional PCA plot as delivered by scater:
p1 <- plotReducedDim(sce, dimred = "PCA",
colour_by = "individual"
) + labs(title = "PCA - by individual")
p2 <- plotReducedDim(sce, dimred = "PCA",
colour_by = "n_UMI"
) + labs(title = "PCA - by total UMI")
p3 <- plotReducedDim(sce, dimred = "PCA",
colour_by = "subset_mito_pct"
) + labs(title = "PCA - by % mitochondrial expression")
p4 <- plotReducedDim(sce, dimred = "PCA",
colour_by = "scDblFinder.class"
) + labs(title = "PCA - by doublet class")
gridExtra::grid.arrange(grobs = list(p1, p2), ncol = 2)
gridExtra::grid.arrange(grobs = list(p3, p4), ncol = 2)It is hard to make out the relative contribution of each level because of how dots are superimposed.
Using hexDimRed:
sce_hexdim <- hexDimRed(sce,
stat_fun_categorical = "prop",
other_fields = c("individual",
"subset_mito_pct",
"scDblFinder.class"
)
)
sce_hexdim2 <- hexDimRed(sce,
stat_fun_categorical = "maj",
other_fields = c("individual",
"scDblFinder.class",
"batch"
)
)
gridExtra::grid.arrange(grobs = sce_hexdim[c(1,4)], ncol = 2)
gridExtra::grid.arrange(grobs = c(sce_hexdim[4], sce_hexdim2[2]), ncol = 2)
gridExtra::grid.arrange(grobs = sce_hexdim2[c(3,4)], ncol = 2)As of June 17 2020, there are some additional functionalities:
- if there is no
n_UMIcolumn in the object, a simple density of cells in the bin will be plotted - support for user supplied scales, both continuous (
custom_cont_cols) and discrete/categorical (custom_cat_cols). Importantly, for the categorical variables, the color vector order will be matched with the factor order specified inother_fieldsusingstat_fun_categorical = "maj". - gene expression can be plotted via
plot_genes, supplying the expression assay (default islognormcounts) - labels can be added with 2 styles:
sticker(rounded box with solid background) ortext(text only with no box) - dark theme enabled
sce_hexdim3 <- hexDimRed(sce,
main_field = NULL,
other_fields = "clustering_result",
stat_fun_categorical = "maj",
dimred = "UMAP",
nbins = 60,
labels = "sticker",
custom_cat_cols = custom.colors,
plot_genes = "ENSG00000198963")
sce_hexdim3[[1]]
sce_hexdim3[[2]]
sce_hexdim3[[3]]Dark mode:
sce_hexdim4 <- hexDimRed(sce,
main_field = NULL,
other_fields = "clustering_result",
stat_fun_categorical = "maj",
dimred = "UMAP",
nbins = 60,
labels = "sticker",
custom_cat_cols = custom.colors
dark = TRUE)
sce_hexdim4[[2]]