Rank Scatter Plot — PlotRank • UtilsR

Ranks features (genes, pathways, regulons, etc.) by their scores and highlights the top-ranked ones with coloured points and text labels. Designed for DNN model interpretation outputs (Integrated Gradients importance scores) but works with any named score data.

Usage

PlotRank(
  data,
  group_col = "cell_type",
  name_col = "gene",
  value_col = "importance",
  groups = NULL,
  group_levels = NULL,
  top_n = 5L,
  max_show = 200L,
  value_scale = c("none", "group", "top_n"),
  highlight_color = "#007D9B",
  base_color = "#BECEE3",
  label_size = 4,
  point_size = 3,
  title = NULL,
  ylab = "Importance",
  base_size = 12,
  ncol = 4L,
  clean_names = TRUE,
  return_type = c("plot", "data", "both"),
  filename = NULL,
  width = 12,
  height = 10,
  dpi = 300
)

Arguments

data

Input data in one of the following formats:

data.frame: Must contain columns specified by name_col and value_col. If group_col exists, a faceted multi-panel plot is produced.
Named numeric vector: Values are scores; names are feature labels. Produces a single-panel plot.
Matrix: Rows = features, columns = groups (e.g. cell types). Use groups to select specific columns.

group_col

Column name for the grouping variable (e.g. cell type). Default "cell_type". Ignored for vector / matrix input.

name_col

Column name for the feature names. Default "gene". Ignored for vector / matrix input.

value_col

Column name for the numeric scores. Default "importance". Ignored for vector / matrix input.

groups

Character vector of groups to plot (for matrix or data.frame input). NULL (default) = all groups.

group_levels

Character vector specifying display order. NULL = data appearance order or factor levels.

top_n

Integer. Number of top-ranked features to highlight per group. Default 5.

max_show

Integer. Maximum features to display per panel. Default 200.

value_scale

Per-group score scaling strategy. Three options:

"none" (default): No scaling; raw scores are plotted as-is.
"group": Scale each group independently to [0, 1] using the group's full value range from the original data (before max_show truncation). Useful when absolute score magnitudes differ greatly across groups.
"top_n": Scale each group to [0, 1] using only the displayed (post-max_show) values. Stretches the visible range within each panel to maximise visual separation.

highlight_color

Colour for top-ranked points. Default "#007D9B".

base_color

Colour for remaining points. Default "#BECEE3".

label_size

Numeric. Text label size. Default 4.

point_size

Numeric. Point size. Default 3.

title

Character. Plot title. NULL = auto.

ylab

Character. Y-axis label. Default "Importance".

base_size

Numeric. Base font size. Default 12.

ncol

Integer. Number of columns in faceted layout. Default 4.

clean_names

Logical. Strip common prefixes (HALLMARK_, KEGG_, etc.) and replace underscores with spaces. Default TRUE.

return_type

What to return: "plot" (default), "data" (the ranked data.frame), or "both".

filename

Output file path. NULL = no save.

width

Output width in inches. Default 12.

height

Output height in inches. Default 10.

dpi

Output resolution. Default 300.

Value

Depends on return_type:

"plot": A ggplot object (default).
"data": A data.frame with columns: Group, Rank, Score, Label, IsTop.
"both": A list with elements plot and data.

Examples

if (FALSE) { # \dontrun{
library(ToyData)
data(Toy_gene_importance)

# Faceted rank scatter (all cell types)
PlotRank(Toy_gene_importance, top_n = 10, ncol = 4, ylab = "Gene Importance (IG)")

# Single cell type
PlotRank(Toy_gene_importance, groups = "Parathyroid cells", top_n = 15)

# With value scaling
PlotRank(Toy_gene_importance, top_n = 10, value_scale = "group")

# Pathway scores matrix
PlotRank(pred$pathway_scores, top_n = 10, ylab = "Pathway Score")

# Named numeric vector
scores <- setNames(runif(100), paste0("Gene", 1:100))
PlotRank(scores, top_n = 10)
} # }