scLAB Documentation

Getting Started

This guide will help you get up and running with single-cell RNA-seq analysis (scRNA-seq). scLAB provides a complete workflow for analyzing your single-cell data without any coding required.

Loading Data

scLAB supports two types of scRNA-seq data files:

1. Cellranger Output h5 Files

Load raw count data from CellRanger output. Alternatively, you can load multiple h5 files (multiple samples) and scLAB will concatenate them.

2. Processed h5ad Files

Load already-processed AnnData files (h5ad format) for visualization and secondary analysis.

Preprocessing Pipeline

Informed by Scanpy, scLAB supports the following quality control workflow:

1. Quality Control (QC)

Divided into basic and advanced quality control:

Basic quality control: Filter cells based on minimum genes per cells and minimum cells per gene thresholds:
- Minimum genes per cell: Remove cells with too few genes (default: 200)
- Minimum cells per gene: Remove genes expressed in too few cells (default: 3)
Advanced quality control: Calculate standard quality metrics:
- n_genes_by_counts: Total number of genes expressed in each cell
- total_counts: Total counts (UMIs) per cell
- (Optional) pct_counts_mt: Percentage of mitochondrial gene counts per cell
- (Optional) rb_counts_mt: Percentage of ribosomal gene counts per cell
- (Optional) hb_counts_mt: Percentage of hemoglobin gene counts per cell

2. Normalization

Normalize gene expression to account for sequencing depth differences between cells. scLAB uses standard log-normalization. Default is 10,000 counts per cell. Users can choose if they want to normalize to the median total counts per cell as well.

3. Feature Selection

Identify highly variable genes that drive biological variation in your dataset. Default: selects top 2000 highly variable genes. Option to identify highly variable genes by accounting for batch effects.

4. Scaling

Scale gene expression to zero mean and unit variance for downstream analysis.

Analysis Methods

Dimensionality Reduction

PCA (Principal Component Analysis)

Reduces high-dimensional gene expression data to a smaller number of principal components. Default: 50 components.

UMAP (Uniform Manifold Approximation and Projection)

Creates a 2D visualization of your cells based on gene expression similarity. Parameters:

n_neighbors: Controls local vs global structure (default: 15)
min_dist: Controls how tightly points cluster (default: 0.1)

Clustering

scLAB uses the Leiden algorithm for community detection. Adjust the resolution parameter to control cluster granularity:

Lower resolution (0.4-0.8): Fewer, larger clusters
Higher resolution (1.0-2.0): More, smaller clusters

Differential Expression Analysis

Find marker genes that distinguish cell clusters or compare between conditions. scLAB uses Wilcoxon rank-sum test by default.

Visualization

scLAB provides interactive visualizations powered by Plotly:

UMAP Plots

Visualize cells in 2D space, colored by clusters, gene expression, or metadata.

Violin Plots

Compare gene expression distributions across clusters.

Heatmaps

Display expression patterns of marker genes across clusters.

Interactive Features

Zoom and pan
Hover for cell information
Export plots as PNG or SVG