scLAB Documentation
Getting Started
This guide will help you get up and running with single-cell RNA-seq analysis (scRNA-seq). scLAB provides a complete workflow for analyzing your single-cell data without any coding required.
Loading Data
scLAB supports two types of scRNA-seq data files:
1. Cellranger Output h5 Files
Load raw count data from CellRanger output. Alternatively, you can load multiple h5 files (multiple samples) and scLAB will concatenate them.
2. Processed h5ad Files
Load already-processed AnnData files (h5ad format) for visualization and secondary analysis.
Preprocessing Pipeline
Informed by Scanpy, scLAB supports the following quality control workflow:
1. Quality Control (QC)
Divided into basic and advanced quality control:
- Basic quality control: Filter cells based on minimum genes per cells
and minimum cells per gene thresholds:
- Minimum genes per cell: Remove cells with too few genes (default: 200)
- Minimum cells per gene: Remove genes expressed in too few cells (default: 3)
- Advanced quality control: Calculate standard quality metrics:
- n_genes_by_counts: Total number of genes expressed in each cell
- total_counts: Total counts (UMIs) per cell
- (Optional) pct_counts_mt: Percentage of mitochondrial gene counts per cell
- (Optional) rb_counts_mt: Percentage of ribosomal gene counts per cell
- (Optional) hb_counts_mt: Percentage of hemoglobin gene counts per cell
2. Normalization
Normalize gene expression to account for sequencing depth differences between cells. scLAB uses standard log-normalization. Default is 10,000 counts per cell. Users can choose if they want to normalize to the median total counts per cell as well.
3. Feature Selection
Identify highly variable genes that drive biological variation in your dataset. Default: selects top 2000 highly variable genes. Option to identify highly variable genes by accounting for batch effects.
4. Scaling
Scale gene expression to zero mean and unit variance for downstream analysis.
Analysis Methods
Dimensionality Reduction
PCA (Principal Component Analysis)
Reduces high-dimensional gene expression data to a smaller number of principal components. Default: 50 components.
UMAP (Uniform Manifold Approximation and Projection)
Creates a 2D visualization of your cells based on gene expression similarity. Parameters:
- n_neighbors: Controls local vs global structure (default: 15)
- min_dist: Controls how tightly points cluster (default: 0.1)
Clustering
scLAB uses the Leiden algorithm for community detection. Adjust the resolution parameter to control cluster granularity:
- Lower resolution (0.4-0.8): Fewer, larger clusters
- Higher resolution (1.0-2.0): More, smaller clusters
Differential Expression Analysis
Find marker genes that distinguish cell clusters or compare between conditions. scLAB uses Wilcoxon rank-sum test by default.
Visualization
scLAB provides interactive visualizations powered by Plotly:
UMAP Plots
Visualize cells in 2D space, colored by clusters, gene expression, or metadata.
Violin Plots
Compare gene expression distributions across clusters.
Heatmaps
Display expression patterns of marker genes across clusters.
Interactive Features
- Zoom and pan
- Hover for cell information
- Export plots as PNG or SVG