Granatum

Welcome to Granatum! This is a graphical single-cell RNA-seq (scRNA-seq) analysis pipeline for genomics scientists. The pipeline will graphically guide you through the analysis of scRNA-seq data, starting from expression and metadata tables. It uses a comprehensive set of modules for quality control / normalization, clustering, differential gene expression / enrichment analysis, protein network interaction visualization, and cell pseudo-time pathway construction.

We welcome collaboration! If you are interested in getting help beyond standard troubleshooting please contact our PI, Lana Garmire, at: lgarmire@gmail.com

Please cite: Zhu, Xun et al. “Granatum: A Graphical Single-Cell RNA-Seq Analysis Pipeline for Genomics Scientists.” Genome Medicine 9.1 (2017)

Note 1: if the browser window (or tab) is accidentally closed, you may resume from where you left off by opening the last page in your broswer history.

Note 2: depending on dataset size, some steps may take time. Please allow computations to complete even if your browser appears to hang.

Note 3: if you encounter an error please create a New Issue in our GitHub repository

Visitor Map

Background

Please cite: Zhu, Xun et al. “Granatum: A Graphical Single-Cell RNA-Seq Analysis Pipeline for Genomics Scientists.” Genome Medicine 9.1 (2017)

Video tutorial: link to the video

Survey (suggestions are welcome!): link to the survey

Manuscript: link to the manuscript

Manual: download PDF

License: download text

DIY

To run the server on your own computer, download it from this link:

Download server file

To use the file, have VirtualBox installed:

Download VirtualBox

After starting VirtualBox, click "File" -> "Import Appliance...", provide the file, and perform the import.

Then launch Granatum, wait for it to load, and point your web browser to the following address:

http://localhost:8028/

A video of this can be viewed as well:

View video on YouTube

Thank you! If there are any questions please contact us: lana.garmire.group@gmail.com

Upload

You can upload your own data or try Granatum on our sample data.

Is your data Human or Mouse? Make a selection under "Species". Then provide your Expression and Metadata tables as comma separated value (CSV) files.

Species

Human

Mouse

Before uploading your data, please refer to our format specification.

Example human data (Kim, et al. 2016):

Expression Table

Browse...

No metadata

Metadata Table

Browse...

If you would like to add more datasets, click Add another dataset on the next page.

Maximum number of cores Granatum can use:

Summary of datasets uploaded

Last dataset uploaded

Expression Table
Metadata Table

Batch-effect removal

Remove confounding effects from data generated in batches. Box plots give expression statistics for a random sampling of up to 96 cells. Select a batch grouping label (factor) then click "Remove batch effect". If multiple datasets were separately uploaded, the "dataset" factor can be used.

Batch factor:

Batch-effect removal methods

ComBat

Simple median alignment

Outlier removal

Remove unusual cells, e.g., those damaged by capture. Select cells by clicking points in the plot and/or using "Auto-identify", then click "Remove selected".

Cell labels (from metadata)

Cluster using only top expressed genes (helps to identify outliers)

Selected cells:

Normalization

Adjust expression levels to correct for artificial differences between cells, e.g., differences in sequencing depth. When a rescaling/normalization button is clicked, the box plot (showing expression statistics for up to 96 randomly selected cells) will reflect changes. For example, clicking "Rescaling to geometric mean" will cause red dots (geometric means) to align. Note that clicking more than one rescaling/normalization button will apply adjustments on already adjusted values (use "Reset" to go back to unadjusted data).

Download the normalized matrix

Imputation

The large number of drop-outs might pose potential problems for downstream analyses. It is thus often appropriate to try to infer whether a zero is in the dataset is a drop-out -- that is, a non-zero expression level incorrectly assayed as zero. And if it is a drop-out, to infer its original expression level.

Gene filtering

Remove genes having very low expression and/or those with little variation (dispersion) by moving the sliders. It is recommended to keep at least 2,000 genes.

Log Mean Expression Threshold

Dispersion Fit Threshold

Starting number of genes:

Post-filtering number of genes:

Clustering

Select a clustering method and enter a number of clusters (or check the box for auto selection), then click "Run clustering".

Cell labels

Clustering method

K-means (Euclidean)

K-means (correlation t-SNE)

Hierarchical clustering (Euclidean) with heatmap

Hierarchical clustering (correlation t-SNE)

Non-negative matrix factorization

Automatically choose the number of clusters (might take a long time)

Number of clusters

Download the filtered matrix Download CSV table

Differential expression

Identify differentially expressed genes between clusters. The number of cores can be set to 2 and will run for approximately 30 minutes on the Kim, et al. 2016 dataset (116 cells, 3,788 genes, 3 clusters), when using a VirtualBox Appliance having 8 GB RAM and an Intel I7 processor. Note: the progress bar will not accurately reflect progress, please give the calculations time to complete.

Once complete, the enrichment of differentially expressed genes in KEGG pathways and GO terms can be calculated.

Method

NODES (fast)

limma (fast)

edgeR (fast)

SCDE (slow)

Which factor to use for DE?

Cell labels

Tabs indicate cluster numbers. Genes are sorted by absolute Z-score.

Download CSV table

Protein network

Proteins from top differentially expressed genes are visualized with connecting lines indicating documented biochemical interactions. Go to the next step by clicking "Proceed" (bottom right of page).

Pseudo-time construction

Cells are ordered in pseudo-time using similarities between their expression profiles.

Cell labels