Welcome to Granatum! This is a graphical single-cell RNA-seq (scRNA-seq) analysis pipeline for genomics scientists. The pipeline will graphically guide you through the analysis of scRNA-seq data, starting from expression and metadata tables. It uses a comprehensive set of modules for quality control / normalization, clustering, differential gene expression / enrichment analysis, protein network interaction visualization, and cell pseudo-time pathway construction.
We welcome collaboration! If you are interested in getting help beyond standard troubleshooting please contact our PI, Lana Garmire, at: email@example.com
Note 1: if the browser window (or tab) is accidentally closed, you may resume from where you left off by opening the last page in your broswer history.
Note 2: depending on dataset size, some steps may take time. Please allow computations to complete even if your browser appears to hang.
Note 3: if you encounter an error please create a New Issue in our GitHub repository
Please cite: Zhu, Xun et al. “Granatum: A Graphical Single-Cell RNA-Seq Analysis Pipeline for Genomics Scientists.” Genome Medicine 9.1 (2017)
Video tutorial: link to the video
Survey (suggestions are welcome!): link to the survey
Manuscript: link to the manuscript
Manual: download PDF
License: download text
To run the server on your own computer, download it from this link:
To use the file, have VirtualBox installed:
After starting VirtualBox, click "File" -> "Import Appliance...", provide the file, and perform the import.
Then launch Granatum, wait for it to load, and point your web browser to the following address:
A video of this can be viewed as well:
Thank you! If there are any questions please contact us: firstname.lastname@example.org
You can upload your own data or try Granatum on our sample data.
Is your data Human or Mouse? Make a selection under "Species". Then provide your Expression and Metadata tables as comma separated value (CSV) files.
Example human data (Kim, et al. 2016):
If you would like to add more datasets, click Add another dataset on the next page.
Remove confounding effects from data generated in batches. Box plots give expression statistics for a random sampling of up to 96 cells. Select a batch grouping label (factor) then click "Remove batch effect". If multiple datasets were separately uploaded, the "dataset" factor can be used.
Remove unusual cells, e.g., those damaged by capture. Select cells by clicking points in the plot and/or using "Auto-identify", then click "Remove selected".
Adjust expression levels to correct for artificial differences between cells, e.g., differences in sequencing depth. When a rescaling/normalization button is clicked, the box plot (showing expression statistics for up to 96 randomly selected cells) will reflect changes. For example, clicking "Rescaling to geometric mean" will cause red dots (geometric means) to align. Note that clicking more than one rescaling/normalization button will apply adjustments on already adjusted values (use "Reset" to go back to unadjusted data).
The large number of drop-outs might pose potential problems for downstream analyses. It is thus often appropriate to try to infer whether a zero is in the dataset is a drop-out -- that is, a non-zero expression level incorrectly assayed as zero. And if it is a drop-out, to infer its original expression level.
Remove genes having very low expression and/or those with little variation (dispersion) by moving the sliders. It is recommended to keep at least 2,000 genes.
Select a clustering method and enter a number of clusters (or check the box for auto selection), then click "Run clustering".
Identify differentially expressed genes between clusters. The number of cores can be set to 2 and will run for approximately 30 minutes on the Kim, et al. 2016 dataset (116 cells, 3,788 genes, 3 clusters), when using a VirtualBox Appliance having 8 GB RAM and an Intel I7 processor. Note: the progress bar will not accurately reflect progress, please give the calculations time to complete.
Once complete, the enrichment of differentially expressed genes in KEGG pathways and GO terms can be calculated.
Tabs indicate cluster numbers. Genes are sorted by absolute Z-score.
Proteins from top differentially expressed genes are visualized with connecting lines indicating documented biochemical interactions. Go to the next step by clicking "Proceed" (bottom right of page).
Cells are ordered in pseudo-time using similarities between their expression profiles.