Database Aims



scLTdb is a comprehensive single-cell lineage tracing database with multi-functional modules for the analysis of gene expression, clonal compositions, fate outcomes, lineage relationships and potential regulators of cell fate determination. scLTdb collected 109 datasets including three species, 13 tissue sources, 2.8 million cells and 36 scLT technologies.

scLTdb provides:

1. Search and query scLT datasets by species, tissue sources, barcode types, and technologies.

2. Re-analyze and viualize scLT datasets through three interactive modules, including single cell module, lineage tracing module, and integration module.

3. Download well-processed scLT datasets in h5ad and rds format.

4. Online tools to analyze single cell lineage tracing and cellualr barcoding data.

5. A step-by-step tutorial for users to use scLTdb. Users can access our tutorial by clicking the Tutorial button located in the navigation bar.

Please note that when you click on the 'Explore dataset' button on the Search page, it may take approximately 30 seconds to load the dataset. This delay is caused by the large number of cells in the dataset and the limitations of computational resources. In case you encounter any difficulty loading the dataset, kindly refresh the page.


Introduction of single cell lineage tracing


In recent years, the next-generation lineage tracing has involved cellular barcoding that uses a large number of synthetic DNA sequences to uniquely label cells, providing quantitative insights into stem cell dynamics and cell fate outcomes. This strategy can be achieved using various barcode types, including integration of exogenous random DNA sequences into the cell genome, recombination of endogenous DNA units, or genome editing-mediated DNA insertions and deletions (INDEL). Cellular barcoding allows researchers to distinguish individual cells based on specific DNA barcode sequences at clonal resolution. Combining cellular barcoding with single-cell genomics, also known as single-cell lineage tracing (scLT), has generated rich datasets that resolve cell fate and transcriptional or epigenetic state of each cell. In addition, using mutations and DNA variations that have accumulated in humans as natural barcodes, scLT is used to reconstruct high-resolution lineage trees in human hematopoiesis and cancers, which were lacking in medical research. Given the significance of scLT in biology and medicine, the publicly available scLT data call for in-depth and integrated analysis to yield new insights on cell fate determination in development and disease.



For users who need to know more about single cell lineage tracing, we recommand this review paper: https://www.nature.com/articles/s41576-020-0223-2 (Wagner et al, Nature Review Genetics, 2020).



Statistics




Contact:

School of Life Sciences, Westlake University

Weike Pei (correspondence): peiweike@westlake.edu.cn

Junyao Jiang (maintainer): jiangjunyao@westlake.edu.cn

Xing Ye (maintainer): yexing@mail.ustc.edu.cn

Dataset overview


Contact:

School of Life Sciences, Westlake University

Weike Pei (correspondence): peiweike@westlake.edu.cn

Junyao Jiang (maintainer): jiangjunyao@westlake.edu.cn

Xing Ye (maintainer): yexing@mail.ustc.edu.cn

This is the single-cell module of scLTdb, which provides a variety of functions for visualizing and re-analyzing gene expression, DNA methylation, and chromatin accessibility within scLT datasets.

1. Cell annotation (Pseudotime) embedding: visualize cell type (state) annotation and pseudotime. It is important to note that certain datasets do not support pseudotime inference. In such cases, scLTdb will display the following message: "This dataset does not have pseudotime."

2. Clone embedding: visualize a single-cell derived clone or selected cell clones on transcriptional landscape.

3. Cell type (state) number barplot: visualize cell type (state) number and fraction.

4. Gene expression: plot the gene of interest on the transcriptional landscape. Please note that this function does not work for datasets with DNA methylation, and chromatin accessibility modalities.

5. Marker gene information for each cell type: Top5 marker genes for each cell type (ranked by Log2Foldchange).

If the number of cells in the selected dataset exceeds 10,000, the following interactive plots will be generated from a sample data set of 10,000 cells.


Cell embedding
Clone embedding
Cell type (state) number barplot
Gene expression
Marker gene information for each cell type

This is the lineage tracing module of scLTdb, which offers a range of functions to visualize and re-analyze lineage tracing (cell fate) data modalities within scLT datasets.

This page offers four functions:

1. Unique barcode number in each cell type (state).

2. Number of cells with detected barcodes in each cell type (state).

3. Clone size statistics: visualize the number of cells carry the same clonal barcode.

4. Clone size ranges: visualize clone size across different ranges.

5. Correlation between clone size and clone related cell type number: violin plots of clone size (cell number) and clone-related cell type number.


Unique barcode number in each cell type
Number of cell with barcode in each cell type
Clone size statistics
Clone size ranges
Correlation between clone size and
clone related cell type number

This is the lineage tracing module of scLTdb, which offers a range of functions to visualize and re-analyze lineage tracing (cell fate) data modalities within scLT datasets.

1. Fate outcomes: this function can assist users to re-map cell fate of targeted populations by visualizing the barcode propagation across various cell types or states:

Users can adjust the ‘fate outcome heatmap’ by the following two ways:

(1). The normalization methods can be selected through the button located on the top of the heatmap. There are two available methods for barcode count normalization: 'log' and 'proportion'. The 'log' method represents log10 normalization of barcode counts, whereas the 'proportion' method represents the proportion of each barcode across all cell types (states).

(2). Users can adjust the column order (cell type order) of the heatmap. This can be done by simply rolling the cell types (states) bars located on the left panel of the heatmap.

2. Fate bias summary: visualize clone fate bias in specific cell types (states). In single-cell lineage tracing data, clone fate bias refers to a particular clone showing a differentiation bias into certain downstream cell types or states.


Fate outcomes
Adjust the column order of the heatmap
Fate bias summary

This is the lineage tracing module of scLTdb, which offers a range of functions to visualize and re-analyze lineage tracing (cell fate) data modalities within scLT datasets.

Here is the 'Lineage relationships' function within the lineage tracing module. This function is used to compare the lineage similarity between cell types (states) based on the number and frequency of barcodes present in each cell type (state). If two cell types share many barcodes at similar frequencies, they are likely to have arisen from a common developmental pathway; if not, they probably developed more independently.

Users can select specific cell types to reconstruct lineage relationships by clicking on ‘Select cell type (state)’.

If this dataset contains Cas9-edited barcodes, this page will present phylogenetic analysis result in the 'Phylogenetic tree' function.


Lineage relationship
Phylogenetic tree

This is the integration module of scLTdb, which offers a range of functions to perform integrative analysis of cell fate and gene expression.

This page offers two functions:

1. Differentially Expressed Genes (DEGs) analysis for cells with different cell fate biases. To utilize this function, users are required to first choose a specific cell type (state) and then select a particular cell fate bias group. This analysis reveals the transcriptional hallmarks between the two different cell fate biases within the selected cell type (state).

2. Enrichment analysis (gene ontology) for cell fate related DEGs.

3. Motif enrichment analysis for cell fate related DARs. This function only works for datasets that contain chromatin accessibility modality.

Please note that if the selected cell fate bias group has no significant DEGs, the scLTdb will display the message ‘No significant DEGs for selected fate bias group’. If cell fate related DEGs cannot be enriched for any functions in the gene ontology database, the scLTdb will display the message ‘Enrichment analysis cannot identify any significant Gene Ontology (GO) terms’.



This is the integration module of scLTdb, which offers a range of functions to perform integrative analysis of cell fate and gene expression.

This page offers three functions:

1. Differentially Expressed Genes (DEGs) analysis for two clones within a specific cell type (state). To utilize this function, users are required to first choose a specific cell type (state) and then select two large clones within this cell type (state). This analysis reveals the transcriptional hallmarks between the two selected large clones.

2. Enrichment analysis (gene ontology) for clone size related DEGs.

3. Motif enrichment analysis for cell fate related DARs. This function only works for datasets that contain chromatin accessibility modality

Please note that if this dataset has no significant large clone DEGs, the scLTdb will display the message ‘No significant DEGs for selected two large clones’. If clone DEGs cannot be enriched for any functions in the gene ontology database, the scLTdb will display the message ‘Enrichment analysis cannot identify any significant Gene Ontology (GO) terms’.

The table below presents the top 5 clones with the largest size in a specific cell type (state).


This is the integration module of scLTdb, which offers a range of functions to perform integrative analysis of cell fate and gene expression.

This page presents a violin plot to display gene expression variation across different cell fate biases within a particular cell type. To utilize this function, users need to first select a specific cell type (state) and enter the gene name into the input box.

This function is only designed for datasets that include the gene expression modality.




Contact:

School of Life Sciences, Westlake University

Weike Pei (correspondence): peiweike@westlake.edu.cn

Junyao Jiang (maintainer): jiangjunyao@westlake.edu.cn

Xing Ye (maintainer): yexing@mail.ustc.edu.cn

Dataset overview

Explanation of key columns in 'obs' (h5ad data download) and 'metadata' (R data download) is provided below:

celltype: state annotation of cells

time information: temporal annotation of cells

experimental condition: experimental condition annotation of cells

mouse ID/patient ID: The mouse or patient source of cells

barcodes: cell lineage barcodes

raw_barcodes: raw lineage barcode sequences for Cas9-edited data

For users who have difficulty to download datasets from Cowtransfer, we also provide a ZENODO repository for them to access our well-processed single-cell lineage tracing datasets (https://zenodo.org/records/12176634).



Contact:

School of Life Sciences, Westlake University

Weike Pei (correspondence): peiweike@westlake.edu.cn

Junyao Jiang (maintainer): jiangjunyao@westlake.edu.cn

Xing Ye (maintainer): yexing@mail.ustc.edu.cn

Clone analysis tools
Weclone to the clone analysis tools of scLTdb.

Our tools offer three functions:
(1) Clone size function: visualize the number of cells carry the same clonal barcode.
(2) Fate outcomes function: this function can assist users to re-map cell fate of targeted populations by visualizing the barcode propagation across various cell types or states.
(3) Lineage relationships function: this function is used to compare the lineage similarity between cell types (states) based on the number and frequency of barcodes present in each cell type (state). If two cell types share many barcodes at similar frequencies, they are likely to have arisen from a common developmental pathway; if not, they probably developed more independently.

Please find below the details regarding the demo data for online tools. To access the demo data, you can simply click on the provided hyperlink. It is worth noting that our online tool supports two types of input formats: table and matrix.
Table format: demo data
matrix format: demo data

Please upload your lineage tracing data using this button to initiate the analysis.



Contact:

School of Life Sciences, Westlake University

Weike Pei (correspondence): peiweike@westlake.edu.cn

Junyao Jiang (maintainer): jiangjunyao@westlake.edu.cn

Xing Ye (maintainer): yexing@mail.ustc.edu.cn

Fate bias DEGs/DARs analysis
Welcome to the fate bias analysis tools within the scLTdb.

This function facilitates integrative analysis of lineage tracing barcodes and single-cell transcriptomic (scRNA-seq) or epigenomic (scATAC-seq) data. It aims to identify differentially expressed genes (DEGs) or differentially accessible regions (DARs) indicative of fate bias.
This tool initially calculates the fate bias of the progenitor cell type selected by the user toward mature cells. Subsequently, it computes DEGs (DARs), among progenitor cells exhibiting distinct fate biases. To utilize this tool, users need to upload a gene-by-cell matrix or a peak-by-cell matrix, along with metadata for each cell, including relevant barcodes and cell types. Subsequently, users need to select the progenitor cell type. The tool will then generate a heatmap and a volcano plot to display the differentially DEGs or DARs.

Please find below the details regarding the demo data for Fate bias DEGs/DARs analysis. To access the demo data, you can simply click on the provided hyperlink.
Gene-by-cell matrix (or peak-by-cell matrix): demo data
metadata of cells: demo data

Please firstly upload your gene-by-cell matrix (or peak-by-cell matrix) and metadata.

Then, choose progenitor cell type and fate bias


Contact:

School of Life Sciences, Westlake University

Weike Pei (correspondence): peiweike@westlake.edu.cn

Junyao Jiang (maintainer): jiangjunyao@westlake.edu.cn

Xing Ye (maintainer): yexing@mail.ustc.edu.cn

PDF tutorial



Contact:

School of Life Sciences, Westlake University

Weike Pei (correspondence): peiweike@westlake.edu.cn

Junyao Jiang (maintainer): jiangjunyao@westlake.edu.cn

Xing Ye (maintainer): yexing@mail.ustc.edu.cn