Benchmarking of Tumour Microenvrionment Cell Type Estimation From Bulk RNA

Run ConsensusTME web application

1. Upload gene expression matrix


ConsensusTME performs best following TPM normalisation for RNA-seq or quantile normalisation for microarray data.
File format should be: Samples as columns, HUGO gene names as rows, see example TCGA Ovarian file

N.B App may crash due to memory restrictions, for larger files please use R package



Running ConsensusTME within the R environment

Tumour Purity Benchmark


This benchmark assesses the ability of the immune estimation tools to capture the global level of immune infiltration into the tumour. This is assessed through the use of copy number and mutation data available from TCGA to derive tumour purity. Immune estimation tools with a good ability to estimate immune infiltation show a strong negative correlation with tumour purity.

Loading...


Loading...

Leukocyte Methylation Benchmark


This benchmark assesses the ability of the immune estimation tools to accurately predict leukocyte infiltration in the tumour. This is assessed through the use of methylation data from TCGA samples. Previous work (Hoadley et al. Cell 2008) identified loci across the epigenome with differential methylation in immune vs tumour populations and used this information to estimation a leukocyte fraction for each sample within TCGA. We assess method accuracy using the assumption that variation in leukocyte fraction should be able to be explained by variation in the cell type estimates in the category of being leukocytes. Multiple linear regression is performed for each cancer and various goodness of fit metrics are used to assess performance.



Fit Metric:


Adjusted R-Squared: Higher Value = Better Model


Loading...


Loading...

Akaike Information Criterion: Lower Value = Better Model


Loading...


Loading...

Bayesian Information Criterion: Lower Value = Better Model


Loading...


Loading...

Image Analysis Benchmark


This benchmark assesses the ability of the immune estimation tools to accurately predict lymphocyte infiltration in the tumour. This is assessed through the use of H&E pathology slides availble from TCHA. Previous work by the Thorsson group (Saltz et al. Cell Reports 2018) used a convolutional neural network to generate tumour infiltration lymphocyte estimates by image analysis. In this benchmark we use the lymphocyte score as the response variable for multiple linear regression with each of cell type estimates fitting into the category of being lymphocytes as explanatory variables. Models were assessed using three goodness of fit metrics to account for varying numbers of terms (i.e cell types) in the models. A smaller number of cancer types have H&E slides available.



Fit Metric:


Adjusted R-Squared: Higher Value = Better Model


Loading...


Loading...

Akaike Information Criterion: Lower Value = Better Model


Loading...


Loading...

Bayesian Information Criterion: Lower Value = Better Model


Loading...


Loading...

MCP-Counter Colon Cancer IHC Benchmark


This benchmark replicates a validation experiment carried out in the orginal MCP-Counter manuscript (Becht et al. Genome Biology 2016). Here the authors used a collection of colon cancer samples with matching RNA and immunohistochemistry (IHC) staining. Performance of methods was assessed by generating estimates for each of the methods from RNA and correlating these against estimates from IHC. Due to methods estimating different cell types, where appropriate cell sub-types are combined together in order to match IHC marker categories. While only assessing the performance of three cell-types the advantages of this benchmark is to samples are in-vivo tumour samples recapitulating the transcriptome in which the immune estimation tools are intended to be used.

Loading...

Loading...

TIMER Bladder Carcinoma Pathologist Estimation Benchmark


This benchmark replicates a validation experiment carried out in the orginal TIMER manuscript (Li et al. Genome Biology 2016). Here the authors used the heamatoxylin and eosin (H&E) slides from the Bladder Carcinoma study (BLCA) in TCGA. A pathologist reviewed 404 slides and classified each slide into having "Low" "Medium" or "High" levels of neutrophil abundance. Performance was then assessed by generating neutrophil estimates for all corresponding samples from RNA and comparing values from each category. Analysis of variance (ANOVA) with Tukey Honest Significant Difference (HSD) post-hoc tests were used to assess differences between RNA derived scores across pathologist estimation categories.

Loading...

xCell PBMC Benchmark


This benchmark replicates validation experiments carried out in the orginal xCell manuscript (Aran et al. Genome Biology 2017). Here the authors used data collected from ImmPort, accession SDY311 & SDY420; these contained PBMC samples from 61 and 104 healthly individuals respectively. Performance of immune estimation methods was assessed using matching RNA-Seq and CyTOF data for each of the samples. Due to methods estimating different cell types, where appropriate cell sub-types are combined together in order to match CyTOF categories. While having the benefits of giving an estimate of cell type specific performance this form of benchmark suffers from using immune cells from peripheral blood which doesn't reflect the complexity of the tumour microenvironment.

Loading...

Loading...

CIBERSORT PBMC Benchmark


This benchmark replicates a validation experiment carried out in the orginal CIBERSORT manuscript (Newman et al. Nature Methods 2015). Here the authors used a collection of peripheral blood mononuclear cells (PBMCs) isolated from 20 adults receiving influenza immunization. Performance of methods was assessed by generating estimates for each of the methods from RNA and correlated against flow cytometry fractions. Due to methods estimating different cell types, where appropriate cell sub-types are combined together in order to match flow cytometry categories. While having the benefits of giving an estimate of cell type specific performance this form of benchmark suffers from using immune cells from peripheral blood of healthly individuals which doesn't reflect the complexity of the tumour microenvironment.

Loading...

High Grade Serous Ovarian Cancer Benchmark


This benchmark replicates a validation experiment carried out in an independent study involving ConsensusTME (Jimenez-Sanchez et al. 2020, Nature Genetics, free access article ). Tumours from patients with high-grade serous ovarian cancer (HGSOC) were used. Methods were benchmarked by correlating bulk tumour mRNA-based immune estimates against IHC counts for CD4+ T Cells, CD8+ T Cells and T Regulatory Cells. N.B. This dataset was used in the orginal development of ConsensusTME



Loading...

Results Overview


Here the overall performance of methods across all benchmarks can be visualised. Benchmarking experiments fall into three broad categories: 1) TCGA - Benchmarks carried out using orthogonal inferences from TCGA data. 2) PBMCs - Benchmarks carried out in which authours used peripheral blood mononuclear cells (PBMCs) derived from circulation. 3) Bulk Tumour - Benchmarks using samples derived from the setting of a bulk tumour. The mean rank of each method is also plotted

Loading...

Overview of approaches & tools for cell type estimation

To suggest additional tools to be added to this list please contact Oliver.Cast@cruk.cam.ac.uk

Datasets for benchmarking & validation of cell estimation tools

To suggest additional datasets to be added to this list please contact Oliver.Cast@cruk.cam.ac.uk

Cell type estimation review and benchmarking studies

To suggest additional articles to be added to this list please contact Oliver.Cast@cruk.cam.ac.uk

Overview


This online portal serves as a companion to the the manuscript published in Cancer Research:

"Comprehensive Benchmarking and Integration of Tumour Microenvironment Cell Estimation Methods"

This serves a dual purpose of allowing interactive exploration of large, multi-dimensional data but also allows benchmarking results to be evolvable; as both new methods and new benchmarking datasets become available the portal can be updated to ensure orginal benchmarks are more than a snapshot in time of method performance.


ConsensusTME Approach



The approach taken by ConsensusTME is fully described in the manuscript. However, to summerise briefly, the approach consists of 6 main steps:

1) Multiple sources are used for initial compiling of gene sets. Either from pre-existing carefully curated gene sets (e.g Danaher gene sets ) or through analysis of signature matricies leveraged by other methods (e.g LM22 matrix used by CIBERSORT ).

2) Defining cell types for which gene sets can be derived from at least two methods.

3) Create a unique union of genes from multiple sources.

4) Using an approach, orginally used by the TIMER tool, we ensure that only genes whose expression has a negative correlation with tumour purity are included. This is done on a cancer by cancer basis for each of the cancer types available from TCGA and increases the confidence that if the expression of a gene is increasing it is due to the presence of an immune cell instead of spurious up-regulation by cancer cells.

5) Use the gene sets with a statistical framework to produce normalised enrichment scores (NESs). Currently single sample gene set enrichment analysis (ssGSEA) is employed and benchmarked. Future versions of ConsensusTME may use different approaches as they become available.

6) NES output can be used to identify differences between immune cell subtype abundance between patients.

ConsensusTME is available as a GitHub downloadable R package: https://github.com/cansysbio/ConsensusTME

Contact


Oliver Cast: Oliver.Cast@cruk.cam.ac.uk

Alejandro Jiménez-Sánchez: ajs.scientia@gmail.com

Martin Miller
For suggestions & queries regarding the portal:

Oliver.Cast@cruk.cam.ac.uk