Skip to content

Variant annotations

Cases are processed via the bioinformatics pipeline and annotations are added by CellBase. Once a case arrives to the New IB, additional annotations are added from resources that are currently unavailable in CellBase.

Annotation Description Source Static / Live
Tiering results Tier, segregation, mode of inheritance, penetrance Genomics England Rare Disease Tiering, Genomics England Rare Disease pipeline Static (Annotated by the Genomics England Rare Disease Tiering pipeline)
GMS PanelApp panels Panel name and version, gene and green status GMS PanelApp Static (Annotated by the Genomics England Rare Disease Tiering pipeline)
Exomiser results Exomiser prioritisation results, with associated scores, ranks and allele frequencies Genomics England Rare Disease Tiering, Genomics England Rare Disease pipeline Static (annotated by Exomiser, as part of the Genomics England Rare Disease pipeline)
PanelApp panels Panel name and version, gene status, gene mode of inheritance PanelApp Static (Annotated at case ingestion. The underlying dataset used for annotation is updated with the most recent PanelApp data every hour).
OMIM Associated diseases and gene-phenotype mode of inheritance CellBase Static (Annotated at the point of case ingestion. The underlying data used for annotation is updated with the most recent OMIM data with every New IB release).
Gene, transcript and protein annotations (ID, flags, consequences) Gene, transcript and associated annotations CellBase Static (Limited by CellBase version. Annotated at case ingestion)
gnomAD population frequency Population germline allele frequency database CellBase Static (Limited by CellBase version. Annotated at case ingestion)
GEL population frequency Internal germline allele frequency database Genomics England dataset Static (Annotated at the point of case ingestion. Also used within the Genomics England Rare Disease pipeline)
CVA CVA entr(y/ies) for the variant, with classification and count CVA Live
ClinVar submissions ClinVar Germline Database - accession, review status, submissions, classification, interpretation, latest submission CellBase Static (Limited by CellBase version. Annotated at case ingestion)
Segregation, MOI, penetrance Segregation, MOI and penetrance groupings from the Tiering events Genomics England Rare Disease Tiering, Genomics England Rare Disease pipeline Static (Annotated by the Genomics England Rare Disease Tiering pipeline)
Zygosity Variant zygosity Variant genotype, from the variant VCF file (generated by the Genomics England Rare Disease pipeline) Static (Annotated at case ingestion, using the VCF file)
Ref / alt / total reads Reference / alternative / total read counts Variant VCF file (Generated by the Genomics England Rare Disease Pipeline) Static (Annotated at case ingestion, using the VCF file)
QC filter data Indicates whether the variant passed QC filters applied during the analysis Variant VCF file (Generated by the Genomics England Rare Disease Pipeline) Static (Annotated at case ingestion, using the VCF file)

Data sources

To read more about each data source used within the product, and where it is used, please click on the below tabs.

GEL Tiering

Prioritisation results are the results from the Rare Disease Genomics England Tiering pipeline for the variant and the applied panels. This comprises of various metadata, including Tier, associated gene(s), segregation, mode of inheritance, and penetrance. These are displayed within the app within the Tier overlay, Tier columns in the variant grids, and the SNV details pages.

For further information on GEL Tiering, visit the Tiering section in the Genomics England Rare Disease Pipeline user guide.

GMS PanelApp

GMS PanelApp panels (including version, gene and green status), are annotated by the Genomics England Rare Disease Tiering pipeline. These are displayed within the app at locations including the case summary page, and the Tier overlay.

Exomiser

Exomiser is a programme that finds potential disease-causing variants from whole-exome or whole-genome sequencing data. All rare disease cases are run through the Exomiser automated variant prioritisation framework. Exomiser results and explainability data are displayed within the app within the Exomiser column in the SNV grid, the accompanying Exomiser overlay, and within the SNV details page.

For further information on Exomiser prioritisation, visit the Exomiser page in the Genomics England Rare Disease Pipeline user guide.

Segregation, MOI, penetrance

The segregation, MOI and penetrance groupings displayed within the New IB currently come from the GEL Tiering events. In future, it is planned that this data for Exomiser events, and also for untiered variants, will be added to the annotations provided.

Zygosity

Zygosity is annotated per variant per individual in the case, at the point of case ingestion, using the variant VCF file. The variant VCF file is generated by the Genomics England Rare Disease pipeline.

Ref / alt / total reads

Reference / alternative / total read counts are annotated per individual in the case, at the point of case ingestion, using the variant VCF file. The variant VCF file is generated by the Genomics England Rare Disease pipeline.

QC filter data

This indicates whether the variant passed QC filters applied during the analysis, and is annotated by the Genomics England Rare Disease Pipeline, where it is stored in the VCF file.

Gene, transcript and protein annotations (Ensembl / RefSeq)

Ensembl is a genome browser that provides gene annotations.

  • CellBase annotates each variant with gene, transcript and associated anntoations from Ensembl
  • The New IB displays these annotations in the variant grid and variant details page
  • The database version of Ensembl used by CellBase may not always be the most recent. Please see the latest version of the Rare Disease Genome Analysis guide and CellBase for further details.

gnomAD population frequencies

The Genome Aggregation Database (gnomAD) is a population frequency database of large-scale sequencing projects from healthy cohorts.

  • CellBase annotates whether a variant is present in gnomAD (exomes, genomes and mitochondrial datasets)
  • The New IB displays the total population germline allele frequency as well as the subpopulations
  • The database version of gnomAD used by CellBase is displayed in the New IB and may not always be the most recent.

GEL population frequencies

SNV and CNV variants are annotated with the GEL internal allele frequency dataset upon case ingestion. This is a GEL internal population frequency dataset. N.B this dataset is also used within the Genomics England Rare Disease pipeline; for further details, please see the Rare Disease Genome Analysis Guide.

ClinVar

ClinVar is a public database of variant and phenotype relationships alongside supporting evidence and clinical interpretation histories.

  • CellBase annotates whether a variant is present in Clinvar.
  • The New IB displays the ClinVar ID, interpretation, review status and number of entries alongside a link to the ClinVar record in the variant grid and variant details page for germline variants.
  • The database version of ClinVar used by CellBase is displayed in the New IB and may not always be the most recent. Please see the latest version of the Rare Disease Genome Analysis guide and CellBase for further details.

PanelApp

The GEL PanelApp knowledgebase allows virtual gene panels related to human disorders to be created, stored and queried and is used as the platform for achieving consensus on gene panels in the NHS Genomic Medicine Service (GMS).

  • The Ensembl gene ID of each variant is used to match the variant to Panels in PanelApp
  • The Panels in which the affected gene is present are then shown on the variant grid and variant details page

CVA

The Clinical Variant Ark (CVA) is the knowledge base built from NHS Clinical Scientist interpretations of rare disease patients in the 100K Genomes Project and NHS Genomic Medicine Service. For further information, visit the CVA user guide.

OMIM

OMIM data is used to annotate the variants with associated diseases and gene phenotype mode of inheritance. The data is static, and is annotated at the point of ingestion, with the underlying data used for annotation being updated with each New IB release.

Abbreviations
Abbreviation Definition
ACGS Association for Clinical Genomic Science
ACMG American College of Medical Genetics and Genomics
CDS Coding DNA Sequence
CIP-API Genomics England Clinical Interpretation API
CNV Copy Number Variant
CVA Clinical Variant Ark
EQ Exit Questionnaire
New IB New Interpretation Browser
GEL Genomics England
GMS Genomic Medicine Service
GLH Genomic Laboratory Hub
HGVS Human Genome Variation Society
HTML Hyper Text Markup Language
HSCN Health and Social Care Network (N3)
IGV Integrative Genomics Viewer
IB Interpretation Browser
IP Interpretation Portal
NGIS National Genomics Informatics System
PID Patient Identifiable Data
QC Quality Control
SoF Summary of Findings
SO Sequence Ontology
SNV Single Nucleotide Variant
SV Structural Variant
TOMS Test Order Management Service
UAT User Acceptance Testing
VCF Variant Call Format File
VILs Variant Interpretation Logs
WGS Whole Genome Sequencing