Variant annotations¶
Cases are processed via the bioinformatics pipeline and annotations are added by CellBase. Once a case arrives to the New IB, additional annotations are added from resources that are currently unavailable in CellBase.
| Annotation | Description | Source | Static / Live |
|---|---|---|---|
| Tiering results | Tier, segregation, mode of inheritance, penetrance | Genomics England Rare Disease Tiering, Genomics England Rare Disease pipeline | Static (Annotated by the Genomics England Rare Disease Tiering pipeline) |
| GMS PanelApp panels | Panel name and version, gene and green status | GMS PanelApp | Static (Annotated by the Genomics England Rare Disease Tiering pipeline) |
| Exomiser results | Exomiser prioritisation results, with associated scores, ranks and allele frequencies | Genomics England Rare Disease Tiering, Genomics England Rare Disease pipeline | Static (annotated by Exomiser, as part of the Genomics England Rare Disease pipeline) |
| PanelApp panels | Panel name and version, gene status, gene mode of inheritance | PanelApp | Static (Annotated at case ingestion. The underlying dataset used for annotation is updated with the most recent PanelApp data every hour). |
| OMIM | Associated diseases and gene-phenotype mode of inheritance | CellBase | Static (Annotated at the point of case ingestion. The underlying data used for annotation is updated with the most recent OMIM data with every New IB release). |
| Gene, transcript and protein annotations (ID, flags, consequences) | Gene, transcript and associated annotations | CellBase | Static (Limited by CellBase version. Annotated at case ingestion) |
| gnomAD population frequency | Population germline allele frequency database | CellBase | Static (Limited by CellBase version. Annotated at case ingestion) |
| GEL population frequency | Internal germline allele frequency database | Genomics England dataset | Static (Annotated at the point of case ingestion. Also used within the Genomics England Rare Disease pipeline) |
| CVA | CVA entr(y/ies) for the variant, with classification and count | CVA | Live |
| ClinVar submissions | ClinVar Germline Database - accession, review status, submissions, classification, interpretation, latest submission | CellBase | Static (Limited by CellBase version. Annotated at case ingestion) |
| Segregation, MOI, penetrance | Segregation, MOI and penetrance groupings from the Tiering events | Genomics England Rare Disease Tiering, Genomics England Rare Disease pipeline | Static (Annotated by the Genomics England Rare Disease Tiering pipeline) |
| Zygosity | Variant zygosity | Variant genotype, from the variant VCF file (generated by the Genomics England Rare Disease pipeline) | Static (Annotated at case ingestion, using the VCF file) |
| Ref / alt / total reads | Reference / alternative / total read counts | Variant VCF file (Generated by the Genomics England Rare Disease Pipeline) | Static (Annotated at case ingestion, using the VCF file) |
| QC filter data | Indicates whether the variant passed QC filters applied during the analysis | Variant VCF file (Generated by the Genomics England Rare Disease Pipeline) | Static (Annotated at case ingestion, using the VCF file) |
Data sources¶
To read more about each data source used within the product, and where it is used, please click on the below tabs.
GEL Tiering¶
Prioritisation results are the results from the Rare Disease Genomics England Tiering pipeline for the variant and the applied panels. This comprises of various metadata, including Tier, associated gene(s), segregation, mode of inheritance, and penetrance. These are displayed within the app within the Tier overlay, Tier columns in the variant grids, and the SNV details pages.
For further information on GEL Tiering, visit the Tiering section in the Genomics England Rare Disease Pipeline user guide.
GMS PanelApp¶
GMS PanelApp panels (including version, gene and green status), are annotated by the Genomics England Rare Disease Tiering pipeline. These are displayed within the app at locations including the case summary page, and the Tier overlay.
Exomiser¶
Exomiser is a programme that finds potential disease-causing variants from whole-exome or whole-genome sequencing data. All rare disease cases are run through the Exomiser automated variant prioritisation framework. Exomiser results and explainability data are displayed within the app within the Exomiser column in the SNV grid, the accompanying Exomiser overlay, and within the SNV details page.
For further information on Exomiser prioritisation, visit the Exomiser page in the Genomics England Rare Disease Pipeline user guide.
Segregation, MOI, penetrance¶
The segregation, MOI and penetrance groupings displayed within the New IB currently come from the GEL Tiering events. In future, it is planned that this data for Exomiser events, and also for untiered variants, will be added to the annotations provided.
Zygosity¶
Zygosity is annotated per variant per individual in the case, at the point of case ingestion, using the variant VCF file. The variant VCF file is generated by the Genomics England Rare Disease pipeline.
Ref / alt / total reads¶
Reference / alternative / total read counts are annotated per individual in the case, at the point of case ingestion, using the variant VCF file. The variant VCF file is generated by the Genomics England Rare Disease pipeline.
QC filter data¶
This indicates whether the variant passed QC filters applied during the analysis, and is annotated by the Genomics England Rare Disease Pipeline, where it is stored in the VCF file.
Gene, transcript and protein annotations (Ensembl / RefSeq)¶
Ensembl is a genome browser that provides gene annotations.
- CellBase annotates each variant with gene, transcript and associated anntoations from Ensembl
- The New IB displays these annotations in the variant grid and variant details page
- The database version of Ensembl used by CellBase may not always be the most recent. Please see the latest version of the Rare Disease Genome Analysis guide and CellBase for further details.
gnomAD population frequencies¶
The Genome Aggregation Database (gnomAD) is a population frequency database of large-scale sequencing projects from healthy cohorts.
- CellBase annotates whether a variant is present in gnomAD (exomes, genomes and mitochondrial datasets)
- The New IB displays the total population germline allele frequency as well as the subpopulations
- The database version of gnomAD used by CellBase is displayed in the New IB and may not always be the most recent.
GEL population frequencies¶
SNV and CNV variants are annotated with the GEL internal allele frequency dataset upon case ingestion. This is a GEL internal population frequency dataset. N.B this dataset is also used within the Genomics England Rare Disease pipeline; for further details, please see the Rare Disease Genome Analysis Guide.
ClinVar¶
ClinVar is a public database of variant and phenotype relationships alongside supporting evidence and clinical interpretation histories.
- CellBase annotates whether a variant is present in Clinvar.
- The New IB displays the ClinVar ID, interpretation, review status and number of entries alongside a link to the ClinVar record in the variant grid and variant details page for germline variants.
- The database version of ClinVar used by CellBase is displayed in the New IB and may not always be the most recent. Please see the latest version of the Rare Disease Genome Analysis guide and CellBase for further details.
PanelApp¶
The GEL PanelApp knowledgebase allows virtual gene panels related to human disorders to be created, stored and queried and is used as the platform for achieving consensus on gene panels in the NHS Genomic Medicine Service (GMS).
- The Ensembl gene ID of each variant is used to match the variant to Panels in PanelApp
- The Panels in which the affected gene is present are then shown on the variant grid and variant details page
CVA¶
The Clinical Variant Ark (CVA) is the knowledge base built from NHS Clinical Scientist interpretations of rare disease patients in the 100K Genomes Project and NHS Genomic Medicine Service. For further information, visit the CVA user guide.
Abbreviations
| Abbreviation | Definition |
|---|---|
| ACGS | Association for Clinical Genomic Science |
| ACMG | American College of Medical Genetics and Genomics |
| CDS | Coding DNA Sequence |
| CIP-API | Genomics England Clinical Interpretation API |
| CNV | Copy Number Variant |
| CVA | Clinical Variant Ark |
| EQ | Exit Questionnaire |
| New IB | New Interpretation Browser |
| GEL | Genomics England |
| GMS | Genomic Medicine Service |
| GLH | Genomic Laboratory Hub |
| HGVS | Human Genome Variation Society |
| HTML | Hyper Text Markup Language |
| HSCN | Health and Social Care Network (N3) |
| IGV | Integrative Genomics Viewer |
| IB | Interpretation Browser |
| IP | Interpretation Portal |
| NGIS | National Genomics Informatics System |
| PID | Patient Identifiable Data |
| QC | Quality Control |
| SoF | Summary of Findings |
| SO | Sequence Ontology |
| SNV | Single Nucleotide Variant |
| SV | Structural Variant |
| TOMS | Test Order Management Service |
| UAT | User Acceptance Testing |
| VCF | Variant Call Format File |
| VILs | Variant Interpretation Logs |
| WGS | Whole Genome Sequencing |