Gene-Mendelian disease mapping


While OMIM is the established reference for mapping Mendelian disease to genes, adding gene-disease associations from DECIPHER, Orphanet and Genomics England panels in MendelVar allows increased discovery compared to OMIM alone (500 more genes and up to 5,000 more gene-disease relationships).

Short disease in MendelVar descriptions were sourced from OMIM, Orphanet, Uniprot and DO.

Overlapping pathogenic variants


Variants downloaded from ClinVar in the variant_summary.txt.gz file were filtered to keep only those that contain "pathogenic", "likely pathogenic" or "risk factor" among its effects. Variants spanning many genes were eliminated, to keep variants directly linked to single or a very small number of genes. Variants missing coordinates were also discarded, and phenotype descriptions matched to OMIM (MIM) IDs whenever possible. This filtering strategy resulted in retention of approximately 20% of variants, ~105,000 variants per genome build. Depending on the input, user-provided or generated intervals are looked up in GRCh37 or GRCh38 subset of the ClinVar database.

Gene set enrichment testing


Following identification of Mendelian disease-associated genes overlapping the GWAS loci, MendelVar allows testing for enrichment of terms associated with those genes relative to the background of all Mendelian disease-associated genes in the whole genome using INRICH. Especially of interest in GWAS for complex traits and uniquely compared to other tools, we allow simultaneous testing using Human Phenotype Ontology (HPO), Disease Ontology (DO) and Freund et al. (2018) gene sets. To give user an overview of general categories associated with both ontologies, MendelVar includes the official DO slim (24 terms) and custom-generated HPO slim (mapping all the terms to 25 direct descendants of the root term HP:0000118 “Phenotypic abnormality”). In addition, more general enrichment testing is available with human Gene Ontology and its slim, and pathway enrichment testing with human ConsensusPathDB, PathwayCommons and Reactome.

We subsetted all the ontologies only to the Mendelian disease genes in the MendelVar database and eliminated annotation of the genes with no evidence for disease causality. Similarly, the genome annotation supplied to INRICH as background contained only those disease genes, as we want to test for enrichment relative to genes linked to Mendelian disease rather any gene as background.

<aside> 💡 R package ontologyIndex was used to propagate all the child relationships to their parents and all the way to the root of ontologies (exclusive of the root terms in each ontology, because these are uninformative): DO, HPO, GO. We only retained the “Phenotypic abnormality” HPO ontology but discarded children of HP:0000005 Mode of inheritance, HP:0031797 Clinical course, HP:0040279 Frequency, HP:0012823 Clinical modifier as these encompass a small number of child terms and do not correlate with the mechanistic basis for disease.

</aside>