Center for the Study of Systems Biology

  • Home
  • About us
    • Members
  • Skolnick Research Group
    • Jeffrey Skolnick
    • Maximilian Brogi
    • Brendon Cara
    • Chad Choudhry
    • Brice Edelman
    • Jonathan Feldman
    • Jessica Gilmore Forness
    • Bartosz Ilkowski
    • Preetam Jukalkar
    • Giselle McPhilliamy
    • Asha Mira Rao
    • Hargobind Singh
    • Kyle Xu
    • Hongyi Zhou
    • Former Group Members
  • Publications
    • Publications
    • Preprints
  • Software and Services
    • Services
      • DESTINI
      • DR. PRODIS
      • ENTPRISE
      • ENTPRISE-X
      • FINDSITEcomb
      • FINDSITEcomb2.0
      • FRAGSITE
      • FRAGSITE2
      • FRAGSITEcombM
      • Know-GENE
      • LeMeDISCO
      • MEDICASCY
      • MOATAI-VIR
      • PHEVIR
      • PICMOA
    • Downloads
      • AF2Complex
      • AF3Complex
      • APoc
      • Cavitator
      • DBD-Hunter
      • DBD-Threader
      • EFICAz2.5
      • Fr-TM-align
      • GOAP
      • iAlign
      • IS-score
      • LIGSIFT
      • MENDELSEEK
      • PULCHRA
      • SAdLSA
      • Valsci
    • Databases
      • Apo and Holo Pairs
      • New Human GPCR Modeling and Virtual Screening
      • PDB-like Structures
    • Simulations
      • E. coli Intracellular Dynamics
  • News & Events
    • News
    • ARCHIVE: Distinguished Lecture Series in Systems Biology
  • Jobs

MENDELSEEK: An algorithm that predicts Mendelian Genes and elucidates what makes them special

MENDLSEEK

While individual Mendelian diseases (diseases caused by a single gene) are rare, their aggregate number is significant. Discovering which gene causes a Mendelian disease is crucial for accurate diagnosis and treatment. Despite decades of effort, the genetic cause driving over half of identified Mendelian diseases is unknown. To address this, we describe MENDELSEEK, a machine learning approach that predicts Mendelian genes by integrating the gene’s aggregate residue variation score with properties such as their involved pathways, Gene Ontology processes, and protein language models. In benchmarking on 16,946 human genes with 10-fold cross-validation, MENDELSEEK achieves an area under the receiver operating characteristic curve, AUC, and an area under precision-recall curve, AUPR, of 0.850 and 0.695 respectively, compared to the second best method that uses residue variation, ENTPRISE+ENTPRISE-X, with 0.781 and 0.604 scores, and the third best approach, REVEL, with 0.597 and 0.390 scores. Mendelian genes have significantly more protein-protein interactions than non-Mendelian genes and are evolutionarily ancient. Applying MENDELSEEK to 17,858 genes of the whole human genome, 1,024 novel Mendelian genes with a precision >0.7 are predicted. Thus, MENDELSEEK represents a major improvement over the state-of-the-art and provides valuable insights into the biochemical features that distinguish Mendelian from non-Mendelian genes.

Citation: Zhou, H, Skolnick J. Submitted. MENDELSEEK: An algorithm that predicts Mendelian Genes and elucidates what makes them special.

Source code: The source code of MENDELSEEK is freely available at github.

  • Skolnick Research Group
    • Jeffrey Skolnick
    • Maximilian Brogi
    • Brendon Cara
    • Chad Choudhry
    • Brice Edelman
    • Jonathan Feldman
    • Jessica Gilmore Forness
    • Bartosz Ilkowski
    • Preetam Jukalkar
    • Giselle McPhilliamy
    • Asha Mira Rao
    • Hargobind Singh
    • Kyle Xu
    • Hongyi Zhou
    • Former Group Members
  • Software and Services
    • Services
      • DESTINI
      • DR. PRODIS
      • ENTPRISE
      • ENTPRISE-X
      • FINDSITEcomb
      • FINDSITEcomb2.0
      • FRAGSITE
      • FRAGSITE2
      • FRAGSITEcombM
      • Know-GENE
      • LeMeDISCO
      • MEDICASCY
      • MOATAI-VIR
      • PHEVIR
      • PICMOA
    • Downloads
      • AF2Complex
      • AF3Complex
      • APoc
      • Cavitator
      • DBD-Hunter
      • DBD-Threader
      • EFICAz2.5
      • Fr-TM-align
      • GOAP
      • iAlign
      • IS-score
      • LIGSIFT
      • MENDELSEEK
      • PULCHRA
      • SAdLSA
      • Valsci
    • Databases
      • Apo and Holo Pairs
      • New Human GPCR Modeling and Virtual Screening
      • PDB-like Structures
    • Simulations
      • E. coli Intracellular Dynamics

Copyright © 2025 · Minimum Pro on Genesis Framework · WordPress · Log in