Skip to main content
Logo
The European High Performance Computing Joint Undertaking (EuroHPC JU)

Multi-layered Assessment of VarIants by Structure for Proteins (MAVISp) to classify variants in cancer driver genes

78,125
Allocated Resources (in node hours)
Discoverer CPU
System Partition
7 July 2023 - 6 July 2024
Allocation Period

Cancer arises from genetic alterations, particularly in cancer-driver genes referred to as tumor suppressors or oncogenes, including many variants of uncertain significance (VUS) with unclear pathogenic and functional impacts. Understanding if VUS affect proteins and their impact at the molecular level is crucial for expanding our knowledge of cancer, developing personalized prognosis and treatments, or assisting genetic counseling.

To address this knowledge gap, the Cancer Structural Biology, Danish Cancer Society Research Center (DCRC) are developing MAVISp (Multi-layered Assessment of VarIants by Structure for proteins), a modular framework that uses structure-based methods to predict the impact of variants in the coding region of genes and simultaneously provides pathogenicity scores as well as an explanation of the molecular mechanisms and effects they cause in biological readouts and protein functions. MAVISp aims at annotating, classifying, and rationalizing the effects of variants and their impacts on the corresponding proteins in a standardized and high-throughput manner.

MAVISp features a publicly accessible web application (https://github.com/ELELAB/MAVISp) where the data generated by the assessment are made available to the community for consultation or further studies. A publicly hosted version of this web app (i.e. a website) is also under development. The most recent, still under-development version of MAVISp includes information for more than 100 different proteins and predictions for 45000 variants, using simple mode only, which contains no information from structural ensembles.

Currently, the project has applied MAVISp on a small scale, however, it required massive computing capacity to scale up its efforts. Here, the project will apply MAVISp to collect structural ensembles from biomolecular simulations and mutational scans based on free energy calculations and predict the effect of tens of thousands of variants from human cancers in more than 400 tumor suppressors, oncogenes and their complexes.

The project is a key step towards developing MAVISp and will generate a wealth of data on variants in cancer, including pathogenic potential, effects at the molecular level, and links to phenotypes. The data can be harnessed for machine learning-based variant assessment and classification or to gain a deeper understanding of the significance of certain variants in relation to clinical factors, including resistance to therapy and risk of relapse. The project has potential future applications for clinical practice, as it could enable more accurate classification of cancer-related variants and help treatment decisions.

Furthermore, MAVISp could have applications in other fields beyond cancer research, such as protein engineering and drug discovery. The project's atlas of variants will be curated, deposited and maintained in the open-access MAVIsp web-based platform, and, with experimental validation of our predictions, will be a valuable resource to lead cellular biologists and health professionals in cancer research.