AI Technology: Machine Learning, Deep Learning, Generative Language Modeling
This project proposes the development of a foundational model for mass spectrometry proteomics, leveraging the advanced capabilities of EuroHPC resources.
By employing transformer neural networks, the model aims to learn fundamental properties of protein sequences directly from mass spectra data.
Such a model holds the promise of a versatile foundation for prediction of downstream tasks, such as mass spectrum prediction from peptide sequences, and overcoming the limitations of traditional database-dependent methods and existing de novo sequencing techniques.
The project's interdisciplinary team, comprising academic researchers and the leading Bio-AI company InstaDeep, brings together years of experience in artificial intelligence, bioinformatics, and proteomics.
The project's innovative approach promises not only to revolutionize protein sequencing, but also to provide a robust tool for a wide range of applications within proteomics and beyond.
With a well-structured personnel and management plan, we are poised to navigate the complexities of this ambitious project, driving significant advancements in proteomics research, and opening new avenues in biomarker discovery, drug development, and personalized medicine.
Timothy Jenkins, University of Denmark - Denmark