LIGATE Pose Selector Training Data

10,000,000

Awarded Resources (in core hours)

MeluXina CPU

System Partition

8 March 2023 - 7 March 2024

Allocation Period

Computer-aided drug design can significantly reduce the time and resources needed for drug discovery because experimental high-throughput assays can be replaced with virtual screens.

Given the huge number of compounds to be tested, only molecular docking is efficient enough to screen the initial compound library to identify promising drug candidates to be studied further. However, docking poses are scored with the help of approximate energy functions that often capture solvent effects inadequately and that completely ignore entropic contributions to protein-ligand binding.

Due to this systematic bias, the ensemble of docking structure may significantly differ from the biologically relevant ensemble of the respective protein-ligand complex in solution.To overcome these shortcomings, we plan to perform short MD simulations with explicit solvent for 4 M docking poses generated for 20,000 protein-ligand complexes curated in PDBbind.

From these MD simulations, we calculate absolute binding free energies and assign them to the respective docking pose as its new docking score. The free energies computed within this project will provide the training data for a machine-learning model encoding this free energy docking score, equipping the software package LiGen with a docking score that incorporates both solvent and entropic effects in the evaluation of docking poses.

LIGATE Pose Selector Training Data

Country and Research Team Institution