Isfahan University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! By
Drugtar Improves Druggability Prediction by Integrating Large Language Models and Gene Ontologies Publisher Pubmed



N Borhani NILOOFAR ; I Izadi IMAN ; A Motahharynia ALI ; M Sheikholeslami MAHSA ; Y Gheisari YOUSOF
Authors

Source: Bioinformatics Published:2025


Abstract

Motivation Target discovery is crucial in drug development, especially for complex chronic diseases. Recent advances in high-throughput technologies and the explosion of biomedical data have highlighted the potential of computational druggability prediction methods. However, most current methods rely on sequence-based features with machine learning, which often face challenges related to hand-crafted features, reproducibility, and accessibility. Moreover, the potential of raw sequence and protein structure has not been fully investigated. Results Here, we leveraged both protein sequence and structure using deep learning techniques, revealing that protein sequence, especially pre-trained embeddings, is more informative than protein structure. Next, we developed DrugTar, a high-performance deep learning algorithm integrating sequence embeddings from the ESM-2 pre-trained protein language model with gene ontologies to predict druggability. DrugTar achieved areas under the curve and precision-recall curve values of 0.94, outperforming state-of-the-art methods. In conclusion, DrugTar streamlines target discovery as a bottleneck in developing novel therapeutics. Availability and implementation DrugTar is available as a web server at www.DrugTar.com. The data and source code are at https://github.com/NBorhani/DrugTar. © 2025 Elsevier B.V., All rights reserved.