Isfahan University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! On (X network) By
Predicting Deleterious Missense Genetic Variants Via Integrative Supervised Nonnegative Matrix Tri-Factorization Publisher Pubmed



Arani AA1, 2 ; Sehhati M3, 4 ; Tabatabaiefar MA4, 5
Authors
Show Affiliations
Authors Affiliations
  1. 1. Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
  2. 2. Student Research Committee, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
  3. 3. Department of Bioinformatics, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
  4. 4. Deputy of Research and Technology, GTaC Corp, Isfahan University of Medical Sciences, Isfahan, Iran
  5. 5. Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran

Source: Scientific Reports Published:2021


Abstract

Among an assortment of genetic variations, Missense are major ones which a small subset of them may led to the upset of the protein function and ultimately end in human diseases. Various machine learning methods were declared to differentiate deleterious and benign missense variants by means of a large number of features, including structure, sequence, interaction networks, gene disease associations as well as phenotypes. However, development of a reliable and accurate algorithm for merging heterogeneous information is highly needed as it could be captured all information of complex interactions on network that genes participate in. In this study we proposed a new method based on the non-negative matrix tri-factorization clustering method. We outlined two versions of the proposed method: two-source and three-source algorithms. Two-source algorithm aggregates individual deleteriousness prediction methods and PPI network, and three-source algorithm incorporates gene disease associations into the other sources already mentioned. Four benchmark datasets were employed for internally and externally validation of both algorithms of our predictor. The results at all datasets confirmed that, our method outperforms most state of the art variant prediction tools. Two key features of our variant effect prediction method are worth mentioning. Firstly, despite the fact that the incorporation of gene disease information at three-source algorithm can improve prediction performance by comparison with two-source algorithm, our method did not hinder by type 2 circularity error unlike some recent ensemble-based prediction methods. Type 2 circularity error occurs when the predictor annotates variants on the basis of the genes located on. Secondly, the performance of our predictor is superior over other ensemble-based methods for variants positioned on genes in which we do not have enough information about their pathogenicity. © 2021, The Author(s).
Other Related Docs
12. Biclustering of Coherent Time Series in Microarray Data, Journal of Isfahan Medical School (2012)
13. A Review of Network-Based Approaches to Drug Repositioning, Briefings in bioinformatics (2018)
22. Dok7 Gene Novel Homozygous Mutation Is Related to Fetal Akinesia Deformation Sequence 3, Journal of Obstetrics and Gynecology of India (2023)