Isfahan University of Medical Sciences

Science Communicator Platform

Stay connected! Follow us on X network (Twitter):
Share this content! On (X network) By
Genetic Variant Effect Prediction by Supervised Nonnegative Matrix Tri-Factorization Publisher Pubmed



Arani AA1 ; Sehhati M2 ; Tabatabaiefar MA3, 4
Authors
Show Affiliations
Authors Affiliations
  1. 1. Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
  2. 2. Department of Bioinformatics, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
  3. 3. Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
  4. 4. GTaC Corp., Deputy of Research and Technology, Isfahan University of Medical Sciences, Isfahan, Iran

Source: Molecular Omics Published:2021


Abstract

Discriminating between deleterious and neutral mutations among numerous non-synonymous single nucleotide variants (nsSNVs) that may be observed through whole exome sequencing (WES) is considered a great challenge. In this regard, many machine learning methods have been developed for the prediction of variant consequences based on the analysis of either protein amino acid sequences or protein structures or their integration with features extracted from various gene level data and phenotype information. Due to the availability of a high number of features and heterogeneity of sources, implementing a suitable integration method plays an important role in predictive models. In this study, we proposed a novel supervised nonnegative matrix tri-factorization (sNMTF) algorithm to integrate current variant prediction scores into the gene level data and disease networks. In this regard, a new feature space was constructed by the integration of all input data using sNMTF to provide appropriate inputs for training a classifier. For the assessment of the proposed model, we utilized two benchmark datasets. The first one contained 11 207 deleterious and 19 839 neutral nsSNPs, whereas for the other dataset we used 4416 and 4960 deleterious and neutral nsSNPs, respectively. In general, the evaluation of our proposed supervised NMTF method on both datasets indicated that, in comparison with the existing nsSNV effect prediction approaches, regardless of whether they are ensemble-based or not, our method exhibited a better performance, which resulted in a higher prediction accuracy on average of 15% than other ensemble scores. In addition, excluding any kind of data that were integrated into the final model led to a substantial decrease in deleterious variant prediction. The proposed model can be used as an extensible framework for integrating more hetergeneous sources. © The Royal Society of Chemistry 2021.
Other Related Docs
17. Dok7 Gene Novel Homozygous Mutation Is Related to Fetal Akinesia Deformation Sequence 3, Journal of Obstetrics and Gynecology of India (2023)