Publikation
LGCA-VHPPI: A local-global residue context aware viral-host protein-protein interaction predictor
Muhammad Nabeel Asim; Muhammad Ali Ibrahim; Muhammad Imran Malik; Andreas Dengel; Sheraz Ahmed
In: PLoS ONE (PLOS), Vol. 7, Pages 1-26, PLOS, 2022.
Zusammenfassung
Viral-host protein protein interaction (PPI) analysis is essential to decode the molecular
mechanism of viral pathogen and host immunity processes which eventually help to control
viral diseases and optimize therapeutics. The state-of-the-art viral-host PPI predictor leverages
unsupervised embedding learning technique (doc2vec) to generate statistical representations of
viral-host protein sequences and a Random Forest classifier for interaction
prediction. However, doc2vec approach generates the statistical representations of viralhost protein
sequences by merely modelling the local context of residues which only partially
captures residue semantics. The paper in hand proposes a novel technique for generating
better statistical representations of viral and host protein sequences based on the infusion
of comprehensive local and global contextual information of the residues. While local residue
context aware encoding captures semantic relatedness and short range dependencies
of residues. Global residue context aware encoding captures comprehensive long-range
residues dependencies, positional invariance of residues, and unique residue combination
distribution important for interaction prediction. Using concatenated rich statistical representations
of viral and host protein sequences, a robust machine learning framework LGCAVHPPI is developed
which makes use of a deep forest model to effectively model complex non-linearity of viral-host PPI sequences.
An in-depth performance comparison of the proposed LGCA-VHPPI framework with existing diverse sequence
encoding schemes based viral-host PPI predictors reveals that LGCA-VHPPI outperforms state-of-the-art predictor
by 6%, 2%, and 2% in terms of matthews correlation coefficient over 3 different benchmark
viral-host PPI prediction datasets.