

Briefings in Bioinformatics 论文解析

现状问题:These methods are mostly based on neighbored features in sequence, and thus limitd to capture spatial information


PPi位点预测转化为一个图节点分类任务(graph node classification task)

关键技术:初始残差(initial residual)和身份映射技术(identity mapping techniques)

想法:把蛋白质视为无向图,将蛋白质位点预测视为节点分类问题,整合evolutionary and structural information to construct node features and calculated pairwise amino acid distances to construct the adjacency matrix. 然后,利用初始残差和身份映射实现一个深度进化框架,从高阶氨基酸领域捕获信息。



To ensure that the training and test set obey similar distributions in terms of interacting percentages(确保里面的PPI占比相同?), 将他们融合

最终:335 protein chains(Train_335), 60 chain(Test_60)

Test_315 验证模型泛化性

UBtest_31 To evaluate the robustness of GraphPPIS and the impact of conformational changes on method performance. corresponding unbound structures

问题1:怎么计算pairwise amino acid distance

Node features:

two groups of amino acid features: evolutionary information(PSSM and HMM), and structural properties(DSSP)

Evolutionary information:

PSSM:position-specific scoring matrix

HMM:hidden Markov models

Structural properties:

DSSP 包括了三个结构信息

Adjacency matrix 获取方式


第二步: 将小于或等于所选cutoff 为 1,大于cutoff 为0,将该蛋白质距离图转换为邻接矩阵,custoff:14A

different strategy: to process protein distance maps into continuous matrices in which values are in the range of 0 to 1. 用公式进行标准化,如果小于或者等于cutoff

GCN with initial residual and identify mapping



