admin管理员组

文章数量:1589797

Briefings in Bioinformatics 论文解析

现状问题:These methods are mostly based on neighbored features in sequence, and thus limitd to capture spatial information

模型:GraphPPIS

PPi位点预测转化为一个图节点分类任务(graph node classification task)

关键技术:初始残差(initial residual)和身份映射技术(identity mapping techniques)

想法:把蛋白质视为无向图,将蛋白质位点预测视为节点分类问题,整合evolutionary and structural information to construct node features and calculated pairwise amino acid distances to construct the adjacency matrix. 然后,利用初始残差和身份映射实现一个深度进化框架,从高阶氨基酸领域捕获信息。

三个数据集:

Dset_186,Dset_72,Dset_164

To ensure that the training and test set obey similar distributions in terms of interacting percentages(确保里面的PPI占比相同?), 将他们融合

最终:335 protein chains(Train_335), 60 chain(Test_60)

Test_315 验证模型泛化性

UBtest_31 To evaluate the robustness of GraphPPIS and the impact of conformational changes on method performance. corresponding unbound structures

问题1:怎么计算pairwise amino acid distance

Node features:

two groups of amino acid features: evolutionary information(PSSM and HMM), and structural properties(DSSP)

Evolutionary information:

PSSM:position-specific scoring matrix

HMM:hidden Markov models

Structural properties:

DSSP 包括了三个结构信息

Adjacency matrix 获取方式

第一步:根据蛋白质的PDB,获得每个氨基酸的原子坐标,然后计算了所有残基队之间的欧式距离,形成了一个距离图

第二步: 将小于或等于所选cutoff 为 1,大于cutoff 为0,将该蛋白质距离图转换为邻接矩阵,custoff:14A

different strategy: to process protein distance maps into continuous matrices in which values are in the range of 0 to 1. 用公式进行标准化,如果小于或者等于cutoff

GCN with initial residual and identify mapping

 

 

本文标签: InteractionSiteproteinStructureAware