ProtPIC (Protein and Peptide Isoelectric Point Calculator)是一个基于机器学习的生物信息学工具,专门用于预测蛋白质和肽段的等电点(pI)以及残基级别的 pKa 值。工具会 自动识别 输入序列类型: 短序列(≤60aa) → 使用肽段模型, 仅需氨基酸序列; 长序列(>60aa) → 使用蛋白质模型,计算结构特征(RSA、pLDDT、二级结构),然后融入计算。
页面会返回等电点,如果是蛋白质,还会返回带电残基的 pKa 值。
1. 氨基酸序列(支持10条 FASTA):
已解析序列数: 0,总残基数: 0
技术亮点
- 结构信息融合 :整合 RSA(相对溶剂可及性)、pLDDT(结构置信度)、二级结构等特征
- 物理先验整合 :融入 Henderson-Hasselbalch 理论计算值作为特征
- 高精度预测 :pKa 预测 MAE = 0.338 pH;蛋白质 pI 预测 MAE = 0.581 pH;肽段 pI 预测 MAE = 0.118 pH
模型性能指标
性能对比 (vs IPC 2.0)
------------------------------------------------------------
[pKa 预测] Rosetta 数据集, 260 残基
------------------------------------------------------------
Metric IPC 2.0 ProtPIC (ours)
------------------------------------------------------------
MAE 0.3364 0.3011
RMSE 0.5762 0.6589
Outliers (>0.5) 54(20.8%) 36(15.9%)
------------------------------------------------------------
------------------------------------------------------------
[Protein pI 预测] 581 蛋白质, IPC2.protein.svr.19
------------------------------------------------------------
Metric IPC 2.0 ProtPIC (ours)
------------------------------------------------------------
MAE 0.5906 0.5760
RMSE 0.8479 0.8356
R² 0.5934 0.6077
Outliers (>0.5) 247(42.5%) 232(39.9%)
------------------------------------------------------------
------------------------------------------------------------
[Peptide pI 预测] 29,774 肽段, IPC2.peptide.Conv2D
------------------------------------------------------------
Metric IPC 2.0 ProtPIC (ours)
------------------------------------------------------------
MAE 0.1216 0.118
RMSE 0.2216 0.228
R² 0.9761 0.975
Outliers (>0.25) 2691(9.0%) 2878(9.7%)
------------------------------------------------------------
按 pKa / pI 范围性能
------------------------------------------------------------
[pKa] ProtPIC (ours)
------------------------------------------------------------
pKa Range Count MAE RMSE Outliers (>0.5)
------------------------------------------------------------
pKa < 4 89 0.239 0.557 13 (14.6%)
pKa 4-6 70 0.231 0.484 7 (10.0%)
pKa 6-8 24 0.345 0.616 4 (16.7%)
pKa 8-10 7 0.646 0.777 4 (57.1%)
pKa > 10 36 0.494 1.065 8 (22.2%)
------------------------------------------------------------
------------------------------------------------------------
[Protein pI] ProtPIC (ours), 581 蛋白质
------------------------------------------------------------
pI Range Count MAE RMSE Outliers (>0.5)
------------------------------------------------------------
Acidic (<5) 156 0.560 0.814 64 (41.0%)
Neutral (5-7) 328 0.454 0.676 101 (30.8%)
Basic (7-9) 76 0.916 1.164 48 (63.2%)
Very Basic (>9) 21 1.376 1.531 19 (90.5%)
------------------------------------------------------------
------------------------------------------------------------
[Peptide pI] ProtPIC (ours), 29,774 肽段
------------------------------------------------------------
pI Range Count MAE RMSE Outliers (>0.25)
------------------------------------------------------------
Very Acidic (<4) 4,578 0.064 0.098 130 (2.8%)
Acidic (4-5) 9,879 0.087 0.156 330 (3.3%)
Neutral-Acidic (5-6) 2,160 0.323 0.423 1167 (54.0%)
Neutral (6-7) 7,297 0.145 0.273 818 (11.2%)
Basic (7-9) 5,795 0.099 0.230 373 (6.4%)
Very Basic (>9) 65 0.534 0.630 60 (92.3%)
------------------------------------------------------------
最后更新日期:2026-05-13
参考文献
Kozlowski LP. IPC 2.0: prediction of isoelectric point and pKa dissociation constants. Nucleic Acids Res. 2021 Jul 2;49(W1):W285-W292. doi: 10.1093/nar/gkab295. PMID: 33905510; PMCID: PMC8262712.