Overview
我们将学习设计蛋白质的方法,其中涉及灵感也是骨架的固定。
这个方案是在FastRelax过程中设计蛋白质,一个单独的类-FastDesign-相对比FastRelax有更多的参数可以选择,但是从根本上来说,他们是一样的。Baker lab 的诸多文章中都包含了FastDesign/RelaxedDesign方案。
Resfile语法
import logging
logging.basicConfig(level=logging.INFO)
import pyrosetta
import pyrosetta.toolbox
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
Initialize PyRosetta
pyrosetta.init("-ignore_unrecognized_res 1 -ex1 -ex2aro -detect_disulf 0")
start_pose = pyrosetta.toolbox.rcsb.pose_from_rcsb("1AB1", ATOM=True, CRYS=False)
pose = start_pose.clone()
scorefxn = pyrosetta.create_score_function("ref2015_cart.wts")
Make of list of which residues are cysteine:
cys_res = []
for i, aa in enumerate(start_pose.sequence(), start=1):
if aa == "C":
cys_res.append(i)
print(cys_res)
Design strategy:
设计半胱氨酸(二硫键)。然后使用resfile re-pack所有的侧链,通过FastRelax使主链和侧链的二面角最小化。
要写resfile,我们需要了解需要改变哪一条链。通过pose.pdb_info()可以看到这个pose中只含有一条链。
更加程序化,可以通过pyrosetta.rosetta.core.pose.conf2pdb_chain(pose)来查看pose中的链,它将返回pyrosetta.rosetta.std.map_unsigned_long_char。
print(pyrosetta.rosetta.core.pose.conf2pdb_chain(pose))
map_unsigned_long_char{1: A}
for k, v in pyrosetta.rosetta.core.pose.conf2pdb_chain(pose).items():
print(v)
A
然后我们可以写一个resfile 只改变A链中的半胱氨酸。
resfile = "./outputs/resfile"
with open(resfile, "w") as f:
f.write("NATAA\n")
f.write("start\n")
for i in cys_res:
f.write("{0} {1} ALLAAxc\n".format(i, pyrosetta.rosetta.core.pose.conf2pdb_chain(pose)[1]))
!cat {resfile}
NATAA
start
3 A ALLAAxc
4 A ALLAAxc
16 A ALLAAxc
26 A ALLAAxc
32 A ALLAAxc
40 A ALLAAxc
我们同样可以通过XML文件中的ResfileCommandOperation来实现。
然后我们需要给FastRelax设置一个TaskOperations,用来告诉FastRelax在packing的时候去design和pack哪一个氨基酸。使用IncludeCurrent 来包含在packing过程中的侧链构象。
# The task factory accepts all the task operations
tf = pyrosetta.rosetta.core.pack.task.TaskFactory()
# These are pretty standard
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides())
# Include the resfile
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.ReadResfile(resfile))
# Convert the task factory into a PackerTask to take a look at it
packer_task = tf.create_task_and_apply_taskoperations(pose)
# View the PackerTask
print(packer_task)
#Packer_Task
Threads to request: ALL AVAILABLE
resid pack? design? allowed_aas
1 TRUE FALSE THR:NtermProteinFull
2 TRUE FALSE THR
3 TRUE TRUE ALA,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
4 TRUE TRUE ALA,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
5 TRUE FALSE PRO
6 TRUE FALSE SER
7 TRUE FALSE ILE
8 TRUE FALSE VAL
9 TRUE FALSE ALA
10 TRUE FALSE ARG
11 TRUE FALSE SER
12 TRUE FALSE ASN
13 TRUE FALSE PHE
14 TRUE FALSE ASN
15 TRUE FALSE VAL
16 TRUE TRUE ALA,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
17 TRUE FALSE ARG
18 TRUE FALSE LEU
19 TRUE FALSE PRO
20 TRUE FALSE GLY
21 TRUE FALSE THR
22 TRUE FALSE SER
23 TRUE FALSE GLU
24 TRUE FALSE ALA
25 TRUE FALSE ILE
26 TRUE TRUE ALA,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
27 TRUE FALSE ALA
28 TRUE FALSE THR
29 TRUE FALSE TYR
30 TRUE FALSE THR
31 TRUE FALSE GLY
32 TRUE TRUE ALA,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
33 TRUE FALSE ILE
34 TRUE FALSE ILE
35 TRUE FALSE ILE
36 TRUE FALSE PRO
37 TRUE FALSE GLY
38 TRUE FALSE ALA
39 TRUE FALSE THR
40 TRUE TRUE ALA,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
41 TRUE FALSE PRO
42 TRUE FALSE GLY
43 TRUE FALSE ASP
44 TRUE FALSE TYR
45 TRUE FALSE ALA
46 TRUE FALSE ASN:CtermProteinFull
继续设置MoveMap 和 MoveMapFactory,使在FastDesign里面最小化的时候可以改变转角。
# Set up a MoveMapFactory
mmf = pyrosetta.rosetta.core.select.movemap.MoveMapFactory()
mmf.all_bb(setting=True)
mmf.all_bondangles(setting=True)
mmf.all_bondlengths(setting=True)
mmf.all_chi(setting=True)
mmf.all_jumps(setting=True)
mmf.set_cartesian(setting=True)
一些Movers只接受MoveMap作为输入,为了向后兼容,可以使用MoveMapFactory 中的create_movemap_from_pose(pose)来创造一个MoveMap。
# Set up a MoveMap
# mm = pyrosetta.rosetta.core.kinematics.MoveMap()
# mm.set_bb(True)
# mm.set_chi(True)
# mm.set_jump(True)
# If needed, you could turn off bb and chi torsions for individual residues like this:
# vector1 of true/false for each residue in the pose
# subset_to_minimize = do_something_set.apply(pose)
# for i in range(1, pose.size() + 1):
# if (not subset_to_minimize[i]):
# mm.set_bb(i, False)
# mm.set_chi(i, False)
我们 double-check一下pose来观察是否已经为FastRelax做好准备。
display_pose = pyrosetta.rosetta.protocols.fold_from_loops.movers.DisplayPoseLabelsMover()
display_pose.tasks(tf)
display_pose.movemap_factory(mmf)
display_pose.apply(pose)
fr = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=scorefxn, standard_repeats=1)
fr.cartesian(True)
fr.set_task_factory(tf)
fr.set_movemap_factory(mmf)
fr.min_type("lbfgs_armijo_nonmonotone") # For non-Cartesian scorefunctions, use "dfpmin_armijo_nonmonotone"
#Note that this min_type is automatically set when you set the cartesian option.
# But it is good to be aware of this - as not all protocols will do this for you.
#fr.set_movemap(mm) # Could have optionally specified a MoveMap instead of MoveMapFactory
#fr.minimize_bond_angles(True) # If not using MoveMapFactory, could specify bond angle minimization here
#fr.minimize_bond_lengths(True) # If not using MoveMapFactory, could specify bond length minimization here
Run Fast(Design)!
%time fr.apply(pose)
Analysis
计算Cα的RMSD
pyrosetta.rosetta.core.scoring.CA_rmsd(start_pose, pose)
0.798085629940033
design前后pose的得分变化
delta_total_score = scorefxn(pose) - scorefxn(start_pose)
print(delta_total_score)
-1377.51918428
计算改变的氨基酸前后能量的变化
for i in cys_res:
pose_total_score = pyrosetta.rosetta.protocols.relax.get_per_residue_scores(pose, pyrosetta.rosetta.core.scoring.ScoreType.total_score)[i - 1]
start_pose_total_score = pyrosetta.rosetta.protocols.relax.get_per_residue_scores(start_pose, pyrosetta.rosetta.core.scoring.ScoreType.total_score)[i - 1]
print("The delta total_score for residue {} between start_pose and pose is {}".format(i, pose_total_score - start_pose_total_score))
The delta total_score for residue 3 between start_pose and pose is -12.881750115
The delta total_score for residue 4 between start_pose and pose is -162.36925048
The delta total_score for residue 16 between start_pose and pose is -34.5286984711
The delta total_score for residue 26 between start_pose and pose is -7.32796169396
The delta total_score for residue 32 between start_pose and pose is -6.70455775554
The delta total_score for residue 40 between start_pose and pose is -7.55635449526