pyrosetta-15 Ligand Refinement in PyRosetta (a.k.a. High-Resolution Local Docking) Using the liga...

初始化PyRosetta并设置输入姿势:

import logging
logging.basicConfig(level=logging.INFO)
import os
import pyrosetta
import pyrosetta.distributed
import pyrosetta.distributed.viewer as viewer

# 其中TPA指的是配体
params_file = "inputs/TPA.gasteiger.fa.params"
flags = f"""
-extra_res_fa {params_file} # Provide a custom TPA .params file
-ignore_unrecognized_res 1
-mute all
"""
pyrosetta.distributed.init(flags)
pose = pyrosetta.io.pose_from_file("inputs/test_lig.pdb")

在进行配体优化之前,让我们使用pyrosetta.distributed.viewer来查看输入的pdb文件的情况:

chE = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("E")

view = viewer.init(pose)
view.add(viewer.setStyle())
view.add(viewer.setStyle(command=({"hetflag": True}, {"stick": {"colorscheme": "brownCarbon", "radius": 0.2}})))
view.add(viewer.setSurface(residue_selector=chE, opacity=0.7, color='white'))
view.add(viewer.setHydrogenBonds())
view()
输入的pdb

重新启动Jupyter Notebook内核以正确地重新初始化PyRosetta

import logging
logging.basicConfig(level=logging.INFO)
import os
import pyrosetta
import pyrosetta.distributed
import pyrosetta.distributed.viewer as viewer

下面的配体改进的例子来自于~Rosetta/main/source/src/python/PyRosetta/src/demo/D120_Ligand_interface.py:

使用rosetta全原子对接(DockingHighRes)来进行配体与蛋白的对接,输入为配体-蛋白的复合物 <pdb_filename>以及相关的链<partners>。如果配体的参数文件没有默认倍载入到PyRosetta中,<ligand_params>必须提供包括配体的文件列表及参数。<jobs>与输出一起执行名为<job_output>_(job#).pdb的结构。

注:全局对接是罗塞塔对接协议解决的一个问题,与其它protocols一样需要结合界面的检测和优化,这些任务分为质心(结合假面检测)和无质心的高分辨率(界面优化)方法,低分辨率配体蛋白预测是不可能的,因此,只有高分辨率的配体蛋白界面优化是可用的。如果添加扰动或随机化步骤,高分辨率阶段可能会失败。扰动步骤可以使其成为全局对接算法,然而刚体在优化之前进行采样需要大量采样才能产生精确的结果,并且该算法将大部分精力花在优化(这对于界面的预测可能是无用的)。

此脚本执行配体-蛋白质界面结构预测,但不执行进行全局配体蛋白对接。由于没有通用的结合界面检测,输入PDB文件必须将配体放置在界面附近。如果DockMCMProtocol应用于pose在没有放置在界面附近的情况下,这个优化可能:
-浪费在错误的结合界面的采样
-通过预测与真实界面相距甚远的错误界面而失败
-通过从蛋白质中分离配体而失败(通常是由于冲突)
DockMCMProtocol不需要独立的随机化或扰动步骤来“seed”其预测。

额外的细化步骤可以提高预测构象的准确性(refinement.py)。构象剧烈变化应该避免;

关于FoldTree的事项,可看我之前的文章。

def sample_ligand_interface(pdb_filename,
                            partners,
                            ligand_params=[""],
                            jobs=1,
                            job_output="ligand_output"):
    """
    Performs ligand-protein docking using Rosetta fullatom docking
    (DockingHighRes) on the ligand-protein complex in  <pdb_filename>
    using the relative chain  <partners>. If the ligand parameters
    (a .params file) are not defaultly loaded into PyRosetta,
    <ligand_params> must supply the list of files including the ligand
    parameters. <jobs>  trajectories are performed with output
    structures named <job_output>_(job#).pdb.
    
    Note: Global docking, a problem solved by the Rosetta DockingProtocol,
    requires interface detection and refinement as with other protocols,
    these tasks are split into centroid (interface detection) and
    high-resolution (interface refinement) methods without a centroid 
    representation, low-resolution ligand-protein prediction is not
    possible and as such, only the high-resolution ligand-protein 
    interface refinement is available. If you add a perturbation or 
    randomization step, the high-resolution stages may fail. A perturbation
    step CAN make this a global docking algorithm however the rigid-body
    sampling preceding refinement requires extensive sampling to produce
    accurate results and this algorithm spends most of its effort in
    refinement (which may be useless for the predicted interface).
    
    This script performs ligand-protein interface structure prediction but does NOT
    perform global ligand-protein docking. Since there is no generic interface
    detection, the input PDB file must have the ligand placed near the interface
    that will be refined. If the DockMCMProtocol is applied to a pose
    without placement near the interface, then the refinement may:
        -waste steps sampling the wrong interface
        -fail by predicting an incorrect interface very far from the true interface
        -fail by separating the ligand from the protein (usually due to a clash)
    DockMCMProtocol does not require an independent randomization or perturbation
    step to "seed" its prediction.
    
    Additional refinement steps may increase the accuracy of the predicted
    conformation (see refinement.py). Drastic moves (large conformational changes)
    should be avoided; if they precede the protocol, the problems above may occur,
    if they succeed the protocol, the protocol results may be lost.
    """

    # Declare working directory and output directory
    working_dir = os.getcwd()
    output_dir = "outputs"
    if not os.path.exists(output_dir):
        os.mkdir(output_dir)
    
    # Initialize PyRosetta
    pyrosetta.init()
    
    # Create an empty pose from the desired PDB file
    pose = pyrosetta.rosetta.core.pose.Pose()

    # If the params list has contents, load .params files
    # Note: this method of adding ligands to the ResidueTypeSet is unnecessary
    # if you call pyrosetta.init("-extra_res_fa {}".format(ligand_params))
    if len(ligand_params) != 0 and ligand_params[0] != "":
        ligand_params = pyrosetta.Vector1(ligand_params)
        res_set = pose.conformation().modifiable_residue_type_set_for_conf()
        res_set.read_files_for_base_residue_types(ligand_params)
        pose.conformation().reset_residue_type_set_for_conf(res_set)
    
    # Load pdb_filename into pose
    pyrosetta.io.pose_from_file(pose, pdb_filename)

    # Setup the docking FoldTree
    # the method setup_foldtree takes an input pose and sets its
    #    FoldTree to have jump 1 represent the relation between the two docking
    #    partners, the jump points are the residues closest to the centers of
    #    geometry for each partner with a cutpoint at the end of the chain,
    # the second argument is a string specifying the relative chain orientation
    #    such as "A_B" of "LH_A", ONLY TWO BODY DOCKING is supported and the
    #    partners MUST have different chain IDs and be in the same pose (the
    #    same PDB), additional chains can be grouped with one of the partners,
    #    the "_" character specifies which bodies are separated
    # the third argument...is currently unsupported but must be set (it is
    #    supposed to specify which jumps are movable, to support multibody
    #    docking...but Rosetta doesn't currently)
    # the FoldTrees setup by this method are for TWO BODY docking ONLY!
    dock_jump = 1 # jump number 1 is the inter-body jump
    pyrosetta.rosetta.protocols.docking.setup_foldtree(pose,
                                                       partners,
                                                       pyrosetta.Vector1([dock_jump]))

    # Create ScoreFunctions for centroid and fullatom docking
    scorefxn = pyrosetta.create_score_function("ligand.wts")

    # Setup the high resolution (fullatom) docking protocol using DockMCMProtocol.
    docking = pyrosetta.rosetta.protocols.docking.DockMCMProtocol()
    # Many of its options and settings can be set using the setter methods.
    docking.set_scorefxn(scorefxn)

    # Change directory temporarily for output
    os.chdir(output_dir)
    
    # Setup the PyJobDistributor
    jd = pyrosetta.toolbox.py_jobdistributor.PyJobDistributor(job_output,
                                                              jobs, scorefxn,
                                                              compress=False)
    
    # Set the native pose so that the output scorefile contains the pose rmsd metric
    jd.native_pose = pose 
    
    # Optional: setup a PyMOLObserver
    # pyrosetta.rosetta.protocols.moves.AddPyMOLObserver(test_pose, True)

    # Perform protein-ligand docking
    # counter = 0 # for pretty output to PyMOLObserver
    
    while not jd.job_complete:
        test_pose = pose.clone() # Reset test pose to original structure
        
        # counter += 1 # Change the pose name, for pretty output to PyMOLObserver
        # test_pose.pdb_info().name(job_output + '_' + str(counter))
        
        # Perform docking and output to PyMOL:
        docking.apply(test_pose) 

        # Write the decoy structure to disk:
        jd.output_decoy(test_pose)
    
    os.chdir(working_dir)

然后我们测试一下,并只执行一轮,

if not os.getenv("DEBUG"):
    sample_ligand_interface("inputs/test_lig.pdb", "E_X",
                                  ligand_params=["inputs/TPA.gasteiger.fa.params"],
                                  jobs=1,
                                  job_output="test_lig")

结果说明:
PyJobDistributor将输出每个轨迹的最低得分pose(作为.pdb文件),并在输出/<job_output>.fasc中记录得分。通常,得分最低的decoy包含对蛋白质配体构象的最佳预测。对接产生的PDB文件将在其预测构象中包含两个对接的partners。当检查这些PDB文件(或PyMOLObserver输出)时,请注意PyMOL可以引入或预测不存在的键,特别是对于近原子。当使用PyMOLMover.keep_history功能时,这种情况很少发生(因为PyRosetta将对一些有冲突的构象空间进行采样)。

PyMOLOObserver将输出一系列由DockingProtocol直接生成的结构。不幸的是,这可能包括Protocol中没有任何了解的中间结构。大量结构输出到PyMOL,您的机器可能难以加载所有这些结构。如果出现这种情况,请尝试将PyMOLBobserver keep_history更改为False,或者在不使用PyMOLBObserver的情况下运行Protocol。

**界面结构预测有助于考虑哪些物理性质在结合事件中很重要以及发生了哪些构象变化。一旦有了使用PyRosetta的经验,您就可以轻松地编写脚本来研究Rosetta分数术语和结构特征。配体结合结果没有一般的解释。尽管Rosetta评分不能直接转化为物理意义(它不是物理能量),但拆分对接的partners并比较评分(packing or refinement后)可以表明键合相互作用的强度。

Restart Jupyter Notebook kernel to properly re-initialize PyRosetta

import logging
logging.basicConfig(level=logging.INFO)
import os, sys
import pyrosetta
import pyrosetta.distributed
import pyrosetta.distributed.viewer as viewer

params_file = "inputs/TPA.gasteiger.fa.params"
flags = f"""
-extra_res_fa {params_file}
-ignore_unrecognized_res 1
-mute all
"""
pyrosetta.distributed.init(flags)
pose = pyrosetta.io.pose_from_file("expected_outputs/test_lig_0.pdb")

配体优化完成后查看结果如何

chE = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("E")

view = viewer.init(pose)
view.add(viewer.setStyle())
view.add(viewer.setStyle(command=({"hetflag": True}, {"stick": {"colorscheme": "brownCarbon", "radius": 0.2}})))
view.add(viewer.setSurface(residue_selector=chE, opacity=0.7, color='white'))
view.add(viewer.setHydrogenBonds())
view()
配体优化后

下面,编写一个名为ligand_refinement_from_command_line.py的函数sample_ligand_interface的替代版本,并进行以下修改:

  1. 使用pyrossetta.init()方法将配体加载到Rosetta数据库中,而不是修改ResidueTypeSet数据库。
  2. 将scorefunction更改为talaris2014
    从命令行运行它(注意:已经为您添加了optparse模块)。
    注意:注意,以下单元格的第一行使用ipython魔术命令%%file,该文件将单元格内容的剩余部分写入文件outputs/ligand_refinement_from_command_line.py:
%%file outputs/ligand_refinement_from_command_line.py
import optparse
import os
import pyrosetta


def sample_ligand_interface(pdb_filename,
                            partners,
                            ligand_params=[""],
                            jobs=1,
                            job_output="ligand_output"):
    """
    Performs ligand-protein docking using Rosetta fullatom docking
    (DockingHighRes) on the ligand-protein complex in  <pdb_filename>
    using the relative chain  <partners>. If the ligand parameters
    (a .params file) are not defaultly loaded into PyRosetta,
    <ligand_params> must supply the list of files including the ligand
    parameters. <jobs>  trajectories are performed with output
    structures named <job_output>_(job#).pdb.
    
    Note: Global docking, a problem solved by the Rosetta DockingProtocol,
    requires interface detection and refinement as with other protocols,
    these tasks are split into centroid (interface detection) and
    high-resolution (interface refinement) methods without a centroid 
    representation, low-resolution ligand-protein prediction is not
    possible and as such, only the high-resolution ligand-protein 
    interface refinement is available. If you add a perturbation or 
    randomization step, the high-resolution stages may fail. A perturbation
    step CAN make this a global docking algorithm however the rigid-body
    sampling preceding refinement requires extensive sampling to produce
    accurate results and this algorithm spends most of its effort in
    refinement (which may be useless for the predicted interface).
    
    This script performs ligand-protein interface structure prediction but does NOT
    perform global ligand-protein docking. Since there is no generic interface
    detection, the input PDB file must have the ligand placed near the interface
    that will be refined. If the DockMCMProtocol is applied to a pose
    without placement near the interface, then the refinement may:
        -waste steps sampling the wrong interface
        -fail by predicting an incorrect interface very far from the true interface
        -fail by separating the ligand from the protein (usually due to a clash)
    DockMCMProtocol does not require an independent randomization or perturbation
    step to "seed" its prediction.
    
    Additional refinement steps may increase the accuracy of the predicted
    conformation (see refinement.py). Drastic moves (large conformational changes)
    should be avoided; if they precede the protocol, the problems above may occur,
    if they succeed the protocol, the protocol results may be lost.
    """

    # Declare working directory and output directory
    working_dir = os.getcwd()
    output_dir = "outputs"
    if not os.path.exists(output_dir):
        os.mkdir(output_dir)
    
    # Initialize PyRosetta
    pyrosetta.init()
    
    # Create an empty pose from the desired PDB file
    pose = pyrosetta.rosetta.core.pose.Pose()

    # If the params list has contents, load .params files
    # Note: this method of adding ligands to the ResidueTypeSet is unnecessary
    # if you call pyrosetta.init("-extra_res_fa {}".format(ligand_params))
    if len(ligand_params) != 0 and ligand_params[0] != "":
        ligand_params = pyrosetta.Vector1(ligand_params)
        res_set = pose.conformation().modifiable_residue_type_set_for_conf()
        res_set.read_files_for_base_residue_types(ligand_params)
        pose.conformation().reset_residue_type_set_for_conf(res_set)
    
    # Load pdb_filename into pose
    pyrosetta.io.pose_from_file(pose, pdb_filename)

    # Setup the docking FoldTree
    # the method setup_foldtree takes an input pose and sets its
    #    FoldTree to have jump 1 represent the relation between the two docking
    #    partners, the jump points are the residues closest to the centers of
    #    geometry for each partner with a cutpoint at the end of the chain,
    # the second argument is a string specifying the relative chain orientation
    #    such as "A_B" of "LH_A", ONLY TWO BODY DOCKING is supported and the
    #    partners MUST have different chain IDs and be in the same pose (the
    #    same PDB), additional chains can be grouped with one of the partners,
    #    the "_" character specifies which bodies are separated
    # the third argument...is currently unsupported but must be set (it is
    #    supposed to specify which jumps are movable, to support multibody
    #    docking...but Rosetta doesn't currently)
    # the FoldTrees setup by this method are for TWO BODY docking ONLY!
    dock_jump = 1 # jump number 1 is the inter-body jump
    pyrosetta.rosetta.protocols.docking.setup_foldtree(pose,
                                                       partners,
                                                       pyrosetta.Vector1([dock_jump]))

    # Create a copy of the pose for testing
    test_pose = pose.clone()

    # Create ScoreFunctions for centroid and fullatom docking
    scorefxn = pyrosetta.create_score_function("ligand")

    # Setup the high resolution (fullatom) docking protocol using DockMCMProtocol.
    docking = pyrosetta.rosetta.protocols.docking.DockMCMProtocol()
    # Many of its options and settings can be set using the setter methods.
    docking.set_scorefxn(scorefxn)

    # Change directory temporarily for output
    os.chdir(output_dir)
    
    # Setup the PyJobDistributor
    jd = pyrosetta.toolbox.py_jobdistributor.PyJobDistributor(job_output,
                                                              jobs, scorefxn,
                                                              compress=False)

    # Set the native pose so that the output scorefile contains the pose rmsd metric
    jd.native_pose = pose 
    
    # Optional: setup a PyMOLObserver
    # pyrosetta.rosetta.protocols.moves.AddPyMOLObserver(test_pose, True)

    # Perform protein-ligand docking
    # counter = 0 # for pretty output to PyMOLObserver
    
    while not jd.job_complete:
        test_pose = pose.clone() # Reset test pose to original structure
        
        # counter += 1 # Change the pose name, for pretty output to PyMOLObserver
        # test_pose.pdb_info().name(job_output + '_' + str(counter))
        
        docking.apply(test_pose) # Perform docking and output to PyMOL

        # Write the decoy structure to disc
        jd.output_decoy(test_pose)
    
    os.chdir(working_dir)

if __name__ == "__main__":
    
    # Declare parser object for managing input options
    parser = optparse.OptionParser()
    parser.add_option("--pdb_filename",
                      dest="pdb_filename",
                      help="The PDB file containing the ligand and protein to dock.")
    parser.add_option("--partners", 
                      dest="partners",
                      default = "A_X",
                      help="The relative chain partners for docking.")
    parser.add_option("--ligand_params",
                      dest="ligand_params",
                      help="The ligand residue parameter file.")
    parser.add_option("--jobs", 
                      dest="jobs",
                      default="1",
                      help="The number of jobs (trajectories) to perform.")
    parser.add_option("--job_output", 
                      dest="job_output",
                      default = "ligand_output",
                      help="The name preceding all output, output PDB files and scorefile.")
    (options, args) = parser.parse_args()
    
    # Catch input erros
    if not options.pdb_filename:
        parser.error("pdb_filename not given!")
    if not options.ligand_params:
        parser.error("ligand_params not given!")

    # Run ligand refinement protocol
    sample_ligand_interface(pdb_filename=options.pdb_filename,
                            partners=options.partners,
                            ligand_params=options.ligand_params.split(","),
                            jobs=int(options.jobs),
                            job_output=options.job_output)

覆盖outputs/ligand_refinement_from_command_line.py
从这个Jupyter笔记本的命令行运行outputs/ligand_refinement_from_command_line.py

%%file outputs/ligand_refinement_from_command_line.py
import optparse
import os
import pyrosetta


def sample_ligand_interface(pdb_filename,
                            partners,
                            ligand_params=[""],
                            jobs=1,
                            job_output="ligand_output"):
    """
    Performs ligand-protein docking using Rosetta fullatom docking
    (DockingHighRes) on the ligand-protein complex in  <pdb_filename>
    using the relative chain  <partners>. If the ligand parameters
    (a .params file) are not defaultly loaded into PyRosetta,
    <ligand_params> must supply the list of files including the ligand
    parameters. <jobs>  trajectories are performed with output
    structures named <job_output>_(job#).pdb.
    
    Note: Global docking, a problem solved by the Rosetta DockingProtocol,
    requires interface detection and refinement as with other protocols,
    these tasks are split into centroid (interface detection) and
    high-resolution (interface refinement) methods without a centroid 
    representation, low-resolution ligand-protein prediction is not
    possible and as such, only the high-resolution ligand-protein 
    interface refinement is available. If you add a perturbation or 
    randomization step, the high-resolution stages may fail. A perturbation
    step CAN make this a global docking algorithm however the rigid-body
    sampling preceding refinement requires extensive sampling to produce
    accurate results and this algorithm spends most of its effort in
    refinement (which may be useless for the predicted interface).
    
    This script performs ligand-protein interface structure prediction but does NOT
    perform global ligand-protein docking. Since there is no generic interface
    detection, the input PDB file must have the ligand placed near the interface
    that will be refined. If the DockMCMProtocol is applied to a pose
    without placement near the interface, then the refinement may:
        -waste steps sampling the wrong interface
        -fail by predicting an incorrect interface very far from the true interface
        -fail by separating the ligand from the protein (usually due to a clash)
    DockMCMProtocol does not require an independent randomization or perturbation
    step to "seed" its prediction.
    
    Additional refinement steps may increase the accuracy of the predicted
    conformation (see refinement.py). Drastic moves (large conformational changes)
    should be avoided; if they precede the protocol, the problems above may occur,
    if they succeed the protocol, the protocol results may be lost.
    """

    # Declare working directory and output directory
    working_dir = os.getcwd()
    output_dir = "outputs"
    if not os.path.exists(output_dir):
        os.mkdir(output_dir)
    
    # Initialize PyRosetta
    # pyrosetta.init()
    pyrosetta.init("-extra_res_fa {}".format(ligand_params))
    
    # Create an empty pose from the desired PDB file
    pose = pyrosetta.rosetta.core.pose.Pose()

    # If the params list has contents, load .params files
    # Note: this method of adding ligands to the ResidueTypeSet is unnecessary
    # if you call pyrosetta.init("-extra_res_fa {}".format(ligand_params))
    # if len(ligand_params) != 0 and ligand_params[0] != "":
    #     ligand_params = pyrosetta.Vector1(ligand_params)
    #     res_set = pose.conformation().modifiable_residue_type_set_for_conf()
    #     res_set.read_files_for_base_residue_types(ligand_params)
    #     pose.conformation().reset_residue_type_set_for_conf(res_set)
    
    # Load pdb_filename into pose
    pyrosetta.io.pose_from_file(pose, pdb_filename)

    # Setup the docking FoldTree
    # the method setup_foldtree takes an input pose and sets its
    #    FoldTree to have jump 1 represent the relation between the two docking
    #    partners, the jump points are the residues closest to the centers of
    #    geometry for each partner with a cutpoint at the end of the chain,
    # the second argument is a string specifying the relative chain orientation
    #    such as "A_B" of "LH_A", ONLY TWO BODY DOCKING is supported and the
    #    partners MUST have different chain IDs and be in the same pose (the
    #    same PDB), additional chains can be grouped with one of the partners,
    #    the "_" character specifies which bodies are separated
    # the third argument...is currently unsupported but must be set (it is
    #    supposed to specify which jumps are movable, to support multibody
    #    docking...but Rosetta doesn't currently)
    # the FoldTrees setup by this method are for TWO BODY docking ONLY!
    dock_jump = 1 # jump number 1 is the inter-body jump
    pyrosetta.rosetta.protocols.docking.setup_foldtree(pose,
                                                       partners,
                                                       pyrosetta.Vector1([dock_jump]))

    # Create a copy of the pose for testing
    test_pose = pose.clone()

    # Create ScoreFunctions for centroid and fullatom docking
    scorefxn = pyrosetta.create_score_function("talaris2014")

    # Setup the high resolution (fullatom) docking protocol using DockMCMProtocol.
    docking = pyrosetta.rosetta.protocols.docking.DockMCMProtocol()
    # Many of its options and settings can be set using the setter methods.
    docking.set_scorefxn(scorefxn)

    # Change directory temporarily for output
    os.chdir(output_dir)
    
    # Setup the PyJobDistributor
    jd = pyrosetta.toolbox.py_jobdistributor.PyJobDistributor(job_output,
                                                              jobs, scorefxn,
                                                              compress=False)

    # Set the native pose so that the output scorefile contains the pose rmsd metric
    jd.native_pose = pose 
    
    # Optional: setup a PyMOLObserver
    # pyrosetta.rosetta.protocols.moves.AddPyMOLObserver(test_pose, True)

    # Perform protein-ligand docking
    # counter = 0 # for pretty output to PyMOLObserver
    
    while not jd.job_complete:
        test_pose = pose.clone() # Reset test pose to original structure
        
        # counter += 1 # Change the pose name, for pretty output to PyMOLObserver
        # test_pose.pdb_info().name(job_output + '_' + str(counter))
        
        docking.apply(test_pose) # Perform docking and output to PyMOL

        # Write the decoy structure to disc
        jd.output_decoy(test_pose)
    
    os.chdir(working_dir)

if __name__ == "__main__":
    
    # Declare parser object for managing input options
    parser = optparse.OptionParser()
    parser.add_option("--pdb_filename",
                      dest="pdb_filename",
                      help="The PDB file containing the ligand and protein to dock.")
    parser.add_option("--partners", 
                      dest="partners",
                      default = "A_X",
                      help="The relative chain partners for docking.")
    parser.add_option("--ligand_params",
                      dest="ligand_params",
                      help="The ligand residue parameter file.")
    parser.add_option("--jobs", 
                      dest="jobs",
                      default="1",
                      help="The number of jobs (trajectories) to perform.")
    parser.add_option("--job_output", 
                      dest="job_output",
                      default = "ligand_output",
                      help="The name preceding all output, output PDB files and scorefile.")
    (options, args) = parser.parse_args()
    
    # Catch input erros
    if not options.pdb_filename:
        parser.error("pdb_filename not given!")
    if not options.ligand_params:
        parser.error("ligand_params not given!")

    # Run ligand refinement protocol
    sample_ligand_interface(pdb_filename=options.pdb_filename,
                            partners=options.partners,
                            ligand_params=options.ligand_params.split(","),
                            jobs=int(options.jobs),
                            job_output=options.job_output)

其他挑战:
上面使用的所有默认变量和参数都是特定于具有inputs/test_lig.pdb和inputs/TPA.gasteiger.fa.params的示例的,这被认为是简单、直接和快速的。下面是一个更实用的例子:
肯普消除一直是使用Rosetta进行酶设计的靶向反应。假设你想更好地了解这些酶的活性位点,并决定使用PyRosetta进行研究。

1.下载RCSB PDB文件3NZ1的副本(移除水和任何其他HETATM),并查看结构情况。

  1. 提取5-硝基-苯并三唑底物(最好是.mol2文件)(注意:使用PyMOL,可以使用.mol2扩展名保存分子)
  2. 编辑PDB文件,去除水分、硫酸盐和酒石酸.
pose = pyrosetta.toolbox.pose_from_rcsb("3NZ1")

from pyrosetta.toolbox import cleanATOM
cleanATOM("3NZ1.pdb")

print(pose.pdb_info())

chE = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("A")

view = viewer.init(pose)
view.add(viewer.setStyle())
view.add(viewer.setStyle(command=({"hetflag": True}, {"stick": {"colorscheme": "brownCarbon", "radius": 0.2}})))
view.add(viewer.setSurface(residue_selector=chE, opacity=0.7, color='white'))
view.add(viewer.setHydrogenBonds())
view()
  1. 生成5-硝基-苯并三唑的.params文件(在PDB文件中列为链X resdidue 3NY),假设底物保存为名为3NY.mol2的.mol2文件
    python rosetta_source/scripts/python/public/molfile_to_params.py 3NY.mol2 -n 3NY
  2. 创建一个包含以下内容的目录(把上面生成的这些中间文件都放进去):
  • 3NZ1的PDB文件(已清除水和非基质HETATM)在此命名为3NZ1.clean.pdb
  • 这个示例脚本
  1. 使用适当的参数从命令行运行脚本:
    python ligand_refinement_from_command_line.py --pdb_filename 3NZ1.clean.pdb --partners A_X --ligand_params 3NY.params --jobs 400 --job_output 3NZ1_docking_output
  • 应使用上述临时方法(方法1)提供ligand.params文件,因为此脚本是为了执行此操作而设置的,文件3NY.params应已在步骤4中成功生成。如果将3NY.param永久添加到化学数据库,则无需为--ligand_params选项提供任何内容。
  • partners 选项A_X是PDB特定的,如果您选择保留不同的链(在步骤3中)或以其他方式更改3NZ1中的链ID,请确保此字符串与所需的链交互匹配。注意要包含配体。
  • 400个轨迹很低,很难对对接构象进行采样,通常要尝试数千(800-1000)个轨迹。
  • 这个脚本的特点是PyMOLBobserver(注释掉以避免使用它),蒙特卡洛模拟预计不会产生动力学上有意义的结果,因此,查看中间体只有在理解protocol时才有用,很少能产生超出最终输出的结果。因此,请确保注释掉用于大规模采样的PyMOLObserver行。
  1. 等待输出,这将需要一段时间(执行DockingHighRes的400个轨迹是相对较慢的)。
  2. 通过使用matplotlib模块绘制配体rmsd与total_score来分析结果。

参考

Ligand-Docking-PyRosetta.ipynb

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 194,457评论 5 459
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 81,837评论 2 371
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 141,696评论 0 319
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 52,183评论 1 263
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 61,057评论 4 355
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 46,105评论 1 272
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 36,520评论 3 381
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 35,211评论 0 253
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 39,482评论 1 290
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 34,574评论 2 309
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 36,353评论 1 326
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 32,213评论 3 312
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 37,576评论 3 298
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 28,897评论 0 17
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 30,174评论 1 250
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 41,489评论 2 341
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 40,683评论 2 335

推荐阅读更多精彩内容