Game AI Pro
Exploring HTN Planners through Example by Troy Humphreys
Introduction
介绍
As programmers we may find ourselves perpetually looking for that “better solution” to whatever problems we’ve encountered—better performance, maintainability, or usability. It’s only after we implement those solutions that we understand some of the nuances that come with them. Often, these nuances might be the deciding factor in what solution we go with.
作为程序员,我们可能会发现自己总是在寻找“更好的解决方案”来解决我们遇到的任何问题,这个解决方案可能是更好的性能、可维护性或可用性。而且只有在我们实现这些解决方案之后,我们才能理解其中的一些细微差别。通常,这些细微差别可能是决定我们采用何种解决方案的决定性因素。
In AI development, a common problem to solve is behavior selection.There are many solutions to this problem, such as finite-state machines, behavior trees, utility-based selection, neural networks, and planners.This article aims to explore the nuances of a type of planner called hierarchical task networks (HTN) by using real world examples that one can run into during development.
在AI开发中,一个普遍需要解决的问题是行为选择。这个问题有很多解决方案,比如FSM有限状态机、行为树、基于效用的选择、神经网络和规划器。这篇文章通过利用在开发过程中可能遇到的现实世界中的项目例子来探究一个类型规划器的细微差别,这类规划器称为分层任务网络(HTN)规划器。
Planning architectures such as HTN take a problem as input and supply a series of steps that solves it. In HTN terms, the series of steps is called a plan.What makes hierarchical task networks unique to other planners is that it allows us to represent the problem as a very high level task, and through its planning process, recursively breaks this task into smaller tasks. When this process is completed, we are left with a series of atomic tasks that represent a plan.
像HTN这样的规划器架构,是将一个问题作为输入,并提供一系列解决问题的步骤。在HTN术语中,这一系列步骤称为一个计划(plan)。HTN相对于其他规划器的独特之处在于,它允许我们把问题表现为一个非常高级(抽象)的计划(plan),通过它的规划过程,递归地把这个任务分解成更小的任务。当这个过程完成时,剩下生成的是一组表示计划(plan)的原子任务。
Breaking up high level tasks into smaller ones is a very natural way of solving many sorts of problems. In our case, the problem is simply “figuring out what to do.” With a high degree of modularity and fast run time execution, HTNs make an attractive choice as a solution. For those of you that are familiar with behavior trees, these benefits might also seem familiar. Unlike behavior trees, however, HTN planners can reason about the effects of possible actions. This ability to reason about the future allows HTN planners to be incredibly expressive in how they describe behavior.
把高层次(抽象)的任务分解成更小的任务是解决很多问题的一种很自然的方法。在我们的例子中,问题只是“弄清楚要去做什么”。HTNs具有高度的模块化和快速的运行时执行,是一种很有吸引力的解决方案。对于那些熟悉行为树的人来说,这些好处也很熟悉。可是与行为树不同的是,HTN的规划器能够对可能的行为产生的影响进行推理。这种对未来推理的能力让HTN的规划器在描述行为时具有令人难以置信的表现力。
There have been many different systems used for HTN planning [Erol 95].The system we will be exploring is the system that we used on Transformers: Fall of Cybertron [HighMoon 12], which is based on a total-order forward decomposition planner. The following example will walk through some of the challenges we faced and the benefits we received during development by using a simplified, fictional example.
有许多不同的系统用HTN来规划[Erol 95]。我们将探究的是我们在《变形金刚:塞伯坦之秋》(HighMoon 12)中使用过的系统,它是基于一个总顺序正向的分解规划器。 下面的示例将通过使用一个简化和虚构的例子,介绍我们在开发过程中遇到的一些挑战和获得的好处。
For our example, we will use a troll NPC called a “Trunk Thumper.” The designer’s initial description is that he’s a big, nasty, lumbering troll that patrols its numerous bridges and attacks passing enemies with a large tree trunk.And just like development in the real world, this design is bound to change.
在我们的例子中,我们将使用一个叫作Trunk Thumper的巨魔NPC。设计师最初的描述是,他是一个巨大的、令人讨厌的、笨拙的巨魔,它在众多的桥梁上巡逻,并用一根巨大的树干攻击路过的敌人。就像通常会遇到的情况一样,这种设计一定会被改动。
Building Blocks of HTN
HTN的构建模块
Before building the behavior for our Trunk Thumper, it’s important to go over the basic building blocks of hierarchical task networks so you can get an idea of how it all works. An NPC, in our case the Trunk Thumper, has a planner that uses a domain and world state to build a sequence of tasks called a plan. This plan will be run by the Trunk Thumper’s plan runner. The world state is updated by the NPC’s sensors and by the successfully completed tasks executed by the plan runner. A diagram of the system is Figure 12.1.
在构建Trunk Thumper的行为(behavior)之前,有必要重温一下HTN的基本构建模块,这样您就可以了解它是如何工作的。在我们的例子中Trunk Thumper NPC拥有一个规划器,这个规划器使用一个域(domain)和世界状态(world state)构建称为计划(plan)的任务序列。这个计划将由Trunk Thumper的计划执行器(plan runner)执行。世界状态(world state)由NPC的传感器(sensors)和被计划执行器(plan runner)成功执行的任务来更新。系统示意图如图12.1所示。
The World State
世界状态
Like any type of behavior algorithm, hierarchical task networks need some type of knowledge representation that describes the current problem space. In the case of our Trunk Thumper, this would be a representation that describes what our troll knows about the world and himself in it. Other types of behavior algorithms might query the actual state of different objects in the world.For example, query an object’s location or their health. But with HTN, this information needs to be encoded into something it can understand, called the world state.The world state is essentially a vector of properties that describe what our HTN is going to reason about. Here is some simple pseudocode
与任何类型的行为算法一样,HTN同样需要某种类型的知识表示来描述当前的问题空间。在我们的Trunk Thumper例子中,这个表示就是描述他对这个世界和他自己的了解。其他类型的行为算法可能查询世界中不同对象的实际状态。例如,查询对象的位置或生命值。但使用HTN,这些信息需要被编码成它能理解的东西,称为世界状态(world state)。世界状态本质上是一个用来描述HTN将会推理出来的属性向量集合。下面是一些简单的伪代码。
enum EHtnWorldStateProperties
{
WsEnemyRange,
WsHealth,
WsIsTired,
…
}
enum EEnemyRange
{
MeleeRange,
ViewRange,
OutOfRange,
…
}
vector<byte> CurrentWorldState;
EEnemyRange currentRange = CurrentWorldState[WsEnemyRange];
CurrentWorldState[WsEnemyRange] = MeleeRange;
As you can see from the pseudocode, world state can simply be an array or vector indexed by an enum such as EhtnWorldStateProperties
从伪代码中可以看到,世界状态(world state)可以是一个数组或由枚举索引的向量,比如EhtnWorldStateProperties
Each entry in the world state can have its own set of values.In the case of WsIsTired, the byte can represent the Boolean values zero and one. With WsEnemyRange, the values in the enum EEnemyRange are used. It’s important to note that the world state only needs to represent what is needed for the HTN to make decisions.That’s why WsEnemyRange is represented by abstract values, instead of the actual range. The goal of the world state isn’t to represent every possible state of every possible object in the game. It only needs to represent the problem space that our planner needs to make decisions. What this means for our example, of course, is that it only needs to represent what the Trunk Thumper needs to make decisions.
世界状态(world state)中的每个字段都可以有自己的一组值。就WsIsTired字段来说,字节可以表示布尔值0和1。使用WsEnemyRange字段时,使用枚举EEnemyRange里面的值。需要注意的是,世界状态(world state)只需要表示出HTN做出决策所需要的字段。这就是为何WsEnemyRange字段使用抽象值而不是实际范围值表示的原因。(因为通过抽象值做出决策,如果需要的是范围值来做出决策,改造byte[]为数据的world state ,是否可以提供使用范围值?)世界状态(world state)的目标并不是代表游戏中所有可能对象的所有可能状态。它只需要表示出满足我们的规划器(planner)做出决策的问题域。当然,对于我们的示例来说,这意味着它只需要表示 Trunk Thumper 做出决策所需的内容。
Sensors
传感器(Sensors)
If you recall, an HTN outputs a plan or sequence of tasks. These tasks will have an effect on the world state as it is executed. There are outside influences such as the player or other NPCs, however, that will affect the world state as well. For example, both the enemy and the troll can affect the world state property, WsEnemyRange. The tasks executed by the troll could update this property if they were to move the troll.There is nothing in the HTN planner to handle changes produced by the enemy moving, however.
如果您还记得的话,HTN输出一个计划(plan)或任务(task)序列。这些任务(task)将在在执行过程中对世界状态(world state)产生影响。不过,也有一些外部的影响会影响世界状态(world state)比如玩家或其他npc 。例如,敌人和巨魔都可以影响世界状态(world state)的WsEnemyRange属性。巨魔执行移动的任务可以更新此属性。不过HTN规划器(planner)中没有什么可以处理敌人移动产生的变化。
There are many different ways these changes can be translated into the world state.One preferable way is a simple sensor system that manages a set of time-sliced sensors. Each sensor can manage different world state properties. Examples of some different sensors include vision, hearing, range, and health sensors.These sensors would work the same as in any other AI system, with an added step of encoding their information into the world state that our HTN can understand.
这些变化可以通过许多不同的方式转化为世界状态(world state)。一种比较好的方法是使用简单传感器系统(simple sensor system)来管理一组时间切片传感器。(把传感器的更新计算分散到多个帧来完成?降低每一帧的运算量)。每个传感器(sensor)可以管理不同的世界状态(world state)属性。一些不同传感器的例子包括视觉、听觉、范围和生命值传感器。这些传感器的工作原理与任何其他人工智能系统相同,只是增加了一个步骤,即把它们的信息编码成我们的HTN能够理解的世界状态(world state)。
Primitive Tasks
原子任务
As we mentioned already, a hierarchical task network is made up of tasks.
There are two types of tasks that are used to build a HTN, called compound tasks and primitive tasks.Primitive tasks represent a single step that can be performed by our NPC. In our Trunk Thumper example, uprooting a tree or attacking with a trunk slam would be examples of primitive tasks.A set of primitive tasks is the plan that we are ultimately getting out of the HTN.
Primitive tasks are comprised of an operator and sets of effects and conditions.
正如我们之前提到的,一个HTN是由一组任务(task)组成的。用于构建HTN的任务(task)有两种类型,称为复合任务(compound task)和原子任务(primitive task)。原子任务(primitive task)表示可以由我们的NPC执行的单个步骤。在我们的Trunk Thumper例子中,拔起一棵树或用树干重击进行攻击都是原子任务(primitive task)的例子。一组原子任务(primitive task)是我们最终从HTN中得到的计划(plan)。原子任务(primitive task)由一个操作符(operator)和一组影响(effect)和条件(condition)组成。
In order for a primitive task to execute, its set of conditions must be valid.
This allows the task’s implementer to ensure the correct conditions are met for the task to run.It’s important to note that a primitive task’s conditions are not a requirement for the implementation of HTN. They are, however, recommended to reduce the redundancy of checks that would be needed higher in the HTN hierarchy. In addition, doing so will avoid potential bugs that can arrive from having to do these checks in multiple places.
为了执行一个原子任务(primitive task),它的条件集必须是合法的。这允许任务的实现者确保任务的运行满足了正确的条件。需要注意的是,原子任务(primitive task)的条件不是HTN实现的必要条件。但是,建议使用这些条件来减少在HTN高层次中的冗余检查。此外,这样做可以避免在一些不得不做检查的地方可能存在潜在的bug。
A primitive task’s effects describe how the success of the task will affect the NPC’s world state. For example, the task DoTrunkSlam executes the troll’s tree trunk melee attack and results in the troll becoming tired. The DoTrunkSlam’s effects are the manner in which we describe this result. This allows the HTN to reason about the “future” as was mentioned earlier. Since the effect of “being tired” is represented, our Trunk Thumper is able to make a better decision of what to do after DoTrunkSlam or if it’s even worth doing so at all.
原子任务(primitive task)的影响(effect)描述了任务的成功将如何影响NPC的世界状态(world state)。例如,任务DoTrunkSlam执行巨魔的树干近战攻击,导致巨魔变得疲惫。任务DoTrunkSlam的影响(effect)就是我们描述这个结果的方式。这允许HTN推理出前面提到的“未来”。由于“变得疲惫”的影响被表现出来,我们的Trunk Thumper能够更好地决定在任务DoTrunkSlam之后该做什么,或者是否值得这样做。
The operator represents an atomic action that a NPC can do.This might sound exactly like the primitive task itself.The difference being that the primitive task along with its effects and conditions describe what the operator means in terms of the HTN we are building.
操作符(operator)表示一个NPC可以执行的原子操作。这可能听起来完全像原子任务(primitive task)本身。不同之处在于,原子任务(primitive task)及它的影响(effect)和条件(condition)描述了操作符(operator)对我们构建的HTN的意义。
As an example, let’s take the two tasks SprintToEnemy and WalkToNextBridge. Both of these tasks use the MoveTo operator, but the two tasks change the state of our NPC in different ways.On the successful completion of SprintToEnemy, our NPC will be at the enemy and tired, specified by the task’s effects. WalkToNextBridge task’s effects would set the NPC’s location to the bridge and he’d be a little more bored. As you can see, we are able to use the same operator but describe two different uses for it in terms of our network.Here is the notation we will use to describe a primitive task going forward along with the SprintToEnemy and WalkToNextBridge tasks as an example.
我们以SprintToEnemy和WalkToNextBridge这两个任务为例。这两个任务都使用MoveTo操作符(operator),但是这两个任务以不同的方式更改NPC的状态。当SprintToEnemy任务成功完成时,我们的NPC会攻击敌人并感到疲劳,这由任务的影响(effect)指定。WalkToNextBridge任务的影响(effect)会将NPC的位置设置为桥的位置,同时他就会有点无聊。正如你所看到的,我们可以使用同一个操作符(operator),但在我们的网络中描述了它的两种不同用途。下面是SprintToEnemy任务和WalkToNextBridge任务作为例子用来描述一个原子任务(primitive task)的记法。
Primitive Task [TaskName(term1, term2,...)]
Preconditions [Condition1, Condition2, …]//optional
Operator [OperatorName(term1, term2,...)]
Effects [WorldState op value, WorldState = value, WorldState += value]//optional
Primitive Task [SprintToEnemy]
Preconditions [WsHasEnemy == true]
Operator [NavigateTo(EnemyLoc, Speed_Fast)]
Effects [WsLocation = EnemyLoc, WsIsTired = true]
Primitive Task [WalkToNextBridge]
Operator [NavigateTo(BridgeLoc, Speed_Slow)]
Effects [WsLocation = BridgeLoc, WsBored += 1]
Compound Tasks
复合任务
Compound tasks are where HTN get their “hierarchical” nature. You can think of compound task as a high level task that has multiple ways of being accomplished. Using the Trunk Thumper as an example, he may have the task AttackEnemy. Our Thumper may have different ways of accomplishing this task.If he has access to a tree trunk, he may run to his target and use it as a melee weapon to “thump” his enemy.If no tree trunks are available, he can pull large boulders from the ground and toss them at our enemy.He may have a multitude of other approaches if the conditions are right.
复合任务(Compound task)是HTN具有“分层”特性的地方。你可以把复合任务(Compound task)看作是一个高级任务,它有多种方式来完成。以Trunk Thumper为例,他可能有AttackEnemy任务。我们的Thumper可能有不同的方式来完成这项任务。如果他接近一根树干,他可以跑向他的目标,用它作为近战武器“thump”他的敌人。如果没有树干,他可以从地上拿起大石头,扔向我们的敌人。如果条件满足,他可能有许多其他方法。
In order to determine which approach we take to accomplish a compound task, we need to select the right method.Methods are comprised of a set of conditions and tasks. In order for the method to be the selected approach, the conditions are validated against the world state. The set of tasks, or subtasks, represent the method’s approach.This subtask set can be comprised of primitive tasks as well as compound. The ability to put compound tasks into the methods of other compound tasks is where hierarchical task networks get their hierarchical nature. Here is an example of the notation we will use to describe a compound task going forward.
为了确定我们采用哪种方法来完成一个复合任务,我们需要选择正确的方法(method)。方法(method)由一组条件和任务组成。为了使该方法成为可选择的方法,结合世界状态(world state)对条件(condition)进行了验证。任务集或子任务表示该方法(method)的处理方式。这个子任务集可以由基本任务和复合任务组成。将复合任务转化为其他复合任务的方法的能力是HTN具有分层特性的地方。下面是一个例子,我们将使用它来描述一个复合任务。
Compound Task [TaskName(term1, term2,...)]
Method 0 [Condition1, Condition2,...]
Subtasks [task1(term1, term2,...). task2(term1, term2,...),...]
Method 1 [Condition1, Condition2,...]
Subtasks [task1(term1, term2,...). task2(term1, term2,...),...]
In our previous example, using the tree trunk as a melee weapon and throwing boulders are both methods to the AttackEnemy compound task. The conditions in which we decide which method to use depend on whether the troll has a tree trunk or not. Here is an example of the AttackEnemy task using the notation above.
在前面的例子中,使用树干作为近战武器和投掷石块都是攻击敌人复合任务的方法。我们决定使用哪种方法的条件取决于巨魔是否有树干。下面是一个使用上面的记法来标记AttackEnemy任务的例子。
Compound Task [AttackEnemy]
Method 0 [WsHasTreeTrunk == true]
Subtasks [NavigateTo(EnemyLoc). DoTrunkSlam()]
Method 1 [WsHasTreeTrunk == false]
Subtasks [LiftBoulderFromGround(). ThrowBoulderAt(EnemyLoc)]
By understanding how compound tasks work, it’s easy to imagine how we could have a large hierarchy that may start with a BeTrunkThumper compound task that is broken down into sets of smaller tasks—each of which are then broken into smaller tasks, and so on. This is how HTN forms a hierarchy that describes how our troll NPC is going to behave.
通过理解复合任务是如何工作的,就很容易想象我们是如何拥有一个大的层次结构的,它可以从一个BeTrunkThumper复合任务开始,然后分解成一组更小的任务——每个任务再分解成更小的任务,以此类推。这就是HTN如何形成描述巨魔NPC行为的层次结构。
It’s important to understand that compound tasks are really just containers for a set of methods that represent different ways to accomplish some high level task. There is no compound task code running during plan execution.
必须理解的是,复合任务实际上只是一组方法(method)的容器,这些方法(method)每一个都是用来表示完成某个高级任务的不同方式。在计划(plan)执行期间没有运行复合任务代码。
Putting Together an HTN Domain
组合成一个HTN域
Now that we have an overview of the main building blocks of HTN, we can build a simple domain for our Trunk Thumper to illustrate how it works. A domain is the term used to describe the entire task hierarchy. As we mentioned before, our troll has numerous bridges that he actively patrols and attacks enemies with a large tree trunk. We start with a compound task called BeTrunkThumper. This root task encapsulates the “main idea” of what it means to be a Trunk Thumper.
现在我们已经对HTN的主要构建块有了一个概述,我们可以为Trunk Thumper构建一个简单的域来说明它是如何工作的。域(domain)是用来描述整个任务层次结构的术语。正如我们之前提到的,我们的巨魔有许多桥,他积极地巡逻和用一根大树干攻击敌人。我们从一个名为BeTrunkThumper的复合任务开始。这个根任务封装了作为一个Trunk Thumper的“主要思想”。
Compound Task [BeTrunkThumper]
Method [WsCanSeeEnemy == true]
Subtasks [NavigateToEnemy(), DoTrunkSlam()]
Method [true]
Subtasks [ChooseBridgeToCheck(), NavigateToBridge(), CheckBridge()]
As you can see with this root compound task, the first method defines the troll’s highest priority. If he can see the enemy, he will navigate using NavigateToEnemy task and attack his enemy with the DoTrunkSlam task. If not, he will fall to the next method. This next method will run three tasks; choose the next bridge to check, navigate to that bridge, and check the bridge for enemies. Let’s take a look at the primitive tasks that make up these methods and the rest of the domain.
从这个根复合任务(root compound task)中可以看到,第一个方法定义了巨魔的最高优先级。如果他能看到敌人,他会使用NavigateToEnemy任务导航,并使用DoTrunkSlam任务攻击敌人。否则,他就会采用下一种方法。下一个方法将运行三个任务;选择下一个要检查的桥,导航到那个桥,检查桥上有没有敌人。让我们看看组成这些方法和域的其余部分的基本任务。
Primitive Task [DoTrunkSlam]
Operator [AnimatedAttackOperator(TrunkSlamAnimName)]
Primitive Task [NavigateToEnemy]
Operator [NavigateToOperator(EnemyLocRef)]
Effects [WsLocation = EnemyLocRef]
Primitive Task [ChooseBridgeToCheck]
Operator [ChooseBridgeToCheckOperator]
Primitive Task [NavigateToBridge]
Operator [NavigateToOperator(NextBridgeLocRef)]
Effects [WsLocation = NextBridgeLocRef]
Primitive Task [CheckBridge]
Operator [CheckBridgeOperator(SearchAnimName)]
The first task DoTrunkSlam is an example of how a primitive task can describe an operator in terms of the HTN domain. Here, the task is really executing an animated attack operator and the animation name is being passed in as a term. The next task NavigateToEnemy is also an example of this, but on the successful completion of this task, the world state WsLocation is set to EnemyLocRef via the primitive task’s effect.
第一个任务DoTrunkSlam是一个示例,说明了原子任务(primitive task)在HTN域中如何描述一个操作符(operator)。在这里,任务实际上是执行一个动画攻击操作符(operator),动画名称作为一个术语传递进来。下一个任务NavigateToEnemy也是这样的一个例子,但在成功完成此任务时,通过原语任务的效果将世界状态(world state)的WsLocation字段设置为EnemyLocRef。
Finding a Plan
找到一个计划
With a domain made up of compound and primitive tasks, we are starting to form an image of how these are put together to represent an NPC. Combine that with the world state and we can talk about the work horse of our HTN, the planner. There are three conditions that will force the planner to find a new plan: the NPC finishes or fails the current plan, the NPC does not have a plan, or the NPC’s world state changes via a sensor. If any of these cases occur, the planner will attempt to generate a plan. To do this, the planner starts with a root compound task that represents the problem domain in which we are trying to plan for. Using our earlier example, this root task would be the BeTrunkThumper task. This root task is pushed onto the TasksToProcess stack. Next, the planner creates a copy of the world state. The planner will be modifying this working world state to “simulate” what will happen as tasks are executed.
对于由复合任务(compound task)和原子任务(primitive task)组成的域(domain),我们从如何将这些任务组合在一起表示一个NPC的形象开始。结合世界状态(world state),我们可以探讨下HTN的主体部分,规划器(planner)。有三个情况会强制规划器(planner)去寻找新的计划(plan):NPC完成或失败当前的计划,NPC没有计划,或者NPC的世界状态(world state)通过传感器(sensor)改变。如果出现上述任何一种情况,规划器(planner)将尝试生成计划(plan)。要做到这一点,规划器(planner)从一个根复合任务(root compound task)开始,该任务表示我们正在努力规划的问题领域。以前面的示例来说,这个根任务是BeTrunkThumper任务。这个根任务被推到TasksToProcess堆栈上。接下来,规划器(planner)创建世界状态(world state)的副本。规划器(planner)将修改这个世界状态(world state)的副本,“模拟”执行任务时会对世界状态(world state)发生的改变。
After these initialization steps are taken, the planner begins to iterate on the tasks to process. On each iteration, the planner pops the next task off the TasksToProcess stack. If it is a compound task, the planner tries to decompose it—first, by searching through its methods looking for the first set of conditions that are valid. If a method is found, that method’s subtasks are added on to the TaskToProcess stack. If a valid method is not found, the planner’s state is rolled back to the last compound task that was decomposed. We will go into more detail about restoring the planner’s state later.
在执行了这些初始化步骤之后,规划器(planner)开始对要处理的任务进行迭代。在每次迭代中,规划器(planner)从TasksToProcess堆栈中弹出下一个任务。如果它是一个复合任务,计划器会尝试分解它——首先,通过搜索它的方法(method)来寻找第一组有效的条件。如果找到一个方法(method),该方法(method)的子任务(subtask)将添加到TaskToProcess堆栈中。如果没有找到有效的方法(method),规划器(planner)的状态将回滚到分解的最后一个复合任务。稍后我们将详细介绍如何恢复规划器(planner)的状态。
If the next task is primitive, we need to check its preconditions against the working world state. If the conditions are met, the task is added to the final plan and its effects are applied to the working world state. The effects are applied because the planner assumes that task is going to succeed. This allows future methods to consider that new state. If the primitive task’s conditions are not met, the planner’s state is rolled back such as was done for the compound task. This iteration process is continued until the TasksToProcess stack is empty. Upon completion, the planner will either end up with a list of primitive tasks or the planner will have rolled back far enough that the result was no plan. Below is the example pseudocode that shows this process.
如果下一个任务是原子任务(primitive task),我们需要根据世界状态(world state)检查它的先决条件(precondition)。如果满足条件,则将任务添加到最终计划(final plan),并将其影响(effect)应用到当前世界状态(world state)的副本。应用这些影响(effect)是因为规划器(planner)假定任务将会成功。这使得将来的方法(method)可以考虑新的状态。如果原子任务(primitive task)的条件不满足,规划器(planner)的状态将回滚,就像对复合任务(compound task)所做的那样。此迭代过程将继续,直到TasksToProcess堆栈为空。在完成时,规划器(planner)将以原子任务(primitive task)的列表作为结果而结束,或者规划器(planner)将回滚到足够靠前以导致TasksToProcess堆栈没有可迭代的任务。下面是演示此过程的示例伪代码。
WorkingWS = CurrentWorldState
TasksToProcess.Push(RootTask)
while TasksToProcess.NotEmpty
{
CurrentTask = TasksToProcess.Pop()
if CurrentTask.Type == CompoundTask
{
SatisfiedMethod = CurrentTask.FindSatisfiedMethod(WorkingWS)
if SatisfiedMethod != null
{
RecordDecompositionOfTask(CurrentTask, FinalPlan, DecompHistory)
TasksToProcess.InsertTop(SatisfiedMethod.SubTasks)
}
else
{
RestoreToLastDecomposedTask()
}
}
else//Primitive Task
{
if PrimitiveConditionMet(CurrentTask)
{
WorkingWS.ApplyEffects(CurrentTask.Effects)
FinalPlan.PushBack(CurrentTask)
}
else
{
RestoreToLastDecomposedTask()
}
}
}
There is a bit of magic going on in the RecordDepositionOfTask and RestoreToLastDecomposedTask functions that should be explained in more detail.The record function records the planner’s state onto the DecompHistory stack. This includes the TasksToProcess and FinalPlan containers as well as the method chosen for the decomposition and its owning compound task. By popping off this recorded state to the planner via the restore function, the planner can backtrack either when a compound task cannot be decomposed or when a primitive’s conditions aren’t satisfied.
RecordDepositionOfTask和RestoreToLastDecomposedTask函数中发生了一些奇妙的事情,需要更详细地解释。record函数将规划器(planner)的状态记录到DecompHistory堆栈中。这包括TasksToProcess和FinalPlan容器,以及为分解及其所属复合任务所选择的方法(method)。通过restore函数将记录的状态弹出到规划器(planner),规划器(planner)可以在不能分解复合任务或者原子任务的条件不能满足时回溯。
As you might have realized, the planner uses a depth-first search to find a valid plan. This does mean that you may have to explore the whole domain to find a valid plan. However, it’s important to remember that you are traversing a hierarchy of tasks. This hierarchy allows the planner to cull large sections of the network via the compound task’s methods. Because we aren’t using a heuristic or cost—such as with A* and Dijkstra searches—we can skip any kind of sorting. These features allowed the HTN planner in Transformers: Fall of Cybertron to be considerably faster than our GOAP system used in Transformers: War for Cybertron [HighMoon 10].
您可能已经意识到,规划器(planner)使用深度优先搜索来查找有效的计划。这意味着您可能必须探索整个领域才能找到一个有效的计划。但是,务必记住,您正在遍历一个任务层次结构。这个层次结构允许规划器(planner)通过复合任务的方法剔除网络中的大部分。因为我们没有使用启发式或代价(比如A*和Dijkstra)搜索,所以我们可以跳过任何类型的排序。这些特性使得《变形金刚:塞伯坦的秋天》中的HTN规划系统比《变形金刚:塞伯坦的战争》(HighMoon10)中的GOAP系统要快得多。
Now that the planner has been explained, we can expand our example and see how a modified version of the Trunk Thumper domain might decompose (Figure 12.2). This domain’s root task is still BeTrunkThumper, but the DoTrunkSlam is now a compound task. DoTrunkSlam has two methods—each doing a different version of the trunk slam. The method’s conditions for both compound tasks have been omitted for simplicity. Underneath the domain you can see the planner’s iterations going from top to the bottom. For each iteration, you can see the left-most task in the TasksToProcess
stack being processed.
现在已经解释了规划器(planner),我们可以扩展示例,看看Trunk Thumper的域的修改版本是如何分解的(图12.2)。这个域的根任务仍然是BeTrunkThumper,但是DoTrunkSlam现在是一个复合任务。DoTrunkSlam有两个方法—每个方法执行不同型式的trunk slam任务。为了简单起见,省略了这两个复合任务的方法条件。在域的下面,您可以看到规划器(planner)的迭代从上到下。对于每个迭代,您可以在TasksToProcess中看到最左边的任务
堆栈被处理。
Running the Plan
运行计划
Running an HTN plan is pretty straightforward. The NPC’s plan runner will attempt to execute each primitive task’s operator in sequence. As it successfully completes each task, the planner applies the task’s effects to the world state. If the task fails for some reason that is specific to the operator it’s running, the plan also fails and forces a re-plan.
运行HTN计划非常简单。NPC的plan runner将尝试依次执行每个原子任务(primitive task)的操作符(operator)。当成功完成每个任务时,计划器将任务的效果应用到世界状态(world state)。如果任务运行它自身指定的操作符失败,那么计划也会失败并强制重新计划。
The plan can also fail if the current or any of the remaining task’s conditions become invalid. The plan runner monitors these tasks’ preconditions against a “working world state” much like the planner. As it confirms each task’s preconditions, its effects are applied
如果当前或剩余任务的任何条件无效,该计划也会失败。plan runner以运行中世界状态(world state)为参照来监控这些任务的先决条件(precondition)能否成立,就像计划人员一样。当它确认每个任务的前提条件时,就会应用它的影响(effect)。
图12.2 Decomposition of the Trunk Thumper domain, showing the resulting plan if BeTrunkThumper. Method0 and DoTrunkSlam.Method.1 were chosen.
图12.2 Trunk Thumper域的分解,显示BeTrunkThumper任务分解方案。方法Method0和方法DoTrunkSlam.Method.1 被选择。
to the working world state. It’s important that it applies the effects because following task’s preconditions might rely on these effects being applied in order to be valid. This plan validation allows the HTN domain to be a bit more expressive and reactive to the changes of the world state.
运行中的世界状态(world state)。应用这些影响(effect)是很重要的,因为后续任务的先决条件(precondition)可能依赖于应用这些影响(effect)才能有效。这个计划的验证使HTN域能够更有表现力,并对世界状态(world state)的变化做出反应。
Using Recursion for Greater Expressiveness
利用递归展现更强大的表现力
After seeing our troll in game, the designers think that the tree trunk attack is a little overpowered. They suggest that the trunk breaks after three attacks, forcing the troll to search for another one. First we can add the property WsTrunkHealth to the world state. By wrapping up the attack method into its own compound task and adding a little recursion, we will be able to modify the troll’s attack behavior. The changed domain would now be:
在我们的巨魔游戏中,设计师认为树干攻击有点太过于压制性。他们认为应该让树干在三次攻击后断裂,迫使巨魔寻找另一个。首先,我们可以将属性WsTrunkHealth添加到世界状态(world state)。通过将攻击方法封装到它自己的复合任务中,并添加一点递归,我们将能够修改巨魔的攻击行为。更改后的域现在是:
Compound Task [BeTrunkThumper]
Method [ WsCanSeeEnemy == true]
Subtasks [AttackEnemy()]// using the new compound task
Method [true]
Subtasks [ChooseBridgeToCheck(), NavigateToBridge(), CheckBridge()]
Compound Task [AttackEnemy]//new compound task
Method [WsTrunkHealth > 0]
Subtasks [NavigateToEnemy(), DoTrunkSlam()]
Method [true]
Subtasks [FindTrunk(), NavigateToTrunk(), UprootTrunk(), AttackEnemy()]
Primitive Task [DoTrunkSlam]
Operator [DoTrunkSlamOperator]
Effects [WsTrunkHealth += -1]
Primitive Task [UprootTrunk]
Operator [UprootTrunkOperator]
Effects [WsTrunkHealth = 3]
Primitive Task [NavigateToTrunk]
Operator [NavigateToOperator(FoundTrunk)]
Effects [WsLocation = FoundTrunk]
When our troll can see the enemy, he will attack just as before—only now, the behavior is wrapped up in a new compound task called AttackEnemy. This task’s high priority method performs the navigate and slam like the original domain, but now has the condition that the trunk has some health. The change to the DoTrunkSlam task will decrement the trunk’s health every successful attack. This allows the planner to drop to the lower priority method if it has to accommodate a broken tree trunk.
当我们的巨魔看到敌人时,它会像以前一样攻击,只是现在,这个行为被包裹在一个新的复合任务中,叫做“AttackEnemy”。该任务的高优先级方法像开始的域一样执行navigate和slam,但现在的条件是树干具有一定的生命值。对DoTrunkSlam任务的更改将在每次成功攻击时降低树干的生命值。这使规划器(planner)在应对树干损坏状况时能够使用较低优先级的方法。
The second method of AttackEnemy handles getting a new tree trunk. It first chooses a new tree to use, navigates to that tree, and uproots it, after which it is able to AttackEnemy. Here is where the recursion comes in. When the planner goes to decompose the AttackEnemy task again it can now consider the methods again. If the tree trunk’s health was still zero, this would cause the planner to infinite loop. But the new task UprootTrunk’s effect sets WsTrunkHealth back to three, allowing us to have the plan FindTrunk → NavigateToTrunk → UprootTrunk → NavigateToEnemy → DoTrunkSlam. This new domain allows us to reuse methods already in the domain to get the troll back to thumping.
第二种AttackEnemy的方法(method)是获得一个新的树干。它首先选择一棵新树,导航到那棵树,拔掉它,然后它就可以AttackEnemy了。这就是递归的作用。当规划器(planner)再次分解攻击敌人的任务时,它现在可以再次考虑这些方法。如果树干的生命值仍然为零,这将导致规划器(planner)进行无限循环。但是新的UprootTrunk任务的效果将WsTrunkHealth设置为3,允许我们生成一个计划FindTrunk→NavigateToTrunk→UprootTrunk→NavigateToEnemy→DoTrunkSlam。这个新域允许我们重用域中已经存在的方法,以使巨魔重新攻击。
Planning for World State Changes not Controlled by Tasks
规划不受任务控制的世界状态变化而产生的计划
So far all of the plans we have been building depend on the primitive task’s effects changing the world state. What happens when the world state is changed outside the control of primitive tasks, however? To explore this, let’s modify our example once again. Let us assume that a designer notices that when the troll can’t see the enemy, he simply goes back to patrolling the bridges. The designer asks you to implement a behavior that will chase after the enemy and react once he sees the enemy again. Let’s look at the changes we could make to the domain to handle this issue.
到目前为止,我们一直在构建的所有计划都依赖于原子任务(primitive task)改变世界状态(primitive task)的效果。但是,当世界状态在原子任务(primitive task)控制之外发生改变时,会发生什么呢?为了探究这一点,让我们再次修改我们的示例。让我们假设一个设计师注意到当巨魔看不到敌人的时候,他会回到桥上巡逻。设计师要求你实施一种追赶敌人的行为,并在再次看到敌人时做出反应。让我们看看我们可以对域进行哪些更改来处理这个问题。
Compound Task [BeTrunkThumper]
Method [ WsCanSeeEnemy == true]
Subtasks [AttackEnemy()]
Method [ WsHasSeenEnemyRecently == true]//New method
Subtasks [NavToLastEnemyLoc(), RegainLOSRoar()]
Method [true]
Subtasks [ChooseBridgeToCheck(), NavigateToBridge(), CheckBridge()]
Primitive Task [NavToLastEnemyLoc]
Operator [NavigateToOperator(LastEnemyLocation)]
Effects [WsLocation = LastEnemyLocation]
Primitive Task [RegainLOSRoar]
Preconditions[WsCanSeeEnemy == true]
Operator [RegainLOSRoar()]
With this rework, if the Trunk Thumper can’t see the enemy, the planner will drop down to the new method that relies on WsHasSeenEnemyRecently world state property. This method’s tasks will navigate to the last place the enemy was seen and do a big animated “roar” if he once again sees the enemy. The problem here is that the RegainLOSRoar task has a precondition of WsCanSeeEnemy being true. That world state is handled by the troll’s vision sensor. When the planner goes to put the RegainLOSRoar task on the final task list it will fail its precondition check, because there is nothing in the domain that represents what the expected world state will be when the navigation completes.
通过这种重新设计,如果Trunk Thumper看不到敌人,规划器(planner)就会向下移动到新方法,这个方法依赖世界状态(world state)里的WsHasSeenEnemyRecently属性。这个方法的任务是navigate到最后一次看到敌人的地方,如果他再次看到敌人,就会执行一个“怒吼”的动画。这里的问题是,RegainLOSRoar任务的先决条件是WsCanSeeEnemy为真。这个世界状态(world state)是由巨魔的视觉传感器(sensor)处理的。当规划器(planner)将RegainLOSRoar任务放到最终任务(final plan)列表中时,它的先决条件检查将失败,因为域中没有任何东西表示navigate完成时预期的世界状态(world state)能满足条件。
To solve this, we are going to introduce the concept of expected effects. Expected effects are effects that get applied to the world state only during planning and plan validation.The idea here is that you can express changes in the world state that should happen based on tasks being executed. This allows the planner to keep planning farther into the future based on what it believes will be accomplished along the way. Remember that a key advantage planners have at decision making is that they can reason about the future, helping them make better decisions on what to do next. To accommodate this, we can change NavToLastEnemyLoc in the domain to:
为了解决这个问题,我们将引入预期影响(expected effect)的概念。预期影响(expected effect)是只在规划期间和计划验证期间应用到世界状态的效果。这里的思想是,您可以表达世界状态中的变化,这些变化应该基于正在执行的任务而发生。这使得规划器(planner)能够根据自己认为在这一过程中会完成的事情,对未来进行更深入的规划。记住,规划器(planner)在做决策时的一个关键优势是他们可以对未来进行推理,帮助他们更好地决定下一步该做什么。为了适应这一点,我们可以将域内的NavToLastEnemyLoc改为:
Primitive Task [NavToLastEnemyLoc]
Operator [NavigateToOperator(LastEnemyLocation)]
Effects [WsLocation = LastEnemyLocation]
ExpectedEffects [WsCanSeeEnemy = true]
Now when this task gets popped off the decomposition list, the working world state will get updated with the expected effect and the RegainLOSRoar task will be allowed to proceed with adding tasks to the chain. This simple behavior could have been implemented a couple of different ways, but expected effects came in handy more than a few times during the development of Transformers: Fall of Cybertron. They are a simple way to be just a little more expressive in a HTN domain.
现在,当这个任务从分解列表(decomposition list)中弹出时,当前的世界状态(world state)将被更新为预期的效果,这时RegainLOSRoar任务将被允许继续向最终任务的链中添加任务。这种简单的行为可以用多种不同的方式实现,但在《变形金刚:塞伯坦的秋天》的制作过程中,预期的效果多次派上了用场。它们是在HTN域中让表达性更强的一种简单方法。
How to Handle Higher Priority Plans
如何处理优先级更高的计划
To this point, we have been decomposing compound tasks based on the order of the task’s methods. This tends to be a natural way of going about our search, but consider these attack changes to our Trunk Thumper domain.
至此,我们已经根据任务方法的顺序分解了复合任务。这是一种很自然的搜索方式,但是必须考虑到这些攻击会对 Trunk Thumper 域的改变。
Compound Task [AttackEnemy]
Method [WsTrunkHealth > 0, AttackedRecently == false,
CanNavigateToEnemy == true]
Subtasks [NavigateToEnemy(), DoTrunkSlam(), RecoveryRoar()]
Method [WsTrunkHealth == 0]
Subtasks [FindTrunk(), NavigateToTrunk(), UprootTrunk(), AttackEnemy()]
Method [true]
Subtasks [PickupBoulder(), ThrowBoulder()]
Primitive Task [DoTrunkSlam]
Operator [DoTrunkSlamOperator]
Effects [WsTrunkHealth += -1, AttackedRecently = true]
Primitive Task [RecoveryRoar]
Operator [PlayAnimation(TrunkSlamRecoverAnim)]
Primitive Task [PickupBoulder]
Operator [PickupBoulder()]
Primitive Task [ThrowBoulder]
Operator [ThrowBoulder()]
After some play testing, our designer commented that our troll is pretty punishing.It only lets up on its attack against the player when it goes to grab another tree trunk. The designer suggests putting in a recovery animation after the trunk slam and a new condition not allowing the slam attack if the troll has attacked recently. Our designer has also noticed that our troll behaves strangely if he could not navigate to his enemy (due to an obstacle, for example). He decided to put in a low priority attack to throw a boulder if this happened.
在一些游戏测试后,我们的设计师觉得我们的巨魔是过于强力攻击。只有当它去抓另一个树干时,它才会放松对玩家的攻击。设计师建议在trunk slam后加入一个回复动画,并在巨魔最近攻击后设置一个不允许slam攻击的新条件。我们的设计师也注意到,如果我们的巨魔不能导航到他的敌人那里,他的行为就会很奇怪(例如,由于障碍)。他决定采取低优先级攻击,如果发生这种情况,就扔一块大石头。
Everything about these behavior changes seems fairly straightforward, but we need to take a closer look at what could happen while running the trunk slam plan. After the actual slam action, we start running the RecoveryRoar task. If, while executing this roar, the world state were to change and cause a re-plan, the RecoveryRoar task will be aborted.The reason for this is that, when the planner gets to the method that handles the slam,the AttackRecently world state will be set to true because the DoTrunkSlam completed successfully. This will cause the planner to skip the “slam” method tasks and fall through to the new “throw boulder” method, resulting in a new plan. This will cause the RecoveryRoar task to be aborted mid-execution, even though the currently running plan is still valid.
关于这些行为变化的一切似乎都相当简单,但是我们需要仔细看看在运行trunk slam计划时会发生什么。在实际slam动作之后,我们开始运行RecoveryRoar任务。如果在执行此任务时,世界状态(world state)发生改变并导致重新规划,RecoveryRoar任务将被中止。原因是,当规划器(planner)运行到处理slam方法时,世界状态(world state)中的AttackRecently将被设置为true,因为DoTrunkSlam任务成功完成,它的影响(effect)应用到了世界状态。这将导致规划器(planner)跳过“slam”方法任务,转而采用新的“throw boulder”方法,从而产生新的计划。这将导致RecoveryRoar任务在执行过程中被中止,即使当前运行的计划仍然有效。
In this case, we need a way to identify the “priority” of a running plan. There are a couple ways of solving this. Since HTN is a graph, we can use some form of a cost-based search such as A* or Dijkstra, for example. This would involve binding some sort of cost to our tasks or even methods. Unfortunately, tuning these costs can be pretty tricky in practice. Not only that, we would now have to add sorting to our planner, which will slow its execution.
在这种情况下,我们需要一种方法来确定运行计划的“优先级”。有几种方法可以解决这个问题。由于HTN是一个图数据结构,我们可以使用某种形式的基于成本的搜索,例如A*或Dijkstra。这将涉及到将某种成本绑定到我们的任务甚至方法上。不幸的是,在实践中调优这些成本非常棘手。不仅如此,我们现在还必须在规划器(planner)中添加排序,这将减慢它的执行速度。
Instead we would like to keep the simplicity and readability of “in-order priority” for our methods. The problem is a plan does not know the decomposition order of compound tasks that the planner took to arrive at the plan—it just executes primitive tasks’ operators.
相反,我们希望保持方法的“顺序优先级”的简单性和可读性。问题是计划(plan)不知道规划器(planner)为达成计划而进行的复合任务的分解顺序——它只执行原子任务(primitive task)的操作符。
图12.3 All possible plans with the Trunk Thumper domain and the Method Traversal Record for each plan, sorted by priority
图12.3 Trunk Thumper域里所有可能的计划和每个计划的方法遍历记录,按优先级排序
The order of a compound task’s methods are what we want to use to define priority—yet the plan isn’t aware of what a compound task is. To get around this, we can encode our traversal through the HTN domain as we search for a plan. This method traversal record (MTR) simply stores the method index chosen for each compound task that was decomposed to create the plan. Now that we have the MTR we can use it in two different ways to help us find the better plan. The simplest method would be to plan normally and compare the newly found plan’s MTR with the currently running plan’s MTR. If all of the method indexes chosen in the new plan are equal or higher priority, we found our new plan. An example is shown in Figure 12.3
我们希望用复合任务的方法(method)的顺序来定义优先级——但是计划并不知道复合任务是什么。为了解决这个问题,我们可以在搜索计划中对HTN域的遍历时进行编码。方法(method)遍历记录(MTR)仅存储为创建计划而分解的每个复合任务选择的方法指数。现在我们有了MTR,我们可以用两种不同的方式来帮助我们找到更好的计划。最简单的方法是正常规划,并将新发现的计划的MTR与当前运行的计划的MTR进行比较。如果新计划中选择的所有方法索引都具有同等或更高的优先级,那么我们就找到了新计划。图12.3显示了一个示例
We can also choose to use the current plan’s MTR during the planning process, as we decompose compound tasks in the new search. We can use the MTR as we search for a valid method only allowing methods that are equal to or higher priority. This allows us to cull whole branches of our HTN based on the current plan’s MTR. Our first method is the easier of the two, but if you find you’re spending a lot of your processing time in your planner, the second method could help speed that up.
当我们分解新的搜索中的复合任务时还可以在规划过程中选择使用当前计划的MTR。我们可以使用MTR来搜索一个有效的方法,只允许具有相等或更高优先级的方法。这允许我们基于当前计划(plan)的MTR去剔除HTN里的的整个分支。第一种方法是两种方法中比较容易的一种,但是如果你发现你在计划中花费了大量的处理时间,第二种方法可以帮助你加快速度。
Now that we have the ability to abort currently running plans for higher priority plans,there is a subtle implementation detail that can cause unexpected behaviors in your NPCs.If you set up your planner to re-plan on world state changes, the planner will try to re-plan when tasks apply their effects on successful execution. Consider this altered subsection of the Trunk Thumper’s domain below.
既然我们有能力为更高优先级的计划中止当前正在运行的计划,那么有一个微妙的实现细节可能会导致npc出现意外行为。如果您设置您的规划器(planner)对世界状态(world stat)变化重新进行规划,那么规划器(planner)将在任务对成功执行的影响(effect)后尝试重新进行规划。思考下面的Trunk Thumper域分段的修改。
Compound Task [AttackEnemy]
Method [WsPowerUp = 3]
Subtasks [DoWhirlwindTrunkAttack(), DoRecovery()]
Method [WsEnemyRange > MeleeRange,]
Subtasks [DoTrunkSlam(), DoRecovery()]
Primitive Task [DoTrunkSlam]
Operator [AnimatedAttackOperator(TrunkSlamAnimName)]
Effects [WsPowerUp += 1]
Primitive Task [DoWhirlwindTrunkAttack]
Operator [DoWhirlwindTrunkAttack()]
Effects [WsPowerUp = 0]
Primitive Task [DoRecover]
Operator [PlayAnimation(TrunkSlamRecoveryAnim)]
This new behavior is designed to have the troll do the DoWhirlwindTrunkAttack task, after executing the DoTrunkSlam three times. This is accomplished by having the DoTrunkSlam task’s effect increase the WsPowerUp property by one each time it executes. This might seem fine at first glance, but you will have designers at your desk informing you that the troll now combos a trunk slam directly into a whirlwind attack every time. The problem arises on the third execution of DoTrunkSlam. The task’s effects are applied and the planner forces a re-plan. With WsPowerUp equal to three, the planner will pick the higher priority Whirlwind attack method. This cancels the DoRecovery task that is designed to break the attacks up, allowing the player some time to react.
这个新行为旨在让巨魔在执行DoTrunkSlam三次之后执行DoWhirlwindTrunkAttack任务。这是通过让DoTrunkSlam任务每次执行后的影响(effect)让WsPowerUp属性增加1来实现的。乍一看,这似乎很好,但设计师在你的办公桌上告诉你,巨魔现在每次都用一个trunk slam直接造成whirlwind attack。问题出现在DoTrunkSlam的第三次执行时任务的影响(effect)被应用,规划器(planner)强制重新规划。当WsPowerUp等于3时,规划器(planner)将选择优先级更高的DoWhirlwindTrunkAttack方法。这样就取消了DoRecovery任务,这个任务原本是设计用来打断连续攻击的,让玩家有一些时间做出反应。
Normally, the whirlwind method should be able to cancel plans of lower priority. But the currently running plan is still valid, and the only reason this bug is occurring is that the planner is replanning on all world state changes, including changes by successfully completed primitive task’s effects. Simply not replanning when the world state changes via effects being applied from a primitive tasks will solve this problem—which is fine, because the plan was found with those world state changes in mind anyway. While this is a good change to make, it won’t be the full solution. Any world state changes outside of the tasks the plan runner is executing will force a replan and cause the bug to resurface.
通常,whirlwind方法应该能够取消较低优先级的计划。但是当前运行的计划仍然有效,出现此错误的唯一原因是在世界状态(world state)变化时规划器(planner)重新规划了计划,包括通过成功完成原子任务的影响(effect)对世界状态(world state)进行的更改。简单地当世界状态通过应用原子任务的影响(effect)发生变化时,只要不重新规划就可以解决这个问题——这很好,因为无论如何,这个计划都是在考虑世界状态变化的情况下制定的。虽然这是一个很好的改变,但它不是完整的解决方案。规划器(planner)正在执行的任务之外的任何世界状态更改都将迫使重新规划并导致错误重新出现。
The real problem here is the domain and how it’s currently setup. There are a couple of different ways we can solve this, and it really matters how you view it. One could say that the recovery animation is part of the attack, so it might be worth incorporating that animation into the attack animation. That way the recovery always plays after the slam attack. This hurts the modularity of the domain. What if the designers want to chain three slams then do a recovery?
这里核心的问题是域(domain)和它目前是如何设置的。这里有几种不同的方法来解决这个问题,你怎么考虑它很重要。如果说恢复动画是攻击的一部分,那将该动画合并到攻击动画中是值得的。这样的话,在slam attack之后总是应该进行恢复。但这损害了域的模块化。比如设计师想要三连slams,然后做一个恢复动画呢?
A better way would be to use world state to describe the reason that DoRecovery is needed. Consider the change below:
更好的方法是使用世界状态(world state)来描述需要DoRecovery的原因。考虑下以下的修改:
Compound Task [AttackEnemy]
Method [WsPowerUp = 3]
Subtasks [DoWhirlwindTrunkAttack(), DoRecovery()]
Method [WsEnemyRange > MeleeRange,]
Subtasks [DoTrunkSlam(), DoRecovery()]
Primitive Task [DoTrunkSlam]
Operator [AnimatedAttackOperator(TrunkSlamAnimName)]
Effects [WsPowerUp += 1, WsIsTired = true]
Primitive Task [DoWhirlwindTrunkAttack]
Preconditions [WsIsTired == false]
Operator [DoWhirlwindTrunkAttack()]
Effects [WsPowerUp = 0]
Primitive Task [DoRecovery]
Operator [PlayAnimation(TrunkSlamRecoveryAnim)]
Effects [WsIsTired = false]
Using the WsIsTired world state, we can properly describe the reason we need the DoRecovery task. The DoTrunkSlam task now makes the Trunk Thumper tired, and he can’t execute DoWhirlwindTrunkAttack until he gets a chance to recover. Now,when the world state changes, the DoRecovery task won’t be interrupted and yet we save the modularity of DoTrunkSlam and DoRecovery. When implementing priority plan picking, these subtle details can really throw a wrench in your HTN behaviors.
使用世界状态(world state)下的WsIsTired,我们可以正确地描述需要DoRecovery任务的原因。DoTrunkSlam任务现在会使Trunk Thumper感到疲劳,直到他有机会恢复,他才能执行DoWhirlwindTrunkAttack任务。现在,当世界状态改变时,DoRecovery任务不会被中断,但是我们保留了DoTrunkSlam和DoRecovery的模块。当按照计划优先级选择执行时,这些细微的细节确实会对您的HTN行为造成困扰。
It’s important to ask yourself if you are properly representing the world when you run into these types of behavior issues. As we saw in this case, a simple world state is all that was needed.
当你遇到这些类型的行为问题时,问问自己是否正确地表示了这个世界是很重要的。正如我们在本例中看到的,一个简单的世界状态(world state)是所需要的全部。
Managing Simultaneous Behaviors
管理同时发生的行为
A lot of different behavior selection algorithms are very good at doing one thing at a time,but complications arise when it comes time to do two things at once. Luckily, there are a couple ways you can handle this problem with HTN.
很多不同的行为选择算法都很擅长一次只做一件事,但当同时做两件事时,就会出现复杂的情况。幸运的是,有几种方法可以让HTN处理这个问题。
One’s first reaction might be to roll multiple operators into one. This will work, but this has a couple pitfalls: it removes the ability to reuse operators we have already developed,the combining of multiple operators brings an added complexity that hurts maintainability, and any variation to this combined operator can force us to duplicate code if not handled correctly. Chances are you are going to run into behavior that will need to do multiple things at once, often enough that you are going to want to avoid this method.
人的第一反应可能是将多个操作符合并为一个操作符。这是可行的,但有几个缺陷:它丧失了重用我们已经开发的操作符的能力,多个操作符的组合带来了额外的复杂性,损害了可维护性,如果处理不当,对组合操作符的任何变化都可能迫使我们引入重复的代码。您可能会遇到需要同时做多个事情的行为,通常情况下您会想要避免使用这种方法。
A more intuitive way to handle this is to build a separate HTN domain to handle different components of your NPC. Using our troll example, we might have a behavior where we need him to navigate towards his enemy but guard himself from incoming range attacks. We can break this up into multiple operators that control different parts of the body—a navigation operator that would handle the lower body and a guard operator to handle the upper body. Knowing that, we can build two domains and use two planners to deal with the upper and lower bodies
更直观的处理方法是构建一个单独的HTN域来处理NPC的不同组件。以我们的巨魔为例,我们可能会有这样的行为,我们需要他导航到他的敌人,但同时保护自己免受攻击。我们可以将其分解为多个operators来控制身体的不同部分,一个导航operator负责下半身,一个防守operator负责上半身。知道了这一点,我们可以建立两个领域,并使用两个规划器(planner)来处理上层和下层主体
You may find early on that this can be tricky to implement. The issue that arises is that you need to sync up the tasks in each planner. You can accomplish this by making sure you have world state that describes what’s going on in each planner. In our troll example, we can have a world state called Navigating that will be set to true when any lower body navigation task is running. This will allow the upper body planner to make decisions based on this information. Below is an example of how these two domains might be set up.
您可能在早期就发现这可能很难实现。关键点是你需要同步每个规划器(planner)中的任务。你可以通过确保你在每个规划器(planner)里的知道世界状态(world state)中正在发生的事情来做到这点。在我们的troll示例中,我们世界状态(world state)里有一个Navigating属性,在运行任何下半身navigation任务时将其设置为true。这将允许上半身规划器(planner)根据这些信息做出决策。下面是如何设置这两个域的示例。
Compound Task [BeTrunkThumperUpper]//Upper domain
Method [WsHasEnemy == true, WsEnemyRange <= MeleeRange]
Subtasks [DoTrunkSlam()]
Method [Navigating == true, HitByRangedAttack == true]
Subtasks [GuardFaceWithArm()]
Method [true]
Subtasks [Idle()]
Compound Task [BeTrunkThumperLower]//Lower domain
Method [WsHasEnemy == true, WsEnemyRange > MeleeRange]
Subtasks [NavigateToEnemy(), BeTrunkThumperLower()]
Method [true]
Subtasks [Idle()]
Primitive Task [DoTrunkSlam]
Operator [DoTrunkSlamOperator]
Primitive Task [GuardFaceWithArm]
Operator [GuardFaceWithArmOperator]
Primitive Task [NavigateToEnemy]
Operator [NavigateToOperator(Enemy)]
Effects [WsLocation = Enemy]
Primitive Task [Idle]
Operator [IdleOperator]
Now this works great, but there are a couple minor problems with it. A second planner will add a bit of performance hit. Keeping these domains synchronized will hurt their maintainability. Lastly, you will not gain any friends when other programmers run into the debugging headache you just created with your multiple planners—trust me.
这个很好用,但是有一些小问题。第二个计划器会增加一点性能损失。保持这些域的同步将损害它们的可维护性。最后,相信我,当其他程序员遇到您刚刚用多个规划器(planner)创建的调试问题时,调试将会变得非常困难,您将没任何朋友。
There is another alternative for our troll shielding example that does not involve two planners. Currently, navigation tasks complete after successfully arriving at the destination.Instead, we can have the navigation task start the path following and complete immediately,since the path following is happening in the background and not as a task in the plan runner. This frees us to plan during navigation, which allows us to put an arm up to shield the troll from incoming fire. This works as long as we have a world state that describes that we are navigating and the current distance to the destination. With this we can detect when we arrive and plan accordingly. Below is an example of how the domain would look.
对于我们的巨魔防护例子,还有另一种不需要两个规划器(planner)参与的办法。目前,成功到达目的地后导航任务完成。相反,我们可以让导航任务启动路径跟随并立即完成,因为路径跟随是在后台发生的,而不是作为plan runner中的一个任务。这使我们可以在导航的过程中进行规划其他任务,这样我们就可以举起武器来保护巨魔不受攻击。只要我们在世界状态(world state)有描述我们正在导航和当前距离目的地的属性,这就可以工作。有了它,我们就可以知道什么时候到达目的地,并据此制定计划。下面这个例子是展示域应该的样子。
Compound Task [BeTrunkThumper]
Method [WsHasEnemy == true, WsEnemyRange <= MeleeRange]
Subtasks [DoTrunkSlam()]
Method [WsHasEnemy == true, WsEnemyRange > MeleeRange]
Subtasks [NavigateToEnemy()]
Method [Navigating == true, HitByRangedAttack == true]
Subtasks [GuardFaceWithArm()]
Method [true]
Subtasks [Idle()]
Primitive Task [DoTrunkSlam]
Operator [DoTrunkSlamOperator]
Primitive Task [GuardFaceWithArm]
Operator [GuardFaceWithArmOperator]
Primitive Task [NavigateToEnemy]
Operator [NavigateToOperator(Enemy)]
Effects [Navigating = true]
Primitive Task [Idle]
Operator [IdleOperator]
As you can see, this domain is similar to our dual domain approach. Both approaches rely on world state to work correctly. With the dual domain, the Navigating world state was used to keep the planners in sync. In the later approach, world state was used to represent the path following happening in the background, but without the need of two domains and two planners running.
正如您所看到的,这个域类似于两个域处理方式。这两种方法都依赖于世界状态(world state)才能正确工作。在双域中,世界状态(world state)的Navigating属性被用来保持计划者的同步。在后一种方法中,世界状态(world state)属性用于表示在后台下路径跟随的事件,而不需要两个域和两个规划器(planner)运行。
Speeding up Planning with Partial Plans
通过局部规划加快规划速度
Let us assume that we have built the Trunk Thumper’s domain into a pretty large network.After optimizing the planner itself, you have found the need to knock a couple milliseconds off your planning time. There are a couple of ways we can still eek more performance out of it. As we explained, HTN naturally culls out large portions of the search space via the methods in compound tasks. There may be instances, however, where we can add a few more methods to cull more search space. In order to do this, we need to have the right world state representation.
让我们假设我们已经将Trunk Thumper的域构建为一个相当大的网络。在优化规划器(planner)本身之后,您发现需要减少几毫秒的规划时间。有很多方法可以提高它的性能。正如我们所解释的,HTN通过复合任务中的方法自然地剔除了很大一部分搜索空间。然而,在某些情况下,我们可以添加更多的方法来剔除更多的搜索空间。为了做到这一点,我们需要有恰当地世界状态(world state)表现。
If those techniques don’t get you the speed you need, partial planning should. Partial planning is one of the most powerful features of HTN. In simplest terms, it allows the planner the ability to not fully decompose a complete plan. HTN is able to do this because it uses forward decomposition or forward search to find plans. That is, the planner starts with the current world state and plans forward in time from that. This allows the planner to only plan ahead a few steps.
如果这些技术不能让你达到你需要的速度,局部规划可以。局部规划是HTN最强大的功能之一。简单地说,它允许规划器(planner)有能力不完全分解出一个完整的计划(plan)。HTN能够做到这一点是因为它使用前进式分解或前进式搜索来查找计划。也就是说,规划器(planner)从当前的世界状态开始,并从那时开始进行规划。这允许计划者只向前地计划几个步骤。
GOAP and STRIPS planner variants, on the other hand, use a backward search [Jorkin 04].This means the search makes its way from a desired goal state toward the current world state. Searching this way means the planner has to complete the entire search in order to know what first step to take. We will go back to a simple version of our Trunk Thumper domain to demonstrate how to break it up into a partial plan domain.
GOAP和STRIPS规划器的变体,另一方面,使用向后搜索[Jorkin 04]。这意味着搜索方式是从期望的目标状态到当前的世界状态。这样搜索意味着规划器(planner)必须完成整个搜索,以便知道第一步要做什么。我们将退回到Trunk Thumper域的一个简单版本,并演示如何将其分解为局部计划域。
Compound Task [BeTrunkThumper]
Method [WsCanSeeEnemy == true]
Subtasks [NavigateToEnemy(), DoTrunkSlam()]
Primitive Task [DoTrunkSlam]
Operator [DoTrunkSlamOperator]
Compound Task [NavigateToEnemy]
Method […]
Subtasks […]
Here, we have a method that will expand both the NavigateToEnemy and DoTrunkSlam tasks if WsCanSeeEnemy is true. Since whatever tasks that make up NavigateToEnemy might take a long time, it would make this a good option to split into a partial plan. There isn’t much point to planning too far into the future since there is a good chance the world state could change, forcing our troll to make a different decision. We can convert this particular plan into a partial plan:
这里,我们有一个方法(method),如果WsCanSeeEnemy为真,它将同时展开NavigateToEnemy和DoTrunkSlam任务。因为任何由NavigateToEnemy组成的任务都可能需要很长时间,所以把它分成一个局部计划是个不错的选择。因为世界状态随时有可能会改变,迫使我们的巨魔做出不同的决定,所以没有太多的意义去规划太远的未来。我们可以将这一特定计划转换为局部计划:
Compound Task [BeTrunkThumper]
Method [WsCanSeeEnemy == true, WsEnemyRange > MeleeRange]
Subtasks [NavigateToEnemy()]
Method [WsCanSeeEnemy == true]
Subtasks [DoTrunkSlam()]
Primitive Task [DoTrunkSlam]
Operator [DoTrunkSlamOperator]
Compound Task [NavigateToEnemy]
Method […]
Subtasks […]
Here, we have broken the previous method into two methods. The new high priority method will navigate to the enemy only if the troll is currently out of range. If the troll is not outside of melee range, he will perform the trunk slam attack. Navigation tasks are also prime targets for partial plans, since they often take a long time to complete.It’s important to point out that splitting this plan is only doable if there is a world state available to differentiate the split.
这里,我们将前面的方法分解为两个方法。新的高优先级方法判断如果巨魔目前在近战范围之外就导航到敌人。如果巨魔在近战范围之内,他将执行trunk slam attack。导航任务也是局部计划的主要目标,因为它们通常需要很长时间才能完成。重要的是要指出需要有一个世界状态的WsEnemyRange属性可用来区分计划,分离这个计划才是可行的。
This method of partial planning requires the author of the domain to create the split themselves. But there is a way to automate this process. By assigning the concept of “time” to primitive tasks, the planner can keep track of how far into the future it has already planned. There are a couple issues with this approach, however. Consider the domain.
这种局部规划的方法要求域的作者自己创建分离。但是有一种方法可以自动化这个过程。通过给原子任务指定出一个“时间”的概念,规划器(planner)可以跟踪它已经计划的未来有多远。然而,结合到域这种方法有几个问题。
Compound Task [BeTrunkThumper]
Method [WsCanSeeEnemy == true]
Subtasks [NavigateToEnemy(), DoTrunkSlam()]
Primitive Task [DoTrunkSlam]
Preconditions[WsStamina > 0]
Operator [DoTrunkSlamOperator]
Compound Task [NavigateToEnemy]
Method […]
Subtasks […]
With this domain, assume the primitive tasks that make up the navigation cross the time threshold that is set in the planner. This would cause the troll to start navigating to the enemy. But if the world state property WsStamina is zero, the troll can’t execute the DoTrunkSlam anyway because of its precondition. The automated partial plan split removed the ability to validate the plan properly. Of course the method can be written to include the stamina check to avoid this problem. But since both ways are valid, it is better to insure both will produce the same results. Not doing so will cause subtle bugs in your game.
对于这个域,假设由导航构成的原子任务越过在规划器里设置的时间极限值。这将使巨魔导航到了敌人面前。但是,如果世界状态WsStamina属性为零,由于它的先决条件无法满足巨魔就无法执行DoTrunkSlam。自动化的局部计划分离忽略了正确验证计划的能力。当然,可以编写该方法来包括持久力检查以避免此问题。但既然两种方式都是有效的,最好确保两种方法都能产生相同的结果。如果不这样做,将会导致游戏中出现一些不易察觉的bug。
Even if you feel that this isn’t a real concern, there is also the question of how to continue where the partial plan left off. We could just replan from the root, but that would require us to change the domain in some way to understand that it’s completed the first part of the full plan. In the case of our example, we would have to add a higher priority method that checks to see if we are in range to do the melee attack. But if we have to do this, what’s the point of the automated partial planning?
即使你觉得这不是一个真正的问题,还有一个问题是如何使局部计划的失败继续。我们可以从根任务再重新,但这需要我们以某种方式改变域,以让它理解已经完成了整个计划的第一部分。在我们的例子中,我们必须添加一个更高优先级的方法来检查我们是否在进行近战攻击的范围内。但是如果我们必须这样做,那么自动化部分规划的意义是什么呢?
A better solution would be to record the state of the unprocessed list. With that we can modify the planner to start with a list of tasks, instead of the one root task. This would allow us to continue the search where we left off. Of course, we would not be able to roll back to before the start of the second part of the plan. Running into this case would mean that you’ve already run tasks that you should not have. So if the user runs into this case, they can’t use partial planning because there are tasks later in the plan that need to be validated in order to get the correct behavior.
更好的解决方案是记录未处理列表的状态。这样,我们就可以修改规划器(planner),使其从任务列表开始,而不是从一个根任务开始。这样我们就可以从中断的地方开始继续搜索。当然,我们不能回滚计划第二部分开始前。遇到这种情况意味着您已经运行了不应该运行的任务。因此,如果用户遇到这种情况,他们就不能使用局部计划,因为计划中稍后的任务需要进行验证,以获得正确的行为。
With Transformers: Fall of Cybertron, we simply built the partial plans into the domains.For us, the chance of putting subtle bugs into the game was high and we found that we were naturally putting partial plans in our NPC domains anyway when full plan validation wasn’t necessary. A lot of our NPCs were using the last example from Section 12.9 for navigation, which is also an example of partial planning.
在《变形金刚:塞伯坦的秋天》中,我们只是将局部计划构建到域中。对于我们来说,在游戏中产生细微bug的几率很高,我们发现在没有必要验证完整计划时,我们会自然地将局部计划放到NPC域中。我们的许多npc使用了12.9节中的最后一个例子来导航,这也是局部规划的一个例子。
Conclusion
结论
Going through the process of creating a simple NPC can be a real eye-opener to the details involved with implementation of any behavior selection system. Hopefully we have explored enough of hierarchical task networks to show its natural approach to describing behaviors, the re-usability and modularity of its primitive tasks. HTN’s ability to reason about the future allows an expressiveness only found with planners. We have also attempted to point out potential problems a developer may come across when implementing it. Hierarchical task networks were a real benefit to the AI programmers on Transformers: Fall of Cybertron and we’re sure it will be the same for you.
通过创建一个简单的NPC的过程,可以让您对任何行为选择系统的实现所涉及的细节都大开眼界。希望我们已经对HTN进行了足够的探索,以展示其描述行为的自然方法、原子任务的可重用性和模块化。HTN对未来进行推理的能力允许只有规划器才能具备的一种表达方式。我们还试图指出开发人员在实现它时可能遇到的潜在问题。HTN对《变形金刚:塞伯坦的秋天》的AI程序员来说是真正有好处的,我们相信对你来说也一样。
References
引用
[Erol et al. 94] K. Erol, D. Nau, and J. Henler, “HTN planning: Complexity and expressivity.”AAAI-94 Proceedings, 1994.
[Erol et al. 95] K. Erol, J. Henler, and D. Nau. “Semantics for Hierarchical Task-Network Planning.” Technical report TR 95-9. The Institute for Systems Research, 1995.
[Ghallab et al. 04] M. Ghallab, D. Nau, and P. Traverso, Automated Planning. San Francisco,CA: Elsevier, 2004, pp. 229–259.
[HighMoon 10] Transformers: War for Cybertron, High Moon Studios/Activision Publishing,2010.
[HighMoon 12] Transformers: Fall of Cybertron, High Moon Studios/Activision Publishing,2012.
[Jorkin 04] Jeff Orkin. “Applying goal-oriented action planning to games.” In AI Game Programming Wisdom 2, edited by Steve Rabin. Hingham, MA: Charles River Media,2004, pp. 217–227.