Agent:
they take the current percept as input from the sensors and return an action to the actuators
rationality
• The performance measure that defines the criterion of success.
• The agent’s prior knowledge of the environment.
• The actions that the agent can perform.
• The agent’s percept sequence to date.
simple flex agents:
- These agents select actions on the basis of the current percept, ignoring the rest of the percept history
即是根据当前的输入来选择动作 - condition–action rule: if ... then ...
- Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments.Simple flex 需要环境是fully observable的。Scaping from infinite loops is possible if the agent can randomize its actions
model-based agents:
- the agent should maintain some sort of internal state that depends on the percept history and thereby reflects at least some of the unobserved aspects of the current state.
- An agent that uses such a model is called a model-based agent
goal-based agents:
- the agent needs some sort of goal information that describes situations that are desirable
- Although the goal-based agent appears less efficient, it is more flexible because the knowledge that supports its decisions is represented explicitly and can be modified.
Properties of task environments
Fully observable vs. partially observable
If an agent’s sensors give it access to the complete state of the environment at each point in time, then we say that the task environ- ment is fully observable
Single agent vs. multiagent
agent 的个数,开车就是一个,下棋就是两个
Deterministic vs. stochastic
If the next state of the environment is completely determined by the current state and the action executed by the agent, then we say the environment is deterministic; otherwise, it is stochastic(随机)
In principle, an agent need not worry about uncertainty in a fully observable, deterministic environment.
Episodic vs. sequential
the next episode does not depend on the actions taken in previous episodes 前后并无关联
In sequential environments, on the other hand, the current decision could affect all future decisions.例子:chess and taxi
Static vs. dynamic
If the environment can change while an agent is deliberating, then we say the environment is dynamic for that agent; otherwise, it is static.
Discrete vs. continuous
The discrete/continuous distinction applies to the state of the environment, to the way time is handled, and to the percepts and actions of the agent. chess:discrete taxi:continuous
Known vs. unknown
to the agent’s (or designer’s) state of knowledge about the “laws of physics” of the environment. In a known environment, the outcomes (or outcome probabilities if the environment is stochastic) for all actions are given.
什么是admissible heuristic function
In computer science, specifically in algorithms related to pathfinding, a heuristic function is said to be admissible if it never overestimates the cost of reaching the goal, i.e. the cost it estimates to reach the goal is not higher than the lowest possible cost from the current point in the path.
Greedy Best-first Search
每个节点有启发式函数,表示这个节点到终点的预计距离h(n)(the cost to get from the node to the goal),每次选最短的(greed),直到到达终点
Minimax and alpha-beta pruning
The minimax algorithm is a way of finding an optimal move in a two player game. Alpha-beta pruning is a way of finding the optimal minimax solution while avoiding searching subtrees of moves which won't be selected.
Some true of false:
- Hill-climbing is an entirely deterministic algorithm. F stochastic hill-climbing random selection among uphill moves.
- DFS has lower asymptotic space complexity than BFS F The limiting behavior of the use of memory space of an algorithm when the size of the problem goes to infinity. 我觉得应该是相反的
- During search, one usually applies the goal test onto newly expanded children, before queuing-up these children. F 很多算法都是先queue再查
- A contingency problem involves a nondeterministic and accessible environment. F inaccessible
- A* is an admissible algorithm. 不确定 A* is admissible 但是是有前提的:启发函数要是admissible的
- When using the correct temperature decrease schedule, simulated annealing is guaranteed to find the global optimum in finite time. F 不一定
- Alpha-beta pruning accelerates game playing at the cost of being an approximation to full minimax. 我觉得是对的
- Genetic algorithms use a step called “failover”. F
- A perfectly rational backgammon-playing agent never loses F
- Hill climbing search is best used for problem domains with densely packed goals T
- The exact evaluation function values do not affect minimax decision as long as the ordering of these values is maintained. F