240 发简信
IP属地:英格兰
  • Chapter 9

    Chapter 9: On-policy Prediction with Approximation From this chapter, we move from tabu...

  • 120
    Chapter 7

    Chapter 7: n-step Bootstrapping n-step TD methods span a spectrum with MC methods at on...

  • Chapter 6

    Chapter 6: Temporal-Difference Learning Temporal-difference (TD) learning is a combinat...

  • 120
    Chapter 5

    Chapter 5: Monte Carlo Methods Monte Carlo (MC) methods are learning methods for estima...

  • Chapter 4

    Chapter 4: Dynamic Programming Dynamic programming computes optimal policies given a pe...

  • Chapter 3

    Chapter 3: Finite Markov Decision Processes Basic Definitions MDP is the most basic for...

  • Chapter 2

    Chapter 2: Multi-armed Bandits Multi-armed bandits can be seen as the simplest form of ...

  • Pointer Networks

    Pointer Networks Oriol Vinyals, Meire Fortunato, Navdeep JaitlyGoogle, BerkeleyNIPS 201...

  • 120
    Neural Computation of Decisions in Optimization Problems

    Neural Computation of Decisions in Optimization Problems J. J. Hopfield, D. W. TankBiol...

  • 120
    Attention, Learn to Solve Routing Problems

    Attention, Learn to Solve Routing Problems Wouter Kool, Herke van Hoof, Max WellingUniv...

  • Machine Learning for Combinatorial Optimization

    Machine Learning for Combinatorial Optimization 1 Introduction 1.1 Background Operation...

  • 我们究竟需要怎样的人工智能

    几天前,特斯拉的自动驾驶汽车出事了,车主身亡。 最近,人工智能很火,无人驾驶很火,从互联网巨头到传统车企都在搞无人车。但是另一方面,许多真正工作在自动驾驶技术研发一线的研究人...

  • 120
    理解 LSTM 网络

    作者: Christopher Olah (OpenAI)译者:朱小虎 Xiaohu (Neil) Zhu(CSAGI / University AI)原文链接:https:...