https://arxiv.org/pdf/1612.00796v1.pdf
The ability to learn tasks in a sequential fashion is crucial to the development of
artificial intelligence. Neural networks are not, in general, capable of this and it
has been widely thought that catastrophic forgetting is an inevitable feature of
connectionist models. We show that it is possible to overcome this limitation and
train networks that can maintain expertise on tasks which they have not experienced
for a long time. Our approach remembers old tasks by selectively slowing down
learning on the weights important for those tasks. We demonstrate our approach is
scalable and effective by solving a set of classification tasks based on the MNIST
hand written digit dataset and by learning several Atari 2600 games sequentially.