Approach
After an initial training phase, we remove all connections whose weight is lower than a threshold. This pruning converts a dense, fully-connected layer to a sparse layer. This first phase learns the topology of the networks — learning which connections are important and removing the unimportant connections. We then retrain the sparse network so the remaining connections can compensate for the connections that have been removed. The phases of pruning and retraining may be repeated iteratively to further reduce network complexity.
Learning the right connections is an iterative process. Pruning followed by a retraining is one iteration, after many such iterations the minimum number connections could be found. Without loss of accuracy, this method can boost pruning rate from 5× to 9× on AlexNet compared with single-step aggressive pruning. Each iteration is a greedy search in that we find the best connections.
Experiment
References:
Learning both Weights and Connections for Efficient Neural Networks, Song Han, 2015, Neural Information Processing Systems