https://www.zhihu.com/question/362455124?sort=created
微软的模型压缩工具:distiller(重点)
https://github.com/NervanaSystems/distiller (支持PyTorch,官方是在PyTorch1.3上测试的,在GitHub上搜PyTorch Pruning最多星)
微软的AutoML工具:nni中也有模型压缩的模块,https://github.com/microsoft/nni/blob/master/examples/model_compress/QAT_torch_quantizer.py
https://nni.readthedocs.io/zh/latest/Compressor/Quantizer.html
https://github.com/microsoft/nni/blob/master/examples/model_compress/DoReFaQuantizer_torch_mnist.py
Pytorch自带的量化工具(PyTorch>1.3)
https://zhuanlan.zhihu.com/p/81026071
https://github.com/pytorch/glow/blob/master/docs/Quantization.md
https://github.com/pytorch/QNNPACK
nni中的量化从例子看起来还蛮好用的(Pytorch的官方量化文档看晕了,不适合刚入手量化的小白)
https://github.com/microsoft/nni/blob/master/examples/model_compress/DoReFaQuantizer_torch_mnist.py
nni中有4种量化方式:
1. Naive Quantizer,2. QAT Quantizer,3. DoReFa Quantizer,4.BNN Quantizer
其中,1貌似是最low的,推理的时候32位变成8位,不想用,4,二进制神经网络是啥?按位运算挺好的,不知道部署时候是否有坑
2,Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference(谷歌2018)nni中说不支持批量归一化折叠,不知道有没有影响
3,DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients(Face++ 2018)例子看起来简单https://arxiv.org/abs/1606.06160,https://blog.csdn.net/langzi453/article/details/88172080
nni中的量化只是模拟不是加速?https://github.com/microsoft/nni/issues/2332
https://github.com/microsoft/nni/blob/master/examples/model_compress/auto_pruners_torch.py
paper with code 的量化github排名:
https://paperswithcode.com/search?q_meta=&q=Quantizer
第一名:Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference(谷歌2018)
https://paperswithcode.com/paper/quantization-and-training-of-neural-networks Tensorflow劝退
第二名:Training with Quantization Noise for Extreme Model Compression(2020)
https://paperswithcode.com/paper/training-with-quantization-noise-for-extreme PyTorch实现
模型压缩的benchmark: https://paperswithcode.com/task/model-compression
模型压缩的benchmark:https://paperswithcode.com/task/quantization
DoReFa还有这个人的:https://github.com/666DZY666/model-compression\
QAT还有这个人的:https://github.com/Xilinx/brevitas