模型量化 - 简书

微软的模型压缩工具：distiller（重点）

https://github.com/NervanaSystems/distiller (支持PyTorch，官方是在PyTorch1.3上测试的，在GitHub上搜PyTorch Pruning最多星)

Pytorch自带的量化工具（PyTorch>1.3）

nni中的量化从例子看起来还蛮好用的（Pytorch的官方量化文档看晕了，不适合刚入手量化的小白）

nni中有4种量化方式：

其中，1貌似是最low的，推理的时候32位变成8位，不想用，4，二进制神经网络是啥？按位运算挺好的，不知道部署时候是否有坑

2，Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference（谷歌2018）nni中说不支持批量归一化折叠，不知道有没有影响

3，DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients（Face++ 2018）例子看起来简单https://arxiv.org/abs/1606.06160，https://blog.csdn.net/langzi453/article/details/88172080

paper with code 的量化github排名：

https://paperswithcode.com/search?q_meta=&q=Quantizer

第一名：Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference（谷歌2018）

第二名：Training with Quantization Noise for Extreme Model Compression（2020）