学术菜鸡的论文统计,请无视
2015年
#1.Learning both Weights and Connections for Efficient Neural Networks:2745 P3
先确定哪些连接是重要的,然后prune,在fine tune
L1正则pruning和L2正则 retrain和iterative prune效果好
2016
#2.Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding:4101 CA
prune/fine tune, quantize weights, Huffman coding
3-4x speedup
#3.Dynamic Network Surgery for Efficient DNNs:502 CA
动态的修剪,并且加入splicing,避免不正确的prune
#2017
#4.Pruning Filters for Efficient ConvNets:1319 P3
prune filter,也就是prune cout
#5.[Pruning Convolutional Neural Networks for Resource Efficient Inference:785 T3
(1)每次prune最不重要的参数,迭代
(2)taylor展开判断哪个该减
(3)每层都要normalization
#6.Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee:79 TA
irregularize prune,先跳过
#7.Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon:135 PA
retrain少 设定一个laye-wise 的error,计算它的二阶导,这和paper#5的区别?
#8.Runtime Neural Pruning:186 N
根据输入,动态的自适应的prune cin
自上而下,逐层剪枝
#9.Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning:378 N
基于能耗 channel prune
每层剪枝后,进行最小二乘微调,快速回复精度。全部剪枝完后,再全局反向传播微调
#10.ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression :688 CA P3
filter prune,根据下一层的统计信息来prune当前层 和paper#5区别?
#11.Channel Pruning for Accelerating Very Deep Neural Networks:865 CA
这个和paper#12有啥区别??待细看
#12.Learning Efficient Convolutional Networks Through Network Slimming:663 PA
修剪input channel,也就是cin,使用BN的scaling来做判断卷积channel的重要性
训练时对channel的scale参数进行L1正则化,抑制为0
#2018
#13.Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers:136 TA P3
以前假设较小的权重或feature map是不重要的,该文不基于这个假设
训练模型使某个通道输出恒定,然后把这个通道剪掉?
#14.To prune, or not to prune: exploring the efficacy of pruning for model compression:254 N
大的稀疏网络效果优于小的密集网络
渐进式prune,sparsity逐渐增加 训练时不断稀疏
#15.Discrimination-aware Channel Pruning for Deep Neural Networks:176 TA
待细看
#16.Frequency-Domain Dynamic Pruning for Convolutional Neural Networks:27 N
#17.Learning Sparse Neural Networks via Sensitivity-Driven Regularization:21 N
量化输出对参数的敏感性(相关性),引入一个正则项,降低敏感性参数的绝对值,直接将低于阈值的设为0
感觉之后可以用于层间只适应,待细看next to read
#18.Amc: Automl for model compression and acceleration on mobile devices:414 T3
使用了AutoMl 有必要之后细看
#19.Data-Driven Sparse Structure Selection for Deep Neural Networks:169 MA
使用一个参数比例因子来缩放某个结构(group block neuron)的输出,正则化稀疏该因子,使用Accelerated Proximal Gradient优化问题,然后删除(感觉和paper#17有点像)
#20.Coreset-Based Neural Network Compression:27 PA
不需要retrain, 有量化和Huffman编码。不知道啥玩意
#21.Constraint-Aware Deep Neural Network Compression:24 SkimCA
#22.A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers:111 CA
跳过
#23.PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning:167 PA
prune参数,剩余参数用来训练新任务
#24.NISP: Pruning Networks using Neuron Importance Score Propagation:256 N
以前只考虑单层或者两层的误差,没有考虑对整个网络的影响,本文基于一个统一的目标,即最小化分类前倒数第二层的“最终响应层”(FRL)中重要响应的重构误差,提出了神经元重要性评分传播(NISP)算法,将最终响应的重要性得分传播到网络中的每个神经元。
#25.CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization:78 N
同时减枝和量化 跳过
#26.“Learning-Compression” Algorithms for Neural Net Pruning:61 N
自动学习每层prune多少
#27.Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks: 209 PA
(1)不是将删除滤波器固定为0
(2)能从头训练,边训边减
有必要细看
#28.Accelerating Convolutional Networks via Global & Dynamic Filter Pruning:72 N
全局的动态剪枝,还可以将误删的恢复
2019
#29.Network Pruning via Transformable Architecture Search:38 PA
既NAS又蒸馏
#30.Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks:39 PA
给channel设置因子gate,gate为0,则删除,用taylor展开判断gate为0时对损失函数的影响 全局的剪枝 tick-tock框架
#31.Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask:64 TA
先跳过
#32.One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers:41 N
先跳过
#33.Global Sparse Momentum SGD for Pruning Very Deep Neural Networks:15 PA
全局的,找到每层稀疏比;端到端训练;不需要retrain;效果比lottery ticket好
#34.AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters:12 N
一般修剪权重会降低鲁棒性,或者需要先验的知识确定超参数,本文用一个autoprune的方法,辅助新的更新规则,缓解了前面的两个问题。还是pre-train prune fine-tune三步走
#35.Model Compression with Adversarial Robustness: A Unified Optimization Framework:19 PA
不损害鲁棒性的压缩
#36.MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning:40 PA
#37.Accelerate CNN via Recursive Bayesian Pruning: 14 PA
逐层剪枝,之后有必要细看
#38.Adversarial Robustness vs Model Compression, or Both?:15 PA
鲁棒性的,先跳过
#39.Learning Filter Basis for Convolutional Neural Network Compression:9 N
#40.Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration:163 PA
前面的工作认为值小的不重要,这需要两个前提条件:(1)fliter偏差大(2)最小的norm应该更小 本文提出了一种基于几何中值的filter prune
#41.Towards Optimal Structured CNN Pruning via Generative Adversarial Learning:82 PA
用GAN 跳过
#42.Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure:37 PA
跳过
#43.On Implicit Filter Level Sparsity in Convolutional Neural Networks:11 PA
跳过
#44.Structured Pruning of Neural Networks with Budget-Aware Regularization:20 N
可以控制prune的大小和速度,还用了蒸馏,跳过
#45.Importance Estimation for Neural Network Pruning:80 PA
估计神经元对最终loss的影响,迭代的删去最小的那个。用了一阶和二阶的泰勒展开,而不是每层的敏感度分析。
感觉很重要,之后细看
#46.OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks:12 N
之前的工作只考虑层内的关系,没有考虑层间的关系,这篇文章考虑了连续层之间的关系,当前层的out和下一层的in
#47.Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search:35 TA
速度和精度上取折衷 跳过
#48.Variational Convolutional Neural Network Pruning:54 N
变分贝叶斯 不需要retrain
#49.The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks:204 TA
先跳过
#50.Rethinking the Value of Network Pruning:303 PA
自动确定每层稀疏率
剪枝再fine-tune的会比从头训练的网络要差。剪枝后的网络结构不应该复用之前的训练好的模型中的权重。所以应该从头训练。
#51.Dynamic Channel Pruning: Feature Boosting and Suppression:50 TA
并不是像单纯的剪枝一样删除结构,而是通过FBS动态的放大重要的通道,跳过不重要的通道。
#52.SNIP: Single-shot Network Pruning based on Connection Sensitivity:121 TA
不是先训练再减枝,而是先减枝,再从头开始训练。还是先做一个链接的敏感度分析,但是仍然是一个使用一阶泰勒展开,然后做softmax,一次性减k个值
#53.Dynamic Sparse Graph for Efficient Deep Learning:16 CUDA3
可以用来训练,有空再看。
#54.Collaborative Channel Pruning for Deep Networks:28 N
剪channel,分析通道对loss的影响,用ccp逼近Hessian矩阵
#55.Approximated Oracle Filter Pruning for Destructive CNN Width Optimization:28 N
oracle减枝评估filter的重要性,但是时间复杂度高,且需要给定结果宽度,本文通过近似oracle法来优化
#56.EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis:15 PA
基于Kronecker因子特征基(KFE)的网络重参数化方法,并在此基础上应用了基于Hessian的结构化剪枝方法。
#57.EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning:1 PA
filter减枝,先跳过
#58.DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation:0 N
可以让每层稀疏率可微,采用梯度的方式搜索稀疏率,而且可以train from scratch
#59.DHP: Differentiable Meta Pruning via HyperNetworks:2 PA
automl跳过
#60.Meta-Learning with Network Pruning:0 N
把剪枝用于元学习,跳过
#61.Accelerating CNN Training by Pruning Activation Gradients:1 N
反向传播中的激活梯度大部分很小,使用一种随机剪枝的方式对激活梯度进行剪枝,剪枝阈值通过分布确定,理论分析。
之后细看
#62.DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search:1 N
NAS跳过
#63.Differentiable Joint Pruning and Quantization for Hardware Efficiency:0 N
联合量化剪枝,跳过
#64.Channel Pruning via Automatic Structure Search:5 PA
跳过
#65.Adversarial Neural Pruning with Latent Vulnerability Suppression:3 N
跳过
#66.Proving the Lottery Ticket Hypothesis: Pruning is All You Need:14 N
加强的彩票假设
#67.Soft Threshold Weight Reparameterization for Learnable Sparsity:6 PA
自动调节稀疏的阈值
#68.Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection:7 N
不是删除网络中的神经元,而是贪婪的从空网络中添加网络中的神经元
#69.Operation-Aware Soft Channel Pruning using Differentiable Masks:0 N
跳过
#70.DropNet: Reducing Neural Network Complexity via Iterative Pruning:0 N
每次删去训练样本平均后激活值最低的那个点
#71.Towards Efficient Model Compression via Learned Global Ranking:1 F
学习跨不同层的滤波器的全局排名,通过修剪排名靠后的滤波器来获得一组具有不同精度/延迟权衡的结构
#72.HRank: Filter Pruning using High-Rank Feature Map:18 PA
跳过
#73.Neural Network Pruning with Residual-Connections and Limited-Data:2 N
跳过
#74.Multi-Dimensional Pruning: A Unified Framework for Model Compression:1 N
跳过
#75.DMCP: Differentiable Markov Channel Pruning for Neural Networks:0 TA
跳过
#76.Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression:8 PA
低秩分解和剪枝一起用,全局压缩
#77.Few Sample Knowledge Distillation for Efficient Network Compression:8 N
蒸馏,跳过
#78.Discrete Model Compression With Resource Constraint for Deep Neural Networks:1 N
跳过
#79.Structured Compression by Weight Encryption for Unstructured Pruning and Quantization:2 N
对非结构化稀疏的权重进行加密,推理的时候用异或门解码
#80.Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration:2 N
每一层自适应的选择不同的剪枝
#81.APQ: Joint Search for Network Architecture, Pruning and Quantization Policy:7
联合NAS,prune,quantization
#82.Comparing Rewinding and Fine-tuning in Neural Network Pruning:23 TA
rewind 和 fine-tune两种方法的对比
#83.A Signal Propagation Perspective for Pruning Neural Networks at Initialization:14 N
解释了为什么修剪只初始化,还没开始训练的网络,这种方法是有效的。
#84.ProxSGD: Training Structured Neural Networks under Regularization and Constraints:1 TA PA
跳过
#85.One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation:2 N
RNN的一次性剪枝
#86.Lookahead: A Far-sighted Alternative of Magnitude-based Pruning:5 PA
基于基于幅值的剪枝确实能使单层线性算子的Frobenius失真最小化,我们将单层优化扩展为多层优化,提出了一种简单的剪枝方法,即超前剪枝
#87.Dynamic Model Pruning with Feedback:9 N
通过反馈重新激活早期删除的权重
#89.Provable Filter Pruning for Efficient Neural Networks:9 N
跳过
#90.Data-Independent Neural Pruning via Coresets:5 N
跳过
#91.AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates:13 N
跳过
#92.DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks
用所获得的信息作为指导,我们首先提出了一种新的块最大加权掩蔽(BMWM)方法,它可以有效地保留显著的权重,同时对权重矩阵施加高度的正则性。作为进一步的优化,我们提出了一种密度自适应规则块(DARB)剪枝方法