论文名称: Decoupled Networks
论文思想
论文出发点是发现卷积神经网络可以很好地建模类内距离(小)和类间距离(大),但是原始的卷积并不是最优的方式。论文通过将卷积理解为内积,进而对内积进行分解达到解耦的目的。
具体出发点可以看上图。对于同一个类别,比如8都在特征空间的一个主轴上。对于不同类别,比如1,4和5,特征空间里角度差异很大。而将卷积以内积形式理解的话,就可以将内积公式展开为特征向量模长和角度的乘积。其中模长对应于类内距离,而角度对应于类间距离。 具体看以下公式:
作者认为这种卷积的这种方式并不能很好地解耦类内和类间地关系。对于泛化后地卷积模型(也就是用h和g函数分别代表类内距离和类间距离),分别尝试了不同的函数。可视化见下图:
其中绿色箭头是原始的特征向量,红色是变化后的特征向量,黑色双向箭头表示角度如何计算的。通过对模长和夹角的不同变化,又不同的可视化结果,详细对比实验请看原文。
一点看法
感觉把卷积等效理解为内积是否合适。
写作学习
1. Inspired by the observation that CNN-learned features are naturally decoupled with the norm of features corresponding to the intra-class variation and the angle corresponding to the semantic difference, we propose a generic decoupled learning framework which models the intra-class variation and semantic difference independently.
2. Extensive experiments show that such decoupled reparameterization renders significant performance gain with easier convergence and stronger robustness.
3. Convolutional neural networks have pushed the boundaries on a wide variety of vision tasks
4. Despite these advances, understanding how convolution naturally leads to discriminative representation and good generalization remains an interesting problem.
5. Our direct intuition comes from the the observation in Fig.
6. On top of the idea to decouple the norm and the angle in an inner product, we propose a novel decoupled network (DCNet) by generalizing traditional inner product based convolution operators to ....
7. We present multiple instances for each type of decoupled operators. Empirically, the bounded operators may yield faster convergence and better robustness against adversarial attacks, and the unbounded operators may have better representational power.
写在最后
文章总结和英语表达学习主要是为了提高自己,看官对于文章本身有兴趣可以阅读原文了解以下。
https://arxiv.org/pdf/1804.08071.pdf