DCGANs -- Deep Convolutional Generative Adversarial Network
DCGANs 参考文献:
Radford2016 < UNSUPERVISED REPRESENTATION LEARNING WITH DEEP CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS > Facebook AI Research
Radford .et al 的这篇文章是 GAN 领域的必读经典,DCGANs 是 CNN 与 GAN 的融合荟萃。在介绍这种模型前先来列举一下作者在 paper 中提到的 contributions :
• We propose and evaluate a set of constraints on the architectural topology of Convolutional GANs that make them stable to train in most settings. We name this class of architectures Deep Convolutional GANs (DCGAN)
• We use the trained discriminators for image classification tasks, showing competitive performance with other unsupervised algorithms.
• We visualize the filters learnt by GANs and empirically show that specific filters have learned to draw specific objects.
• We show that the generators have interesting vector arithmetic properties allowing for easy manipulation of many semantic qualities of generated samples.
DCGAN 是继 Ian Goodfellow 在2014年提出 GAN 之后的突破,在此之前我们知道 GAN 的训练非常不稳定,容易得到没有意义的输出。DCGAN 通过提出对 CNN 的一些网络拓扑结构的限制,大大提高了训练的稳定性;同时利用训练好的 discriminator 进行图像分类任务也取得了良好的效果。
Details of DCGAN Structure Guildline:
1. Replace all deterministic spatial pooling layer with convolutional stride, making network to learn its own spatial downsampling.
2. Use transposed convolution for upsampling.
3. Eliminate fully connected layers. ( The first layer of the GAN, which takes a uniform noise distribution Z as input, could be called fully connected as it is just a matrix multiplication, but the result is reshaped into a 4-dimensional tensor and used as the start of the convolution stack. For the discriminator, the last convolution layer is flattened and then fed into a single sigmoid output.)
4. Use Batch normalization except the output layer for the generator and the input layer of the discriminator.(Or it will cause sample oscillation and model instability. BatchNorm can stabilize learning by normalizing the input to each unit to have zero mean and unit variance, which helps to deal with training problem with poor initialization and help gradient flowing)
5. Use ReLU in the generator except for the output layer which uses Tanh function. Use LeakyReLU in the discriminator for all layers.
除了设定好了结构的 guildline, 这篇论文还论证了 GANs 作为 Feature Extractor 的有效性,利用 L2-SVM 作为分类器在 CIFAR-10, SVHN dataset 上都得到不错的结果:
论文的最后还对 DCGANs 的内部特征表示学习做了 investigation & visualization, 通过不断改变初始向量的值来探索 GAN 内部隐空间是如何影响图片生成的,以及判断 NN 是否是学到了语义还是只是记住了图片。这些探索都在论文中 ‘walking in the lantent space’ 部分进行了讨论,暂不过多记录。
最后,还是打一下代码会比较更好的掌握 DCGAN。
资源收集
论文地址: http://arxiv.org/abs/1511.06434
github 代码:https://github.com/Newmu/dcgan_code