LeNet神经网络由深度学习三巨头之一的Yan LeCun于1998年提出,解决手写字符的识别与分类的神经网络。详细信息参考:http://yann.lecun.com/exdb/lenet/index.html
LeNet在深度学习发展史上之所以特别重要是因为:在LeNet被发明之前,字符识别是由手工设计提取特征的算法,然后把提取的特征输入传统的机器学习模型。机器学习模型分类的是手动提取的特征,而不是原始图像数据。LeNet证明了,可以由模型自动提取特征,特征在模型中的内部表达,比人工手动提取的特征更优秀,更有利于提高分类的精准度,从而实现直接读入原始数据到模型完成分类的工作,LeNet分类的是原始数据(raw image data)LeNet-5第一层:卷积层C1
C1层是卷积层,6个滤波器(num_filters),执行1输入6输出的卷积计算,每个滤波器大小(filter_size)是5x5,卷积区域每次滑动一个像素(stride=1),这样卷积层形成的每个特征图谱大小是(32-5)/1+1=28x28。C1层共有(5x5+1)x6=156个训练参数
LeNet-5第二层:池化层S2
S2层是一个下采样层,即池化层(Pool),池化层大小为2x2,用于降低数据量
LeNet-5第三层:卷积层C3
C3层是一个卷积层,10个滤波器,执行6输入10输出的卷积计算,意思是每个滤波器有6个输入通道,一共10个滤波器。每个滤波器大小(filter_size)是5x5,卷积区域每次滑动一个像素(stride=1)
LeNet-5第四层:池化层S4
S4是一个下采样层,即池化层(Pool),池化层大小为2x2,用于降低数据量
LeNet-5第五层:卷积层C5
C5层是一个卷积层,由于输入的数据是5x5,卷积核是5x5,最终输出的特征图大小为1x1,相当于卷积层展开了。
LeNet-5第六层:全连接层F6
F6层是全连接层,输入120,输出84。
LeNet-5第七层:全连接层Output
输入84,输出10。
由此,用PaddlePaddle实现其LeNet网络结构定义,代码如下所示:
import numpy as np
import paddle
import paddle.fluid as fluid
from paddle.fluid.dygraph.nn import Conv2D, Pool2D, Linear
# 定义LeNet网络结构
class LeNet(fluid.dygraph.Layer):
def __init__(self, num_classes=1):
super().__init__()
# C1: feature maps 6@28x28
self.conv1 = Conv2D(1, 6, 5, act='sigmoid')
# S2:6@Maxpool 6@14x14
self.maxpool1 = Pool2D(pool_size=2, pool_type='max', pool_stride=2)
# C3: f.maps 16@10x10
self.conv2 = Conv2D(6, 16, 5, act='sigmoid')
# S4: 16@maxpool 16@5x5
self.maxpool2 = Pool2D(pool_size=2, pool_type='max', pool_stride=2)
# C5: f.maps 16@5x5 -> 120@1x1
self.conv3 = Conv2D(16, 120, 5, act='sigmoid')
# F6: FC 120 -> 84
self.fc1 = Linear(120, 84, act='sigmoid')
# output: 10
self.fc2 = Linear(84, output_dim=num_classes)
# 定义网络前向计算
def forward(self, x):
x = self.conv1(x)
x = self.maxpool1(x)
x = self.conv2(x)
x = self.maxpool2(x)
x = self.conv3(x)
x = fluid.layers.reshape(x, (x.shape[0],-1))
x = self.fc1(x)
x = self.fc2(x)
return x
神经网络计算对数据的形状非常敏感,LeNet处理的是32x32的输入数据,构建测试数据,打印神经网络各层的信息,代码如下:
# 查看LeNet-5 每一层输出数据形状
x = np.random.randn(*[20,1,32,32]).astype('float32')
with fluid.dygraph.guard():
model =LeNet(num_classes=10)
print(model.sublayers())
x = fluid.dygraph.to_variable(x)
for item in model.sublayers():
try:
x = item(x)
except:
print(x.shape)
x = fluid.layers.reshape(x, (x.shape[0],-1))
print(x.shape)
x = item(x)
print(x.shape)
if len(item.parameters()) == 2 :
print(item.full_name(), x.shape, item.parameters()[0].shape, item.parameters()[1].shape)
else:
print(item.full_name(), x.shape)
运行结果:
W0113 16:56:11.127960 4340 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 11.1, Runtime API Version: 10.0
W0113 16:56:11.134941 4340 device_context.cc:260] device: 0, cuDNN Version: 7.6.
[<paddle.fluid.dygraph.nn.Conv2D object at 0x00000199026A6108>, <paddle.fluid.dygraph.nn.Pool2D object at 0x00000199026A6228>, <paddle.fluid.dygraph.nn.Conv2D object at 0x00000199026A6288>, <paddle.fluid.dygraph.nn.Pool2D object at 0x00000199026A63A8>, <paddle.fluid.dygraph.nn.Conv2D object at 0x00000199026A6408>, <paddle.fluid.dygraph.nn.Linear object at 0x00000199026A6528>, <paddle.fluid.dygraph.nn.Linear object at 0x00000199026A6648>]
conv2d_0 [20, 6, 28, 28] [6, 1, 5, 5] [6]
pool2d_0 [20, 6, 14, 14]
conv2d_1 [20, 16, 10, 10] [16, 6, 5, 5] [16]
pool2d_1 [20, 16, 5, 5]
conv2d_2 [20, 120, 1, 1] [120, 16, 5, 5] [120]
[20, 120, 1, 1]
[20, 120]
[20, 84]
linear_0 [20, 84] [120, 84] [84]
linear_1 [20, 10] [84, 10] [10]
当把测试数据改为:
x = np.random.randn(*[20,1,28,28]).astype('float32')
运行会报错,指出数据形状不匹配。
----------------------
Error Message Summary:
----------------------
InvalidArgumentError: The input of Op(Conv) should be a 4-D or 5-D Tensor. But received: input's dimension is 2, input's shape is [20, 256].
[Hint: Expected in_dims.size() == 4 || in_dims.size() == 5 == true, but received in_dims.size() == 4 || in_dims.size() == 5:0 != true:1.] at (D:\1.8.5\paddle\paddle\fluid\operators\conv_op.cc:59)
[operator < conv2d > error]