50题真 • 一文入门TensorFlow2.x

50题真 • 一文入门TensorFlow2.x
点击以上链接👆 不用配置环境，直接在线运行

（本文基于TensorFlow 2.x编写）

大家好，挤牙膏挤了一个多月，终于把tensorflow 2.x的教程做出来了。这个教程是使用tensorflow低阶api做的，想学keras的高阶api可以直接看我的keras教程40题刷爆Keras，人生苦短我选Keras

TensorFlow是谷歌基于DistBelief进行研发的第二代人工智能学习系统，其命名来源于本身的运行原理。Tensor（张量）意味着N维数组，Flow（流）意味着基于数据流图的计算，TensorFlow为张量从流图的一端流动到另一端计算过程。TensorFlow是将复杂的数据结构传输至人工智能神经网中进行分析和处理过程的系统。 TensorFlow可被用于语音识别或图像识别等多项机器学习和深度学习领域，对2011年开发的深度学习基础架构DistBelief进行了各方面的改进，它可在小到一部智能手机、大到数千台数据中心服务器的各种设备上运行。TensorFlow将完全开源，任何人都可以用。

其他x题系列：

# 导入一些必要的库
import numpy as np
import matplotlib.pyplot as plt
import os
import pickle

1.导入tensorflow库简写为tf，并输出版本

import tensorflow as tf

tf.__version__

'2.1.0'

一、Tensor张量

常量

2.创建一个3x3的0常量张量

c = tf.zeros([3, 3])

3.根据上题张量的形状，创建一个一样形状的1常量张量

tf.ones_like(c)

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]], dtype=float32)>

4.创建一个2x3，数值全为6的常量张量

tf.fill([2, 3], 6)  # 2x3 全为 6 的常量 Tensor

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[6, 6, 6],
       [6, 6, 6]], dtype=int32)>

5.创建3x3随机的随机数组

tf.random.normal([3,3])

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[-1.1090602 , -0.14372216, -2.0020268 ],
       [-1.246778  , -0.155268  ,  1.3298218 ],
       [-0.47514197, -0.49891278,  0.6524196 ]], dtype=float32)>

6.通过二维数组创建一个常量张量

a = tf.constant([[1, 2], [3, 4]])
a

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4]], dtype=int32)>

7.取出张量中的numpy数组

a.numpy()

array([[1, 2],
       [3, 4]], dtype=int32)

8.从1.0-10.0等间距取出5个数形成一个常量张量

tf.linspace(1.0, 10.0, 5)

<tf.Tensor: shape=(5,), dtype=float32, numpy=array([ 1.  ,  3.25,  5.5 ,  7.75, 10.  ], dtype=float32)>

9.从1开始间隔2取1个数字，到大等于10为止

tf.range(start=1, limit=10, delta=2)

<tf.Tensor: shape=(5,), dtype=int32, numpy=array([1, 3, 5, 7, 9], dtype=int32)>

运算

10.将两个张量相加

a + a

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[2, 4],
       [6, 8]], dtype=int32)>

11.将两个张量做矩阵乘法

tf.matmul(a, a)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 7, 10],
       [15, 22]], dtype=int32)>

12.两个张量做点乘

tf.multiply(a, a)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 1,  4],
       [ 9, 16]], dtype=int32)>

13.将一个张量转置

tf.linalg.matrix_transpose(c)

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]], dtype=float32)>

14.将一个12x1张量变形成3行的张量

b = tf.linspace(1.0, 10.0, 12)
tf.reshape(b,[3,4])

# 方法二
tf.reshape(b,[3,-1])

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 1.       ,  1.8181818,  2.6363635,  3.4545455],
       [ 4.272727 ,  5.090909 ,  5.909091 ,  6.7272725],
       [ 7.5454545,  8.363636 ,  9.181818 , 10.       ]], dtype=float32)>

二、自动微分

这一部分将会实现 $y=x^2$ 在 $x=1$ 处的导数

变量

15.新建一个1x1变量，值为1

x = tf.Variable([1.0])  # 新建张量
x

<tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([1.], dtype=float32)>

16.新建一个GradientTape追踪梯度，把要微分的公式写在里面

with tf.GradientTape() as tape:  # 追踪梯度
    y = x * x

17.求y对于x的导数

grad = tape.gradient(y, x)  # 计算梯度
grad

<tf.Tensor: shape=(1,), dtype=float32, numpy=array([2.], dtype=float32)>

三、线性回归案例

这一部分将生成添加随机噪声的沿100个 $y=3x+2$ 的数据点，再对这些数据点进行拟合。

18.生成X,y数据，X为100个随机数，y=3X+2+noise，noise为100个随机数

X = tf.random.normal([100, 1]).numpy()
noise = tf.random.normal([100, 1]).numpy()

y = 3*X+2+noise

可视化这些点

plt.scatter(X, y)

image

19.创建需要预测的参数W,b（变量张量）

W = tf.Variable(np.random.randn())
b = tf.Variable(np.random.randn())

print('W: %f, b: %f'%(W.numpy(), b.numpy()))

W: 0.546446, b: -0.565772

20.创建线性回归预测模型

def linear_regression(x):
    return W * x + b

21.创建损失函数，此处采用真实值与预测值的差的平方，公式为： $loss=\frac{1}{n}\sum^n_{i=1} (y_i-h(x_i))^2$

def mean_square(y_pred, y_true):
    return tf.reduce_mean(tf.square(y_pred-y_true))

22.创建GradientTape，写入需要微分的过程

with tf.GradientTape() as tape:
    pred = linear_regression(X)
    loss = mean_square(pred, y)

23.对loss，分别求关于W,b的偏导数

dW, db = tape.gradient(loss, [W, b])

24.用最简单朴素的梯度下降更新W,b，learning_rate设置为0.1

W.assign_sub(0.1*dW)
b.assign_sub(0.1*db)
print('W: %f, b: %f'%(W.numpy(), b.numpy()))

W: 0.958635, b: -0.127122

25.以上就是单次迭代的过程，现在我们要继续循环迭代20次，并且记录每次的loss,W,b

for i in range(20):
    with tf.GradientTape() as tape:
        pred = linear_regression(X)
        loss = mean_square(pred, y)
        
    dW, db = tape.gradient(loss, [W, b])

    W.assign_sub(0.1*dW)
    b.assign_sub(0.1*db)

    print("step: %i, loss: %f, W: %f, b: %f" % (i+1, loss, W.numpy(), b.numpy()))

step: 1, loss: 8.821126, W: 1.306481, b: 0.235136
step: 2, loss: 6.507570, W: 1.599810, b: 0.534510
step: 3, loss: 4.896123, W: 1.846998, b: 0.782077
step: 4, loss: 3.773298, W: 2.055174, b: 0.986930
step: 5, loss: 2.990682, W: 2.230394, b: 1.156539
step: 6, loss: 2.445040, W: 2.377799, b: 1.297045
step: 7, loss: 2.064524, W: 2.501743, b: 1.413504
step: 8, loss: 1.799106, W: 2.605914, b: 1.510081
step: 9, loss: 1.613936, W: 2.693432, b: 1.590208
step: 10, loss: 1.484730, W: 2.766930, b: 1.656716
step: 11, loss: 1.394562, W: 2.828633, b: 1.711945
step: 12, loss: 1.331630, W: 2.880417, b: 1.757825
step: 13, loss: 1.287701, W: 2.923863, b: 1.795953
step: 14, loss: 1.257035, W: 2.960305, b: 1.827651
step: 15, loss: 1.235626, W: 2.990863, b: 1.854011
step: 16, loss: 1.220678, W: 3.016481, b: 1.875940
step: 17, loss: 1.210241, W: 3.037954, b: 1.894188
step: 18, loss: 1.202953, W: 3.055948, b: 1.909377
step: 19, loss: 1.197864, W: 3.071024, b: 1.922023
step: 20, loss: 1.194310, W: 3.083654, b: 1.932554

画出最终拟合的曲线

plt.plot(X, y, 'ro', label='Original data')
plt.plot(X, np.array(W * X + b), label='Fitted line')
plt.legend()
plt.show()

image

四、神经网络案例

这部分将会在CIFAR10数据集上，训练LeNet5模型
模型结构如下所示：

image

CIFAR10数据集为32x32的3通道图像，标签共10个种类

定义参数

26.定义第①步：卷积层的参数
输入图片：3×32×32
卷积核大小：5×5
卷积核种类：6
所以需要定义5×5×3×6个权重变量，和6个bias变量

conv1_w = tf.Variable(tf.random.truncated_normal([5,5,3,6], stddev=0.1))
conv1_b = tf.Variable(tf.zeros([6]))

27.定义第③步：卷积层的参数
输入：14×14×6
卷积核大小：5×5
卷积核种类：16
所以需要定义5×5×6×16个权重变量，和16个bias变量

conv2_w = tf.Variable(tf.random.truncated_normal([5, 5, 6, 16], stddev=0.1))
conv2_b = tf.Variable(tf.zeros([16]))

28.定义第第⑤步：全连接层的参数
输入：5×5×16
输出：120

fc1_w = tf.Variable(tf.random.truncated_normal([5*5*16, 120], stddev=0.1))
fc1_b = tf.Variable(tf.zeros([120]))

29.定义第⑥步：全连接层的参数
输入：120
输出：84

fc2_w = tf.Variable(tf.random.truncated_normal([120, 84], stddev=0.1))
fc2_b = tf.Variable(tf.zeros([84]))

30.定义第⑦步：全连接层的参数
输入：84
输出：10

fc3_w = tf.Variable(tf.random.truncated_normal([84, 10], stddev=0.1))
fc3_b = tf.Variable(tf.zeros([10]))

搭模型

def lenet5(input_img):
    ## 31.搭建INPUT->C1的步骤
    conv1_1 = tf.nn.conv2d(input_img, conv1_w, strides=[1,1,1,1], padding="VALID")
    conv1_2 = tf.nn.relu(tf.nn.bias_add(conv1_1,conv1_b))
    
    ## 32.搭建C1->S2的步骤
    pool1 = tf.nn.max_pool(conv1_2,ksize=[1,2,2,1],strides=[1,2,2,1],padding="VALID")
    
    ## 33.搭建S2->C3的步骤
    conv2_1 = tf.nn.conv2d(pool1,conv2_w,strides=[1,1,1,1],padding="VALID")
    conv2_2 = tf.nn.relu(tf.nn.bias_add(conv2_1,conv2_b))
    
    ## 34.搭建C3->S4的步骤
    pool2 = tf.nn.max_pool(conv2_2,ksize=[1,2,2,1],strides=[1,2,2,1],padding="VALID")
    
    ## 35.将S4的输出扁平化
    reshaped = tf.reshape(pool2,[-1, 16*5*5])
    
    ## 35.搭建S4->C5的步骤
    fc1 = tf.nn.relu(tf.matmul(reshaped,fc1_w) + fc1_b)
    
    ## 36.搭建C5->F6的步骤
    fc2 = tf.nn.relu(tf.matmul(fc1,fc2_w) + fc2_b)
    
    ## 37.搭建F6->OUTPUT的步骤
    OUTPUT = tf.nn.softmax(tf.matmul(fc2,fc3_w) + fc3_b)
    
    return OUTPUT

38.创建一个Adam优化器，学习率0.02

optimizer = tf.optimizers.Adam(learning_rate=0.02)

验证网络正确性

（随便搞点数据，验证一下能不能跑通）

39.随机一对x,y数据，x的形状为(1,32,32,3)，y的形状为(10,)

test_x = tf.Variable(tf.random.truncated_normal([1,32,32,3]))
test_y = [1,0,0,0,0,0,0,0,0,0]

将数据送入模型，进行反向传播

with tf.GradientTape() as tape:  
    ## 40.将数据从入模型
    prediction = lenet5(test_x)
    print("第一次预测：", prediction)
    ## 41.使用交叉熵作为损失函数，计算损失
    cross_entropy = -tf.reduce_sum(test_y * tf.math.log(prediction))
## 42.计算梯度    
trainable_variables = [conv1_w, conv1_b, conv2_w, conv2_b, fc1_w, fc1_b, fc2_w, fc2_b, fc3_w, fc3_b]  # 需优化参数列表
grads = tape.gradient(cross_entropy, trainable_variables) 
## 43.更新梯度
optimizer.apply_gradients(zip(grads, trainable_variables))

print("反向传播后的预测：", lenet5(test_x))

第一次预测： tf.Tensor(
[[0.04197786 0.25509077 0.10203867 0.1586983  0.08596517 0.08058039
  0.03977505 0.1154211  0.03524319 0.08520938]], shape=(1, 10), dtype=float32)
反向传播后的预测： tf.Tensor(
[[9.9998593e-01 7.0147706e-08 2.0064485e-08 4.0835562e-06 2.0444522e-06
  1.0025160e-08 3.7448625e-10 6.8495629e-06 1.0462685e-06 1.5679738e-08]], shape=(1, 10), dtype=float32)

读入数据，预处理

def load_cifar_batch(filename):
    """ load single batch of cifar """
    with open(filename, 'rb') as f:
        datadict = pickle.load(f, encoding='iso-8859-1')
        X = datadict['data']
        Y = datadict['labels']
        X = X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("uint8")
        Y = np.array(Y)
        return X,Y
def load_cifar(ROOT):
    data_X, data_Y = load_cifar_batch('/home/kesci/input/cifar10/data_batch_1')
    for b in range(2,6):
        f = os.path.join(ROOT, 'data_batch_%d' % (b, ))
        batch_X, batch_Y = load_cifar_batch(f)
        data_X = np.concatenate([data_X,batch_X])
        data_Y = np.concatenate([data_Y,batch_Y])
    data_test_X, data_test_Y  = load_cifar_batch(os.path.join(ROOT, 'test_batch'))
    return data_X, data_Y, data_test_X, data_test_Y

train_X, train_Y, test_X, test_Y = load_cifar('/home/kesci/input/cifar10')
classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
train_X.shape, train_X.shape, test_X.shape, test_Y.shape

((50000, 32, 32, 3), (50000, 32, 32, 3), (10000, 32, 32, 3), (10000,))

捞一个数据看看样子

plt.imshow(train_X[0])
plt.show()
print(classes[train_Y[0]])

image

frog

44.预处理1：将train_y, test_y进行归一化

train_X = tf.cast(train_X, dtype=tf.float32) / 255
test_X = tf.cast(test_X, dtype=tf.float32) / 255

45.预处理2：将train_y, test_y进行onehot编码

train_Y = tf.one_hot(train_Y, depth=10)
test_Y = tf.one_hot(test_Y, depth=10)

训练网络

因为前面实验的时候修改过参数，所以需要重新初始化所有参数

conv1_w = tf.Variable(tf.random.truncated_normal([5,5,3,6], stddev=0.1))
conv1_b = tf.Variable(tf.zeros([6])) 
conv2_w = tf.Variable(tf.random.truncated_normal([5, 5, 6, 16], stddev=0.1))
conv2_b = tf.Variable(tf.zeros([16]))
fc1_w = tf.Variable(tf.random.truncated_normal([5*5*16, 120], stddev=0.1))
fc1_b = tf.Variable(tf.zeros([120]))
fc2_w = tf.Variable(tf.random.truncated_normal([120, 84], stddev=0.1))
fc2_b = tf.Variable(tf.zeros([84]))
fc3_w = tf.Variable(tf.random.truncated_normal([84, 10], stddev=0.1))
fc3_b = tf.Variable(tf.zeros([10]))

然后再重新定义一个优化器

optimizer2 = tf.optimizers.Adam(learning_rate=0.002)

简简单单随便写一个算准确率的函数

def accuracy_fn(y_pred, y_true):
    preds = tf.argmax(y_pred, axis=1)  # 取值最大的索引，正好对应字符标签
    labels = tf.argmax(y_true, axis=1)
    return tf.reduce_mean(tf.cast(tf.equal(preds, labels), tf.float32))

46.把数据送入模型，开始训练，训练集迭代5遍，每遍分成25个batch，数据集每迭代完一遍，输出一次训练集上的准确率

EPOCHS = 5  # 整个数据集迭代次数

for epoch in range(EPOCHS):
    for i in range(25):  # 一整个数据集分为10个小batch训练
        with tf.GradientTape() as tape:
            prediction = lenet5(train_X[i*2000:(i+1)*2000])
            cross_entropy = -tf.reduce_sum(train_Y[i*2000:(i+1)*2000] * tf.math.log(prediction))
    
        trainable_variables = [conv1_w, conv1_b, conv2_w, conv2_b, fc1_w, fc1_b, fc2_w, fc2_b, fc3_w, fc3_b]  # 需优化参数列表
        grads = tape.gradient(cross_entropy, trainable_variables)  # 计算梯度
    
        optimizer2.apply_gradients(zip(grads, trainable_variables))  # 更新梯度
        
    # 每训练完一次，输出一下训练集的准确率    
    accuracy = accuracy_fn(lenet5(train_X), train_Y)
    print('Epoch [{}/{}], Train loss: {:.3f}, Test accuracy: {:.3f}'
              .format(epoch+1, EPOCHS, cross_entropy/2000, accuracy))

Epoch [1/5], Train loss: 1.934, Test accuracy: 0.309
Epoch [2/5], Train loss: 1.758, Test accuracy: 0.376
Epoch [3/5], Train loss: 1.638, Test accuracy: 0.422
Epoch [4/5], Train loss: 1.564, Test accuracy: 0.442
Epoch [5/5], Train loss: 1.488, Test accuracy: 0.472

使用网络进行预测

47.在测试集上进行预测

test_prediction = lenet5(test_X)
test_acc = accuracy_fn(test_prediction, test_Y)
test_acc.numpy()

0.4665

取一些数据查看预测结果

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.imshow(test_X[i], cmap=plt.cm.binary)
    title=classes[np.argmax(test_Y[i])]+'=>'
    title+=classes[np.argmax(test_prediction[i])]
    plt.xlabel(title)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)

image

五、变量保存&读取

这一部分，我们实现最简单的保存&读取变量值

48.新建一个Checkpoint对象，并且往里灌一个刚刚训练完的数据

save = tf.train.Checkpoint()
save.listed = [fc3_b]
save.mapped = {'fc3_b': save.listed[0]}

49.利用save()的方法保存，并且记录返回的保存路径

save_path = save.save('/home/kesci/work/data/tf_list_example')
print(save_path)

/home/kesci/work/data/tf_list_example-1

50.新建一个Checkpoint对象，从里读出数据

restore = tf.train.Checkpoint()
fc3_b2 = tf.Variable(tf.zeros([10]))
print(fc3_b2.numpy())
restore.mapped = {'fc3_b': fc3_b2}
restore.restore(save_path)
print(fc3_b2.numpy())

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[-0.04522281 -0.03583474  0.06039428 -0.00184523  0.10119602 -0.04645465
  0.06197326 -0.02776093 -0.00871902 -0.04295361]