介绍
这里记录keras文档FAQ中在工作中用到的一些问题和技巧。参考自这里
主要包括:
- 多GPU训练
- 获取中间层的输出
- 冻结(freeze)某些层
多GPU运行
运行一个模型在多个gpu上有两种方法:数据并行、设备并行
数据并行
数据并行是将一个模型在每个GPU上都部署一份进行训练,同时处理,加速训练。
keras有内置的工具keras.utils.multi_gpu_model
,该模块可以为任何自定义模型产生一个数据并行模型,在多gpu上达到线性拟合加速(quasi-linear speedup)。
更多可以参考multi_gpu_model
这里给出一个例子
from keras.utils import multi_gpu_model
parallel_model = multi_gpu_model(model, gpus=8)
parallel_model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
# This `fit` call will be distributed on 8 GPUs.
parallel_model.fit(x, y, epochs=20, batch_size=256) # batch size: 256, each GPU will process 32 samples.
设备并行
设备并行是在不同的GPU上运行一个模型的多个分支,多用于模型中有多个并行的结构,例如AlexNet的卷积就是放到多个GPU上运行的。提供一个例子
# Model where a shared LSTM is used to encode two different sequences in parallel
input_a = keras.Input(shape=(140, 256))
input_b = keras.Input(shape=(140, 256))
shared_lstm = keras.layers.LSTM(64)
# Process the first sequence on one GPU
with tf.device_scope('/gpu:0'):
encoded_a = shared_lstm(tweet_a)
# Process the next sequence on another GPU
with tf.device_scope('/gpu:1'):
encoded_b = shared_lstm(tweet_b)
# Concatenate results on CPU
with tf.device_scope('/cpu:0'):
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
如何获取某一层的输出
从Model中获取输出
创建一个模型,直接输出模型预测的结果。如下。
from keras.models import Model
model = ... # create the original model
layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)
使用keras function
from keras import backend as K
# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input], [model.layers[3].output])
layer_output = get_3rd_layer_output([x])[0]
如果模型有dropout、BN
如果模型有dropout、BN这种训练期有效、测试期无效的层,需要给一个指标(flag)。如下
get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()], [model.layers[3].output])
# output in test mode = 0
layer_output = get_3rd_layer_output([x, 0])[0]
# output in train mode = 1
layer_output = get_3rd_layer_output([x, 1])[0]
如何冻结(freeze)某些层
冻结代表在训练时期,某一些层的参数是不变的。这个多用于微调模型。
只需要在创建某一层的时候设定trainable参数为False。
frozen_layer = Dense(32, trainable=False)
或者在创建之后设定,如下。
x = Input(shape=(32,))
layer = Dense(32)
layer.trainable = False
y = layer(x)
frozen_model = Model(x, y)
# in the model below, the weights of `layer` will not be updated during training
frozen_model.compile(optimizer='rmsprop', loss='mse')
layer.trainable = True
trainable_model = Model(x, y)
# with this model the weights of the layer will be updated during training
# (which will also affect the above model since it uses the same layer instance)
trainable_model.compile(optimizer='rmsprop', loss='mse')
frozen_model.fit(data, labels) # this does NOT update the weights of `layer`
trainable_model.fit(data, labels) # this updates the weights of `layer`