tf.nn.embedding_lookup
tf.nn.embedding_lookup函数的用法主要是选取一个张量里面索引对应的元素。tf.nn.embedding_lookup(tensor, id):tensor就是输入张量,id就是张量对应的索引,其他的参数不介绍。
例如:
1. import tensorflow as tf;
2. import numpy as np;
4. c = np.random.random([10,1])
5. b = tf.nn.embedding_lookup(c, [1, 3])
7. with tf.Session() as sess:
8. sess.run(tf.initialize_all_variables())
9. print sess.run(b)
10. print c
输出:
[[ 0.77505197]
[ 0.20635818]]
[[ 0.23976515]
[ 0.77505197]
[ 0.08798201]
[ 0.20635818]
[ 0.37183035]
[ 0.24753178]
[ 0.17718483]
[ 0.38533808]
[ 0.93345168]
[ 0.02634772]]
分析:输出为张量的第一和第三个元素。
f.truncated_normal_initializer
tf.truncated_normal_initializer 从截断的正态分布中输出随机值。
生成的值服从具有指定平均值和标准偏差的正态分布,如果生成的值大于平均值2个标准偏差的值则丢弃重新选择。
ARGS:
mean:一个python标量或一个标量张量。要生成的随机值的均值。
stddev:一个python标量或一个标量张量。要生成的随机值的标准偏差。
seed:一个Python整数。用于创建随机种子。查看 tf.set_random_seed 行为。
dtype:数据类型。只支持浮点类型。
这是神经网络权重和过滤器的推荐初始值。
import tensorflow as tf
t = tf.truncated_normal_initializer(stddev=0.1, seed=1)
v = tf.get_variable('v', [1], initializer=t)
with tf.Session() as sess:
for i in range(1, 10, 1):
sess.run(tf.global_variables_initializer())
print(sess.run(v))
输出:
[-0.08113182]
[ 0.06396971]
[ 0.13587774]
[ 0.05517125]
[-0.02088852]
[-0.03633211]
[-0.06759059]
[-0.14034753]
[-0.16338211]
tf.reduce_mean
tensorflow中有一类在tensor的某一维度上求值的函数。如:
求最大值tf.reduce_max(input_tensor, reduction_indices=None, keep_dims=False, name=None)
求平均值tf.reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)
参数1--input_tensor:待求值的tensor。
参数2--reduction_indices:在哪一维上求解。
参数(3)(4)可忽略
举例说明:
# 'x' is [[1., 2.]
# [3., 4.]]
x是一个2维数组,分别调用reduce_*函数如下:
首先求平均值:
tf.reduce_mean(x) ==> 2.5 #如果不指定第二个参数,那么就在所有的元素中取平均值
tf.reduce_mean(x, 0) ==> [2., 3.] #指定第二个参数为0,则第一维的元素取平均值,即每一列求平均值
tf.reduce_mean(x, 1) ==> [1.5, 3.5] #
指定第二个参数为1,则第二维的元素取平均值,即每一行求平均值
同理,还可用tf.reduce_max()求最大值等。
name_scope, variable_scope
TensorFlow里创建变量的两种方式有 tf.get_variable() 和 tf.Variable() ,区别
的答案已经说的很清楚了。
在 tf.name_scope下时,tf.get_variable()创建的变量名不受 name_scope 的影响,而且在未指定共享变量时,如果重名会报错,tf.Variable()会自动检测有没有变量重名,如果有则会自行处理。
import tensorflow as tf
with tf.name_scope('name_scope_x'):
var1 = tf.get_variable(name='var1', shape=[1], dtype=tf.float32)
var3 = tf.Variable(name='var2', initial_value=[2], dtype=tf.float32)
var4 = tf.Variable(name='var2', initial_value=[2], dtype=tf.float32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(var1.name, sess.run(var1))
print(var3.name, sess.run(var3))
print(var4.name, sess.run(var4))
# 输出结果:
# var1:0 [-0.30036557] 可以看到前面不含有指定的'name_scope_x'
# name_scope_x/var2:0 [ 2.]
# name_scope_x/var2_1:0 [ 2.] 可以看到变量名自行变成了'var2_1',避免了和'var2'冲突
如果使用tf.get_variable()创建变量,且没有设置共享变量,重名时会报错
import tensorflow as tf
with tf.name_scope('name_scope_1'):
var1 = tf.get_variable(name='var1', shape=[1], dtype=tf.float32)
var2 = tf.get_variable(name='var1', shape=[1], dtype=tf.float32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(var1.name, sess.run(var1))
print(var2.name, sess.run(var2))
# ValueError: Variable var1 already exists, disallowed. Did you mean
# to set reuse=True in VarScope? Originally defined at:
# var1 = tf.get_variable(name='var1', shape=[1], dtype=tf.float32)
所以要共享变量,需要使用tf.variable_scope()
import tensorflow as tf
with tf.variable_scope('variable_scope_y') as scope:
var1 = tf.get_variable(name='var1', shape=[1], dtype=tf.float32)
scope.reuse_variables() # 设置共享变量
var1_reuse = tf.get_variable(name='var1')
var2 = tf.Variable(initial_value=[2.], name='var2', dtype=tf.float32)
var2_reuse = tf.Variable(initial_value=[2.], name='var2', dtype=tf.float32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(var1.name, sess.run(var1))
print(var1_reuse.name, sess.run(var1_reuse))
print(var2.name, sess.run(var2))
print(var2_reuse.name, sess.run(var2_reuse))
# 输出结果:
# variable_scope_y/var1:0 [-1.59682846]
# variable_scope_y/var1:0 [-1.59682846] 可以看到变量var1_reuse重复使用了var1
# variable_scope_y/var2:0 [ 2.]
# variable_scope_y/var2_1:0 [ 2.]
也可以这样
with tf.variable_scope('foo') as foo_scope:
v = tf.get_variable('v', [1])
with tf.variable_scope('foo', reuse=True):
v1 = tf.get_variable('v')
assert v1 == v
或者这样:
with tf.variable_scope('foo') as foo_scope:
v = tf.get_variable('v', [1])
with tf.variable_scope(foo_scope, reuse=True):
v1 = tf.get_variable('v')
assert v1 == v
获取变量维度
获取变量维度是一个使用频繁的操作,在tensorflow中获取变量维度主要用到的操作有以下三种:
- Tensor.shape
- Tensor.get_shape()
- tf.shape(input,name=None,out_type=tf.int32)
对上面三种操作做一下简单分析:(这三种操作先记作A、B、C)
A 和 B 基本一样,只不过前者是Tensor的属性变量,后者是Tensor的函数。
A 和 B 均返回TensorShape类型,而 C 返回一个1D的out_type类型的Tensor。
A 和 B 可以在任意位置使用,而 C 必须在Session中使用。
A 和 B 获取的是静态shape,可以返回不完整的shape; C 获取的是动态的shape,必须是完整的shape。
从TenaorShape变量中获取具体维度数值的方法
# 直接获取TensorShape变量的第i个维度值
x.shape[i].value
x.get_shape()[i].value
# 将TensorShape变量转化为list类型,然后直接按照索引取值
x.get_shape().as_list()
下面给出全部的示例程序:
import tensorflow as tf
x1 = tf.constant([[1,2,3],[4,5,6]])
# 占位符创建变量,第一个维度初始化为None,表示暂不指定维度
x2 = tf.placeholder(tf.float32,[None, 2,3])
print('x1.shape:',x1.shape)
print('x2.shape:',x2.shape)
print('x2.shape[1].value:',x2.shape[1].value)
print('tf.shape(x1):',tf.shape(x1))
print('tf.shape(x2):',tf.shape(x2))
print('x1.get_shape():',x1.get_shape())
print('x2.get_shape():',x2.get_shape())
print('x2.get_shape.as_list[1]:',x2.get_shape().as_list()[1])
shapeOP1 = tf.shape(x1)
shapeOP2 = tf.shape(x2)
with tf.Session() as sess:
print('Within session, tf.shape(x1):',sess.run(shapeOP1))
# 由于x2未进行完整的变量填充,其维度不完整,因此执行下面的命令将会报错
# print('Within session, tf.shape(x2):',sess.run(shapeOP2)) # 此命令将会报错
输出结果为:
x1.shape: (2, 3)
x2.shape: (?, 2, 3)
x2.shape[1].value: 2
tf.shape(x1): Tensor("Shape:0", shape=(2,), dtype=int32)
tf.shape(x2): Tensor("Shape_1:0", shape=(3,), dtype=int32)
x1.get_shape(): (2, 3)
x2.get_shape(): (?, 2, 3)
x2.get_shape.as_list[1]: 2
Within session, tf.shape(x1): [2 3]
expand_dim()
用下面的方法可以实现:
one_img = tf.expand_dims(one_img, 0)
one_img = tf.expand_dims(one_img, -1) #-1表示最后一维
在最后,给出官方的例子和说明
# 't' is a tensor of shape [2]
shape(expand_dims(t, 0)) ==> [1, 2]
shape(expand_dims(t, 1)) ==> [2, 1]
shape(expand_dims(t, -1)) ==> [2, 1]
# 't2' is a tensor of shape [2, 3, 5]
shape(expand_dims(t2, 0)) ==> [1, 2, 3, 5]
shape(expand_dims(t2, 2)) ==> [2, 3, 1, 5]
shape(expand_dims(t2, 3)) ==> [2, 3, 5, 1]
Args:
input: A Tensor.
dim: A Tensor. Must be one of the following types: int32, int64. 0-D (scalar). Specifies the dimension index at which to expand the shape of input.
name: A name for the operation (optional).
Returns:
A Tensor. Has the same type as input. Contains the same data as input, but its shape has an additional dimension of size 1 added.
tf.cast
cast(x, dtype, name=None)
将x的数据格式转化成dtype.例如,原来x的数据格式是bool,那么将其转化成float以后,就能够将其转化成0和1的序列。反之也可以
a = tf.Variable([1,0,0,1,1])
b = tf.cast(a,dtype=tf.bool)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
print(sess.run(b))
#[ True False False True True]
padding方式“SAME”和“VALID”
可以看出“SAME”的填充方式是比“VALID”的填充方式多了一列。
让我们来看看变量x是一个2x3的矩阵,max pooling窗口为2x2,两个维度的strides=2。
第一次由于窗口可以覆盖(橙色区域做max pool操作),没什么问题,如下:
1 |
2 |
3 |
---|---|---|
4 |
5 |
6 |
接下来就是“SAME”和“VALID”的区别所在,由于步长为2,当向右滑动两步之后“VALID”发现余下的窗口不到2x2所以就把第三列直接去了,而“SAME”并不会把多出的一列丢弃,但是只有一列了不够2x2怎么办?填充!
1 | 2 | 3 |
0 |
---|---|---|---|
4 | 5 | 6 |
0 |
如上图所示,“SAME”会增加第四列以保证可以达到2x2,但为了不影响原来的图像像素信息,一般以0来填充。(这里使用表格的形式展示,markdown不太好控制格式,明白意思就行),这就不难理解不同的padding方式输出的形状会有所不同了。
tf.layers.dropout
tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None,name=None)
上面方法中常用的是前两个参数:
第一个参数x:指输入
第二个参数keep_prob: 设置神经元被选中的概率,在初始化时keep_prob是一个占位符, keep_prob = tf.placeholder(tf.float32) 。tensorflow在run时设置keep_prob具体的值,例如keep_prob
: 0.5
第五个参数name:指定该操作的名字。
使用举例:
1. from __future__ import print_function
2. import tensorflow as tf
3. from sklearn.datasets import load_digits
4. from sklearn.cross_validation import train_test_split
5. from sklearn.preprocessing import LabelBinarizer
7. # load data
8. digits = load_digits()
9. X = digits.data
10. y = digits.target
11. y = LabelBinarizer().fit_transform(y)
12. X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3)
15. def add_layer(inputs, in_size, out_size, layer_name, activation_function=None, ):
16. # add one more layer and return the output of this layer
17. Weights = tf.Variable(tf.random_normal([in_size, out_size]))
18. biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, )
19. Wx_plus_b = tf.matmul(inputs, Weights) + biases
20. # here to dropout
21. Wx_plus_b = tf.nn.dropout(Wx_plus_b, keep_prob)
22. if activation_function is None:
23. outputs = Wx_plus_b
24. else:
25. outputs = activation_function(Wx_plus_b, )
26. tf.histogram_summary(layer_name + '/outputs', outputs)
27. return outputs
30. # define placeholder for inputs to network
31. keep_prob = tf.placeholder(tf.float32)
32. xs = tf.placeholder(tf.float32, [None, 64]) # 8x8
33. ys = tf.placeholder(tf.float32, [None, 10])
35. # add output layer
36. l1 = add_layer(xs, 64, 50, 'l1', activation_function=tf.nn.tanh)
37. prediction = add_layer(l1, 50, 10, 'l2', activation_function=tf.nn.softmax)
39. # the loss between prediction and real data
40. cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),
41. reduction_indices=[1])) # loss
42. tf.scalar_summary('loss', cross_entropy)
43. train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
45. sess = tf.Session()
46. merged = tf.merge_all_summaries()
47. # summary writer goes in here
48. train_writer = tf.train.SummaryWriter("logs/train", sess.graph)
49. test_writer = tf.train.SummaryWriter("logs/test", sess.graph)
51. sess.run(tf.initialize_all_variables())
53. for i in range(500):
54. # here to determine the keeping probability
55. sess.run(train_step, feed_dict={xs: X_train, ys: y_train, keep_prob: 0.5})
56. if i % 50 == 0:
57. # record loss
58. train_result = sess.run(merged, feed_dict={xs: X_train, ys: y_train, keep_prob: 1})
59. test_result = sess.run(merged, feed_dict={xs: X_test, ys: y_test, keep_prob: 1})
60. train_writer.add_summary(train_result, i)
61. test_writer.add_summary(test_result, i)
说明:
1、因为要可视化训练和测试的loss.所以,必须定义两个文件来写入训练的结果,比如我们将训练和测试的结果分别写入logs/train,logs/test
1. sess = tf.Session()
2. merged = tf.merge_all_summaries()
3. # summary writer goes in here
4. train_writer = tf.train.SummaryWriter("logs/train",sess.graph)
5. test_writer = tf.train.SummaryWriter("logs/test",sess.graph)
2、然后,每次迭代训练的过程中,就要分别输出train,test的结果
1. for i in range(500):
2. sess.run(train_step, feed_dict={xs: X_train, ys: y_train})
3. if i % 50 == 0:
4. train_result = sess.run(merged,feed_dict={xs:X_train,ys:y_train})
5. test_result = sess.run(merged,feed_dict={xs:X_test,ys:y_test})
6. train_writer.add_summary(train_result,i)
3、dropout必须设置概率keep_prob,并且keep_prob也是一个占位符,跟输入是一样的
1. keep_prob = tf.placeholder(tf.float32)
4、train的时候才是dropout起作用的时候,train和test的时候不应该让dropout起作用
1. sess.run(train_step, feed_dict={xs: X_train, ys: y_train, keep_prob: 0.5})
1. train_result = sess.run(merged, feed_dict={xs: X_train, ys: y_train, keep_prob: 1})
2. test_result = sess.run(merged, feed_dict={xs: X_test, ys: y_test, keep_prob: 1})
5、tf实现dropout其实,就一个函数,让一个神经元以某一固定的概率失活
1. def add_layer(inputs, in_size, out_size, layer_name, activation_function=None, ):
2. # add one more layer and return the output of this layer
3. Weights = tf.Variable(tf.random_normal([in_size, out_size]))
4. biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, )
5. Wx_plus_b = tf.matmul(inputs, Weights) + biases
6. # here to dropout
7. Wx_plus_b = tf.nn.dropout(Wx_plus_b, keep_prob)
8. if activation_function is None:
9. outputs = Wx_plus_b
10. else:
11. outputs = activation_function(Wx_plus_b, )
12. tf.histogram_summary(layer_name + '/outputs', outputs)
13. return output
6、说明:使用dropout之后,训练误差和测试误差类似
tf.argmax
tf.argmax(vector, 1):返回的是vector中的最大值的索引号,如果vector是一个向量,那就返回一个值,如果是一个矩阵,那就返回一个向量,这个向量的每一个维度都是相对应矩阵行的最大值元素的索引号。
import tensorflow as tf
import numpy as np
A = [[1,3,4,5,6]]
B = [[1,3,4], [2,4,1]]
with tf.Session() as sess:
print(sess.run(tf.argmax(A, 1)))
print(sess.run(tf.argmax(B, 1)))
输出:
[4]
[2 1]
tf.equal
tf.equal(A, B)是对比这两个矩阵或者向量的相等的元素,如果是相等的那就返回True,反正返回False,返回的值的矩阵维度和A是一样的
1. import tensorflow as tf
2. import numpy as np
4. A = [[1,3,4,5,6]]
5. B = [[1,3,4,3,2]]
7. with tf.Session() as sess:
8. print(sess.run(tf.equal(A, B)))
输出:
[[ True True True False False]]