Numpy使我们能够高效地工作在Python中的阵列(arrays)和矩阵(matrices)。
下面是最基础的Numpy知识,这个帖子将会进行长期补充。
Start A Array
可以建立一个列表List, 并将其转换为阵列Array
np.array(mylist)
mylist = [1, 2, 3]
x = np.array(mylist)
x
>>>array([1, 2, 3])
或者更直接
y = np.array([4, 5, 6])
m = np.array([[7, 8, 9], [10, 11, 12]])
对于arange函数,我们传递一个开始start,一个停止stop和一个跨步step的值, 并在给定的间隔内返回均匀跨步的值。
n = np.arange(0, 30, 2) # start at 0 count up by 2, stop before 30
>>>
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])
想把这个array 转换成一个3x5的阵列
n = n.reshape(3, 5) # reshape array to be 3x5
array([[ 0, 2, 4, 6, 8],
[10, 12, 14, 16, 18],
[20, 22, 24, 26, 28]])
linspace函数类似于arange,除了告诉需要返回多少个数字,它会相应地分隔间隔。
o = np.linspace(0, 4, 9) # return 9 evenly spaced values from 0 to 4
o
>>> array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. ])
Numpy还有几个内置函数和快捷方式来创建阵列。
- ones返回一个都是1的阵列
- zeros是一个都是0的阵列
- eye返回一个阵列,其中对角线是1,其他是0的
- diag构造一个对角阵列
np.ones((3, 2))
>>>
array([[ 1., 1.],
[ 1., 1.],
[ 1., 1.]])
np.zeros((2, 3))
>>>
array([[ 0., 0., 0.],
[ 0., 0., 0.]])
np.eye(3)
>>>
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
y = np.array([4, 5, 6])
np.diag(y)
>>>
array([[4, 0, 0],
[0, 5, 0],
[0, 0, 6]])
索引(index)和切片(slice)
创建一个array,0到12的每个数都进行平方
s = np.arange(13)**2
s
>>> array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144])
- 用方括号,里面是数字 获取特定索引的值
s[0], s[4], s[-1]
>>>(0, 16, 144)
- 用冒号(:)符号获取范围
- 第一个例子从索引1开始的范围, 并在索引5之前停止
s[1:5]
>>>array([ 1, 4, 9, 16])
- 用负数从array的末尾倒数
- 指定起始或结束索引不是必需的,直接 : 就不写了
- array 的最后四个元素
s[-4:]
>>>array([ 81, 100, 121, 144])
- array 的末尾第五开始, 每步向后倒退两位
A second : can be used to indicate step-size. array[start:stop:stepsize]
s[-5::-2]
>>> array([64, 36, 16, 4, 0])
Multidimensional array.
r = np.arange(36)
r.resize((6, 6))
r
>>>
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
#Use bracket notation to slice: array[row, column]
r[2, 2]
>>> 14
# select a range of rows or columns
r[3, 3:6]
>>>
array([21, 22, 23])
- 直到前两列,直到最后一行
#selecting all the rows up to (and not including) row 2
#and all the columns up to (and not including) the last column.
r[:2, :-1]
>>>
array([[ 0, 1, 2, 3, 4],
[ 6, 7, 8, 9, 10]])
r[-1, ::2]
>>>
array([30, 32, 34])
- 方括号运算符进行条件索引
# conditional indexing
r[r > 30]
>>>
array([31, 32, 33, 34, 35])
r[r > 30] = 30
r
>>>
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 30, 30, 30, 30, 30]])
Copy Data
- 创建一个新的new array r2,它是array r的一个切片
- 把这个阵列的所有元素设置为零
- 看原始阵列r时, 可以看到r中的slice也被改变了
- 所以这是需要记住的, 在使用Numpy阵列时要小心: 如果我们希望创建副本,但不更改原始阵列r, 可以使用r.copy()
r2 = r[:3,:3]
r2
>>>
array([[ 0, 1, 2],
[ 6, 7, 8],
[12, 13, 14]])
r2[:] = 0
r2
>>>
array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])
r
>>>
array([[ 0, 0, 0, 3, 4, 5],
[ 0, 0, 0, 9, 10, 11],
[ 0, 0, 0, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 30, 30, 30, 30, 30]])
# use r.copy to create a copy that will not affect the original array
r_copy = r.copy()
r_copy
>>>
array([[ 0, 0, 0, 3, 4, 5],
[ 0, 0, 0, 9, 10, 11],
[ 0, 0, 0, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 30, 30, 30, 30, 30]])
r_copy[:] = 10
print(r_copy, '\n')
print(r)
>>>
[[10 10 10 10 10 10]
[10 10 10 10 10 10]
[10 10 10 10 10 10]
[10 10 10 10 10 10]
[10 10 10 10 10 10]
[10 10 10 10 10 10]]
[[ 0 0 0 3 4 5]
[ 0 0 0 9 10 11]
[ 0 0 0 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]
[30 30 30 30 30 30]]
- 将r_copy中所有元素的值更改为10, 则原始阵列r保持不变
Iterating Over Arrays
test = np.random.randint(0, 10, (4,3))
test
>>>
array([[4, 8, 7],
[7, 8, 3],
[9, 1, 3],
[7, 5, 4]])
#Iterate by row:
for row in test:
print(row)
>>>
[4 8 7]
[7 8 3]
[9 1 3]
[7 5 4]
#Iterate by index:
for i in range(len(test)):
print(test[i])
>>>
[4 8 7]
[7 8 3]
[9 1 3]
[7 5 4]
#Iterate by row and index:
for i, row in enumerate(test):
print('row', i, 'is', row)
>>>
row 0 is [4 8 7]
row 1 is [7 8 3]
row 2 is [9 1 3]
row 3 is [7 5 4]
#Use zip to iterate over multiple iterables
test2 = test**2
test2
>>>
array([[16, 64, 49],
[49, 64, 9],
[81, 1, 9],
[49, 25, 16]])
for i, j in zip(test, test2):
print(i,'+',j,'=',i+j)
>>>
[4 8 7] + [16 64 49] = [20 72 56]
[7 8 3] + [49 64 9] = [56 72 12]
[9 1 3] + [81 1 9] = [90 2 12]
[7 5 4] + [49 25 16] = [56 30 20]