Pandas_Select_Data_Boolen
import pandas as pd
import numpy as np
iris = pd.read_csv('iris.csv')
iris.head(2)
out:
sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
另一种常见操作是使用布尔向量来过滤数据。
操作符为:| 对应or,& 对应and,~对应not。
必须使用括号对这些进行分组。
使用布尔向量索引系列的工作方式与NumPy ndarray完全相同:
iris[iris.sepal_length>7]
out:
sepal_length sepal_width petal_length petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
iris.loc[iris.sepal_length>7]
out:
sepal_length sepal_width petal_length petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
& (and)
iris[(iris.sepal_length>7) & (iris.sepal_width<3)]
out:
sepal_length sepal_width petal_length petal_width species
107 7.3 2.9 6.3 1.8 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
130 7.4 2.8 6.1 1.9 virginica
iris.loc[(iris.sepal_length>7) & (iris.sepal_width<3)]
out:
sepal_length sepal_width petal_length petal_width species
107 7.3 2.9 6.3 1.8 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
130 7.4 2.8 6.1 1.9 virginica
|(or)
iris[(iris.sepal_length>7) | (iris.sepal_width>4)]
out:
sepal_length sepal_width petal_length petal_width species
15 5.7 4.4 1.5 0.4 setosa
32 5.2 4.1 1.5 0.1 setosa
33 5.5 4.2 1.4 0.2 setosa
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
```python
iris.loc[(iris.sepal_length>7) | (iris.sepal_width>4)]
out:
sepal_length sepal_width petal_length petal_width species
15 5.7 4.4 1.5 0.4 setosa
32 5.2 4.1 1.5 0.1 setosa
33 5.5 4.2 1.4 0.2 setosa
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
~(not)
iris[~(iris.sepal_length<=7)]
out:
sepal_length sepal_width petal_length petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
使用“-”也可以起到同样作用。
iris[-(iris.sepal_length<=7)]
out:
sepal_length sepal_width petal_length petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
iris.loc[~(iris.sepal_length<=7)]
out:
sepal_length sepal_width petal_length petal_width species
102 7.1 3.0 5.9 2.1 virginica
105 7.6 3.0 6.6 2.1 virginica
107 7.3 2.9 6.3 1.8 virginica
109 7.2 3.6 6.1 2.5 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
125 7.2 3.2 6.0 1.8 virginica
129 7.2 3.0 5.8 1.6 virginica
130 7.4 2.8 6.1 1.9 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica
使用map()
criterion = iris.sepal_length.map(lambda s: s>7.5)
iris[criterion]
out:
sepal_length sepal_width petal_length petal_width species
105 7.6 3.0 6.6 2.1 virginica
117 7.7 3.8 6.7 2.2 virginica
118 7.7 2.6 6.9 2.3 virginica
122 7.7 2.8 6.7 2.0 virginica
131 7.9 3.8 6.4 2.0 virginica
135 7.7 3.0 6.1 2.3 virginica