pandas常用函数:
一般的,我们使用如下缩写:
df:任意的Pandas DataFrame对象
s:任意的Pandas Series对象
模块的导入:
import pandas as pd
import numpy as pd
https://www.cnblogs.com/ly803744/p/10468307.html
Series
https://blog.csdn.net/kingov/article/details/79513322
Pipeline
https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.make_pipeline.html
https://blog.csdn.net/qq_19528953/article/details/79348929
- fillna()函数
https://blog.csdn.net/qq_21840201/article/details/81008566
3.one hot 独热编码
https://blog.csdn.net/yueyao121107/article/details/79730934
4.pandas的get_dummies
https://www.jianshu.com/p/c324f4101785
get_dummies是将拥有不同值的变量转换为0/1数值
5.pandas.DataFrame.notnull
http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.notnull.html
返回非空值
notnull()返回值是布尔型(0-1矩阵)的矩阵。再取df[布尔型矩阵]返回的是id为非空的行
isnull()同理返回空值
https://blog.csdn.net/waiwai3/article/details/80736714
6.pandas.DataFrame.as_matrix
DataFrame.as_matrix([columns]) 转换为矩阵
https://blog.csdn.net/hhtnan/article/details/80080240
7.pandas.DataFrame.loc()
http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
iloc,即index locate 用index索引进行定位,所以参数是整型,如:df.iloc[10:20, 3:5]
loc,则可以使用column名和index名进行定位,如:
df.loc[‘image1’:‘image10’, ‘age’:‘score’]
https://blog.csdn.net/missyougoon/article/details/83375375
8.groupby()
https://www.zhangshengrong.com/p/q0Xpqj4j1K/