sklearn.preprocessing
1、对数据分类编码
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
df_user3['user_id']=le.fit_transform(list(df_user3['user_id']))
le.classes_查看编码对应的user_id
生成对应表
d = pd.Series(le.classes_)
print(d)
d.to_csv('D:/huashuData/dianhui/dianhui_user_code.csv')
2、对分类特征进行二进制(0,1)编码
from sklearn.preprocessing import MultiLabelBinarizer
mlb=MultiLabelBinarizer()
mo = mo.join(pd.DataFrame(mlb.fit_transform(mo.pop('genres')),
columns=mlb.classes_,
index=mo.index))