调用scikit-learn的随机森林接口时,模型预测语句执行时,遇到报错ValueError: buffer source array is read-only
from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier(n_estimators=10, max_depth=3,random_state=0)
clf.fit(X_train,y_train)
preds = clf.predict(X_test)
解决方法:
根据报错提示,可能是cpython相关报错。参考github的一些报错讨论、还有这个,图1。
检查pandas安装的包
import pandas as pd
pd.show_versions()
本来显示的Cython是None的,所以试着安装一下cython,参考官方文档(英文、中文)
pip install --user Cython
安装好后,在运行着的jupyter notebook中是直接可以看到cython的版本的,见图2.但是,需要重启jupyter notebook!如果不重启jupyter notebook的话是无法生效的,自己就在这一点上被坑了一个小时,一直以为是自己的数据格式或者大小的问题。
具体报错:
---------------------------------------------------------------------------ValueErrorTraceback (most recent call last)<ipython-input-27-e487f74f048d>in<module>----> 1 y_pred_rt=pipeline.predict_proba(nd_X_test)[:,1] 2fpr_rt_lm,tpr_rt_lm,_=roc_curve(nd_y_test,y_pred_rt)~/.local/lib/python3.6/site-packages/sklearn/utils/metaestimators.pyin<lambda>(*args, **kwargs) 114 115# lambda, but not partial, allows help() to work with update_wrapper--> 116 out=lambda*args,**kwargs:self.fn(obj,*args,**kwargs) 117# update the docstring of the returned function 118update_wrapper(out,self.fn)~/.local/lib/python3.6/site-packages/sklearn/pipeline.pyinpredict_proba(self, X) 469Xt=X 470for_,name,transforminself._iter(with_final=False):--> 471 Xt=transform.transform(Xt) 472returnself.steps[-1][-1].predict_proba(Xt) 473~/.local/lib/python3.6/site-packages/sklearn/ensemble/_forest.pyintransform(self, X) 2251""" 2252check_is_fitted(self)-> 2253 returnself.one_hot_encoder_.transform(self.apply(X))~/.local/lib/python3.6/site-packages/sklearn/ensemble/_forest.pyinapply(self, X) 226**_joblib_parallel_args(prefer="threads"))( 227delayed(tree.apply)(X,check_input=False)--> 228 for tree in self.estimators_) 229 230returnnp.array(results).T~/.local/lib/python3.6/site-packages/joblib/parallel.pyin__call__(self, iterable) 1002# remaining jobs. 1003self._iterating=False-> 1004 ifself.dispatch_one_batch(iterator): 1005self._iterating=self._original_iteratorisnotNone 1006~/.local/lib/python3.6/site-packages/joblib/parallel.pyindispatch_one_batch(self, iterator) 833returnFalse 834else:--> 835 self._dispatch(tasks) 836returnTrue 837~/.local/lib/python3.6/site-packages/joblib/parallel.pyin_dispatch(self, batch) 752withself._lock: 753job_idx=len(self._jobs)--> 754 job=self._backend.apply_async(batch,callback=cb) 755# A job can complete so quickly than its callback is 756# called before we get here, causing self._jobs to~/.local/lib/python3.6/site-packages/joblib/_parallel_backends.pyinapply_async(self, func, callback) 207defapply_async(self,func,callback=None): 208"""Schedule a func to be run"""--> 209 result=ImmediateResult(func) 210ifcallback: 211callback(result)~/.local/lib/python3.6/site-packages/joblib/_parallel_backends.pyin__init__(self, batch) 588# Don't delay the application, to avoid keeping the input 589# arguments in memory--> 590 self.results=batch() 591 592defget(self):~/.local/lib/python3.6/site-packages/joblib/parallel.pyin__call__(self) 254withparallel_backend(self._backend,n_jobs=self._n_jobs): 255return [func(*args, **kwargs)--> 256 for func, args, kwargs in self.items] 257 258def__len__(self):~/.local/lib/python3.6/site-packages/joblib/parallel.pyin<listcomp>(.0) 254withparallel_backend(self._backend,n_jobs=self._n_jobs): 255return [func(*args, **kwargs)--> 256 for func, args, kwargs in self.items] 257 258def__len__(self):~/.local/lib/python3.6/site-packages/sklearn/tree/_classes.pyinapply(self, X, check_input) 471check_is_fitted(self) 472X=self._validate_X_predict(X,check_input)--> 473 returnself.tree_.apply(X) 474 475defdecision_path(self,X,check_input=True):sklearn/tree/_tree.pyxinsklearn.tree._tree.Tree.apply()sklearn/tree/_tree.pyxinsklearn.tree._tree.Tree.apply()sklearn/tree/_tree.pyxinsklearn.tree._tree.Tree._apply_dense()~/.local/lib/python3.6/site-packages/sklearn/tree/_tree.cpython-36m-x86_64-linux-gnu.soinView.MemoryView.memoryview_cwrapper()~/.local/lib/python3.6/site-packages/sklearn/tree/_tree.cpython-36m-x86_64-linux-gnu.soinView.MemoryView.memoryview.__cinit__()ValueError: buffer source array is read-only
具体的报错截图: