开篇先把我目前电脑上的环境说一下:
- 硬件
- 戴尔工作站
- 华硕1080ti 11G显存
- 软件
- 已安装好Anoconda 2
- 已安装好cuda
- 已安装好cudnn
- 已安装好TensorFlow、Pytorch,每个都单独有一个虚拟的Anoconda环境
那么现在想要把caffe安装到一个新的Anoconda虚拟环境中去,每次训练不同平台下的网络时,只需要激活指定的虚拟环境。由于这里的caffe源码别人做了部分修改,运行他的代码,必须编译他修改后的caffe源码。
安装依赖库
进入官网http://caffe.berkeleyvision.org/installation.html,可以看到编译caffe需要的依赖库:
首先进入自己的caffe虚拟环境
source activate gcf_caffe
,由于
CUDA
已经安装好了,直接跳过,安装各种依赖库:sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
接着安装:
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
sudo apt-get install libatlas-dev
sudo apt-get install liblapack-dev
sudo apt-get install libatlas-base-dev
参考地址https://blog.csdn.net/u011974639/article/details/78804299
假若安装时提示:无法定位软件包,代表需要更新软件源
sudo apt-get update
,更新软件源时若提示仓库“http://ppa.launchpad.net/fcitx-team/nightly/ubuntu xenial Release”没有Release文件。
怎么解决呢?先切换到对应的ppa目录:cd /etc/apt/sources.list.d
,再执行mv fcitx-team-ubuntu-nightly-xenial.list fcitx-team-ubuntu-nightly-xenial.list.bak
,参考地址https://www.cnblogs.com/wenzheshen/p/6599636.html
配置caffe的Makefile.config文件
很多编译问题基本都是这里没有设置好,比如opencv冲突,protobuf冲突,这个文件需要仔细设置,实际内容也并没有多少。
这里把我电脑上用到的具体配置贴出来:
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!
# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1
# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0
# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
# You should not set this flag if you will be reading LMDBs with any
# possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1
# Uncomment if you're using OpenCV 3
# OPENCV_VERSION := 3
# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++
# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr
# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_52,code=sm_52 \
-gencode arch=compute_60,code=sm_60 \
-gencode arch=compute_61,code=sm_61 \
-gencode arch=compute_61,code=compute_61
# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /opt/OpenBLAS/include
# BLAS_LIB := /opt/OpenBLAS/lib
# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app
# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
# PYTHON_INCLUDE := /usr/include/python2.7 \
# /usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
ANACONDA_HOME := $(HOME)/anaconda2/envs/gcf_caffe
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
$(ANACONDA_HOME)/include/python2.7 \
$(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include
# Uncomment to use Python 3 (default is Python 2)
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m \
# /usr/lib/python3.5/dist-packages/numpy/core/include
# We need to be able to find libpythonX.X.so or .dylib.
# PYTHON_LIB := /usr/lib
PYTHON_LIB := $(ANACONDA_HOME)/lib
LINKFLAGS := -Wl,-rpath,$(ANACONDA_HOME)/lib
# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib
# Uncomment to support layers written in Python (will link against Python libs)
WITH_PYTHON_LAYER := 1
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial
# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib
# NCCL acceleration switch (uncomment to build with NCCL)
# https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0)
# USE_NCCL := 1
# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1
# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1
# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
# enable pretty build (comment to see full commands)
Q ?= @
注意点有三点:
- 我的依赖库都是在caffe虚拟环境gcf_caffe中安装的,这里
ANACONDA_HOME := $(HOME)/anaconda2/envs/gcf_caffe
- 设置好hdf5的路径,
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial
- caffe下的Makefile修改地方:
将:LIBRARIES += glog gflags protobuf boost_system boost_filesystem m
改为:LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial
protoc冲突
这里说一下非常容易遇到的protoc冲突,因为我已经安装了TensorFlow环境,/anaconda/bin/
下存在一个3.5的protoc,系统/usr/bin/
下也存在一个2.6的protoc,编译caffe时默认使用的是/anaconda/bin
下的protoc,提示:error This file was generated by an older version of protoc which is
,表示当前使用的protoc版本过高。
- 安装protobuf的方法有好几种:
sudo apt-get install libprotobuf-dev protobuf-compiler #Linux系统级的安装
sudo pip install google protocol #python2.7版本的安装
sudo pip3 install google protocol #python3.5版本的安装
conda install protobuf #anaconda版本的安装
- 查看系统中已安装的protobuf:
whereis protoc #查看那些路径下安装了protobuf
which protoc #查看默认选用的protobuf
protoc --version #查看当前默认的protobuf的版本
sudo protoc --version #查看系统的protobuf的版本
参考地址https://blog.csdn.net/m0_38082419/article/details/80117132
当前protoc情况:
因为protoc版本过高,那么我只需要使用一个低版本的protoc就行了,修改caffe中的Makefile
文件,把如下代码:
$(Q)protoc --proto_path=$(PROTO_SRC_DIR) --cpp_out=$(PROTO_BUILD_DIR) $<
$(Q)protoc --proto_path=$(PROTO_SRC_DIR) --python_out=$(PY_PROTO_BUILD_DIR) $<
修改成
$(Q)/usr/bin/protoc --proto_path=$(PROTO_SRC_DIR) --cpp_out=$(PROTO_BUILD_DIR) $<
$(Q)/usr/bin/protoc --proto_path=$(PROTO_SRC_DIR) --python_out=$(PY_PROTO_BUILD_DIR) $<
还有个问题,如何查看caffe需要的protoc版本呢?
打开编译出错的文件caffe.pb.h
,里面会有相关错误提示
编译失败了,重新编译时,记得sudo make clean
编译
-
sudo make all -j8
-
sudo make test -j4
-
make runtest -j4
-
sudo make pycaffe
- import caffe
若果提示ImportError: No module named caffe
,需要把caffe下的Python路径导入环境变量中去。sudo vim ~/.bashrc
,最后一行加上export PYTHONPATH="/home/ilab-gcf/桌面/CASENet_Codes/caffe/python:$PYTHONPATH"
,这里的路径写上你自己的路径,记得source ~/.bashrc
。否则的话只能在这个目录下执行Python,导入caffe了。
报错提示
can not find module skimage.io
,安装scikit-image
,conda install scikit-image
报错提示No module named google.protobuf.internal
,安装protobuf
,pip install protobuf
Pycharm中使用Caffe
由于pycharm并不能读取.bashrc中的内容,因而在命令行中能import caffe,在pycharm中依旧不能使用import caffe。解决办法是从bash shell中启动pycharm,使pycharm能读取.bashrc中的环境变量。
启动命令:sh ./pycharm.sh
,前提进入pycharm安装路径的bin
文件夹中。
参考地址https://www.cnblogs.com/darkknightzh/p/5896446.html
总结
这里都是废话,可以不用看。这个caffe编译浪费了我很多时间,遇到错误了就不想弄了,三天打鱼两天筛网,遇到问题还是喜欢逃避,很不好,路要一步步走。给自己建议:
- 每天制定小目标,一步步脚踏实地。
- 多尝试换位思考。
- 加强承受能力。
- 不清高。
以下内容2019年5月24日新增:
背景介绍:需要在新的机器上搭建caffe环境,显卡为2080ti,cuda 10.0,cudnn 7.3.1,问题尝试了很多方法,花了大半天时间,心态很重要。
- libopencv_core.so需要libcudart.so.9.0,而libcaffe.so需要libcudart.so.10.0??
在大佬帮助下,虚拟环境下弄了两个libcudart<实际就是从另外环境拷贝过来一个~>
- 报这样的错:
CXX/LD -o .build_release/examples/cifar10/convert_cifar_data.bin
CXX/LD -o .build_release/examples/cpp_classification/classification.bin
.build_release/tools/convert_imageset.o:在函数‘std::string* google::MakeCheckOpString<unsigned long, int>(unsigned long const&, int const&, char const*)’中:
convert_imageset.cpp:(.text._ZN6google17MakeCheckOpStringImiEEPSsRKT_RKT0_PKc[_ZN6google17MakeCheckOpStringImiEEPSsRKT_RKT0_PKc]+0x50):对‘google::base::CheckOpMessageBuilder::NewString()’未定义的引用
.build_release/tools/convert_imageset.o:在函数‘main’中:
convert_imageset.cpp:(.text.startup+0x347):对‘google::SetUsageMessage(std::string const&)’未定义的引用
convert_imageset.cpp:(.text.startup+0xd2a):对‘google::protobuf::MessageLite::SerializeToString(std::string*) const’未定义的引用
.build_release/lib/libcaffe.so:对‘google::protobuf::Message::InitializationErrorString() const’未定义的引用
.build_release/lib/libcaffe.so:对‘google::protobuf::internal::WireFormatLite::WriteStringMaybeAliased(int, std::string const&, google::protobuf::io::CodedOutputStream*)’未定义的引用
....
查看当前版本<gcc --version>
版本为4.8
,把它升级成5.4
,感谢此博主的文章https://blog.csdn.net/chenshuibiao/article/details/78734957
关于如何升级更换GCC?
感谢此博主文章https://www.cnblogs.com/loadofleaf/p/5667989.html
- 添加源
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
- 安装依赖的包(出现错误才需要这个)
sudo apt-get install software-properties-common
- 升级安装
sudo apt-get install gcc-5 g++-5
- 更新链接
sudo ln -s /usr/bin/gcc-5 /usr/bin/gcc -f
sudo ln -s /usr/bin/g++-5 /usr/bin/g++ -f
注意!注意!注意!:
caffe 编译使用5.0以上gcc,3.0以下的protobuf(建议2.6.1)
- 报错ImportError: dynamic module does not define module export function (PyInit__caffe)
不要使用3.5以上python、建议2.7
- 报错.build_release/lib/libcaffe.so: undefined reference to cv::imread(cv::String const&, int)’
发现此电脑的opencv为3.0,所以:
Makefile.config中OPENCV_VERSION := 3取消注释
- 无形错误最为致命!!!
有时候环境装成功后,再次编译会失败,大概是anoconda的虚拟环境下有相关库冲突了,一般我都是删除anoconda 虚拟环境,重新创建再编译