之前用Mac跑了一下im2txt
现在用ubuntu虚拟机试一下, 记录下来
- 取代码
git clone https://github.com/tensorflow/models.git
2.安装环境
我的设备:ubuntu14.04+GPU
TensorFlow1.0.1
相关论文 : 《Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge》
去年9月刚开源的: github
根据GitHub的readme,先安装相关东西
1:Bazel
根据官网
$echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
$sudo apt-get update&& sudo
apt-get install bazel
报错:
有一些软件包无法被安装。如果您用的是 unstable发行版,这也许是因为系统无法达到您要求的状态造成的。该版本中可能会有一些您需要的软件
包尚未被创建或是它们已被从新到(Incoming)目录移出。
下列信息可能会对解决问题有所帮助:
下列软件包有未满足的依赖关系:
bazel :
依赖: google-jdk 但无法安装它
或
java8-jdk但无法安装它或
java8-sdk但无法安装它或
oracle-java8-installer但无法安装它
E:
无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系。
试了网上的无数方法,各种换源都没用,直到我看到官网的一行字:
If you want to use the JDK 7, please replace jdk1.8 with jdk1.7 and
if you want to install the testing version of Bazel, replace stable with testing.
应该是因为我的系统是ubuntu14.04,所以用的jdk7
$cat /etc/issue
Ubuntu 14.04.5 LTS \n \l
$ update-java-alternatives -l
#java-1.7.0-openjdk-amd641071
/usr/lib/jvm/java-1.7.0-openjdk-amd64
#继续按照官网
$echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.7" | sudo tee /etc/apt/sources.list.d/bazel.list
$curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
$sudo apt-get update&& sudo
apt-get install bazel
$sudo apt-get upgrade bazel
$wget https://github.com/bazelbuild/bazel/releases/download/
0.5.0/bazel-0.5.0-installer-linux-x86_64.sh
$chmod +x bazel-0.5.0-installer-linux-x86_64.sh
$./bazel-0.5.0-installer-linux-x86_64.sh --user
$export PATH="$PATH:$HOME/bin"
#检查自己是否安装好了
root@cf644f163c6d:~# $HOME/bin/bazel version
Extracting Bazel installation...
Build label: 0.5.0
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/bu
ild/lib/bazel/BazelServer_deploy.jar
Build time: Fri May 26 12:11:50 2017 (1495800710)
Build timestamp: 1495800710
Build timestamp as int: 1495800710
继续安装另外两个库NumPy和 NLTK
2:NumPy
NumPy
安装官方文档
[https://www.scipy.org/install.html](https://www.scipy.org/install.html)
$python -m pip install --upgrade pip
$pip install --user numpy scipy matplotlib ipython jupyter pandas sympy nose
测试:
$python
>>>import scipy
>>>import numpy
>>>scipy.test()
>>>numpy.test()
网上说也可以这么装,不懂跟GitHub上链接的网址有什么不同
$sudo apt-get install Python-scipy
$sudo apt-get install python-numpy
$sudo apt-get install python-matplotlib
3:NLTK
Natural Language Toolkit (NLTK):
首先安装NLTK
[http://www.nltk.org/install.html](http://www.nltk.org/install.html)
$sudo pip install -U nltk
$sudo pip install -U numpy
$python
>>> import nltk
继续github的流程
# Location to save the MSCOCO data.
MSCOCO_DIR="${HOME}/im2txt/data/mscoco"
# Build the preprocessing script. 要把model改名成tensorflow-models
cd tensorflow-models/im2txt
$HOME/bin/bazel build //im2txt:dow
nload_and_preprocess_mscoco
我的输出如下:
root@cf644f163c6d:~/tensorflow-models/im2txt# /root/bin/bazel build //im2txt:dow
nload_and_preprocess_mscoco
..................
INFO: Found 1 target...
Target //im2txt:download_and_preprocess_mscoco up-to-date:
bazel-bin/im2txt/download_and_preprocess_mscoco
INFO: Elapsed time: 5.003s, Critical Path: 0.03s
第三步比较耗时,因为有13G的文件
# Run the preprocessing script.
bazel-bin/im2txt/download_and_preprocess_mscoco "${MSCOCO_DIR}"