Wednesday, March 15, 2017

Installing keras

keras is great. It makes building convolution networks so much easier. Obviously I am using keras with tensorflow backend, but it can be theano. I tried three methods:

1. Install keras using conda
 $ conda install keras  
 Fetching package metadata ...........  
 PackageNotFoundError: Package not found: Conda could not find '  
--> Obviously it failed

2. Install keras using pip3
 $ pip3 install keras  
 ...  
 Successfully installed keras-2.0.0 pyyaml-3.12  
--> Can't import keras on python 3.6

3. Install keras using pip
 $ pip install keras  
 ...  
 Successfully installed keras-2.0.0 pyyaml-3.12  
--> Now I can import keras on python 3.6
  • Anaconda is great but I still don't understand what you can and cannot conda
  • I am using python 3.6 and I still don't understand whether I should use pip or pip3 for individual modules

Monday, March 13, 2017

Compile OpenCV 3.2 for Anaconda Python 3.6

I need Open CV to do some image processing and visualization. Open CV seems to be an equivalent of Matlab "Image Processing Toolbox". However, it turns out that it's a huge headache to install Open CV 3 in Anaconda Python 3.6 environment due to the dependency issues. After extensive web search I found this site. I tested it in my MacBookAir and it seems to be working well.

1. Setup
Linux
 $ sudo apt install gcc g++ git libjpeg-dev libpng-dev libtiff5-dev libjasper-dev libavcodec-dev libavformat-dev libswscale-dev pkg-config cmake libgtk2.0-dev libeigen3-dev libtheora-dev libvorbis-dev libxvidcore-dev libx264-dev sphinx-common libtbb-dev yasm libfaac-dev libopencore-amrnb-dev libopencore-amrwb-dev libopenexr-dev libgstreamer-plugins-base1.0-dev libavcodec-dev libavutil-dev libavfilter-dev libavformat-dev libavresample-dev 

OSX
 $ brew install git cmake pkg-config jpeg libpng libtiff openexr eigen tbb

2. Download OpenCV (http://opencv.org/)
$ cd ~/Downloads
$ unzip opencv-3.2.0.zip
$ cd opencv-3.2.0

3. Cmake configuration: OpenCV for Python 3
$ mkdir release
$ cd release
$ sudo cmake -DBUILD_TIFF=ON -DBUILD_opencv_java=OFF -DWITH_CUDA=OFF -DENABLE_AVX=ON -DWITH_OPENGL=ON -DWITH_OPENCL=ON -DWITH_IPP=OFF -DWITH_TBB=ON -DWITH_EIGEN=ON -DWITH_V4L=ON -DWITH_VTK=OFF -DBUILD_TESTS=OFF -DBUILD_PERF_TESTS=OFF -DCMAKE_BUILD_TYPE=RELEASE -DBUILD_opencv_python2=OFF -DCMAKE_INSTALL_PREFIX=$(python3 -c "import sys; print(sys.prefix)") -DPYTHON3_EXECUTABLE=$(which python3) -DPYTHON3_INCLUDE_DIR=$(python3 -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())") -DPYTHON3_PACKAGES_PATH=$(python3 -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())") .. 

4. Compile and install OpenCV source
$ sudo make -j4
$ sudo make install

5. Fix a couple of things (for Linux only)
$ mv libstdc++.so.6 libstdc++.so.6.bak
$ mv libgomp.so.1 libgomp.so.1.bak

6. Test installation
$ python3
>>> import cv2
>>>

Thursday, March 9, 2017

Re-installing tensorflow

Finally I understood why my friend recommended Anaconda. It's so much easier to install Anaconda, which includes all the goodies that come with it, rather than installing each component on an as-needed basis (e.g. Spyder, iPython, etc.) The only reason I didn't go with Anaconda installation of tensorflow previously was that the tensorflow official documentation did not strongly recommend it. I used Anaconda to install tensorflow in my MacBookAir, and had no issues so far. So why not use it for the GPU box as well?

1. Install Anaconda (Python 3.6 at the time of this writing)

2. Create a conda environment named tensorflow to run Python 3.6:

$ conda create -n tensorflow

3. Activate the conda environment:

$ source activate tensorflow
  (tensorflow) username$ # Your prompt should change

4. Install tensorflow within the conda environment:

(tensorflow) username$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.1-cp36-cp36m-linux_x86_64.whl   

# This is for Python 3.6 with GPU support

I tried pip3 at the beginning and got an error. pip works without an issue. I still don't quite understand pip3 (python3) vs. pip (python2).

5.  Verify that tensorflow is installed within Anaconda

(tensorflow) username$ ls /home/username/anaconda3/envs/tensorflow/bin/

activate  conda  deactivate


6. Install ipython and jupyter

(tensorflow) username$ conda install ipython


(tensorflow) username$ pip3 install jupyter  


7. Verify that ipython and  is installed within Anaconda

(tensorflow) username$ ls /home/username/anaconda3/envs/tensorflow/bin/

2to3          easy_install-3.6  pydoc       python3.6-config   tclsh8.5
2to3-3.6      idle3             pydoc3      python3.6m         unxz
activate      idle3.6           pydoc3.6    python3.6m-config  wheel
conda         ipython           pygmentize  python3-config     wish8.5
c_rehash      ipython3          python      pyvenv             xz
deactivate    openssl           python3     pyvenv-3.6
easy_install  pip               python3.6   sqlite3 



Now I can use tensorflow in ipython within Anaconda!

Friday, February 10, 2017

Installing tensorflow

This took 2 days. The issue was that I started out with pip install, which could impact the existing python (2.7) programs in the GPU box. I eventually went with virtualenv install.

Install required (some unrequired) packages

$ sudo apt-get install openjdk-8-jdk git python-dev python3-dev python-numpy python3-numpy build-essential python-pip python3-pip python-virtualenv swig python-wheel libcurl3-dev

Create a Virtualenv environment in the directory ~/tensorflow:

$ virtualenv --system-site-packages ~/tensorflow
 

Activate the environment:

$ source ~/tensorflow/bin/activate

(tensorflow)$ 

Pick the right tensorflow binary package (Ubuntu/Linux 64-bit, GPU enabled, Python 3.5)

(tensorflow)$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp35-cp35m-linux_x86_64.whl


Install tensorflow
 
(tensorflow)$ pip3 install --upgrade $TF_BINARY_URL


Add commands to ~/.bash_profile

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda


Done!

Updating bash file

Next time I logged in, my GPU box couldn't find nvcc. I panicked - do I need to install CUDA again?? I frantically searched for an answer on the web, and came to a conclusion that I didn't update bash file.

$ gedit ~/.bashrc

Add the following lines at the bottom of the bash file:

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda


Save and close the text file.
Type the following command to reload the .bashrc file:

$ source ~/.bashrc

Installing cuDNN

This was relatively painless.

Go to NVIDIA website, log in as a developer, download cuDNN Library v5.1 for Linux (cudnn-8.0-linux-x64-v5.1.tgz)

$ cd /usr/local/cuda
$ tar xvzf cudnn-8.0-linux-x64-v5.1.tgz
$ sudo cp -P cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*



Monday, February 6, 2017

Installing CUDA

It took me about a week but I finally got it to work. I don't belabor the details but here is the summary:

1. Download CUDA toolkit 8.0
I used Ubuntu 16.04 LTS version of runfile (local)

2. Compute md5 sum:
$ md5sum cuda_8.0.44_linux.run

3. Remove CUDA toolkit 7.5
$ sudo apt-get purge nvidia-cuda*
$ sudo apt-get purge nvidia-*
(redundant but I did it to make sure)

4. Go to a terminal session
(ctrl+alt+F2)

5. Stop lightdm
$ sudo service lightdm stop

6. Install CUDA runfile
$ sudo sh cuda_8.0.44_linux.run --override

7. Start lightdm again
$ sudo service lightdm start 

8. Modify PATH 
$ export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}} 
$ export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

I basically followed the Installation Guide. Easy, right?

Here is my mistake. I realize that the CUDA driver version is 367 by default at this point.

9. Install the latest CUDA driver for Tesla K80 (= 375 at the time of this writing)
$ sudo apt-get install nvidia-375

Somehow driver 375 doesn't seem to do well in my system. When I run

$ nvidia-smi

I get the following error.

Failed to initialize NVML: Driver/library version mismatch 

After multiple repetition of the loop 1-9 above, I just gave up installing driver 375 and everything seems to be working well. Now I have

CUDA Toolkit 8.0
CUDA driver 367
and... MATLAB does recognize Tesla K80!






 

Initial condition

Here is the spec of the liquid-cooled GPU box that I recently purchased to meet the high-performance computing demands in my laboratory:

Motherboard:  Xeon E5-2600/1600 v3 C612 Chipset
CPU:               Intel® Xeon® Processor E5-2680 v4 (14-core, 35M Cache, 2.40 GHz)
Memory:         DDR4 ECC Reg SO-DIMM 128GB (= 4x32GB)
Storage:           2.5" SATA 6Gb/s Internal SSD 1TB
GPU:               2 x NVIDIA Tesla K80 24GB Passive Cooling PCI-E 3.0 x16 GPU
Cooling:          2-Phase Liquid Cooling Kit for GPU and CPU by Ebullient
OS:                  Ubuntu 16.04 LTS


Here are the problems that I found as soon as it arrived:

1. Can't login to Ubuntu using Unity
The vendor kindly installed Openbox which allows login to Ubuntu with no issues.

2. Older version of CUDA toolkit was installed
The current version of CUDA toolkit at the time of this writing is 8.0. However, toolkit 7.5 was installed.

3. Older version of CUDA driver was installed
The current version of CUDA driver for Tesla K80 at the time of this writing is 375. However, version 367 was installed.

4. MATLAB doesn't recognize NVIDIA Tesla K80
In MATLAB 2016b with Parallel Computing Toolbox;
>> gpuDeviceCount

ans =

     0
 

Oh, noooo!

The purpose of this blog is to document the solutions (and the struggles) so no one needs to waste their time trying to solve the same problems that I had.