In this tutorial, I will guide how you set up your computer as a work station to do some serious deep learning while managing different development environments with Docker. First let’s get the machine to running without any docker.
- Install Ubuntu 16.04 (the latest version with LTS), an updated verison for Ubuntu 18.04
- Install the latest (supported by your GPU) Nvidia drivers.
- Install CUDA (which allows fast computation on your GPU).
- Install the CUDA Toolkit
- Install cuDNN (install from Debian files)
- Now to verify that everything works as expected, follow the steps, I had to fix some bugs in the sample code as described in this blog post.
- Now run it again, ‘all test passed!’ should be the last command.
Now let’s focus on Docker.
-
Install nvidia-docker in order to get access to the GPU inside the docker container
-
Write a scrip to run a Docker container. This pulls an image from floydhub, removes the need for a password with Jupyter, persits Jupyter Notebooks in $HOME/jupyter and makes the notebook accessible on port 8888.
#!/usr/bin/env bash docker run -it -p 8888:8888 --runtime=nvidia --volume /home/filter/jupyter:/jupyter --name pytorch floydhub/pytorch:0.3.1-gpu.cuda9cudnn7-py3.27 /bin/sh -c "./run_jupyter.sh --NotebookApp.token='' --notebook-dir=/jupyter"
-
Now verify your docker is working
docker ps
-
To stop it run
docker stop pytorch
or to restart bash
docker restart pytorch
-
To access the work borks from your local machine, tunnel the traffic via ssh
ssh -N -f -L localhost:8888:localhost:8888
-
(optional) If you happened to have a special setup (like myself) that you don’t have a IPv6 address, you may need to tunnel it over an another server. In this case,
-
tunnel from the local machine to the server
ssh -N -f -L localhost:8888:localhost:64444 XX@XX.TLD
-
tunnel from that server to your gpu work stations
ssh -N -f -L localhost:64444:localhost:8888 XX@XX.TLD
That’s it. I hope this post helpful to you and you are right to train deep neural networks on large datasets. ;)