I have been using Jupyter notebooks in a virtual environment for some time now. I would compile the version of Python that I wanted into a local folder that did not require any special permissions. I would then create a virtual environment for Jupyter and proceed to install what I needed. Once completed I created the requirements file. It was fairly easy to update items. A bit time consuming and not fully automated. This didn’t work well for windows. I have to use conda for that platform.
I looked into using docker to run the Jupyter notebook application. There were a lot of options available over at Docker Hub. A lot of options! I tried some of them but couldn’t figure out how to use them. That is when I decided I would build from scratch. I didn’t know at the time there were known issues. That route was interesting but not fruitful. I decided to take a look at the pre-built images. After a little reading, I settled on the minimal notebook. This gave a fully functioning notebook implementation. I just had to add what I need to the mix.
I decided to use docker-compose to create the new image. It is a yaml based configuration file that drives creating a docker image. It took me a bit of time to figure out how it worked. You create a DockerFile and use the docker-compose.yml to script the process. The DockerFile is used to construct what is in the image. I like to think of it as a command line capture of building the actual system - as if you were typing the commands directly into the file. The docker-compose.yml takes the DockerFile and automates some of the other drudgeries.
Here is the DockerFile that I came up with:
FROM jupyter/minimal-notebook LABEL maintainer="Troy Williams <firstname.lastname@example.org>" # install the stack that I need for my work RUN conda install numpy RUN conda install pandas RUN conda install scipy RUN conda install matplotlib RUN conda install seaborn RUN conda install ipywidgets # install the notebook extensions RUN conda install -c conda-forge jupyter_contrib_nbextensions
The file is short and to the point. I had trouble at first as I was using
apt-get assuming it was a Debian distro. The author decided to use the conda based python distribution. I was trying to use pip to install my requirements. Once I realized that this was a conda based distribution, it was easy to figure out the rest.
The nice thing about this process is that I can make it build and it will check for updates and downloaded them before assembling the final image. It is an easy process to ensure the image is always up to date.
NOTE: It is possible to make sure that you have specific version numbers if that is important to your workflow
version: '2' services: jupyter: build: context: ./ dockerfile: DockerFile volumes: - /home/troy/sync/misc/jupyter notebooks/:/home/jovyan/work - /home/troy/docker/jupyter/jupyter.pem:/etc/ssl/jupyter.pem environment: - NB_UID=1000 - NB_GID=1000 ports: - "8888:8888" command: ["start-notebook.sh", "--NotebookApp.certfile=/etc/ssl/jupyter.pem", "--NotebookApp.password='sha1:8ced77887f24:2631e006832e185c867d3e482e6f2ee8eca76885'" ] # How to launch Jupyter # $ docker-compose run --service-ports --user="root" jupyter
The docker-compose format is well described here.
I had to explicitly specify the root folder where the DockerFile was located as well as the explicit name otherwise it wouldn’t work correctly.
build: context: ./ dockerfile: DockerFile
The volumes are straightforward. I mapped paths on my system to the file system in the container. In this case, /home/jovyan/work was the path where jupyter would be directed to open/save notebooks. The second volume is a path to a custom signed certificate for the jupyter server so that it can use SSL. It has an option to generate a certificate every time the server is launched. This caused problems with the tls handshakes taking forever in FireFox.
volumes: - /home/troy/sync/misc/jupyter notebooks/:/home/jovyan/work - /home/troy/docker/jupyter/jupyter.pem:/etc/ssl/jupyter.pem
In linux it is quite easy to generate the self-signed certificate.
These flags need to be set so that when the notebooks are saved they have the correct UID and GID for the user on the local host and there are no permission issues.
environment: - NB_UID=1000 - NB_GID=1000
The image that we are using as a base image has the notebook server working from port 8888. We need to map that port from the inside of the container to the outside:
ports: - "8888:8888"
NOTE: You can map any external port to 8888.
Last, we setup the command to launch the container properly:
command: ["start-notebook.sh", "--NotebookApp.certfile=/etc/ssl/jupyter.pem", "--NotebookApp.password='sha1:8ced77887f24:2631e006832e185c867d3e482e6f2ee8eca76885'"]
The first line, start-notebook.sh, is the script that launches the jupyter notebook server as well as setting the UID and GID of the user so that when files are written to the system, they have the correct UID and GID. The
--NotebookApp.certfile should point to the certificate. The other one,
--NotebookApp.password, which is an ssh1 hash of the password that we want to use to access the notebook. Without this option, you will have to enter a special token everytime to access the server. I find it easier to type a password.
To create the hash, I launch the jupyter notebook, enter the token, launch a new workbook and enter the following:
from notebook.auth import passwd; passwd()
That command will prompt you to enter a password twice. It will create the sha1 hash. That hash is used as the app password.
So now I have a simple mechanism to launch and maintain jupyter notebooks without having to mess about with things on the current O/S. It should make using this on different computers child’s play.
To launch the jupyter notebook server use the following command:
$ docker-compose run --service-ports --user="root" jupyter
It is run a little different than using docker-compose up but it works as expected.