Commit 09c13875 authored by Belhorn, Matt's avatar Belhorn, Matt
Browse files

Improves instructions for running Jupyter servers on Rhea.

parent f0d17058
Loading
Loading
Loading
Loading
+44 −28
Original line number Diff line number Diff line
The following approach is a hack-ish way to run jupyter on Rhea. The jupyter
compute kernels should be run on reserved batch nodes (ie, not shared login
nodes where they can be killed without warning) and the web browser used to
access the notebook interface is best run on your local machine as you are
already used to.

The batch script `jupyter-on-rhea.pbs` launches a jupyter notebook server on a
Setting up Jupyter on Rhea
==========================

The following procedure can be used to run Jupyter server instances on Rhea
while allowing connections to them from a local (i.e. a laptop) browser. This
is a band-aid procedure in leiu of dedicated infrastructure for spinning up
Jupyter instances at the OLCF. Ideally we would offer a host running a
JupyterHub service where users could spin up secured, private Jupyter servers
that offload work transparently to dynamically started ipycluster backend jobs
through the batch system.

If you like this idea, please mention it in the OLCF User Survey - with enough
voices and support behind it we could push to aquire the necessary
infrastructure.

Meanwhile
---------

The jupyter compute kernels should be run on reserved batch nodes (ie, not
shared login nodes where they can be killed without warning) and the web
browser used to access the notebook interface is best run on your local machine
as you are already used to.

The `jupyter-on-rhea.pbs` batch script launches a jupyter notebook server on a
single batch node and sets up a script to create the necessary SSH tunnel to
access it. In order to work, you will need to have jupyter installed somewhere
in your PYTHONPATH. This can either be in `/ccs/proj/...` or simply in your home
directory using pip:
in your PYTHONPATH. This can either be in `/ccs/proj/...` or (for Rhea) simply
in your home directory using pip:

$ module load python/2.7.9 python_setuptools python_pip
$ pip install --user jupyter
@@ -17,12 +34,10 @@ You should then create a skeleton configuration file and set an access password
(see the caveats below for an explanation):

$ jupyter notebook --generate-config
$ python -c "from IPython.lib import passwd; p=passwd(); print
'''c.NotebookApp.password = u'%s' ''' % p"
$ python -c "from IPython.lib import passwd; p=passwd(); print '''c.NotebookApp.password = u'%s' ''' % p"
Enter password: 
Verify password: 
c.NotebookApp.password =
'sha1:123:some:456:secret:789:password:012:hash:3456789abcd'
c.NotebookApp.password = 'sha1:123:some:456:secret:789:password:012:hash:3456789abcd'

add the output hashed password line to the profile config file (typically
`$HOME/.jupyter/jupyter_notebook_config.py`)
@@ -30,14 +45,13 @@ add the output hashed password line to the profile config file (typically
Starting the server is done by launching the PBS script with appropriate PBS
options:

$ qsub ok_jupyter.pbs
$ qsub jupyter-on-rhea.pbs

The job will place an executable at `$HOME/.jupyter_connect` which contains
instructions on how to attach to the server. Typically you would issue a
variation of

`ssh -f -L 127.0.0.1:8080:127.0.0.1:XXXX $USER@rhea.ccs.ornl.gov
/ccs/home/$USER/.jupyter_connect`
`ssh -f -L 127.0.0.1:8080:127.0.0.1:8081 $USER@rhea.ccs.ornl.gov /ccs/home/$USER/.jupyter_connect`

on your local workstation and direct your local browser to
`http://127.0.0.1:8080`.
@@ -45,30 +59,32 @@ on your local workstation and direct your local browser to
A few caveats to this approach are:

1.) The connection to the server is not encrypted nor password protected by
default. As anyone can SSH to any node on Rhea, it is possible for other users
default.

As anyone can SSH to any node on Rhea, it is possible for other users
to connect to your notebook server and then generate and run code as your user
if left unsecured. It highly recommended to setup TLS/SSL encryption to prevent
the possibility of someone sniffing data packets sent to the notebook server. To
enable encryption, one can obtain or generate a self-signed x509 server
if left unsecured. It is a good idea to setup TLS/SSL encryption if you are
concerned about the possibility of someone sniffing data packets sent to the
notebook server. To enable encryption, generate a self-signed x509 server
certificate/key pair. The DN data used is mostly arbitrary, but you must use
`127.0.0.1` for the hostname/server name as modern browsers will reject the
certificate if it does not match the URL in the browser:

$ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout
~/.jupyter/mykey.key -out ~/.jupyter/mycert.pem
$ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout ~/.jupyter/mykey.key -out ~/.jupyter/mycert.pem
$ chmod go= ~/.jupyter/mykey.key

and set the options in `$HOME/.jupyter/jupyter_notebook_config.py`:
c.NotebookApp.certfile = '/ccs/home/$USER/.jupyter/mycert.pem'
c.NotebookApp.keyfile = '/ccs/home/$USER/.jupyter/mykey.pem'

When using TLS encryption, connections must use `https://127.0.0.1:8080`
If you add TLS encryption (*you should*), you must connect using `https://127.0.0.1:8080`
instead of `http`.

2.) See the notebook `interactive_notebooks_with_mpi_on_rhea.ipynb` for how to
enable MPI parallelism in an interactive notebook. The method described is also
a hack to overcome infrastructure and policy limitations that, if resolved,
would make this process both easier and more allocation efficient.
2.) The kernels all run on a single node. It is possible to extend this
setup to use the ipython cluster 'ipcluster' backend and the $PBS_NODEFILE to
allow the kernel to run parallel tasks. See the notebook
`interactive_notebooks_with_mpi_on_rhea.ipynb` in this repo for instructions on
setting up an ipycluster backend.

3.) The server is killed at least every 48 hours so you will want to make sure
your work is saved often. You can add a line like:
@@ -80,7 +96,7 @@ server up, but you will still need to re-establish the tunnel each time it goes
down. 

4.) This does consume your Rhea allocation so just keeping the server up and not
using it to crunch numbers could be wasteful. It is perhaps the best practice to
using it to crunch numbers is wasteful. It is perhaps the best practice to
do interactive development work on a local jupyter instance and then run a
dedicated python script in a batch job to make the most efficient use of your
allocation.