Loading jupyter-on-rhea.README.md 0 → 100644 +92 −0 Original line number Diff line number Diff line The following approach is a hack-ish way to run jupyter on Rhea. The jupyter compute kernels should be run on reserved batch nodes (ie, not shared login nodes where they can be killed without warning) and the web browser used to access the notebook interface is best run on your local machine as you are already used to. The batch script `jupyter-on-rhea.pbs` launches a jupyter notebook server on a single batch node and sets up a script to create the necessary SSH tunnel to access it. In order to work, you will need to have jupyter installed somewhere in your PYTHONPATH. This can either be in `/ccs/proj/...` or simply in your home directory using pip: $ module load python/2.7.9 python_setuptools python_pip $ pip install --user jupyter You should then create a skeleton configuration file and set an access password (see the caveats below for an explanation): $ jupyter notebook --generate-config $ python -c "from IPython.lib import passwd; p=passwd(); print '''c.NotebookApp.password = u'%s' ''' % p" Enter password: Verify password: c.NotebookApp.password = 'sha1:123:some:456:secret:789:password:012:hash:3456789abcd' add the output hashed password line to the profile config file (typically `$HOME/.jupyter/jupyter_notebook_config.py`) Starting the server is done by launching the PBS script with appropriate PBS options: $ qsub ok_jupyter.pbs The job will place an executable at `$HOME/.jupyter_connect` which contains instructions on how to attach to the server. Typically you would issue a variation of `ssh -f -L 127.0.0.1:8080:127.0.0.1:XXXX $USER@rhea.ccs.ornl.gov /ccs/home/$USER/.jupyter_connect` on your local workstation and direct your local browser to `http://127.0.0.1:8080`. A few caveats to this approach are: 1.) The connection to the server is not encrypted nor password protected by default. As anyone can SSH to any node on Rhea, it is possible for other users to connect to your notebook server and then generate and run code as your user if left unsecured. It highly recommended to setup TLS/SSL encryption to prevent the possibility of someone sniffing data packets sent to the notebook server. To enable encryption, one can obtain or generate a self-signed x509 server certificate/key pair. The DN data used is mostly arbitrary, but you must use `127.0.0.1` for the hostname/server name as modern browsers will reject the certificate if it does not match the URL in the browser: $ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout ~/.jupyter/mykey.key -out ~/.jupyter/mycert.pem $ chmod go= ~/.jupyter/mykey.key and set the options in `$HOME/.jupyter/jupyter_notebook_config.py`: c.NotebookApp.certfile = '/ccs/home/$USER/.jupyter/mycert.pem' c.NotebookApp.keyfile = '/ccs/home/$USER/.jupyter/mykey.pem' When using TLS encryption, connections must use `https://127.0.0.1:8080` instead of `http`. 2.) See the notebook `interactive_notebooks_with_mpi_on_rhea.ipynb` for how to enable MPI parallelism in an interactive notebook. The method described is also a hack to overcome infrastructure and policy limitations that, if resolved, would make this process both easier and more allocation efficient. 3.) The server is killed at least every 48 hours so you will want to make sure your work is saved often. You can add a line like: qsub -W depend=afternotok:$PBS_JOBID ok_jupyter.pbs near the top of `ok_jupyter.pbs` to resubmit the job automatically to keep a server up, but you will still need to re-establish the tunnel each time it goes down. 4.) This does consume your Rhea allocation so just keeping the server up and not using it to crunch numbers could be wasteful. It is perhaps the best practice to do interactive development work on a local jupyter instance and then run a dedicated python script in a batch job to make the most efficient use of your allocation. 5.) Any of the configuration details should be tuned to your needs. Specifically, the ports may need to be different for your case. You may want to change 'c.NotebookManager.notebook_dir' to use a different path then the default so as to keep your toplevel $HOME directory tidy. Loading
jupyter-on-rhea.README.md 0 → 100644 +92 −0 Original line number Diff line number Diff line The following approach is a hack-ish way to run jupyter on Rhea. The jupyter compute kernels should be run on reserved batch nodes (ie, not shared login nodes where they can be killed without warning) and the web browser used to access the notebook interface is best run on your local machine as you are already used to. The batch script `jupyter-on-rhea.pbs` launches a jupyter notebook server on a single batch node and sets up a script to create the necessary SSH tunnel to access it. In order to work, you will need to have jupyter installed somewhere in your PYTHONPATH. This can either be in `/ccs/proj/...` or simply in your home directory using pip: $ module load python/2.7.9 python_setuptools python_pip $ pip install --user jupyter You should then create a skeleton configuration file and set an access password (see the caveats below for an explanation): $ jupyter notebook --generate-config $ python -c "from IPython.lib import passwd; p=passwd(); print '''c.NotebookApp.password = u'%s' ''' % p" Enter password: Verify password: c.NotebookApp.password = 'sha1:123:some:456:secret:789:password:012:hash:3456789abcd' add the output hashed password line to the profile config file (typically `$HOME/.jupyter/jupyter_notebook_config.py`) Starting the server is done by launching the PBS script with appropriate PBS options: $ qsub ok_jupyter.pbs The job will place an executable at `$HOME/.jupyter_connect` which contains instructions on how to attach to the server. Typically you would issue a variation of `ssh -f -L 127.0.0.1:8080:127.0.0.1:XXXX $USER@rhea.ccs.ornl.gov /ccs/home/$USER/.jupyter_connect` on your local workstation and direct your local browser to `http://127.0.0.1:8080`. A few caveats to this approach are: 1.) The connection to the server is not encrypted nor password protected by default. As anyone can SSH to any node on Rhea, it is possible for other users to connect to your notebook server and then generate and run code as your user if left unsecured. It highly recommended to setup TLS/SSL encryption to prevent the possibility of someone sniffing data packets sent to the notebook server. To enable encryption, one can obtain or generate a self-signed x509 server certificate/key pair. The DN data used is mostly arbitrary, but you must use `127.0.0.1` for the hostname/server name as modern browsers will reject the certificate if it does not match the URL in the browser: $ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout ~/.jupyter/mykey.key -out ~/.jupyter/mycert.pem $ chmod go= ~/.jupyter/mykey.key and set the options in `$HOME/.jupyter/jupyter_notebook_config.py`: c.NotebookApp.certfile = '/ccs/home/$USER/.jupyter/mycert.pem' c.NotebookApp.keyfile = '/ccs/home/$USER/.jupyter/mykey.pem' When using TLS encryption, connections must use `https://127.0.0.1:8080` instead of `http`. 2.) See the notebook `interactive_notebooks_with_mpi_on_rhea.ipynb` for how to enable MPI parallelism in an interactive notebook. The method described is also a hack to overcome infrastructure and policy limitations that, if resolved, would make this process both easier and more allocation efficient. 3.) The server is killed at least every 48 hours so you will want to make sure your work is saved often. You can add a line like: qsub -W depend=afternotok:$PBS_JOBID ok_jupyter.pbs near the top of `ok_jupyter.pbs` to resubmit the job automatically to keep a server up, but you will still need to re-establish the tunnel each time it goes down. 4.) This does consume your Rhea allocation so just keeping the server up and not using it to crunch numbers could be wasteful. It is perhaps the best practice to do interactive development work on a local jupyter instance and then run a dedicated python script in a batch job to make the most efficient use of your allocation. 5.) Any of the configuration details should be tuned to your needs. Specifically, the ports may need to be different for your case. You may want to change 'c.NotebookManager.notebook_dir' to use a different path then the default so as to keep your toplevel $HOME directory tidy.