How to run an R Jupyter notebook on a deepthought node
This blog post covers how to set up an R environment to run on a deepthought node. This is useful if you have analysis to do in R that requires too much memory to run locally!
If you haven’t already gotten access to the DeepThought HPC follow the instructions here and then set up conda following the instructions here. Then come back for the rest of the tutorial.
Installing R
This should be pretty straightforward. We will create a conda environment named R_notebook_env
and then install R into it.
mamba create -n R_notebook_env conda-forge::r-base
Note 1: I use R_notebook_env
as the name, and that name appears in the slurm script below. If you use a different environment name, be sure to change it below
Note 2: This will install the latest version of R that is available on conda. If you know you need a specific version you can specify this as mamba create -n R_notebook_env conda-forge::r-base=4.X.X
Installing Jupyter
In order to run a jupyter session we need jupyter! We are going to activate our conda environment:
mamba activate R_notebook_env
Then install jupyter:
mamba install anaconda::jupyterlab
BUT in order for R and jupyter to talk to each other we need to install some additional kernels:
mamba install conda-forge::r-irkernel
mamba install nb_conda_kernels
Now our R jupyter environment is ready to go!
Setting up a slurm script
You can use this slurm script to submit your R job to DeepThought. I’ve downloaded this script and saved it in a directory called slurm
.
Submit your script to the cluster by running these commands:
mkdir -p jupyter # you only need to do this once!
sbatch slurm/jupyter_deepthought.slurm
It will put the log files in ~/jupyter
and the notebooks by default will be where ever you run the sbatch
command from.
It will also print a bunch of information into a log file in ~/jupyter
that you will need to ssh tunnel to your session.
ssh tunnelling
We will use the log file that has just been created in your ~/jupyter
directory to connect to your jupyter session. You should have something that looks like this:
Windows MobaXterm info
Forwarded port: 8100
Remote server: hpc-node023
Remote port: 8281
SSH server: deepthought.flinders.edu.au
SSH login: grig0076
SSH port: 22
Use a Browser on your local machine to go to:
localhost:8281 (prefix w/ https:// if using password)
If ypu have a command line, you can also use this version
ssh grig0076@deepthought.flinders.edu.au -p 22 -L 8100:hpc-node023:8281
and then open chrome and go to
https://localhost:8100
[W 2024-06-03 14:14:41.207 ServerApp] ServerApp.password config is deprecated in 2.0. Use PasswordIdentityProvider.hashed_password.
[W 2024-06-03 14:14:41.321 ServerApp] A `_jupyter_server_extension_points` function was not found in nbclassic. Instead, a `_jupyter_server_extension_paths` function was found and will be used for now. This function name will be deprecated in future releases of Jupyter Server.
[I 2024-06-03 14:14:41.328 ServerApp] jupyter_server_fileid | extension was successfully linked.
[I 2024-06-03 14:14:41.333 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2024-06-03 14:14:41.338 ServerApp] jupyter_server_ydoc | extension was successfully linked.
...
[I 2024-06-03 14:14:43.563 LabApp] JupyterLab extension loaded from /home/grig0076/miniconda3/envs/R_notebook_env/lib/python3.11/site-packages/jupyterlab
[I 2024-06-03 14:14:43.563 LabApp] JupyterLab application directory is /home/grig0076/miniconda3/envs/R_notebook_env/share/jupyter/lab
[I 2024-06-03 14:14:43.567 ServerApp] jupyterlab | extension was successfully loaded.
[I 2024-06-03 14:14:43.574 ServerApp] nbclassic | extension was successfully loaded.
[I 2024-06-03 14:14:43.575 ServerApp] Serving notebooks from local directory: /home/grig0076
[I 2024-06-03 14:14:43.576 ServerApp] Jupyter Server 2.14.1 is running at:
[I 2024-06-03 14:14:43.576 ServerApp] http://hpc-node023:8281/lab
[I 2024-06-03 14:14:43.576 ServerApp] http://127.0.0.1:8281/lab
Rather than explain this again, use this output and refer to the instructions here to create your ssh tunnel. If you have a Windows device, you will need to use mobaXterm and if you have Windows Subsystem for Linux, Mac, Chromebook or Linux, you can set up an ssh tunnel via the command line.
Troubleshooting
Sometimes R can be picky about what versions a package will run with. Sometimes you might need to install a package manually (I have needed to do this to get the MASS and Matrix packages to run). You can do this by downloading the source code for your desired package then run an R session on the head node by running:
R
Then install your package:
install.packages(path_to_package, repos = NULL, type="source")
Working with Jupyter
Jupyter is little different from RStudio which you might be used to if you work with R often. In Jupyter you can create blocks of code that can be run individually. Any figures or output will be shown directly below your block.
You can install packages that you might need from inside these code blocks.