Desktop On Demand

Objectives

You will learn:

  • What is On-Demand and when to use it

  • Which interface to use on each resource and how to start them

  • How to set the job parameters for your application

What is Desktop On Demand? Is it right for my job?

On Cosmos (LUNARC), Kebnekaise (HPC2N), Alvis (C3SE), and Dardel (PDC), some applications are available through one of a couple of On Demand services. On Demand applications provide an interactive environment to schedule jobs on compute nodes using a graphic user interface (GUI) instead of the typical batch submission script. How you reach this interface is dependent on the system you use and their choice of On Demand client.

Desktop On-Demand is most appropriate for interactive work requiring small-to-medium amounts of computing resources. Non-interactive jobs and jobs that take more than a day or so should generally be submitted as batch jobs. If you have a longer job that requires an interactive interface to submit, make sure you keep track of the wall time limits for your facility. Some of the Desktop On Demand’s only allow at most 12 hours of allocation at a time.

On-Demand applications are not accessible via SSH; you must use either Thinlinc (Cosmos and Dardel) or the dedicated web portal (Kebnekaise and Alvis).

Important

“On-Demand App Availability for this Course”

  • Jupyter (Lab and/or Notebook) is available as an On-Demand application at all 4 facilities covered on this page. For Cosmos specifically, it can also load custom conda environments (but NOT pip environments).

  • On Alvis, Cosmos and Kebnekaise, VSCode can also be run via On-Demand.

  • Spyder can be run via On-Demand on Cosmos only. It also supports custom conda environments.

  • On Cosmos, there are also interactive On-Demand command lines (for CPUs and GPUs) under Applications - General that can be used to start Jupyter or Spyder with a custom pip-based environment.

Warning

Dardel also has On-Demand applications located in the equivalent place on its remote desktop, but only 30 ThinLinc licenses are available for the whole facility. Talks are ongoing about whether and by how much to increase the number of licenses, but until the changes are implemented, we advise using SSH with -X forwarding instead. In our experience, ThinLinc access to Dardel is not reliably available, and even when connection succeeds, queue times for On-Demand applications can be very long.

Starting the On-Demand Interface

For most programs, the start-up process is roughly the same:

  1. Log into COSMOS (or Dardel) via your usual Thinlinc client or browser interface to start an HPC Desktop session.

  2. Click Applications in the top left corner, hover over the items prefixed with Applications - until you find your desired application (on Dardel, On-Demand applications are prefixed with PDC-), and click it. The top-level Applications menu on Cosmos looks like this:

../_images/Cosmos-AppMenu.png

Warning

If you start a terminal session or another application from Favorites, System Tools, or other menu headings not prefixed with Applications - or PDC-, and launch an interactive program from that, it will run on a login node. Do not run intensive programs this way!

Setting Job Parameters

Upon clicking your chosen application, a pop-up interface called the GfxLauncher will appear and let you set the following options:

  1. Wall time - how long your interactive session will remain open. When it ends, the whole window closes immediately and any unsaved work is lost. You can select the time from a drop-down menu, or type in the time manually. On Cosmos, CPU-only applications (indicated with “(CPU)” in the name) can run for up to 168 hours (7 days), but the rest are limited to 48 hours. Default is 30 minutes.

  2. Requirements - how many tasks per node you need. The default is usually 1 or 4 tasks per node. There is also a gear icon to the right of this box that can pull up a second menu (see figure below) where you can set

  • the name of your job,

  • the number of tasks per node,

  • the amount of memory per CPU core, and/or

  • whether or not to use a full node.

../_images/cosmos-on-demand-job-settings.png

The GfxLauncher GUI (here used to launch Jupyter Lab). The box on the left is the basic menu and the box on the right is what pops up when the gear/cog icon next to Requirements is clicked.

  1. Resource - which kind of node you want in terms of the architecture (AMD or Intel) and number of cores in the CPU (or GPU). Options and defaults vary by program, and the option to change this is not always available.

  2. Project - choose from a drop-down menu the project with which your work is associated. This is mainly to keep your usage in line with your allocations and permissions, and to send any applicable invoices to the correct PI.

When you’re happy with your settings, click “Start”. The GfxLauncher menu will stay open in the background so that you can monitor your wall time usage with the Usage bar. Leave this window open—your application depends on it!

Warning

Closing the GfxLauncher popup after your application starts will kill the application immediately!

If you want, you can also look at the associated SLURM scripts by clicking the “More” button at the bottom of the GfxLauncher menu and clicking the “Script” tab (example below), or view the logs under the “Logg” tab.

../_images/cosmos-on-demand-jupyter-more.png

If an app fails to start, the first step of troubleshooting will always be to check the “Logg” tab.

Tip

Terminals on Compute nodes. If you don’t see the program you want to run interactively listed under any other Applications sub-menus, or if the usual menu item fails to launch the application, you may still be able to launch it via one of the terminals under Applications - General, or the GPU Accelerated Terminal under Applications - Visualization.

The CPU terminal allows for a wall time of up to 168 hours (7 days), while the two GPU terminals can only run for 48 hours (2 days) at most. For more on the specifications of the different nodes these terminals can run on, see LUNARC’s webpage on COSMOS.

If you finish before your wall time is up and close the app, the app should stop in the GfxLauncher window within a couple of minutes, but you can always force it to stop by clicking the “Stop” button. This may be necessary for Jupyter Lab.

Job Parameters - GPUs

Most settings are the same, with exception of the “Resource” or “Node Type”/”Core Number” settings. Here is shown how it would look:

This is how it looks in Dardel, which has a GfXLauncher setup very similar to Cosmos.

../_images/dardel-thinlinc.png
../_images/dardel-thinlinc-gfx.png
../_images/dardel-thinlinc-gfx-settings.png
../_images/dardel-thinlinc-gfx-starting.png
../_images/dardel-thinlinc-gfx-jupyterlab.png
  • At centres that have OpenOnDemand installed, you do not have to submit a batch job, but can run directly on the already allocated resources

  • OpenOnDemand is a good option for interactive tasks, graphical applications/visualization, and simpler job submittions. It can also be more user-friendly.

  • Regardless, there are many situations where submitting a batch job is the best option instead, including when you want to run jobs that need many resources (time, memory, multiple cores, multiple GPUs) or when you run multiple jobs concurrently or in a specified succession, without need for manual intervention. Batch jobs are often also preferred for automation (scripts) and reproducibility. Many types of application software fall into this category.

  • At centres that have ThinLinc you can usually submit MATLAB jobs to compute resources from within MATLAB.