Slurm, Realistic Examples
Of course, our initial example script was a bit silly. We now list a few more complicated scripts that should help you get started with writing your own production grade scripts.
The following examples follow:
- Download of NWP Forecasts from MetNo
- Calling Python Script, Using Multiple Cores
- Using Conda Environments
- Get Verbosity
Example 01: Download of NWP Forecasts from MetNo
This downloads a large number of weather forecast from the meterological institute.
volkerh@ds01:~/Jobs/NWP$ cat 2018_MEPS_Download_Q1.job
#!/bin/bash
#SBATCH --output /home/volkerh/Jobs/NWP/0000_Logs/2018_MEPS_Download_Q1-%j.out
#SBATCH --job-name 2018/MEPS/Download/Q1
#SBATCH --partition sintef
#SBATCH --ntasks 1
#SBTACH --mem=32MB
#SBATCH --time 7-00:00:00
home=/home/volkerh
data=/data/volkerh
outdir=$data/nwp/2018_meps
cd $outdir
export DATE=`date +%F_%H%M`
srun ./sync_q1.sh > Sync_Q1_$DATE.log 2>&1
Note:
-
Directives starting with
#XSBATCH
(instead of#SBATCH
) are ignored. -
The script doing the actual heavy lifting is the one called by
srun ./sync_q1.sh
in the directory we have changed to. It looks like this:
volkerh@ds01:/data/volkerh/nwp/2018_meps$ cat 2018_MEPS_Download_Q1.job
#!/bin/bash
year=2018
# http://thredds.met.no/thredds/fileServer/meps25epsarchive/2018/01/01/meps_mbr0_pp_2_5km_20180101T00Z.nc
# January
nmonth=01
for nday in `seq -f %02g 1 31`; do
wget --no-verbose http://thredds.met.no/thredds/fileServer/meps25epsarchive/${year}/${nmonth}/${nday}/meps_mbr0_pp_2_5km_${year}${nmonth}${nday}T00Z.nc
done
Example 02: Calling Python Script, Using Multiple Cores
Here's a job script that calls a Python script which can use multiple cores.
volkerh@ds01:~/Jobs/Energytics$ cat AMS_Hafslund.job
#!/bin/bash
#SBATCH --output /home/volkerh/Jobs/Energytics/0000_Logs/AMS_Hafslund-%j.out
#SBATCH --job-name AMS/Hafslund
#SBATCH --partition sintef
#SBATCH --ntasks 1
#SBATCH --cpus-per-task=12
#SBTACH --mem=24GB
#SBATCH --time 00-08:00:00
home=/home/volkerh
data=/data/volkerh
srcdir=$home/Source/Energytics/Playground/AMS_Loaders
outdir=$data/energytics/ams_hafslund
cd $outdir
export DATE=`date +%F_%H%M`
srun python -u ${srcdir}/hafslund_csv2hdf5.py > Run_$DATE.log 2>&1
Note:
-
The Python interpreter is invoked with the
-u
flag. This forces the output to be unbuffered. If output is buffered, any messages the script may emit will not immediately show up in the log. -
The directive
#SBATCH --cpus-per-task=12
assigns 12 CPU (cores) to the job.
Confusion potential
Be careful to not confuse --cpus-per-task
with --ntasks
. The latter is only relevant when using MPI parallelism.
Example 03: Using Conda Environments
If you want to use a virtual Python environment (with Anaconda/Miniconda), you will need to invoke two additional steps, viz.
volkerh@ds01:/home/davidg/Jobs$ cat 6hforecast.job
#!/bin/bash
#SBATCH --output /home/davidg/Jobs/0000_Logs/6hforecast-%j.out
#SBATCH --job-name 6hforecast
#SBATCH --partition sintef
#SBATCH --ntasks 1
#SBATCH --cpus-per-task=1
#SBATCH --mem=24GB
#SBATCH --time 07-00:00:00
# ENABLE ACCESS TO CONDA ENVIRONMENTS
. "/opt/miniforge3/etc/profile.d/conda.sh"
# ACTIVATE CONDA ENVIRONMENT
conda activate dgenv2
cd /home/davidg/
export DATE=`date +%F_%H%M`
srun python -u /home/davidg/6h_forecast_interpolator.py > Run_$DATE.log 2>&1
Example 04: Get Verbosity
Sometimes you may want to time your jobs a bit better. You can do this by throwing timestamps at various steps in the script, viz.
volkerh@ds01:~/Jobs/Energytics$ cat AMS_Hafslund.job
#!/bin/bash
#SBATCH --output /home/volkerh/Jobs/Energytics/0000_Logs/AMS_Hafslund-%j.out
#SBATCH --job-name AMS/Hafslund
#SBATCH --partition sintef
#SBATCH --ntasks 1
#SBATCH --cpus-per-task=12
#SBATCH --mem=24GB
#SBATCH --time 00-08:00:00
home=/home/volkerh
data=/data/volkerh
srcdir=$home/Source/Energytics/Playground/AMS_Loaders
outdir=$data/energytics/ams_hafslund
echo ""
echo "***** LAUNCHING *****"
echo `date '+%F %H:%M:%S'`
echo ""
echo "outdir="$outdir
echo "hostname="`hostname`
echo ""
echo "***"
echo ""
cd $outdir
export DATE=`date +%F_%H%M`
srun python -u ${srcdir}/hafslund_csv2hdf5.py > Run_$DATE.log 2>&1
echo ""
echo "***** DONE *****"
echo `date '+%F %H:%M:%S'`
echo ""