Skip to content

Better logging

Be default, the normal and erroneous outputs of the LAM model are stored in a single $RUN_DIR/**/out_execution file. This is not ideal because logs from different MPI processes and components (LMDZ, DYNAMICO, etc.) are all mixed together. Fortunately, we can improve the situation with the following tips.

Use srun process labels

The srun command of SLURM allows distinguishing log lines coming from different (MPI) processes. In the Job_<JobName> script (e.g. Job_CREATE-amip-ERA5-LAM.01), generated by libIGCM, add -l or --label argument to srun which will prepend the process ID to each log line:

/usr/bin/time srun \
    --label \ #(1)!
    --multi-prog \
    ./run_file >out_execution 2>&1
  1. Prepend task (process) number to each line of the standard output and error streams (see srun --label documentation).

Should generate $RUN_DIR/**/out_execution log similar to this:

22:  USING DEFAULTS : area_radius1 =   3360.00000000000
26:  USING DEFAULTS : area_radius1 =   3360.00000000000
 0:  USING DEFAULTS : area_radius1 =   3360.00000000000
 0:  GETIN area_radius1 =    3360.00000000000
12:  USING DEFAULTS : area_rotation_pre =  0.000000000000000E+000
12:  USING DEFAULTS : area_rotation =  0.000000000000000E+000
Where, if MPI is used, the processes are indexed starting from 0 up to the MPI_COMM_RANK - 1.

Separating the labelled logs

The log file can be then unmixed with ipsl_slurm_logs script from the ipsl-common Python package:

pip install git+https://gitlab.in2p3.fr/patryk.kiepas/ipsl-common.git
mkdir separated_logs/
ipsl_slurm_logs <YOUR_LOG_FILE> --output-dir separated_logs/

The separated log should appear inseparated_logs/output_<ID>.log files. Consult the ipsl_slurm_logs --help section to find out more.

The potential of separated logs

Separated logs facilitate debugging of model errors, like paprs bad order, which could occur only at specific grid points (often on the domain boundary).

Use sbatch output

We can also modify how sbatch creates the log files and output them to per-process files.