Using Linaro DDT Parallel Debugger, MAP profiler and Performance Reports

Introduction

DDT is a parallel debugger from Linaro. It has a graphical user interface and can be used for debugging Fortran, C, and C++ programs that have been parallelized with MPI and OpenMP, as well as UPC programs. Additional information can be found at https://www.linaroforge.com/about/

This company was formally known as Allinea, some references to Allinea may necessarily remain in this documentation. Older versions of the software are available as allinea or arm modules ("module load allinea or module load arm" to use). In 2016 Allinea software was acquired by Arm, and in 2023 Linaro acquired Arm Forge.

MAP is a low-overhead profiler from Linaro for both scalar and MPI programs.

Both tools share a common environment.

In most cases it is much faster and easier to find an error in the code or the ways to optimize the performance using a debugger and a profiler than through the use of numerous print statements.

Note that even though DDT and MAP support GPU languages, such as HMPP, OpenMP Accelerators, CUDA and CUDA Fortran, we don't have a license to use DDT and MAP on the GPUs.

Linaro Performance Reports is a low-overhead tool that produces one-page text and HTML reports summarizing and characterizing both scalar and MPI application performance.

The Linaro DDT, MAP and Performance Reports user guides and tutorials can be found at https://www.linaroforge.com/documentation/

Notes:

The program should run for a minimum of a few seconds in order for reports to be generated.

Note than we have perpetual ARM/Allinea license for versions up to 19.0.3, so you can continue using ARM/Allinea tools even when we no longer have support.

Starting DDT and MAP

To use DDT and MAP load arm module:
```
	module load linaro
```
Next check whether your desktop computer accepts X connections. Issue
```
	xterm &
```
If your desktop machine accepts X connections, then this command will open a window.
If no window was open, try to login to Condo cluster issuing the following commands:
```
	ssh -X nova.its.iastate.edu
```
and once logged to nova, issue "xterm &".
You can also try to set environment variable DISPLAY to tell nova which machine to display the DDT window on:
```
	setenv DISPLAY desktop_machine_name:0.0
```
To find out the desktop_machine_name, issue on nova:
```
	echo $REMOTEHOST 
```
Alternatively you can install a remote client on your desktop which you can download from https://www.linaroforge.com/downloadForge_OldVersion/ . The remote client version should correspond to the version of Linaro Forge installed on the cluster. (Client version 20.0.3 will work with the server version 20.0.3) After installing the remote client, start the program and set up the Remote Launch by entering the following in the Host Name field:
```
	<user>@nova.its.iastate.edu 
```
where <user> is your user name on nova. In the Remote Installation Directory field enter
```
        /shared/hpc/linaro/forge
```
See details in the Linaro DDT and MAP User Guide.

To provide debugger with the information about the program, the program should be compiled with a debug flag. For the compilers installed on Condo cluster, this is "-g". To start DDT type the following:
```
	
	ddt [program_name [arguments] ]
```
The Linaro DDT and MAP User Guide explains in more details how to debug and profile programs. You can also use the following example to get acquainted with DDT.

Example

This example shows how to compile an MPI C program and to run it under DDT debugger on two compute nodes. The provided example does not have an error. After running it as explained below, try to introduce errors in the code (for example, declare array name to be of size 10) and run again.

Copy the example program to your directory:
```
	cp /home/SAMPLES/mpihello.c ./
```
Load intel and allinea environment modules:
```
	module load intel arm
```
Compile the program using the debug compiler flag "-g":
```
	mpiicc -g mpihello.c 
```
Start DDT debugger:
```
	ddt &
```
First a small windows saying "Linaro FORGE" will open, then it will be replaced by a larger window.
Click on RUN to run and debug a program.
In the application field enter the full path of the executable that you would like to debug. If you issued the cp command above from your home directory, enter "/home/<user>/a.out", where <user> is your user name on nova. You can also search for executable by clicking on the yellow folder button on the right of the application field.
Select checkboxes by "MPI" and "Submit to Queue" options.
Click on the "Configure..." button in the "Submit to Queue" option.
In the "Submission template file" field enter "/shared/hpc/arm/forge/templates/slurm.qtf". After entering this, four other fields will be populated.
Click on "OK" button.
Click on "Parameters..." button and type in required Wall Clock Limit (default is 30 minutes). You may also want to type in the queue name "debug", since debug nodes are usually available. However debug queue has two hour limit. If your job needs more than 2 hours do NOT type "debug" in the Queue field. You can also use "free" queue, which usually is less busy than the regular queue.
In the MPI section select 4 processes and 2 nodes.
Click on "Submit" button at the bottom.
DDT will submit your job into queue. When job will start running, DDT will attach to the processes. In the central part of the window you will see the code.
Right-click on line 18 (on the line number) and select "Add breakpoint for All". Do the same at line 21.
Click on the green triangle in the upper left corner of the window. A small window should appear telling that processes 0-3 stopped at breakpoint in line 18. Click on "Pause".
Explore the information provided in the lower part of the window (click on different tabs) and the values of variables displayed on the right. Select ping squares with numbers 0 through 3 to see information for various processes. Notice that MyId will have different values for different processes (select the Locals tab to see local variables), name will have some garbage in it.
Once again click on the green triangle in the upper left corner of the window. The processes will be stopped at the next breakpoint, and a small notification window will appear. Select "Pause" and see what changed. In the Input/Output window you should now see Hello messages from all 4 processes, and in the variable window name should now have the correct value.
To finish the execution, click on the green triangle, and when prompted about restarting session, select "No".
You can start new session by selecting appropriate menu option in the File menu.

DDT Reference

The following is the list of important functions allowed by DDT:

Stepping through the program one statement at a time and displaying all variable values.
Running the program to a selected statement where a breakpoint has been set and then displaying variable values.
Stepping into and out from functions/subprograms.
Running a program and having the program stop execution where a problem occurs. Then one knows where the program stopped execution and values of variables at that point.
Stepping can be done with all MPI processes stepping together or selected MPI processes stepping independently.

The following summarizes the most useful items from the toolbars at the top of the DDT window:

File: restart and exit sessions, configure DDT
Edit/Go to Line: allows one to easily go to any line of the program displayed in the middle window and to search for strings.
Window: allows one turn on and off viewing of various items.
Help: allows one to view and print the DDT User Guide and displays a FAQ.
Control/Focus on current: Allows one to select group, process or thread for stepping through the program.
Play: starts program execution from where it was stopped and runs to the next breakpoint or until the program stops.
Pause: pauses program execution on all MPI processes.
Step Into: execution proceeds into the function/subroutine if stopped at a function or subroutine; otherwise, execution continues to the next statement.
Step Over: execution continues to the next statement and not into a function/subroutine.
Step Out: allows one to step out from a function/subroutine.

Values of local variables are displayed in the right center window for the MPI process selected for the place where execution has stopped. Multi-dimensional array values can be viewed by selecting Multi-Dimensional Array Viewer listed under View.