- Introduction to HPC clusters
- UNIX Introduction
- HPC Class
- Condo 2017
- File Transfers
- Cloud Back-up with Rclone
- Globus Connect
- Sample Job Scripts
- Using DDT Parallel Debugger, MAP profiler and Performance Reports
- Using Matlab Parallel Server
- Using ANSYS RSM
- Nova OnDemand
- Using Julia
- LAS Machine Learning Container
- Support & Contacts
- Systems & Equipment
- FAQ: Frequently Asked Questions
- Contact Us
- Cluster Access Request
Using ARM DDT Parallel Debugger, MAP profiler and Performance Reports
This guide specifically uses the condo2017 cluster in the examples, but the software is available on all of our clusters, just substitute the other clusters as appropriate.
DDT is a parallel debugger from Arm. It has a graphical user interface and can be used for debugging Fortran, C, and C++ programs that have been parallelized with MPI and OpenMP, as well as UPC programs. Additional information can be found at https://www.arm.com/products/development-tools/server-and-hpc/forge
This company was formally known as Allinea, some references to Allinea may necessarily remain in this documentation. Older versions of the software are available as allinea modules ("module load allinea" to use)
MAP is a low-overhead profiler from Arm for both scalar and MPI programs.
Both tools share a common environment.
In most cases it is much faster and easier to find an error in the code or the ways to optimize the performance using a debugger and a profiler than through the use of numerous print statements.
The ARM DDT and MAP user guides and tutorials can be found at https://developer.arm.com/tools-and-software/server-and-hpc/help
The Arm forum and blog pages have various articles on how to best utilize the Allinea tools, such as how to fix dangling pointers or debug mixed Python and Fortran codes.
Note that even though DDT and MAP support GPU languages, such as HMPP, OpenMP Accelerators, CUDA and CUDA Fortran, we don't have a license to use DDT and MAP on the GPUs.
ARM Performance Reports is a low-overhead tool that produces one-page text and HTML reports summarizing and characterizing both scalar and MPI application performance. The Performance Reports user guide can be downloaded from https://developer.arm.com/docs/101137/latest
- The program should run for a minimum of a few seconds in order for reports to be generated.
- The license for "reports" has not been renewed and is substantially older than the rest of the suite.
Note than we have perpetual ARM/Allinea license, so you can continue using ARM/Allinea tools even when we no longer have support.
Starting DDT and MAP
To use DDT and MAP load arm module:
module load arm
Next check whether your desktop computer accepts X connections. Issue
xterm &If your desktop machine accepts X connections, then this command will open a window.
If no window was open, try to login to Condo cluster issuing the following commands:
ssh -X condo2017.its.iastate.eduand once logged to 'condo2017', issue "xterm &".
You can also try to set environment variable DISPLAY to tell condo2017 which machine to display the DDT window on:
setenv DISPLAY desktop_machine_name:0.0To find out the desktop_machine_name, issue on condo2017:
Alternatively you can install a remote client on your desktop which you can download from https://developer.arm.com/tools-and-software/server-and-hpc/downloads/arm-forge The remote client version should correspond to the version of Arm Forge installed on the cluster. (Client version 20.0.3 will work with the server version 20.0.3) After installing the remote client, start the program and set up the Remote Launch by entering the following in the Host Name field:
<user>@condo2017.its.iastate.eduwhere <user> is your user name on condo2017. In the Remote Installation Directory field enter
/shared/hpc/arm/forgeSee details in the Arm DDT and MAP User Guide.
To provide debugger with the information about the program, the program should be compiled with a debug flag. For the compilers installed on Condo cluster, this is "-g". To start DDT type the following:
ddt [program_name [arguments] ]
- The Arm DDT and MAP User Guide explains in more details how to debug and profile programs. You can also use the following example to get acquainted with DDT.
This example shows how to compile an MPI C program and to run it under DDT debugger on two compute nodes. The provided example does not have an error. After running it as explained below, try to introduce errors in the code (for example, declare array name to be of size 10) and run again.
- Copy the example program to your directory:
cp /home/SAMPLES/mpihello.c ./
- Load intel and allinea environment modules:
module load intel arm
- Compile the program using the debug compiler flag "-g":
mpiicc -g mpihello.c
- Start DDT debugger:
ddt &First a small windows saying "Arm FORGE" will open, then it will be replaced by a larger window.
Click on RUN to run and debug a program.
In the application field enter the full path of the executable that you would like to debug. If you issued the cp command above from your home directory, enter "/home/<user>/a.out", where <user> is your user name on condo2017. You can also search for executable by clicking on the yellow folder button on the right of the application field.
Select checkboxes by "MPI" and "Submit to Queue" options.
Click on the "Configure..." button in the "Submit to Queue" option.
In the "Submission template file" field enter "/shared/hpc/arm/forge/templates/slurm.qtf". After entering this, four other fields will be populated.
Click on "OK" button.
Click on "Parameters..." button and type in required Wall Clock Limit (default is 30 minutes). You may also want to type in the queue name "debug", since debug nodes are usually available. However debug queue has two hour limit. If your job needs more than 2 hours do NOT type "debug" in the Queue field. You can also use "free" queue, which usually is less busy than the regular queue.
In the MPI section select 4 processes and 2 nodes.
Click on "Submit" button at the bottom.
DDT will submit your job into queue. When job will start running, DDT will attach to the processes. In the central part of the window you will see the code.
Right-click on line 18 (on the line number) and select "Add breakpoint for All". Do the same at line 21.
Click on the green triangle in the upper left corner of the window. A small window should appear telling that processes 0-3 stopped at breakpoint in line 18. Click on "Pause".
Explore the information provided in the lower part of the window (click on different tabs) and the values of variables displayed on the right. Select ping squares with numbers 0 through 3 to see information for various processes. Notice that MyId will have different values for different processes (select the Locals tab to see local variables), name will have some garbage in it.
Once again click on the green triangle in the upper left corner of the window. The processes will be stopped at the next breakpoint, and a small notification window will appear. Select "Pause" and see what changed. In the Input/Output window you should now see Hello messages from all 4 processes, and in the variable window name should now have the correct value.
To finish the execution, click on the green triangle, and when prompted about restarting session, select "No".
You can start new session by selecting appropriate menu option in the File menu.
The following is the list of important functions allowed by DDT:
Stepping through the program one statement at a time and displaying all variable values.
Running the program to a selected statement where a breakpoint has been set and then displaying variable values.
Stepping into and out from functions/subprograms.
Running a program and having the program stop execution where a problem occurs. Then one knows where the program stopped execution and values of variables at that point.
Stepping can be done with all MPI processes stepping together or selected MPI processes stepping independently.
The following summarizes the most useful items from the toolbars at the top of the DDT window:
File: restart and exit sessions, configure DDT
Edit/Go to Line: allows one to easily go to any line of the program displayed in the middle window and to search for strings.
Window: allows one turn on and off viewing of various items.
Help: allows one to view and print the DDT User Guide and displays a FAQ.
Control/Focus on current: Allows one to select group, process or thread for stepping through the program.
Play: starts program execution from where it was stopped and runs to the next breakpoint or until the program stops.
Pause: pauses program execution on all MPI processes.
Step Into: execution proceeds into the function/subroutine if stopped at a function or subroutine; otherwise, execution continues to the next statement.
Step Over: execution continues to the next statement and not into a function/subroutine.
Step Out: allows one to step out from a function/subroutine.
Values of local variables are displayed in the right center window for the MPI process selected for the place where execution has stopped. Multi-dimensional array values can be viewed by selecting Multi-Dimensional Array Viewer listed under View.