C H A P T E R  4

Debugging a Program

This chapter discusses how to debug message-passing programs in the Prism environment. It also describes how to use events to control the execution of a program.

Note that many principles that apply to debugging serial programs also apply to debugging message-passing programs. However, debugging a message-passing program can be considerably more complex than debugging a serial program, since you are in effect debugging multiple individual programs concurrently. The Prism environment's concept of psets lets you focus your debugging efforts on the processes that are of particular interest.

For information about debugging serial programs, see Appendix C.

This chapter is organized into the following sections:


Overview of Events

A typical approach to debugging is to stop the execution of a program at different points so that you can perform various actions, such as checking the values of variables. You stop execution by setting a breakpoint. If you perform a trace, execution stops, then automatically continues.

In the Prism environment, breakpoints and traces are referred to as events. Before the execution of a program begins. you can specify what events are to take place during execution. When an event occurs:

1. The execution pointer moves to the current execution point.

2. A message is printed in the command window.

3. If you specified that an action was to accompany the event, it is performed. An example of this might be to print a variable's value.

4. If the event is a trace, execution then continues. If it is a breakpoint, execution does not resume until you explicitly order it to.

The Prism environment provides various ways of creating these events -- for example, by issuing commands or by using the mouse in the source window. Setting Breakpoints describes how to create breakpoint events; Tracing Program Execution describes how to create trace events. Using the Event Table describes the Event Table, which provides a unified method for listing, creating, editing, and deleting events.

See Events Taking Pset Qualifiers for a discussion of events in the Prism environment.

You can define events so that they occur:

You can include one or more Prism commands as actions that are to take place as part of the event. One example of this would be to define an event that tells the Prism environment to stop at line 25, print the value of x, and do a stack trace.


Using the Event Table

The Event Table provides a unified method for controlling the execution of a program. Creating an event in any of the ways discussed later in this chapter adds an event to the list in this table. You can also display the Event Table and modify its contents directly by:

To display the Event Table, select Event Table from the Events menu.

This section describes the general process of using the Event Table.

Description of the Event Table

FIGURE 4-1 shows the Event Table.

The top area of the Event Table is the event list -- a scrollable region in which events are listed. When you execute the program, the Prism environment uses the events in this list to control execution. Each event is listed in the format you would use to enter it as a command in the command window. It is prefaced by an ID number assigned by the Prism environment, which is 1 in the FIGURE 4-1 example.

The middle area of the Event Table is a series of fields that you fill in when editing or adding an event; not all of the fields are relevant to every event. The fields are:

 FIGURE 4-1 Event Table

Screenshot of the Event table. Standard (not shortcut buttons) are New, Save, Replace, Delete, Close, and Help.

The buttons beneath these fields are for use in creating and deleting events; they are described below.

The area headed Common Events contains buttons that provide shortcuts for creating certain standard events.

Click on Close or press the Esc key to cancel the Event Table window.

Adding an Event

You can either add an event explicitly, editing field by field, or you can use the Common Events buttons to automatically fill in some of the fields for you. You can add an event from the beginning if it is not similar to any of the categories covered by the Common Events buttons.


procedure icon  To Add an Event, Editing Field by Field

1. Click on the New button.

All values currently in the fields are cleared.

2. Fill in the relevant fields to create the event.

3. Click on the Save button to save the new event.

It appears in the event list.


procedure icon  To Add an Event, Using Common Events Buttons

1. Click on the button for the event you want to add -- for example, Print.

This fills in certain fields and highlights all fields that you need to fill in.

2. Fill in the highlighted field(s).

You can also edit other fields if you like.

3. Click on Save to add the event to the event list.

Most of these Common Events buttons are also available as separate selections in the Events menu. This lets you add one of these events without having to display the entire Event Table. The menu selections, however, prompt you only for the field(s) you must fill in. You cannot edit other fields.

Individual Common Events buttons are discussed throughout the remainder of this guide.

You can also create a new event by editing an existing event; see Editing an Existing Event.

Deleting an Existing Event

You can delete events using the Event Table or the Delete selection from the Events menu.


procedure icon  To Delete an Existing Event

1. Click on the line representing the event in the Event Table or move to it with the up and down arrow keys.

This causes the components of the event to be displayed in the appropriate fields beneath the list.

2. Click on the Delete button.

You can also select Delete from the Events menu to display the Event Table. You can then follow the procedure described above.

Deleting a breakpoint at a program location also deletes the B in the line-number region at that location.

Editing an Existing Event

You can edit an existing event to change it or to create a new event similar to it.


procedure icon  To Edit an Existing Event

1. Click on the line representing the event in the event list or move to it with the up or down arrow keys.

This causes the components of the event to be displayed in the appropriate fields beneath the list.

2. Edit these fields.

You can, for example, change the Location field to specify a different location in the program.

3. Save the newly edited event.

Click on the Save button to save the new event in addition to the original version of the event; it is given a new ID and is added to the end of the event list. Clicking on Save is a quick way of creating a new event similar to an event you have already created.

Disabling and Enabling Events

You can disable and enable events. When you disable an event, the Prism environment keeps it in the event list, but it no longer affects execution. You can subsequently enable it when you once again want it to affect execution. This can be more convenient than deleting events and then redefining them.


procedure icon  To Disable an Event

single-step bulletPerform one of the following:

In the following example, the sequence of commands displays the event list, then disables an event, and then redisplays the event list:

(prism all) show events
(1) trace
(2) when stopped { print board }
(prism all) disable 1
event 1 disabled
(prism all) show events
(1) trace (disabled)
(2) when stopped { print board }


procedure icon  To Enable an Event

single-step bulletType:

(prism all) enable event_ID

This re-enables event_ID.

Saving Events

Events that you create for a program are automatically maintained when you reload the same program during a Prism session. This saves you the effort of redefining these events each time you reload a program.

Note these points:

  • The Prism environment prints a warning message if it can't maintain an event . This would happen, for example, if an event is supposed to occur at a source line that no longer exists.
  • Changing a program can also change the meaning of events. A breakpoint set at line 32, for example, may still be a valid event, but it may not be the event you want if you have deleted lines earlier in the program.
  • Disabled events become enabled when a program is reloaded.
  • Events are deleted when you leave the Prism environment.

procedure icon  To Save Events to a File

You can use Prism commands to save your events to a file and then execute them from the file rather than interactively.

1. Redirect the output to a file.

For example,the following redirects the list of events to the file primes.events:

(prism all) show events @ primes.events

2. Edit primes.events to remove the ID number at the beginning of each event.

This leaves you with a list of Prism commands.

3. Type:

(prism all) source primes.events

This reads in and executes the commands from primes.events.

Events Taking Pset Qualifiers

Events in the Prism environment can take a pset qualifier.


procedure icon  To Specify a Pset Qualifier

single-step bulletType the pset name in the Pset field in the Event Table, as shown in FIGURE 4-2.

 FIGURE 4-2 Pset Field in the Prism Environment's Event Table

Screenshot of the Pset field in the Prism Environment's Event Table. Buttons are New, Save, Replace, and Delete.

If you do not supply a pset qualifier, the event applies to the current pset.

In the following example, the current pset is all.

(prism all) stop in receive pset notx

Because the pset notx is specified, this command sets a breakpoint in the receive routine for the processes in the set notx. Each process in pset notx stops when it reaches this routine. It is possible, of course, that some processes may never reach this routine. This might become an issue when you include actions in an event.

The following command stops execution for any process in the current pset if the process's value for the variable x is greater than 10.

(prism all) stop if x > 10

Because no other pset was specified in this example, this event applies to the current pset, which is all. The Prism environment evaluates the expression in the condition locally -- that is, separately for each process. Similarly, if a and b are arrays, the following commandstops execution for a process in the current set if the sum of the values of a in that process is greater than the sum of the values of b:

(prism all) stop if sum(a) > sum(b)

All processes that are stopped at breakpoints are members of the predefined pset break.


procedure icon  To Continue All the Processes in a Pset

single-step bulletType:

(prism all) cont

The following command causes the processes in pset notx to continue running:

(prism all) cont pset notx

Events and Dynamic Psets

If you use a dynamic pset as a qualifier for an event, its membership is evaluated when you issue the command defining the event. Thus, the following command creates a breakpoint only in the processes that are interrupted at the time the command is issued:

(prism all) stop at 10 pset interrupted

If no processes are currently interrupted, you will receive an error message.

One result of this is that you cannot define events that involve dynamic psets before the program starts execution.

Events and Variable Psets

If you specify a user-defined variable pset as a qualifier, its membership is determined by the most recent eval pset command issued for that pset.

As is the case with dynamic psets, you cannot define events that involve variable psets before the program starts execution.

Actions in Events

Events in the Prism environment can take action clauses. For example, the following action clause prints x for the pset foo when the members of foo are stopped at line 10:

(prism all) stop at 10 {print x} pset foo



Note - Associating an action with an event forces a global synchronization at the breakpoint or tracepoint. In the example above, every process in pset foo must stop at line 10 before x can be printed. If a member does not stop at line 10, the action never takes place. In a trace event, all processes in the pset must stop at the specified place and synchronize; the action then takes place, and the processes automatically continue execution.



You can include an eval pset command as an event action. For example, this evaluates the pset sending when all the members of the current pset are stopped in send:

(prism all) stop in send {eval pset sending}

You receive error messages if it is impossible to evaluate membership in a pset. This would happen, for example, if a variable in the set definition is not active.

Note these limitations in using event actions:

  • You cannot include the following commands that manipulate psets:
    • define pset
    • delete pset
    • process
    • pset
  • You cannot include a pset qualifier in the action. The command in the action clause takes its pset from the pset of the event.
  • You cannot include commands that affect program execution; these are
    • cont and contw
    • run
    • step and stepi
    • next and nexti
    • wait
  • You cannot include the load, reload, return, and core commands.
  • You cannot use an unbounded pset as the context for an event specification. For information about unbounded psets, see Using Unbounded Psets in Commands.

procedure icon  To Display Events by Process

single-step bulletType:

(prism all) show events (processnumber)

This displays all events associated with the specified process.

Issuing show events with no arguments has its standard behavior. That is, it prints out all events, as shown in the following example:

(prism all) show events
(1) trace
(2) when stopped { print board }
(prism all) disable 1
event 1 disabled
(prism all) show events
(1) trace (disabled)
(2) when stopped { print board }

Events and Deleted Psets

If you create an event that applies to a particular pset and subsequently delete the pset, the event continues to exist. Its printed representation, however, is changed so that it shows the processes that were members of the pset at the time you deleted the set.


Setting Breakpoints

A breakpoint stops execution of a program when a specific location is reached, if a variable or expression changes its value, or if a certain condition is met. This section describes the methods available in the Prism environment for setting a breakpoint.

You can set a breakpoint in the following ways:

  • By using the line-number region
  • By using the Event Table and the Events menu
  • By issuing the command stop or when in the command window

The line-number region is easiest for setting simple breakpoints. However, the other two methods give you greater flexibility, such as in setting up a condition under which the breakpoint is to take place.

In all cases, an event is added to the list in the Event Table. If you delete the breakpoint using any of the methods described in this section, the corresponding event is deleted from the event list. If you set a breakpoint at a program location, a B appears next to the line number in the line-number region.



Note - Secondary (spawned) Prism sessions do not inherit breakpoints set within primary Prism sessions.



Using the Line-Number Region

To use the line-number region to set a breakpoint, the line at which you want to stop execution must appear in the source window. If it does not, you can scroll through the source window (if the line is in the current file) or use the File or Func selection from the File menu to display the source file you are interested in.


procedure icon  To Set a Breakpoint in the Line-Number Region

1. Position the mouse pointer to the right of the line numbers.

The pointer turns into a B.

2. Move the pointer next to the line at which you want to stop execution.

3. Left-click the mouse.

A B is displayed, indicating that a breakpoint has been set for that line.

A message appears in the command window confirming the breakpoint, and an event is added to the event list.

The source line you choose must contain executable code; if it does not, you receive a warning in the command window, and no B appears where you clicked.

4. Shift-click on the letter in the line-number region to display the complete event
(or events) associated with it.

See Using the Line-Number Region for more information on the line-number region.


procedure icon   To Delete Breakpoints Using the Line-Number Region

single-step bulletLeft-click on the B that represents the breakpoint you want to delete.

The B disappears; a message appears in the command window, confirming the deletion.

What Happens in a Split Source Window

As described in Moving Through the Source Code, you can split the source window to display source code and the corresponding assembly code.

You can set a breakpoint in either pane. The B appears in the line-number region of both panes, unless you set the breakpoint at an assembly code line for which there is no corresponding source line.

Deleting a breakpoint from one pane of the split source window deletes it from the other pane as well.

Using the Event Table and the Events Menu


procedure icon  To Set a Breakpoint Using the Event Table

1. Select Stop <loc> or Stop <var> from the Events menu.

These choices are also available as Common Events buttons within the Event Table itself; see Adding an Event.

2. Perform one of the following:

  • Stop <loc> prompts for a location at which to stop the program. If you want to set the breakpoint at a particular line in the program, enter the line number. Or, to set the breakpoint at the first line of a function or procedure, enter the function or procedure name instead.

 FIGURE 4-3 Stop <loc> Dialog Box

Screenshot of the Stop <loc> dialog box. Buttons are OK, Cancel, and Help.
  • Stop <var> prompts for a variable name. The program stops when the variable's value changes. The variable can be an array, in which case execution stops whenever any element of the array changes. This slows execution considerably.
  • Stop <cond> (available as a Common Events button on the Event Table) prompts for a condition, which can be any expression that evaluates to true or false. The program stops when the condition is met. This slows execution considerably.

See Writing Expressions in the Prism Environment for more information on expressions.

You can also use the Event Table to create combinations of these breakpoints; for example, you can create a breakpoint that stops at a location if a condition is met.
In addition, you can use the Actions field of the Event Table to specify the Prism commands that are to be executed when execution stops.


procedure icon   To Delete Breakpoints Using the Event Table

single-step bulletPerform one of the following:

  • From the Events menu, select Delete.
  • From the Event Table, click on the Delete button.

For more information about deleting events, see Deleting an Existing Event.

Setting a Breakpoint Using Commands


procedure icon  To Set a Breakpoint Using Commands

single-step bulletType:

(prism all) stop

or

(prism all) when

The when command is an alias for the stop command.

The syntax of the stop command is also used by the stopi, when,trace, and tracei commands. The general syntax for these commands is:

command [variable | at line | in func] [if expr] [{cmd[; cmd...]}] [after n]

where

  • command - Is the name of a command, which can be stop, stopi, when, trace, or tracei.
  • variable - Is the name of a variable. The command is executed if the value of the variable changes. If the variable is an array, an array section, or a parallel variable, the command is executed if the value of any element changes. This form of the command slows execution considerably. You cannot specify both a variable and a program location.
  • line - Specifies the line number where the stop or trace is to be executed. If the line is not in the current file, use the format:
at filename:line-number
  • func - Is the name of the function or procedure in which the stop or trace is to be executed.
  • expr - Is any language expression that evaluates to true or false. This argument specifies the logical condition, if any, under which the stop or trace is to be executed. For example, the following expression evaluates to true whenever variable a is greater than 1.
if a .GT. 1
This form of the command slows execution considerably, unless you combine it with the at line syntax. See Writing Expressions in the Prism Environment for more information on writing expressions in the Prism environment.
  • cmd - Is any Prism command (except attach, core, detach, load, return, run, or step). This argument specifies the actions, if any, that are to accompany the execution of the stop or trace. For example, {print a} prints the value of a. If you include multiple commands, separate them with semicolons.
  • n - Is an integer that specifies how many times a triggering condition is to be reached before the stop or trace is executed; see Overview of Events for a discussion of triggering conditions. This is referred to as an after count. The default is 1. Once the stop or trace is executed, the count is reset to its original value. Note that if there is both a condition and an after count, the condition is checked first.

The first option listed (specifying the location or the name of the variable) must come first on the command line. The other options, if you include them, can be in any order.

For the when command, you can use the keyword stopped to specify that the actions are to occur whenever the program stops execution.

When you issue the command, an event is added to the event list. If the command sets a breakpoint at a program location, a B appears in the line-number region next to the location.

Examples of the stop Command

To stop execution the tenth time in function foo and print a, type:

(prism all) stop in foo {print a} after 10

To stop at line 17 of file bar if a is equal to 0, type:

(prism all) stop at "bar":17 if a == 0

To stop whenever a changes, type:

(prism all) stop a

To stop the third time a equals 5, type:

(prism all) stop if a .eq. 5 after 3

To print a and do a stack trace every time the program stops execution, type:

(prism all) when stopped {print a; where}

procedure icon  To Set a Breakpoint Using Machine Instructions

single-step bulletType:

(prism all) stopi at machine-address

For example, the following command stops execution at address 1000 (hex):

(prism all) stopi at 0x1000

The history region displays the address and the machine instruction. The source pointer moves to the source line being executed.


procedure icon   To Delete Breakpoints Using the Command Window

1. Type:

(prism all) show events

This prints out the event list. Each event has an ID number associated with it.

2. Type:

(prism all) delete ID [ID ...]

List the ID numbers of the events you want to delete; separate multiple IDs with one or more blank spaces. For example,this deletes the events with IDs 1 and 3. Use the argument all to delete all existing events:

delete 1 3


Tracing Program Execution

You can trace program execution by using the Event Table or Events menu or by issuing commands. All methods add an event to the Event Table. If you trace a source line, the Prism environment displays a T next to the line in the line-number region.

Tracing is essentially the same as setting a breakpoint, except that execution continues automatically after the breakpoint is reached. When tracing source lines, the Prism environment steps into procedures if they were compiled with the -g option; otherwise it steps over them as if it had issued a next command.

Using the Event Table and Events Menu


procedure icon  To Trace Program Execution Using the Event Table and the Events Menu

single-step bulletSelect Trace, Trace <loc>, or Trace <var> from the Events menu.

These choices are also available as Common Events buttons within the Event Table itself.

  • Trace displays source lines in the command window before they are executed.
  • Trace <loc> prompts for a source line. The Prism environment displays a message immediately prior to the execution of this source line.
  • Trace <var> prompts for a variable name. A message is printed when the variable's value changes. The variable can be an array, an array section, or a parallel variable, in which case a message is printed any time any element changes. This slows execution considerably.
  • Trace <cond> (available as a Common Events button) prompts for a condition, which can be any expression that evaluates to true or false; see Writing Expressions in the Prism Environment for more information on writing expressions. The program displays a message when the condition is met. This also slows execution considerably.

For variations of these traces, you can create your own event in the Event Table. You can also use the Actions field to specify Prism commands that are to be executed along with the trace.


procedure icon   To Delete Traces

single-step bulletChoose the Delete selection from the Events menu, or use the Delete button in the Event Table.

For more information about deleting existing events, see Deleting an Existing Event.

Using the Command Window


procedure icon   To Trace Program Execution Using Commands

single-step bulletType:

(prism all) trace

Issuing trace with no arguments causes each source line in the program to be displayed in the command window before it is executed.

The trace command uses the same syntax as the stop command; see Setting a Breakpoint Using Commands. For example:

To trace and print a on every source line, type:

(prism all) trace {print a}

To trace line 17 if a is greater than 10, type:

(prism all) trace at 17 if a .GT. 10

In addition, the Prism environment interprets these two commands as being the same:

(prism all) trace line-number

(prism all) trace at line-number


procedure icon   To Trace Machine Instructions

single-step bulletType:

(prism all) tracei address

When tracing machine instructions, the Prism environment follows all procedure calls down. The tracei command has the same syntax as the stop command; see Setting a Breakpoint Using Commands.

The history region displays the address and the machine instruction. The execution pointer moves to the next source line to be executed.


procedure icon   To Delete Traces

1. Type:

(prism all) show events

This obtains the ID associated with the trace.

2. Type:

(prism all) delete ID

For further information, see Setting a Breakpoint Using Commands.


Displaying and Moving Through the Call Stack

The call stack is the list of procedures and functions currently active in a program. The Prism environment provides you with methods for examining the contents of the call stack.

See Displaying the Where Graph for a discussion of displaying the call stack graphically in the Prism environment.


procedure icon   To Display the Call Stack

single-step bulletPerform one of the following:

  • From the menu bar - Select Where from the Debug menu. The Where window is displayed; see FIGURE 4-4. The window contains the call stack. It is updated automatically when execution stops or when you issue commands that change the stack.

 FIGURE 4-4 Where Window

Screenshot of the Where window. Buttons are Cancel and Help.
  • From the command window - Type where on the Prism command line. If you include a number, it specifies how many active procedures are to be displayed; otherwise, all active procedures are displayed in the history region.
  • From the command window - Type where on snapshot on the Prism command line to put the history-region output into a window.

Values of arguments in displayed procedures are shown in the default radix, which is decimal unless you change it via the set $radix command; see To Change the Default Radix.

Moving Through the Call Stack

Moving up through the call stack means heading toward the main procedure. Moving down through the call stack means heading toward the current stopping point in the program.

Moving through the call stack changes the current function and repositions the source window at this function. It also affects the scope that the Prism environment uses for interpreting the names of variables in expressions and commands. See Scope in the Prism Environment for more information.


procedure icon  To Move Through the Call Stack

single-step bulletPerform one of the following:

  • From the menu bar - Choose Up or Down from the Debug menu. Up moves up one level in the call stack; Down moves down one level. These selections are available by default in the tear-off region.
  • From the command window - Enter up or down on the command line to move up or down one level. To move more than one level, specify an integer argument.
  • From the Where window - If the Where window is displayed, clicking on a function in it changes the stack level to make that function current.

Displaying the Where Graph

Selecting Where from the Debug menu displays the call stacks for the program being debugged. A multiprocess program can have multiple call stacks, one for each process. A threaded program can have a separate stack for each thread in each process.

To show the relationships among these call stacks, the Prism environment provides a Where graph; this window displays a snapshot of the dynamic call graph of the program. The Where graph displays information about all processes that are not running.


procedure icon  To Display the Where Graph

single-step bulletPerform one of the following:

  • From the menu bar - Choose Where from the Debug menu.
  • From the command line - Type where on dedicated.

A window like the one shown in FIGURE 4-5 is displayed.

 FIGURE 4-5 Where Graph

Screenshot of the Where Graph. Buttons are Cancel and Help.

The Where graph centers on the current process of the current pset. That is, the processes related to it are lined up in a single column. In FIGURE 4-5, process 0 is the current process. If you change the current process, the Where graph rearranges itself. The default zoom level of the Where graph shows the arguments for the current process.

The line numbers at the bottom of each box indicate where processes branch.


procedure icon  To Display Processes Containing a Specific Function in Their Call Stacks

single-step bulletShift-click in each function's box.

This displays a pop-up window showing the numbers of the processes with this function in their call stack, along with their arguments.

Panning and Zooming in the Where Graph

As FIGURE 4-6 shows, the Where graph can get quite large, so the Prism environment provides methods for panning through it and zooming in and out.

The white box in the navigator rectangle at the top of the window shows the position of the display area relative to the entire Where graph.


procedure icon  To Move the Position Displayed in the Where Graph

single-step bulletPerform one of the following:

  • Drag the box.
  • Click at a spot in the navigator.

The box moves to that spot, and the window shows the Where graph in this area of the total display.


procedure icon  To Display More of the Where Graph

single-step bullet Click on the Zoom down arrow to the right of the navigator.

This reduces the size of the boxes representing the functions and removes information. FIGURE 4-6 shows the Where graph of FIGURE 4-5, zoomed out one level. Note that the information about the current process's arguments is gone.

As you zoom further out, the Where graph removes the line numbers, and one more level after that removes the function names, leaving only boxes connected by lines.

 FIGURE 4-6 Where Graph, Zoomed Out One Level

Screenshot of the Where Graph, zoomed out one level.

procedure icon  To Display Additional Information About a Box in the Where Graph

single-step bulletShift-click on a box to display information about it.

If your program is multithreaded, its call stacks are not rooted at main. Thus, at maximum zoom, the Where graph displays the call stacks as multiple trees, as shown in FIGURE 4-7.

 FIGURE 4-7 Where Graph, Zoomed Out to the Maximum

Screenshot of the Where Graph, zoomed out to the maximum.

procedure icon  To Increase the Size of the Where Graph's Function Boxes

single-step bulletClick on the Zoom up arrow.

This increases the size of the function boxes and includes more information in them. FIGURE 4-8 shows the Where graph of FIGURE 4-5, zoomed in. In this case, the Where graph shows, for each function, the processes that have that function in their call stack. As in the Psets window, the processes are represented as bitmaps of cells, with numbering starting at the upper left, increasing from left to right and then jumping to the next row.

If your Where graph displays a threaded program, you can zoom in to the level shown in FIGURE 4-9.

 FIGURE 4-8 Where Graph, Zoomed In

Screenshot of the Where Graph, zoomed in.

Zooming in another level shows all arguments for all processes.

 FIGURE 4-9 Where Graph of a Threaded Program, Zoomed in to Show Thread Stripes

Screen shot of the Where Graph, zoomed in to show thread stripes.

procedure icon  To View Information About Individual Threads

single-step bulletShift-click on the individual stripes.

This displays information about the corresponding threads.


procedure icon   To Shrink Selected Portions of the Where Graph

You can shrink selected portions of the Where graph. This is useful if you want to see the overall structure of the graph but also want to focus on certain functions.

single-step bulletPerform one of the following:

  • Middle-click on a function to iconify it and all of its children. Middle-click on an iconified function to re-expand it and its children to the current zoom level.
  • Alternatively, you can click on the (De)iconify Node button next to the Zoom arrows at the top of the Where graph. This changes the mouse pointer to a target. You can then left-click on a function to iconify it and its children. If it is already iconified, left-clicking on it will re-expand it and its children. To cancel the operation, left-click anywhere outside of the boxes surrounding the functions.

procedure icon   To Move Through the Where Graph

When you first display the Where graph, the main function is highlighted.

single-step bulletLeft-click on a function to highlight it. Or, move through the Where graph using the keyboard:

  • Use the up arrow key to move to the parent of the highlighted function.
  • If line numbers are visible in the highlighted function, by default the leftmost number is selected by having a box drawn around it. Use the left and right arrows to select other line numbers in the function. You can then use the down arrow key to highlight the function called at the selected line.

procedure icon   To Make a Function the Current Pset

single-step bulletPress the spacebar while in the Where graph.

The following actions occur:

  • The current function changes to the function that is highlighted in the Where graph.
  • The highlighted function in the source window is displayed.
  • A new current pset is created, with the same name as the function and containing the processes with this function in their call stack. The current process of this current set is the lowest-numbered process in the set.


Combining Debug and Optimization Options

When you use the Prism environment on programs that have been compiled with optimization options, Prism commands behave differently and the visibility of variables in the optimized programs changes.

Interpreting Interaction Between an Optimized Program and the Prism Environment

When the control flow is inside a routine that has been compiled with both -g and an optimization option (a debuggable optimized routine), the next and step commands change their behavior:

  • next steps out of the current routine and stops in the next debuggable routine that differs from the original routine.
  • step stops in the next debuggable routine (including recursive calls of the original routine).

You can set breakpoints using the stop at command inside debuggable optimized routines only at the first line of such a routine. If the routine name is foo and the first instruction in foo is ADDR_INSTR, then the breakpoint is set as if you had used stop in foo or stopi at ADDR_INSTR.

Note that the following commands are unaffected:

  • nexti
  • stepi
  • stopi

When either return or stepout is used to return control flow to a debuggable optimized routine, the Prism environment assumes that the current position is at the first line of the current routine. The Prism environment makes the same assumption when the source file position is updated as a result of up or down commands that result in a debuggable optimized routine.

Accessing Variables in Optimized Routines

Due to the effects of optimization on variable locations in executable programs that have been compiled with optimization, the Prism environment cannot access all variables at all times.

The accessibility of variables can be defined by whether the variables can be used in expressions that require the right value of the variable (such as print X or call foo(X)) or the left value of the variable (such as assign X=1).

The limits of accessibility can be described by the flow of control in an optimized program. When the flow of control is in a routine compiled with both -g and an optimization flag, the following conditions apply:

  • If the control flow is at the first machine instruction of the routine (which has not yet been executed), then all global variables and the routine's arguments are accessible. No other local variable is accessible.
  • If the first machine instruction of the current routine has already been executed, then only the global variables are accessible. No local variable is accessible.

The following commands can use only accessible variables:

  • assign
  • call
  • display
  • dump
  • print
  • trace
  • tracei
  • varsave
  • when
  • where

The where command reports all active stack frames that have a stack pointer. The where command does not report routines that have no frame pointer and routines that have been inlined.



Note - The where stack displays values only for accessible arguments and `???' for all others.




Debugging Spawned Sun MPI Processes

When debugging Sun MPI jobs that spawn other Sun MPI jobs, you should be especially careful to ensure that Sun MPI or Prism processes do not exit while other processes depend on communicating with them.

For example, suppose MPI job foo spawns MPI job bar, job foo uses MPI_Send to communicate with a process in job bar,and job bar uses MPI_Recv to handle a message from job foo.

If you are debugging both jobs in the Prism environment and you issue the Prism quit command in the primary Prism session (foo) before the process in foo calls the MPI_Send function, then job foo will exit. However, bar (which you are still debugging in a secondary Prism session) cannot continue past the MPI_Recv call, because foo has already exited.

If you issue a quit -all command in the primary Prism session while debugging a job that has many deeply nested MPI_Comm_spawn calls, it may not terminate all spawned secondary Prism sessions. To terminate a secondary debug session, you must manually issue the quit command in the secondary Prism session(s).

Debugging Spawned Sessions Using the Commands-Only Interface

When the Prism environment is started with the -CX option, it opens new X terminal windows in response to the spawning of new processes. It labels a new window with the title aout:jid, where jid is the job ID of the spawned process.

You must set the DISPLAY variable if you debug programs with calls to MPI_Comm_spawn() or MPI_Comm_spawn_multiple(), even when launching the Prism environment with the commands-only interface. For more information about the commands-only interface, see Appendix A.

Prism Commands With Special Functions in Spawned Sessions

Several Prism commands perform special functions in spawned Prism sessions.

  • attach - The Prism attach command enables you to attach to an executable without issuing a prior load command. You can simply attach to the process ID or job ID, as follows:
(prism all) attach jid
The attach command will clean up the current session before attaching to the jid (job ID) specified in the command.
The attach command does not accept multiple job IDs.
However, if the specified job ID is a result of an MPI_Comm_spawn_multiple(), multiple Prism sessions get created.
  • detach - The detach command only applies to the Prism session where it is invoked. If you issue the detach command in a primary session, it is not propagated down to secondary sessions.
  • run and rerun - When you issue the run or rerun commands in the primary Prism session, the Prism environment will clean up all the secondary Prism sessions. That is, the Prism environment will shut down the secondary Prism sessions and the debuggees.
The run and rerun commands are not valid in the secondary Prism sessions.
  • kill - If you issue a kill command in a primary Prism session, the command propagates to the secondary Prism sessions. That is, the Prism environment shuts down the secondary Prism sessions and the debuggees.
  • quit - The quit command does not propagate down to the secondary sessions unless you issue the command with the -all option.
To quit all Prism sessions, you must type:
(prism all) quit -all
If the job was run by the primary Prism session, the command quit -all kills the debuggees in the primary as well as the secondary Prism sessions and close all the Prism sessions.
If you attached to the job in the primary Prism session, quit -all leaves the debuggees running and close all the Prism sessions.
The -all option is valid only in the primary Prism session.
For convenience, you can add the quit command with the -all option to the tear-off region of the Prism graphical interface. For example,
(prism all) pushbutton quitall "quit -all"
This creates a button labeled quitall in the tear-off region.

Error Conditions Arising From Spawned Sessions

TABLE 4-1 lists and explains error messages that may be displayed when error conditions are encountered in debugging spawned processes.

TABLE 4-1 Error Messages Related to Debugging of Spawned Processes

Error Message

Description

Command not allowed in spawned prisms

The Prism environment displays this error message when a user attempts to issue a run, rerun, or quit -all command in a secondary Prism session.

Timed out waiting for spawned prism

The Prism environment displays this message when system conditions prevent a a secondary Prism session from starting and it successfully communicates its status to the primary Prism session.

In such a situation, quit the primary Prism session and all secondary sessions by issuing a quit -all command in the primary Prism session. Then repeat the debugging session with follow_spawn disabled.

Nodal startup failed in spawned prism

The Prism environment displays this error message when the Prism executable on a specific node fails to start due to a system error on that node. Quit and repeat the debugging session with follow_spawn disabled.

Timed out connecting to parent prism

The Prism environment displays this error message in the secondary Prism session if it fails to connect to the primary Prism session and exchange status information with it.

Failed to spawn new prism

The Prism environment displays this error message in the primary Prism session if it fails to spawn either a new Prism session to debug a newly spawned Sun MPI debuggee or if additional Sun MPI jobs have been specified in the prism command line.

Could not continue stopped processes

The Prism environment displays this error message when it fails to change debuggee process status from the stopped state to the running state. In such cases, partial debugging of Sun MPI jobs with different executables (debugging only some of the processes) cannot usually be continued because the debugged processes cannot communicate with the non-debugged ones.

follow_spawn requires the MPI library to be linked in

The Prism environment displays this error message when you attempt to use spawn-related Prism commands, such as

set follow_spawn = on

set debug_spawn_aout = list

while debugging a non-MIMD executable--that is, an executable that does not have the Sun MPI library linked in.


For more information about using the Prism environment with Sun MPI programs that issue calls to MPI_Comm_spawn() or MPI_Comm_spawn_multiple(), see Enabling Support for Spawned MPI Processes.


Examining the Contents of Memory and Registers

You can issue commands in the command window to display the contents of memory addresses and registers.


procedure icon  To Display Memory

single-step bulletSpecify the address on the command line, followed by a slash (/).

The following displays the memory contents at address 10000 (hex).

(prism all) 0x10000/

If you specify the address as a period, the Prism environment displays the contents of the memory address immediately following the one printed most recently.

Specify a symbolic address by preceding the name with an &. For example, this prints the contents of memory for variable x:

(prism all) &x/

The address you specify can be an expression made up of other addresses and the operators +, -, and indirection (unary *). For example, this prints the contents of the location 100 addresses above address 0x1000:

(prism all) 0x1000+100/

After the slash you can specify how memory is to be displayed. TABLE 4-2 lists the supported memory address formats.

TABLE 4-2 Memory Address Formats

Format

Description

d 

Print a short word in decimal

D

Print a long word in decimal

o

Print a short word in octal

O

Print a long word in octal

x

Print a short word in hexadecimal

X

Print a long word in hexadecimal

b

Print a byte in octal

c

Print a byte as a character

s

Print a string of characters terminated by a null byte

f 

Print a single-precision real number

F 

Print a double-precision real number

i 

Print the machine instruction


The initial format is X. If you omit the format in your command, you get either X (if you haven't previously specified a format) or the format you specified previously.

You can print the contents of multiple addresses by specifying a number after the slash (and before the format). For example, this displays the contents of eight memory locations starting at address 0x1000:

(prism all) 0x1000/8X

These contents are displayed as hexadecimal long words.


procedure icon  To Display the Contents of Registers

You can examine the contents of registers in the same way that you examine the contents of memory.

single-step bulletSpecify a register by preceding its name with a dollar sign.

For example, this prints the contents of the f0 register:

(prism all) $f0/

Specify a number after the slash to print the contents of multiple registers. For example, this prints the contents of registers f0, f1, and f2:

(prism all) $f0/3

The order in which the registers are displayed is that shown in TABLE 4-3.

You can also specify a format, as described above. The format specifier controls the display of the output; it doesn't affect how much of the register contents is displayed. Thus, this displays three registers:

$f0/3X

The output is displayed as hexadecimal longwords. The following table shows the names and descriptions of UltraSPARCtrademark registers

TABLE 4-3 UltraSPARC Registers

Name

Register

$g0-$g7

Global registers (64 bits)

$o0-$o7

Output registers (64 bits)

$l0-$l7

Local registers

$i0-$i7

Input registers

$psr

Processor state register

$pc

Program counter

$npc

Next program counter

$y

Y register

$wim

Window invalid mask

$tbr

Trap base register

$f0-$f31

Floating-point registers

$fsr

Floating status register (64 bits)

$f0f1-$f62f63

Floating-point registers

$xg0-$xg7

Upper 32 bits of $g0-$g7 (SPARC V8 plus only, or higher)

$xo0-$xo7

Upper 32 bits of $o0-$o7 (SPARC V8 plus only, or higher)

$xfsr

Upper 32 bits of $fsr (SPARC V8 plus only, or higher)

$fprs

Floating-point registers state (SPARC V8 plus only, or higher)

$tstate

Trap state register (SPARC V8 plus only, or higher)

$fp

Frame pointer (synonym for $i6)

$sp

Stack pointer (synonym for $o6)