DIGITAL Fortran 90
User Manual for
DIGITAL UNIX Systems


Previous Contents Index

5.6.3 Use Efficient Data Types

In cases where more than one data type can be used for a variable, consider selecting the data types based on the following hierarchy, listed from most to least efficient:

However, keep in mind that in an arithmetic expression, you should avoid mixing integer and floating-point (REAL) data (see Section 5.6.2).

5.6.4 Avoid Using Slow Arithmetic Operators

Before you modify source code to avoid slow arithmetic operators, be aware that optimizations convert many slow arithmetic operators to faster arithmetic operators. For example, the compiler optimizes the expression H=J**2 to be H=J*J.

Consider also whether replacing a slow arithmetic operator with a faster arithmetic operator will change the accuracy of the results or impact the maintainability (readability) of the source code.

Replacing slow arithmetic operators with faster ones should be reserved for critical code areas. The following hierarchy lists the DIGITAL Fortran 90 arithmetic operators, from fastest to slowest:

5.6.5 Avoid EQUIVALENCE Statement Use

Avoid using EQUIVALENCE statements. EQUIVALENCE statements can:

5.6.6 Use Statement Functions and Internal Subprograms

Whenever the DIGITAL Fortran 90 compiler has access to the use and definition of a subprogram during compilation, it may choose to inline the subprogram. Using statement functions and internal subprograms maximizes the number of subprogram references that will be inlined, especially when multiple source files are compiled together at optimization level -o4 or higher.

For more information, see Section 5.1.2.

5.6.7 Code DO Loops for Efficiency

Minimize the arithmetic operations and other operations in a DO loop whenever possible. Moving unnecessary operations outside the loop will improve performance (for example, when the intermediate nonvarying values within the loop are not needed).

For More Information:

5.7 Optimization Levels: the -Onum Option

DIGITAL Fortran 90 performs many optimizations by default. You do not have to recode your program to use them. However, understanding how optimizations work helps you remove any inhibitors to their successful function.

Generally, DIGITAL Fortran 90 increases compile time in favor of decreasing run time. If an operation can be performed, eliminated, or simplified at compile time, DIGITAL Fortran 90 does so, rather than have it done at run time. The time required to compile the program usually increases as more optimizations occur.

The program will likely execute faster when compiled at -o4 , but will require more compilation time than if you compile the program at a lower level of optimization.

The size of object file varies with the optimizations requested. Factors that can increase object file size include an increase of loop unrolling or procedure inlining.

Table 5-4 lists the levels of DIGITAL Fortran 90 optimization with different -o options. For example: -o0 specifies no selectable optimizations (some optimizations always occur); -o5 specifies all levels of optimizations, including loop transformation and software pipelining.

Table 5-4 Levels of Optimization with Different -O num Options
  Option
Optimization Type --O0 --O1 --O2 --O3 --O4 --O5
Loop transformation and software pipelining           X
Automatic inlining         X X
Additional global optimizations       X X X
Global optimizations     X X X X
Local (minimal) optimizations   X X X X X

The default is -o4 (same as -o ). However, if -g2 , -g , or -gen_feedback is also specified, the default is -o0 (no optimizations).

In Table 5-4, the following terms are used to describe the levels of optimization (described in detail in Section 5.7.1 to Section 5.7.6):

5.7.1 Optimizations Performed at All Optimization Levels

The following optimizations occur at any optimization level ( -o0 through -o5 ):

5.7.2 Local (Minimal) Optimizations

To enable local optimizations, use -o1 or a higher optimization level ( -o2 , -o3 , -o4 , or -o5 ).

To prevent local optimizations, specify the -o0 option.

5.7.2.1 Common Subexpression Elimination

If the same subexpressions appear in more than one computation and the values do not change between computations, DIGITAL Fortran 90 computes the result once and replaces the subexpressions with the result itself:


DIMENSION A(25,25), B(25,25) 
A(I,J) = B(I,J) 

Without optimization, these statements can be compiled as follows:


t1 = ((J-1)*25+(I-1))*4 
t2 = ((J-1)*25+(I-1))*4 
A(t1) = B(t2) 

Variables t1 and t2 represent equivalent expressions. DIGITAL Fortran 90 eliminates this redundancy by producing the following:


t = ((J-1)*25+(I-1)*4 
A(t) = B(t) 

5.7.2.2 Integer Multiplication and Division Expansion

Expansion of multiplication and division refers to bit shifts that allow faster multiplication and division while producing the same result. For example, the integer expression (I*17) can be calculated as I with a 4-bit shift plus the original value of I. This can be expressed using the DIGITAL Fortran 90 ISHFT intrinsic function:


J1 = I*17 
J2 = ISHFT(I,4) + I     ! equivalent expression for I*17 

The optimizer uses machine code that, like the ISHFT intrinsic function, shifts bits to expand multiplication and division by literals.

5.7.2.3 Compile-Time Operations

DIGITAL Fortran 90 does as many operations as possible at compile time rather than having them done at run time.

Constant Operations

DIGITAL Fortran 90 can perform many operations on constants (including PARAMETER constants):

Algebraic Reassociation Optimizations

DIGITAL Fortran 90 delays operations to see whether they have no effect or can be transformed to have no effect. If they have no effect, these operations are removed. A typical example involves unary minus and .NOT. operations:


X = -Y * -Z            ! Becomes: Y * Z 

5.7.2.4 Value Propagation

DIGITAL Fortran 90 tracks the values assigned to variables and constants, including those from DATA statements, and traces them to every place they are used. DIGITAL Fortran 90 uses the value itself when it is more efficient to do so.

When compiling subprograms, DIGITAL Fortran 90 analyzes the program to ensure that propagation is safe if the subroutine is called more than once.

Value propagation frequently leads to more value propagation. DIGITAL Fortran 90 can eliminate run-time operations, comparisons and branches, and whole statements.

In the following example, constants are propagated, eliminating multiple operations from run time:
Original Code Optimized Code
PI = 3.14 .
.
.

PIOVER2 = PI/2 .
.
.

I = 100 .
.
.

IF (I.GT.1) GOTO 10
10 A(I) = 3.0*Q
.
.
.

PIOVER2 = 1.57 .
.
.

I = 100 .
.
.

10 A(100) = 3.0*Q

5.7.2.5 Dead Store Elimination

If a variable is assigned but never used, DIGITAL Fortran 90 eliminates the entire assignment statement:


X = Y*Z 
   .
   .
   .=Y*Z is eliminated. 
 
X = A(I,J)* PI 

Some programs used for performance analysis often contain such unnecessary operations. When you try to measure the performance of such programs compiled with DIGITAL Fortran 90, these programs may show unrealistically good performance results. Realistic results are possible only with program units using their results in output statements.

5.7.2.6 Register Usage

A large program usually has more data that would benefit from being held in registers than there are registers to hold the data. In such cases, DIGITAL Fortran 90 typically tries to use the registers according to the following descending priority list:

  1. For temporary operation results, including array indexes
  2. For variables
  3. For addresses of arrays (base address)
  4. All other usages

DIGITAL Fortran 90 uses heuristic algorithms and a modest amount of computation to attempt to determine an effective usage for the registers.

Holding Variables in Registers

Because operations using registers are much faster than using memory, DIGITAL Fortran 90 generates code that uses the Alpha 64-bit integer and floating-point registers instead of memory locations. Knowing when DIGITAL Fortran 90 uses registers may be helpful when doing certain forms of debugging.

DIGITAL Fortran 90 uses registers to hold the values of variables whenever the Fortran language does not require them to be held in memory, such as holding the values of temporary results of subexpressions, even if -o0 (no optimization) was specified.

DIGITAL Fortran 90 may hold the same variable in different registers at different points in the program:


V = 3.0*Q 
   .
   .
   .
X = SIN(Y)*V 
   .
   .
   .
V = PI*X 
   .
   .
   .
Y = COS(Y)*V 

DIGITAL Fortran 90 may choose one register to hold the first use of V and another register to hold the second. Both registers can be used for other purposes at points in between. There may be times when the value of the variable does not exist anywhere in the registers. If the value of V is never needed in memory, it is never assigned.

DIGITAL Fortran 90 uses registers to hold the values of I, J, and K (so long as there are no other optimization effects, such as loops involving the variables):


A(I) = B(J) + C(K) 

More typically, an expression uses the same index variable:


A(K) = B(K) + C(K) 

In this case, K is loaded into only one register and is used to index all three arrays at the same time.

5.7.2.7 Mixed Real/Complex Operations

In mixed REAL/COMPLEX operations, DIGITAL Fortran 90 avoids the conversion and performs a simplified operation on:

For example, if variable R is REAL and A and B are COMPLEX, no conversion occurs with the following:


COMPLEX A, B 
   .
   .
   .
B = A + R 

5.7.3 Global Optimizations

To enable global optimizations, use -o2 or a higher optimization level ( -o3 , -o4 , or -o5 ). Using -o2 or higher also enables local optimizations ( -o1 ).

Global optimizations include:

Data-flow and split lifetime analysis (global data analysis) traces the values of variables and whole arrays as they are created and used in different parts of a program unit. During this analysis, DIGITAL Fortran 90 assumes that any pair of array references to a given array might access the same memory location, unless a constant subscript is used in both cases.

To eliminate unnecessary recomputations of invariant expressions in loops, DIGITAL Fortran 90 hoists them out of the loops so they execute only once.

Global data analysis includes which data items are selected for analysis. Some data items are analyzed as a group and some are analyzed individually. DIGITAL Fortran 90 limits or may disqualify data items that participate in the following constructs, generally because it cannot fully trace their values.

Data items in the following constructs can make global optimizations less effective:


Previous Next Contents Index