C H A P T E R 10 |
Dense Matrix Routines |
Sun S3L includes support for matrix-matrix and matrix-vector multiplication, inner- and outer-product computation, and 2-norm computation. The routines that support these operations are discussed the following sections:
The dense matrix routines, like most other Sun S3L routines, can be used in both single-instance or multiple-instance contexts. For example, the matrix-matrix multiplication routine can be used in either of the following ways:
In the first case, the operation would be performed on a single process, with both arrays local to that process or on multiple processes, with the two arrays block-distributed across the processes.
In the second case, each instance of the multiplication operation would be performed on a different process, with each process having a pair of instances of the two arrays local to it.
All of the dense matrix routines operate on at least one Sun S3L array, which would ordinarily be created by a call to S3L_declare or S3L_declare_detailed. See Creating and Destroying Array Handles for Dense Sun S3L Arrays for information on how to create and deallocate dense Sun S3L arrays.
The balance of this chapter discusses the various Sun S3L dense matrix routines more closely.
Sun S3L provides 18 versions of matrix multiplication routines. These are listed in TABLE 10-1.
In each routine, two Sun S3L arrays, represented by A and B, are multiplied. A third Sun S3L array, represented by C, will hold the results of the operation. Other aspects of the operation vary from routine to routine as follows:
The argument syntax for the matrix-matrix multiply routines is summarized below:
S3L_mat_mult(A, B, C, row_axis, col_axis, ier) S3L_mat_mult_noadd(A, B, C, row_axis, col_axis, ier) S3L_mat_mult_addto(A, B, C, D, row_axis, col_axis, ier) |
A, B, C, and D are Sun S3L array handles returned by earlier calls to S3L_declare or S3L_declare_detailed.
A and B represent the multiplication operand matrices. C represents the matrix that stores the result of the operation. A, B, and C must all have the same rank.
D is used only in the _addto class of routines, when its contents are added to the product of A and B. D must have the same shape as C.
Note - The argument D can be identical to C in all matrix multiply _addto routines, except t1_t2__addto (both A and B are transposed). |
The contents of A and B are not changed in any of the matrix multiply routines. If D is distinct from C, its contents are not changed either. If D and C are the same variable, its contents are overwritten by the result of the matrix multiply operation.
row_axis is a scalar integer that specifies which axis of A, B, C, and D counts the rows of the embedded matrix or matrices. It must be nonnegative and less than the rank of C.
col_axis is a scalar integer that specifies which axis of A, B, C, and D counts the columns of the embedded matrix or matrices. It must be nonnegative and less than the rank of C.
For detailed descriptions of the Fortran and C bindings for the matrix-matrix multiply routines, see the S3L_mat_mult(3) man page or the corresponding descriptions in the Sun S3L Software Reference Manual.
For calls that do not transpose either matrix A or B, the variables conform correctly with the axis lengths for row_axis and col_axis shown in TABLE 10-2.
A |
p |
q |
B |
q |
r |
C |
p |
r |
D |
p |
r |
For calls that transpose matrix A (AT), the variables conform correctly with the axis lengths for row_axis and col_axis shown in TABLE 10-3.
A |
q |
p |
B |
q |
r |
C |
p |
r |
D |
p |
r |
For calls that transpose matrix B (BT), the variables conform correctly with the axis lengths for row_axis and col_axis shown in TABLE 10-4.
For calls that transpose both A and B (ATBT), the variables conform correctly with the axis lengths for row_axis and col_axis shown in TABLE 10-5.
A |
q |
p |
B |
r |
q |
C |
p |
r |
D |
p |
r |
A matrix multiply routine will use one of three algorithms, depending on various factors. The three candidate algorithms are:
Examples showing S3L_mat_mult in use can be found in:
/opt/SUNWhpc/examples/s3l/dense_matrix_ops/matmult.c /opt/SUNWhpc/examples/s3l/dense_matrix_ops-f/matmult.f |
Sun S3L provides six matrix-vector multiplication routines, which compute one or more instances of a matrix-vector product. For each instance, these routines perform the operations listed in TABLE 10-6.
Note - In these descriptions, conj[A] denotes the conjugate of A. |
S3L_mat_vec_mult |
||
S3L_mat_vec_mult_noadd |
||
S3L_mat_vec_mult_addto |
||
S3L_mat_vec_mult_c1 |
||
S3L_mat_vec_mult_c1_noadd |
||
S3L_mat_vec_mult_c1_addto |
In each matrix-vector routine, a Sun S3L array, represented by A, is multiplied by a vector, represented by x. Another Sun S3L array, represented by y, holds the results of the matrix-vector operation. Other aspects of the operation vary from routine to routine as follows:
The argument syntax for the matrix-vector routines is summarized below:
y, A, x, and v are Sun S3L array handles returned by earlier calls to S3L_declare or S3L_declare_detailed.
A and x represent the matrix and vector multiplication operands, respectively. y represents the array that stores the result of the matrix-vector operation.
v is used only in the _addto class of routines. Its contents are added to the product of A and x.
Note - The argument v can be identical to y in both routines that have _addto in their names. |
y, A, x, and v must have the following rank and size relationships:
y_vector_axis is a scalar integer that specifies the axis of y and v along which the elements of the embedded vectors lie.
row_axis is a scalar integer that specifies which axis of y, A, x, and v counts the rows of the embedded matrix or matrices. It must be nonnegative and less than the rank of A.
col_axis is a scalar integer that specifies which axis of y, A, x, and v counts the columns of the embedded matrix or matrices. It must be nonnegative and less than the rank of A.
x_vector_axis is a scalar integer that specifies the axis of x along which the elements of the embedded vectors lie.
If the call is made from a Fortran program, error status will be in ier.
For detailed descriptions of the Fortran and C bindings for the matrix-vector multiply routines, see the S3L_mat_vec_mult(3) man page or the corresponding descriptions in the Sun S3L Software Reference Manual.
Examples showing S3L_mat_vec_mult in use can be found in:
/opt/SUNWhpc/examples/s3l/dense_matrix_ops/mat_vec_mult.c /opt/SUNWhpc/examples/s3l/dense_matrix_ops-f/matvec_mult.f |
The multiple-instance 2-norm routine, S3L_2_norm, computes one or more instances of the 2-norm of a vector. The single-instance 2-norm routine, S3L_gbl_2_norm, computes the global 2-norm of a parallel array.
For each instance z of z, the multiple-instance 2-norm routine performs one of the operations shown in TABLE 10-7.
Upon successful completion, S3L_2_norm overwrites each element of z with the
2-norm of the corresponding vector in x.
The single-instance 2-norm routine performs the operations shown in TABLE 10-8.
Upon successful completion, S3L_gbl_2_norm overwrites a with the global 2-norm of x.
The argument syntax for the single- and multiple-instance 2-norm routines are summarized below:
S3L_gbl_2_norm(a, x, ier) S3L_2_norm(z, x, x_vector_axis, ier) |
x and z are Sun S3L array handles returned by earlier calls to S3L_declare or S3L_declare_detailed.
x represents a parallel array of rank 2 or greater and at least one nonlocal instance axis. It contains one or more instances of the vector x whose 2-norm will be computed.
z represents a parallel array that will contain the results of the multiple-instance 2-norm operation. Its rank must be one less than that of x.
a is a pointer to a scalar variable, which is the destination for the results of the single-instance 2-norm operation.
x_vector_axis is a scalar integer that specifies the axis of x along which the vectors lie.
If the call is made from a Fortran program, error status will be in ier.
For detailed descriptions of the Fortran and C bindings for the 2-norm routine, see the S3L_2_norm(3) man page or the corresponding descriptions in the Sun S3L Software Reference Manual.
Examples showing S3L_2_norm in use can be found in:
/opt/SUNWhpc/examples/s3l/dense_matrix_ops/norm2.c /opt/SUNWhpc/examples/s3l/dense_matrix_ops-f/norm2.f |
Sun S3L provides six multiple-instance inner-product routines, all of which compute one or more instances of the inner product of two vectors embedded in two parallel arrays. It also provides six single-instance inner product routines, all of which compute the inner product over all the axes of two parallel arrays.
The two sets of inner-product routines are discussed separately below.
The operations performed by the inner product routines are listed in TABLE 10-9.
S3L_inner_prod |
||
S3L_inner_prod_noadd |
||
S3L_inner_prod_addto |
||
S3L_inner_prod_c1 |
||
S3L_inner_prod_c1_noadd |
||
S3L_inner_prod_c1_addto |
For each multiple-instance inner-product routine, array x contains one or more instances of the first vector in each inner-product pair, x. Likewise, array y contains one or more instances of the second vector in each pair, y.
In each multiple-instance inner-product routine, the inner products are computed for vectors embedded in two Sun S3L arrays, represented by x and y. Another Sun S3L array, represented by z, holds the results of the inner-product operation. Other aspects of the operation vary from routine to routine as follows:
The argument syntax for the multiple-instance inner-product routines is summarized below:
S3L_inner_prod(z, x, y, x_vector_axis, y_vector_axis, ier) S3L_inner_prod_noadd(z, x, y, x_vector_axis, y_vector_axis, ier) S3L_inner_prod_addto(z, x, y, u, x_vector_axis, y_vector_axis, ier) |
z, x, y, and u are Sun S3L array handles returned by earlier calls to S3L_declare or S3L_declare_detailed.
x and y represent the Sun S3L arrays that contain the vector pairs from which the inner products will be computed. z represents the array that stores the results of the multiple-instance inner-product operations.
For some multiple-instance inner-product operations, the inner-product results are added to the contents of z. In other operations, the inner-product results simply replace the contents of z.
u is used only in the _addto class of routines. Its contents are added to the inner-product results computed from x and y.
z, x, y, and u must have the following rank and size relationships:
x_vector_axis is a scalar integer that specifies the axis of x along which the elements of the embedded vectors lie.
y_vector_axis is a scalar integer that specifies the axis of y along which the elements of the embedded vectors lie.
If the call is made from a Fortran program, error status will be in ier.
For detailed descriptions of the Fortran and C bindings for the multiple-instance inner-product routines, see the S3L_inner_prod(3) man page or the corresponding descriptions in the Sun S3L Software Reference Manual.
Examples showing S3L_inner_prod in use can be found in:
/opt/SUNWhpc/examples/s3l/dense_matrix_ops/inner_prod.c /opt/SUNWhpc/examples/s3l/dense_matrix_ops-f/inner_prod.f |
The operations performed by the single-instance inner-product routines are listed in TABLE 10-10.
The argument syntax for the single-instance inner-product routines is summarized below:
S3L_gbl_inner_prod(a, x, y, ier) S3L_gbl_inner_prod_noadd(a, x, y, ier) S3L_gbl_inner_prod_addto(a, x, y, b, ier) |
x and y are Sun S3L array handles returned by earlier calls to S3L_declare or S3L_declare_detailed. They represent the Sun S3L arrays containing the vector pairs from which the inner-products will be computed.
a is a pointer to a scalar variable that is the destination for the results of the single-instance inner-product operations. For S3L_gbl_inner_prod and S3L_gbl_inner_prod_c1, a is also a source of values to be added to the inner products of x and y.
b is also a pointer to a scalar variable. It is used only in the _addto class of routines. Its contents are added to the inner-product results computed from x and y.
For detailed descriptions of the Fortran and C bindings for the single-instance inner-product routines, see the S3L_inner_prod(3) man page or the corresponding descriptions in the Sun S3L Software Reference Manual.
Examples showing S3L_inner_prod in use can be found in:
/opt/SUNWhpc/examples/s3l/dense_matrix_ops/inner_prod.c /opt/SUNWhpc/examples/s3l/dense_matrix_ops-f/inner_prod.f |
Sun S3L provides six outer-product routines that compute one or more instances of an outer product of two vectors. For each instance, the outer-product routines perform the operations listed in TABLE 10-11.
Note - In these descriptions, yT and yH denote y transpose and y Hermitian, respectively. |
S3L_outer_prod |
||
S3L_outer_prod_noadd |
||
S3L_outer_prod_addto |
||
S3L_outer_prod_c2 |
||
S3L_outer_prod_c2_noadd |
||
S3L_outer_prod_c2_noadd |
In elementwise notation, for each instance S3L_outer_prod computes:
A(i,j) = A(i,j) + x(i) * y(j) |
and S3L_outer_prod_c2 computes
A(i,j) = A(i,j) + x(i) * conj[y(j)] |
where conj[y(j)] denotes the conjugate of y(j).
The argument syntax for the outer-product routines is summarized below:
A, x, y, and B are Sun S3L array handles returned by earlier calls to S3L_declare or S3L_declare_detailed.
x and y represent the Sun S3L arrays that contain the vector pairs from which the inner-products will be computed. A represents the array that stores the results of the outer-product operations.
x contains one or more instances of the first source vector, x, embedded along the axis specified by axis x_vector_axis (see below).
y contains one or more instances of the second source vector, y, embedded along the axis specified by y_vector_axis (see below).
B is used only in the _addto class of routines. Its contents are added to the outer products computed from x and y.
A, x, y, and B must conform to the following rank and size relationships:
row_axis is a scalar integer that specifies which axis of A and B counts the rows of the embedded matrix or matrices. It must be nonnegative and less than the rank of A.
col_axis is a scalar integer that specifies which axis of A and B counts the columns of the embedded matrix or matrices. It must be nonnegative and less than the rank of A.
x_vector_axis is a scalar integer that specifies the axis of x along which the elements of the embedded vectors lie.
y_vector_axis is a scalar integer that specifies the axis of y along which the elements of the embedded vectors lie.
If the call is made from a Fortran program, error status will be in ier.
For detailed descriptions of the Fortran and C bindings for the outer-product routines, see the S3L_outer_prod(3) man page or the corresponding descriptions in the Sun S3L Software Reference Manual.
Copyright © 2003, Sun Microsystems, Inc. All rights reserved.