rocSOLVER API¶
This section provides details of the rocSOLVER library API.
Types¶
rocSOLVER uses types and enumerations defined by the rocBLAS API. For more information, see the rocBLAS types.
Additional Types¶
These are types that extend the rocBLAS API.
rocblas_direct¶
rocblas_storev¶
rocblas_svect¶

enum
rocblas_svect
¶ Used to specify how the singular vectors are to be computed and stored.
Values:

enumerator
rocblas_svect_all
¶ The entire associated orthogonal/unitary matrix is computed.

enumerator
rocblas_svect_singular
¶ Only the singular vectors are computed and stored in output array.

enumerator
rocblas_svect_overwrite
¶ Only the singular vectors are computed and overwrite the input matrix.

enumerator
rocblas_svect_none
¶ No singular vectors are computed.

enumerator
rocblas_evect¶

enum
rocblas_evect
¶ Used to specify how the eigenvectors are to be computed.
Values:

enumerator
rocblas_evect_original
¶ Compute eigenvectors for the original symmetric/Hermitian matrix.

enumerator
rocblas_evect_tridiagonal
¶ Compute eigenvectors for the symmetric tridiagonal matrix.

enumerator
rocblas_evect_none
¶ No eigenvectors are computed.

enumerator
rocblas_workmode¶

enum
rocblas_workmode
¶ Used to enable the use of fast algorithms (with outofplace computations) in some of the routines.
Values:

enumerator
rocblas_outofplace
¶ Outofplace computations are allowed; this requires enough free memory.

enumerator
rocblas_inplace
¶ When not enough memory, this forces inplace computations

enumerator
LAPACK Auxiliary Functions¶
These are functions that support more advanced LAPACK routines.
Complex vector manipulations¶
rocsolver_<type>lacgv()¶

rocblas_status
rocsolver_zlacgv
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *x, const rocblas_int incx)¶

rocblas_status
rocsolver_clacgv
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *x, const rocblas_int incx)¶ LACGV conjugates the complex vector x.
It conjugates the n entries of a complex vector x with increment incx.
 Parameters
[in] handle
: rocblas_handle[in] n
: rocblas_int. n >= 0.The number of entries of the vector x.
[inout] x
: pointer to type. Array on the GPU of size at least n.On input it is the vector x, on output it is overwritten with vector conjg(x).
[in] incx
: rocblas_int. incx != 0.The increment between consecutive elements of x. If incx is negative, the elements of x are indexed in reverse order.
Matrix permutations and manipulations¶
rocsolver_<type>laswp()¶

rocblas_status
rocsolver_zlaswp
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)¶

rocblas_status
rocsolver_claswp
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)¶

rocblas_status
rocsolver_dlaswp
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)¶

rocblas_status
rocsolver_slaswp
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)¶ LASWP performs a series of row interchanges on the matrix A.
It interchanges row I with row IPIV[k1 + (I  k1) * abs(inx)], for each of rows K1 through K2 of A. k1 and k2 are 1based indices.
 Parameters
[in] handle
: rocblas_handle[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix of column dimension n to which the row interchanges will be applied. On exit, the permuted matrix.
[in] lda
: rocblas_int. lda > 0.The leading dimension of the array A.
[in] k1
: rocblas_int. k1 > 0.The first element of IPIV for which a row interchange will be done. This is a 1based index.
[in] k2
: rocblas_int. k2 > k1 > 0.(K2K1+1) is the number of elements of IPIV for which a row interchange will be done. This is a 1based index.
[in] ipiv
: pointer to rocblas_int. Array on the GPU of dimension at least k1 + (k2  k1) * abs(incx).The vector of pivot indices. Only the elements in positions k1 through (k1 + (k2  k1) * abs(incx)) of IPIV are accessed. Elements of ipiv are considered 1based.
[in] incx
: rocblas_int. incx != 0.The increment between successive values of IPIV. If IPIV is negative, the pivots are applied in reverse order.
Householder reflexions¶
rocsolver_<type>larfg()¶

rocblas_status
rocsolver_zlarfg
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *alpha, rocblas_double_complex *x, const rocblas_int incx, rocblas_double_complex *tau)¶

rocblas_status
rocsolver_clarfg
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *alpha, rocblas_float_complex *x, const rocblas_int incx, rocblas_float_complex *tau)¶

rocblas_status
rocsolver_dlarfg
(rocblas_handle handle, const rocblas_int n, double *alpha, double *x, const rocblas_int incx, double *tau)¶

rocblas_status
rocsolver_slarfg
(rocblas_handle handle, const rocblas_int n, float *alpha, float *x, const rocblas_int incx, float *tau)¶ LARFG generates an orthogonal Householder reflector H of order n.
Householder reflector H is such that
H * [alpha] = [beta] [ x ] [ 0 ]
where x is an n1 vector and alpha and beta are scalars. Matrix H can be generated as
H = I  tau * [1] * [1 v'] [v]
with v an n1 vector and tau a scalar.
 Parameters
[in] handle
: rocblas_handle[in] n
: rocblas_int. n >= 0.The order (size) of reflector H.
[inout] alpha
: pointer to type. A scalar on the GPU.On input the scalar alpha, on output it is overwritten with beta.
[inout] x
: pointer to type. Array on the GPU of size at least n1.On input it is the vector x, on output it is overwritten with vector v.
[in] incx
: rocblas_int. incx > 0.The increment between consecutive elements of x.
[out] tau
: pointer to type. A scalar on the GPU.The scalar tau.
rocsolver_<type>larft()¶

rocblas_status
rocsolver_zlarft
(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *tau, rocblas_double_complex *T, const rocblas_int ldt)¶

rocblas_status
rocsolver_clarft
(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *tau, rocblas_float_complex *T, const rocblas_int ldt)¶

rocblas_status
rocsolver_dlarft
(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *tau, double *T, const rocblas_int ldt)¶

rocblas_status
rocsolver_slarft
(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *tau, float *T, const rocblas_int ldt)¶ LARFT Generates the triangular factor T of a block reflector H of order n.
The block reflector H is defined as the product of k Householder matrices as
H = H(1) * H(2) * ... * H(k) (forward direction), or H = H(k) * ... * H(2) * H(1) (backward direction)
depending on the value of direct.
The triangular matrix T is upper triangular in forward direction and lower triangular in backward direction. If storev is columnwise, then
H = I  V * T * V'
where the ith column of matrix V contains the Householder vector associated to H(i). If storev is rowwise, then
H = I  V' * T * V
where the ith row of matrix V contains the Householder vector associated to H(i).
 Parameters
[in] handle
: rocblas_handle.[in] direct
: rocblas_direct.Specifies the direction in which the Householder matrices are applied.
[in] storev
: rocblas_storev.Specifies how the Householder vectors are stored in matrix V.
[in] n
: rocblas_int. n >= 0.The order (size) of the block reflector.
[in] k
: rocblas_int. k >= 1.The number of Householder matrices.
[in] V
: pointer to type. Array on the GPU of size ldv*k if columnwise, or ldv*n if rowwise.The matrix of Householder vectors.
[in] ldv
: rocblas_int. ldv >= n if columnwise, or ldv >= k if rowwise.Leading dimension of V.
[in] tau
: pointer to type. Array of k scalars on the GPU.The vector of all the scalars associated to the Householder matrices.
[out] T
: pointer to type. Array on the GPU of dimension ldt*k.The triangular factor. T is upper triangular is forward operation, otherwise it is lower triangular. The rest of the array is not used.
[in] ldt
: rocblas_int. ldt >= k.The leading dimension of T.
rocsolver_<type>larf()¶

rocblas_status
rocsolver_zlarf
(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_double_complex *x, const rocblas_int incx, const rocblas_double_complex *alpha, rocblas_double_complex *A, const rocblas_int lda)¶

rocblas_status
rocsolver_clarf
(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_float_complex *x, const rocblas_int incx, const rocblas_float_complex *alpha, rocblas_float_complex *A, const rocblas_int lda)¶

rocblas_status
rocsolver_dlarf
(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, double *x, const rocblas_int incx, const double *alpha, double *A, const rocblas_int lda)¶

rocblas_status
rocsolver_slarf
(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, float *x, const rocblas_int incx, const float *alpha, float *A, const rocblas_int lda)¶ LARF applies a Householder reflector H to a general matrix A.
The Householder reflector H, of order m (or n), is to be applied to a mbyn matrix A from the left (or the right). H is given by
H = I  alpha * x * x'
where alpha is a scalar and x a Householder vector. H is never actually computed.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.If side = rocblas_side_left, then compute H*A If side = rocblas_side_right, then compute A*H
[in] m
: rocblas_int. m >= 0.Number of rows of A.
[in] n
: rocblas_int. n >= 0.Number of columns of A.
[in] x
: pointer to type. Array on the GPU of size at least (1 + (m1)*abs(incx)) if left side, or at least (1 + (n1)*abs(incx)) if right side.The Householder vector x.
[in] incx
: rocblas_int. incx != 0.Increment between to consecutive elements of x. If incx < 0, the elements of x are used in reverse order.
[in] alpha
: pointer to type. A scalar on the GPU.If alpha = 0, then H = I (A will remain the same, x is never used)
[inout] A
: pointer to type. Array on the GPU of size lda*n.On input, the matrix A. On output it is overwritten with H*A (or A*H).
[in] lda
: rocblas_int. lda >= m.Leading dimension of A.
rocsolver_<type>larfb()¶

rocblas_status
rocsolver_zlarfb
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *T, const rocblas_int ldt, rocblas_double_complex *A, const rocblas_int lda)¶

rocblas_status
rocsolver_clarfb
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *T, const rocblas_int ldt, rocblas_float_complex *A, const rocblas_int lda)¶

rocblas_status
rocsolver_dlarfb
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *T, const rocblas_int ldt, double *A, const rocblas_int lda)¶

rocblas_status
rocsolver_slarfb
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *T, const rocblas_int ldt, float *A, const rocblas_int lda)¶ LARFB applies a block reflector H to a general mbyn matrix A.
The block reflector H is applied in one of the following forms, depending on the values of side and trans:
H * A (No transpose from the left) H' * A (Transpose or conjugate transpose from the left) A * H (No transpose from the right), and A * H' (Transpose or conjugate transpose from the right)
The block reflector H is defined as the product of k Householder matrices as
H = H(1) * H(2) * ... * H(k) (forward direction), or H = H(k) * ... * H(2) * H(1) (backward direction)
depending on the value of direct. H is never stored. It is calculated as
H = I  V * T * V'
where the ith column of matrix V contains the Householder vector associated with H(i), if storev is columnwise; or
H = I  V' * T * V
where the ith row of matrix V contains the Householder vector associated with H(i), if storev is rowwise. T is the associated triangular factor as computed by LARFT.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply H.
[in] trans
: rocblas_operation.Specifies whether the block reflector or its transpose/conjugate transpose is to be applied.
[in] direct
: rocblas_direct.Specifies the direction in which the Householder matrices were to be applied to generate H.
[in] storev
: rocblas_storev.Specifies how the Householder vectors are stored in matrix V.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix A.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix A.
[in] k
: rocblas_int. k >= 1.The number of Householder matrices.
[in] V
: pointer to type. Array on the GPU of size ldv*k if columnwise, ldv*n if rowwise and applying from the right, or ldv*m if rowwise and applying from the left.The matrix of Householder vectors.
[in] ldv
: rocblas_int. ldv >= k if rowwise, ldv >= m if columnwise and applying from the left, or ldv >= n if columnwise and applying from the right.Leading dimension of V.
[in] T
: pointer to type. Array on the GPU of dimension ldt*k.The triangular factor of the block reflector.
[in] ldt
: rocblas_int. ldt >= k.The leading dimension of T.
[inout] A
: pointer to type. Array on the GPU of size lda*n.On input, the matrix A. On output it is overwritten with H*A, A*H, H’*A, or A*H’.
[in] lda
: rocblas_int. lda >= m.Leading dimension of A.
Bidiagonal forms¶
rocsolver_<type>labrd()¶

rocblas_status
rocsolver_zlabrd
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup, rocblas_double_complex *X, const rocblas_int ldx, rocblas_double_complex *Y, const rocblas_int ldy)¶

rocblas_status
rocsolver_clabrd
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup, rocblas_float_complex *X, const rocblas_int ldx, rocblas_float_complex *Y, const rocblas_int ldy)¶

rocblas_status
rocsolver_dlabrd
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup, double *X, const rocblas_int ldx, double *Y, const rocblas_int ldy)¶

rocblas_status
rocsolver_slabrd
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup, float *X, const rocblas_int ldx, float *Y, const rocblas_int ldy)¶ LABRD computes the bidiagonal form of the first k rows and columns of a general mbyn matrix A, as well as the matrices X and Y needed to reduce the remaining part of A.
The bidiagonal form is given by:
B = Q' * A * P
where B is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q and P are orthogonal/unitary matrices represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(k) and P = G(1) * G(2) * ... * G(k1), if m >= n, or Q = H(1) * H(2) * ... * H(k1) and P = G(1) * G(2) * ... * G(k), if m < n
Each Householder matrix H(i) and G(i) is given by
H(i) = I  tauq[i1] * v(i) * v(i)', and G(i) = I  taup[i1] * u(i) * u(i)'
If m >= n, the first i1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1; while the first i elements of the Householder vector u(i) are zero, and u(i)[i+1] = 1. If m < n, the first i elements of the Householder vector v(i) are zero, and v(i)[i+1] = 1; while the first i1 elements of the Householder vector u(i) are zero, and u(i)[i] = 1.
The unreduced part of the matrix A can be updated using a block update:
A = A  V * Y'  X * U'
where V is an mbyk matrix and U is an nbyk formed using the vectors v and u.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[in] k
: rocblas_int. min(m,n) >= k >= 0.The number of leading rows and columns of the matrix A to be reduced.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B. If m >= n, the elements below the diagonal are the m  i elements of vector v(i) for i = 1,2,…,n, and the elements above the superdiagonal are the n  i  1 elements of vector u(i) for i = 1,2,…,n1. If m < n, the elements below the subdiagonal are the m  i  1 elements of vector v(i) for i = 1,2,…,m1, and the elements above the diagonal are the n  i elements of vector u(i) for i = 1,2,…,m.
[in] lda
: rocblas_int. lda >= m.specifies the leading dimension of A.
[out] D
: pointer to real type. Array on the GPU of dimension k.The diagonal elements of B.
[out] E
: pointer to real type. Array on the GPU of dimension k.The offdiagonal elements of B.
[out] tauq
: pointer to type. Array on the GPU of dimension k.The scalar factors of the Householder matrices H(i).
[out] taup
: pointer to type. Array on the GPU of dimension k.The scalar factors of the Householder matrices G(i).
[out] X
: pointer to type. Array on the GPU of dimension ldx*k.The mbyk matrix needed to reduce the unreduced part of A.
[in] ldx
: rocblas_int. ldx >= m.specifies the leading dimension of X.
[out] Y
: pointer to type. Array on the GPU of dimension ldy*k.The nbyk matrix needed to reduce the unreduced part of A.
[in] ldy
: rocblas_int. ldy >= n.specifies the leading dimension of Y.
rocsolver_<type>bdsqr()¶

rocblas_status
rocsolver_zbdsqr
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, double *D, double *E, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *U, const rocblas_int ldu, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status
rocsolver_cbdsqr
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, float *D, float *E, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *U, const rocblas_int ldu, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status
rocsolver_dbdsqr
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, double *D, double *E, double *V, const rocblas_int ldv, double *U, const rocblas_int ldu, double *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status
rocsolver_sbdsqr
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, float *D, float *E, float *V, const rocblas_int ldv, float *U, const rocblas_int ldu, float *C, const rocblas_int ldc, rocblas_int *info)¶ BDSQR computes the singular value decomposition (SVD) of a nbyn bidiagonal matrix B.
The SVD of B has the form:
B = Ub * S * Vb'
where S is the nbyn diagonal matrix of singular values of B, the columns of Ub are the left singular vectors of B, and the columns of Vb are its right singular vectors.
The computation of the singular vectors is optional; this function accepts input matrices U (of size nubyn) and V (of size nbynv) that are overwritten with U*Ub and Vb’*V. If nu = 0 no left vectors are computed; if nv = 0 no right vectors are computed.
Optionally, this function can also compute Ub’*C for a given nbync input matrix C.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether B is upper or lower bidiagonal.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of matrix B.
[in] nv
: rocblas_int. nv >= 0.The number of columns of matrix V.
[in] nu
: rocblas_int. nu >= 0.The number of rows of matrix U.
[in] nc
: rocblas_int. nu >= 0.The number of columns of matrix C.
[inout] D
: pointer to real type. Array on the GPU of dimension n.On entry, the diagonal elements of B. On exit, if info = 0, the singular values of B in decreasing order; if info > 0, the diagonal elements of a bidiagonal matrix orthogonally equivalent to B.
[inout] E
: pointer to real type. Array on the GPU of dimension n1.On entry, the offdiagonal elements of B. On exit, if info > 0, the offdiagonal elements of a bidiagonal matrix orthogonally equivalent to B (if info = 0 this matrix converges to zero).
[inout] V
: pointer to type. Array on the GPU of dimension ldv*nv.On entry, the matrix V. On exit, it is overwritten with Vb’*V. (Not referenced if nv = 0).
[in] ldv
: rocblas_int. ldv >= n if nv > 0, or ldv >=1 if nv = 0.Specifies the leading dimension of V.
[inout] U
: pointer to type. Array on the GPU of dimension ldu*n.On entry, the matrix U. On exit, it is overwritten with U*Ub. (Not referenced if nu = 0).
[in] ldu
: rocblas_int. ldu >= nu.Specifies the leading dimension of U.
[inout] C
: pointer to type. Array on the GPU of dimension ldc*nc.On entry, the matrix C. On exit, it is overwritten with Ub’*C. (Not referenced if nc = 0).
[in] ldc
: rocblas_int. ldc >= n if nc > 0, or ldc >=1 if nc = 0.Specifies the leading dimension of C.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, i elements of E have not converged to zero.
Tridiagonal forms¶
rocsolver_<type>latrd()¶

rocblas_status
rocsolver_zlatrd
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, double *E, rocblas_double_complex *tau, rocblas_double_complex *W, const rocblas_int ldw)¶

rocblas_status
rocsolver_clatrd
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, float *E, rocblas_float_complex *tau, rocblas_float_complex *W, const rocblas_int ldw)¶

rocblas_status
rocsolver_dlatrd
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *E, double *tau, double *W, const rocblas_int ldw)¶

rocblas_status
rocsolver_slatrd
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *E, float *tau, float *W, const rocblas_int ldw)¶ LATRD computes the tridiagonal form of k rows and columns of a symmetric/hermitian matrix A, as well as the matrix W needed to update the remaining part of A.
The reduced form is given by:
T = Q' * A * Q
If uplo is lower, the first k rows and columns of T form a tridiagonal block, if uplo is upper, then the last k rows and columns of T form the tridiagonal block. Q is an orthogonal/unitary matrix represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(k) if uplo indicates lower, or Q = H(n1) * H(n2) * ... * H(nk) if uplo is upper.
Each Householder matrix H(i) is given by
H(i) = I  tau[i] * v(i) * v(i)'
where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v(i) are zero, and v(i)[i+1] = 1. If uplo is upper, the last ni elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
The unreduced part of the matrix A can be updated using a rank update of the form:
A = A  V * W'  W * V'
where V is an nbyk matrix formed by the vectors v(i).
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix A.
[in] k
: rocblas_int. 0 <= k <= n.The number of rows and columns of the matrix A to be reduced.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the nbyn matrix to be reduced. On exit, if uplo is lower, the first k columns have been reduced to tridiagonal form (given in the diagonal elements of A and the array E), the elements below the diagonal contain the vectors v(i) stored as columns. If uplo is upper, the last k columns have been reduced to tridiagonal form (given in the diagonal elements of A and the array E), the elements above the diagonal contain the vectors v(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A.
[out] E
: pointer to real type. Array on the GPU of dimension n1.If upper (lower), the last (first) k elements of E are the offdiagonal elements of the computed tridiagonal block.
[out] tau
: pointer to type. Array on the GPU of dimension n1.If upper (lower), the last (first) k elements of tau are the scalar factors of the Householder matrices H(i).
[out] W
: pointer to type. Array on the GPU of dimension ldw*k.The nbyk matrix needed to update the unreduced part of A.
[in] ldw
: rocblas_int. ldw >= n.specifies the leading dimension of W.
Orthonormal matrices¶
rocsolver_<type>org2r()¶

rocblas_status
rocsolver_dorg2r
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sorg2r
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORG2R generates a mbyn Matrix Q with orthonormal columns.
(This is the unblocked version of the algorithm).
The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m
Q = H(1) * H(2) * ... * H(k)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQRF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith column has Householder vector v(i), for i = 1,2,…,k as returned in the first k columns of matrix A of GEQRF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
rocsolver_<type>orgqr()¶

rocblas_status
rocsolver_dorgqr
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sorgqr
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORGQR generates a mbyn Matrix Q with orthonormal columns.
(This is the blocked version of the algorithm).
The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m
Q = H(1) * H(2) * ... * H(k)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQRF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith column has Householder vector v(i), for i = 1,2,…,k as returned in the first k columns of matrix A of GEQRF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
rocsolver_<type>orgl2()¶

rocblas_status
rocsolver_dorgl2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sorgl2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORGL2 generates a mbyn Matrix Q with orthonormal rows.
(This is the unblocked version of the algorithm).
The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n
Q = H(k) * H(k1) * ... * H(1)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GELQF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. 0 <= m <= n.The number of rows of the matrix Q.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= m.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith row has Householder vector v(i), for i = 1,2,…,k as returned in the first k rows of matrix A of GELQF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
rocsolver_<type>orglq()¶

rocblas_status
rocsolver_dorglq
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sorglq
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORGLQ generates a mbyn Matrix Q with orthonormal rows.
(This is the blocked version of the algorithm).
The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n
Q = H(k) * H(k1) * ... * H(1)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GELQF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. 0 <= m <= n.The number of rows of the matrix Q.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= m.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith row has Householder vector v(i), for i = 1,2,…,k as returned in the first k rows of matrix A of GELQF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
rocsolver_<type>org2l()¶

rocblas_status
rocsolver_dorg2l
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sorg2l
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORG2L generates a mbyn Matrix Q with orthonormal columns.
(This is the unblocked version of the algorithm).
The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m
Q = H(k) * H(k1) * ... * H(1)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQLF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the (nk+i)th column has Householder vector v(i), for i = 1,2,…,k as returned in the last k columns of matrix A of GEQLF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQLF.
rocsolver_<type>orgql()¶

rocblas_status
rocsolver_dorgql
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sorgql
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORGQL generates a mbyn Matrix Q with orthonormal columns.
(This is the blocked version of the algorithm).
The matrix Q is defined as the last n column of the product of k Householder reflectors of order m
Q = H(k) * H(k1) * ... * H(1)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQLF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the (nk+i)th column has Householder vector v(i), for i = 1,2,…,k as returned in the last k columns of matrix A of GEQLF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQLF.
rocsolver_<type>orgbr()¶

rocblas_status
rocsolver_dorgbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sorgbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORGBR generates a mbyn Matrix Q with orthonormal rows or columns.
If storev is columnwise, then the matrix Q has orthonormal columns. If m >= k, Q is defined as the first n columns of the product of k Householder reflectors of order m
Q = H(1) * H(2) * ... * H(k)
If m < k, Q is defined as the product of Householder reflectors of order m
Q = H(1) * H(2) * ... * H(m1)
On the other hand, if storev is rowwise, then the matrix Q has orthonormal rows. If n > k, Q is defined as the first m rows of the product of k Householder reflectors of order n
Q = H(k) * H(k1) * ... * H(1)
If n <= k, Q is defined as the product of Householder reflectors of order n
Q = H(n1) * H(n2) * ... * H(1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by GEBRD in its arguments A and tauq or taup.
 Parameters
[in] handle
: rocblas_handle.[in] storev
: rocblas_storev.Specifies whether to work columnwise or rowwise.
[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q. If rowwise, then min(n,k) <= m <= n.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix Q. If columnwise, then min(m,k) <= n <= m.
[in] k
: rocblas_int. k >= 0.The number of columns (if storev is columwise) or rows (if rowwise) of the original matrix reduced by GEBRD.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith column (or row) has the Householder vector v(i) as returned by GEBRD. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension min(m,k) if columnwise, or min(n,k) if rowwise.The scalar factors of the Householder matrices H(i) as returned by GEBRD.
rocsolver_<type>orgtr()¶

rocblas_status
rocsolver_dorgtr
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sorgtr
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ ORGTR generates a nbyn orthogonal Matrix Q.
Q is defined as the product of n1 Householder reflectors of order n. If uplo indicates upper, then Q has the form
Q = H(n1) * H(n2) * ... * H(1)
On the other hand, if uplo indicates lower, then Q has the form
Q = H(1) * H(2) * ... * H(n1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by SYTRD in its arguments A and tau.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the SYTRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix Q.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the (i+1)th column (if uplo indicates upper) or ith column (if uplo indicates lower) has the Householder vector v(i) as returned by SYTRD. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension n1.The scalar factors of the Householder matrices H(i) as returned by SYTRD.
rocsolver_<type>orm2r()¶

rocblas_status
rocsolver_dorm2r
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_sorm2r
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORM2R applies a matrix Q with orthonormal columns to a general mbyn matrix C.
(This is the unblocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(1) * H(2) * ... * H(k)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The ith column has the Householder vector v(i) associated with H(i) as returned by GEQRF in the first k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, or lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>ormqr()¶

rocblas_status
rocsolver_dormqr
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_sormqr
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORMQR applies a matrix Q with orthonormal columns to a general mbyn matrix C.
(This is the blocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(1) * H(2) * ... * H(k)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The ith column has the Householder vector v(i) associated with H(i) as returned by GEQRF in the first k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, or lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>orml2()¶

rocblas_status
rocsolver_dorml2
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_sorml2
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORML2 applies a matrix Q with orthonormal rows to a general mbyn matrix C.
(This is the unblocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(k) * H(k1) * ... * H(1)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.The ith row has the Householder vector v(i) associated with H(i) as returned by GELQF in the first k rows of its argument A.
[in] lda
: rocblas_int. lda >= k.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>ormlq()¶

rocblas_status
rocsolver_dormlq
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_sormlq
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORMLQ applies a matrix Q with orthonormal rows to a general mbyn matrix C.
(This is the blocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(k) * H(k1) * ... * H(1)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.The ith row has the Householder vector v(i) associated with H(i) as returned by GELQF in the first k rows of its argument A.
[in] lda
: rocblas_int. lda >= k.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>orm2l()¶

rocblas_status
rocsolver_dorm2l
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_sorm2l
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORM2L applies a matrix Q with orthonormal columns to a general mbyn matrix C.
(This is the unblocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(k) * H(k1) * ... * H(1)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The ith column has the Householder vector v(i) associated with H(i) as returned by GEQLF in the last k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQLF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>ormql()¶

rocblas_status
rocsolver_dormql
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_sormql
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORMQL applies a matrix Q with orthonormal columns to a general mbyn matrix C.
(This is the blocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(k) * H(k1) * ... * H(1)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The ith column has the Householder vector v(i) associated with H(i) as returned by GEQLF in the last k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQLF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>ormbr()¶

rocblas_status
rocsolver_dormbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_sormbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORMBR applies a matrix Q with orthonormal rows or columns to a general mbyn matrix C.
If storev is columnwise, then the matrix Q has orthonormal columns. If storev is rowwise, then the matrix Q has orthonormal rows. The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
The order nq of orthogonal matrix Q is nq = m if applying from the left, or nq = n if applying from the right.
When storev is columnwise, if nq >= k, then Q is defined as the product of k Householder reflectors of order nq
Q = H(1) * H(2) * ... * H(k),
and if nq < k, then Q is defined as the product
Q = H(1) * H(2) * ... * H(nq1).
When storev is rowwise, if nq > k, then Q is defined as the product of k Householder reflectors of order nq
Q = H(1) * H(2) * ... * H(k),
and if n <= k, Q is defined as the product
Q = H(1) * H(2) * ... * H(nq1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by GEBRD in its arguments A and tauq or taup.
 Parameters
[in] handle
: rocblas_handle.[in] storev
: rocblas_storev.Specifies whether to work columnwise or rowwise.
[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0.The number of columns (if storev is columwise) or rows (if rowwise) of the original matrix reduced by GEBRD.
[in] A
: pointer to type. Array on the GPU of size lda*min(nq,k) if columnwise, or lda*nq if rowwise.The ith column (or row) has the Householder vector v(i) associated with H(i) as returned by GEBRD.
[in] lda
: rocblas_int. lda >= nq if columnwise, or lda >= min(nq,k) if rowwise.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least min(nq,k).The scalar factors of the Householder matrices H(i) as returned by GEBRD.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>ormtr()¶

rocblas_status
rocsolver_dormtr
(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_sormtr
(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORMTR applies an orthogonal matrix Q to a general mbyn matrix C.
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
The order nq of orthogonal matrix Q is nq = m if applying from the left, or nq = n if applying from the right.
Q is defined as the product of nq1 Householder reflectors of order nq. If uplo indicates upper, then Q has the form
Q = H(nq1) * H(nq2) * ... * H(1).
On the other hand, if uplo indicates lower, then Q has the form
Q = H(1) * H(2) * ... * H(nq1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by SYTRD in its arguments A and tau.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] uplo
: rocblas_fill.Specifies whether the SYTRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] A
: pointer to type. Array on the GPU of size lda*nq.On entry, the (i+1)th column (if uplo indicates upper) or ith column (if uplo indicates lower) has the Householder vector v(i) as returned by SYTRD.
[in] lda
: rocblas_int. lda >= nq.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least nq1.The scalar factors of the Householder matrices H(i) as returned by SYTRD.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
Unitary matrices¶
rocsolver_<type>ung2r()¶

rocblas_status
rocsolver_zung2r
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cung2r
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶ UNG2R generates a mbyn complex Matrix Q with orthonormal columns.
(This is the unblocked version of the algorithm).
The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m
Q = H(1) * H(2) * ... * H(k)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQRF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith column has Householder vector v(i), for i = 1,2,…,k as returned in the first k columns of matrix A of GEQRF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
rocsolver_<type>ungqr()¶

rocblas_status
rocsolver_zungqr
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cungqr
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶ UNGQR generates a mbyn complex Matrix Q with orthonormal columns.
(This is the blocked version of the algorithm).
The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m
Q = H(1) * H(2) * ... * H(k)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQRF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith column has Householder vector v(i), for i = 1,2,…,k as returned in the first k columns of matrix A of GEQRF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
rocsolver_<type>ungl2()¶

rocblas_status
rocsolver_zungl2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cungl2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶ UNGL2 generates a mbyn complex Matrix Q with orthonormal rows.
(This is the unblocked version of the algorithm).
The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n
Q = H(k)**H * H(k1)**H * ... * H(1)**H
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GELQF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. 0 <= m <= n.The number of rows of the matrix Q.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= m.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith row has Householder vector v(i), for i = 1,2,…,k as returned in the first k rows of matrix A of GELQF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
rocsolver_<type>unglq()¶

rocblas_status
rocsolver_zunglq
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cunglq
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶ UNGLQ generates a mbyn complex Matrix Q with orthonormal rows.
(This is the blocked version of the algorithm).
The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n
Q = H(k)**H * H(k1)**H * ... * H(1)**H
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GELQF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. 0 <= m <= n.The number of rows of the matrix Q.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= m.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith row has Householder vector v(i), for i = 1,2,…,k as returned in the first k rows of matrix A of GELQF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
rocsolver_<type>ung2l()¶

rocblas_status
rocsolver_zung2l
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cung2l
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶ UNG2L generates a mbyn complex Matrix Q with orthonormal columns.
(This is the unblocked version of the algorithm).
The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m
Q = H(k) * H(k1) * ... * H(1)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQLF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the (nk+i)th column has Householder vector v(i), for i = 1,2,…,k as returned in the last k columns of matrix A of GEQLF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQLF.
rocsolver_<type>ungql()¶

rocblas_status
rocsolver_zungql
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cungql
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶ UNGQL generates a mbyn complex Matrix Q with orthonormal columns.
(This is the blocked version of the algorithm).
The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m
Q = H(k) * H(k1) * ... * H(1)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQLF.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of columns of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the (nk+i)th column has Householder vector v(i), for i = 1,2,…,k as returned in the last k columns of matrix A of GEQLF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQLF.
rocsolver_<type>ungbr()¶

rocblas_status
rocsolver_zungbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cungbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶ UNGBR generates a mbyn complex Matrix Q with orthonormal rows or columns.
If storev is columnwise, then the matrix Q has orthonormal columns. If m >= k, Q is defined as the first n columns of the product of k Householder reflectors of order m
Q = H(1) * H(2) * ... * H(k)
If m < k, Q is defined as the product of Householder reflectors of order m
Q = H(1) * H(2) * ... * H(m1)
On the other hand, if storev is rowwise, then the matrix Q has orthonormal rows. If n > k, Q is defined as the first m rows of the product of k Householder reflectors of order n
Q = H(k) * H(k1) * ... * H(1)
If n <= k, Q is defined as the product of Householder reflectors of order n
Q = H(n1) * H(n2) * ... * H(1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by GEBRD in its arguments A and tauq or taup.
 Parameters
[in] handle
: rocblas_handle.[in] storev
: rocblas_storev.Specifies whether to work columnwise or rowwise.
[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q. If rowwise, then min(n,k) <= m <= n.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix Q. If columnwise, then min(m,k) <= n <= m.
[in] k
: rocblas_int. k >= 0.The number of columns (if storev is columwise) or rows (if rowwise) of the original matrix reduced by GEBRD.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the ith column (or row) has the Householder vector v(i) as returned by GEBRD. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension min(m,k) if columnwise, or min(n,k) if rowwise.The scalar factors of the Householder matrices H(i) as returned by GEBRD.
rocsolver_<type>ungtr()¶

rocblas_status
rocsolver_zungtr
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cungtr
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶ UNGTR generates a nbyn unitary Matrix Q.
Q is defined as the product of n1 Householder reflectors of order n. If uplo indicates upper, then Q has the form
Q = H(n1) * H(n2) * ... * H(1)
On the other hand, if uplo indicates lower, then Q has the form
Q = H(1) * H(2) * ... * H(n1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by HETRD in its arguments A and tau.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the HETRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix Q.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the (i+1)th column (if uplo indicates upper) or ith column (if uplo indicates lower) has the Householder vector v(i) as returned by HETRD. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension n1.The scalar factors of the Householder matrices H(i) as returned by HETRD.
rocsolver_<type>unm2r()¶

rocblas_status
rocsolver_zunm2r
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_cunm2r
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)¶ UNM2R applies a complex matrix Q with orthonormal columns to a general mbyn matrix C.
(This is the unblocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Conjugate transpose from the left) C * Q (No transpose from the right), and C * Q' (Conjugate transpose from the right)
Q is a unitary matrix defined as the product of k Householder reflectors as
Q = H(1) * H(2) * ... * H(k)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its conjugate transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The ith column has the Householder vector v(i) associated with H(i) as returned by GEQRF in the first k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, or lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>unmqr()¶

rocblas_status
rocsolver_zunmqr
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_cunmqr
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)¶ UNMQR applies a complex matrix Q with orthonormal columns to a general mbyn matrix C.
(This is the blocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Conjugate transpose from the left) C * Q (No transpose from the right), and C * Q' (Conjugate transpose from the right)
Q is a unitary matrix defined as the product of k Householder reflectors as
Q = H(1) * H(2) * ... * H(k)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its conjugate transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The ith column has the Householder vector v(i) associated with H(i) as returned by GEQRF in the first k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, or lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>unml2()¶

rocblas_status
rocsolver_zunml2
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_cunml2
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)¶ UNML2 applies a complex matrix Q with orthonormal rows to a general mbyn matrix C.
(This is the unblocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Conjugate transpose from the left) C * Q (No transpose from the right), and C * Q' (Conjugate transpose from the right)
Q is a unitary matrix defined as the product of k Householder reflectors as
Q = H(k)**H * H(k1)**H * ... * H(1)**H
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its conjugate transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.The ith row has the Householder vector v(i) associated with H(i) as returned by GELQF in the first k rows of its argument A.
[in] lda
: rocblas_int. lda >= k.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>unmlq()¶

rocblas_status
rocsolver_zunmlq
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_cunmlq
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)¶ UNMLQ applies a complex matrix Q with orthonormal rows to a general mbyn matrix C.
(This is the blocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Conjugate transpose from the left) C * Q (No transpose from the right), and C * Q' (Conjugate transpose from the right)
Q is a unitary matrix defined as the product of k Householder reflectors as
Q = H(k)**H * H(k1)**H * ... * H(1)**H
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its conjugate transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.The ith row has the Householder vector v(i) associated with H(i) as returned by GELQF in the first k rows of its argument A.
[in] lda
: rocblas_int. lda >= k.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>unm2l()¶

rocblas_status
rocsolver_zunm2l
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_cunm2l
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)¶ UNM2L applies a complex matrix Q with orthonormal columns to a general mbyn matrix C.
(This is the unblocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Conjugate transpose from the left) C * Q (No transpose from the right), and C * Q' (Conjugate transpose from the right)
Q is a unitary matrix defined as the product of k Householder reflectors as
Q = H(k) * H(k1) * ... * H(1)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its conjugate transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The ith column has the Householder vector v(i) associated with H(i) as returned by GEQLF in the last k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQLF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>unmql()¶

rocblas_status
rocsolver_zunmql
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_cunmql
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)¶ UNMQL applies a complex matrix Q with orthonormal columns to a general mbyn matrix C.
(This is the blocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Conjugate transpose from the left) C * Q (No transpose from the right), and C * Q' (Conjugate transpose from the right)
Q is a unitary matrix defined as the product of k Householder reflectors as
Q = H(k) * H(k1) * ... * H(1)
of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its conjugate transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The ith column has the Householder vector v(i) associated with H(i) as returned by GEQLF in the last k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQLF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>unmbr()¶

rocblas_status
rocsolver_zunmbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_cunmbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)¶ UNMBR applies a complex matrix Q with orthonormal rows or columns to a general mbyn matrix C.
If storev is columnwise, then the matrix Q has orthonormal columns. If storev is rowwise, then the matrix Q has orthonormal rows. The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Conjugate transpose from the left) C * Q (No transpose from the right), and C * Q' (Conjugate transpose from the right)
The order nq of unitary matrix Q is nq = m if applying from the left, or nq = n if applying from the right.
When storev is columnwise, if nq >= k, then Q is defined as the product of k Householder reflectors of order nq
Q = H(1) * H(2) * ... * H(k),
and if nq < k, then Q is defined as the product
Q = H(1) * H(2) * ... * H(nq1).
When storev is rowwise, if nq > k, then Q is defined as the product of k Householder reflectors of order nq
Q = H(1) * H(2) * ... * H(k),
and if n <= k, Q is defined as the product
Q = H(1) * H(2) * ... * H(nq1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by GEBRD in its arguments A and tauq or taup.
 Parameters
[in] handle
: rocblas_handle.[in] storev
: rocblas_storev.Specifies whether to work columnwise or rowwise.
[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its conjugate transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocblas_int. k >= 0.The number of columns (if storev is columwise) or rows (if rowwise) of the original matrix reduced by GEBRD.
[in] A
: pointer to type. Array on the GPU of size lda*min(nq,k) if columnwise, or lda*nq if rowwise.The ith column (or row) has the Householder vector v(i) associated with H(i) as returned by GEBRD.
[in] lda
: rocblas_int. lda >= nq if columnwise, or lda >= min(nq,k) if rowwise.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least min(nq,k).The scalar factors of the Householder matrices H(i) as returned by GEBRD.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>unmtr()¶

rocblas_status
rocsolver_zunmtr
(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)¶

rocblas_status
rocsolver_cunmtr
(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)¶ UNMTR applies a unitary matrix Q to a general mbyn matrix C.
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Conjugate transpose from the left) C * Q (No transpose from the right), and C * Q' (Conjugate transpose from the right)
The order nq of unitary matrix Q is nq = m if applying from the left, or nq = n if applying from the right.
Q is defined as the product of nq1 Householder reflectors of order nq. If uplo indicates upper, then Q has the form
Q = H(nq1) * H(nq2) * ... * H(1).
On the other hand, if uplo indicates lower, then Q has the form
Q = H(1) * H(2) * ... * H(nq1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by HETRD in its arguments A and tau.
 Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] uplo
: rocblas_fill.Specifies whether the SYTRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its conjugate transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] A
: pointer to type. Array on the GPU of size lda*nq.On entry, the (i+1)th column (if uplo indicates upper) or ith column (if uplo indicates lower) has the Householder vector v(i) as returned by HETRD.
[in] lda
: rocblas_int. lda >= nq.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least nq1.The scalar factors of the Householder matrices H(i) as returned by HETRD.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] ldc
: rocblas_int. ldc >= m.Leading dimension of C.
Eigensolvers¶
rocsolver_<type>sterf()¶

rocblas_status
rocsolver_dsterf
(rocblas_handle handle, const rocblas_int n, double *D, double *E, rocblas_int *info)¶

rocblas_status
rocsolver_ssterf
(rocblas_handle handle, const rocblas_int n, float *D, float *E, rocblas_int *info)¶ STERF computes the eigenvalues of a symmetric tridiagonal matrix.
The eigenvalues of the symmetric tridiagonal matrix are computed by the PalWalkerKahan variant of the QL/QR algorithm, and returned in increasing order.
The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric offdiagonal elements E as returned by, e.g., SYTRD or HETRD.
 Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of the tridiagonal matrix.
[inout] D
: pointer to real type. Array on the GPU of dimension n.On entry, the diagonal elements of the matrix. On exit, if info = 0, the eigenvalues in increasing order. If info > 0, the diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).
[inout] E
: pointer to real type. Array on the GPU of dimension n1.On entry, the offdiagonal elements of the matrix. On exit, if info = 0, this array converges to zero. If info > 0, the offdiagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, STERF did not converge. i elements of E did not converge to zero.
rocsolver_<type>steqr()¶

rocblas_status
rocsolver_zsteqr
(rocblas_handle handle, const rocblas_evect compC, const rocblas_int n, double *D, double *E, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status
rocsolver_csteqr
(rocblas_handle handle, const rocblas_evect compC, const rocblas_int n, float *D, float *E, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status
rocsolver_dsteqr
(rocblas_handle handle, const rocblas_evect compC, const rocblas_int n, double *D, double *E, double *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status
rocsolver_ssteqr
(rocblas_handle handle, const rocblas_evect compC, const rocblas_int n, float *D, float *E, float *C, const rocblas_int ldc, rocblas_int *info)¶ STEQR computes the eigenvalues and (optionally) eigenvectors of a symmetric tridiagonal matrix.
The eigenvalues of the symmetric tridiagonal matrix are computed by the implicit QL/QR algorithm, and returned in increasing order.
The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric offdiagonal elements E as returned by, e.g., SYTRD or HETRD. If the tridiagonal matrix is the reduced form of a full symmetric/Hermitian matrix as returned by, e.g., SYTRD or HETRD, then the eigenvectors of the original matrix can also be computed, depending on the value of compC.
 Parameters
[in] handle
: rocblas_handle.[in] compC
: rocblas_evect.Specifies how the eigenvectors are computed.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the tridiagonal matrix.
[inout] D
: pointer to real type. Array on the GPU of dimension n.On entry, the diagonal elements of the matrix. On exit, if info = 0, the eigenvalues in increasing order. If info > 0, the diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).
[inout] E
: pointer to real type. Array on the GPU of dimension n1.On entry, the offdiagonal elements of the matrix. On exit, if info = 0, this array converges to zero. If info > 0, the offdiagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).
[inout] C
: pointer to type. Array on the GPU of dimension ldc*n.On entry, if compC is original, the orthogonal/unitary matrix used for the reduction to tridiagonal form as returned by, e.g., ORGTR or UNGTR. On exit, it is overwritten with the eigenvectors of the original symmetric/Hermitian matrix (if compC is original), or the eigenvectors of the tridiagonal matrix (if compC is tridiagonal). (Not referenced if compC is none).
[in] ldc
: rocblas_int. ldc >= n if compc is original or tridiagonal.Specifies the leading dimension of C. (Not referenced if compC is none).
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, STEQR did not converge. i elements of E did not converge to zero.
LAPACK Functions¶
LAPACK routines solve complex Numerical Linear Algebra problems.
Special Matrix Factorizations¶
rocsolver_<type>potf2()¶

rocblas_status
rocsolver_zpotf2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_cpotf2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_dpotf2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_spotf2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶ POTF2 computes the Cholesky factorization of a real symmetric/complex Hermitian positive definite matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form:
A = U' * U, or A = L * L'
depending on the value of uplo. U is an upper triangular matrix and L is lower triangular.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The matrix dimensions.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.
rocsolver_<type>potf2_batched()¶

rocblas_status
rocsolver_zpotf2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cpotf2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dpotf2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_spotf2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶ POTF2_BATCHED computes the Cholesky factorization of a batch of real symmetric/complex Hermitian positive definite matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_i in the batch has the form:
A_i = U_i' * U_i, or A_i = L_i * L_i'
depending on the value of uplo. U_i is an upper triangular matrix and L_i is lower triangular.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The dimension of matrix A_i.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the matrices A_i to be factored. On exit, the upper or lower triangular factors.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_i.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful factorization of matrix A_i. If info_i = j > 0, the leading minor of order j of A_i is not positive definite. The ith factorization stopped at this point.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>potf2_strided_batched()¶

rocblas_status
rocsolver_zpotf2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cpotf2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dpotf2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_spotf2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶ POTF2_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric/complex Hermitian positive definite matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_i in the batch has the form:
A_i = U_i' * U_i, or A_i = L_i * L_i'
depending on the value of uplo. U_i is an upper triangular matrix and L_i is lower triangular.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The dimension of matrix A_i.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the matrices A_i to be factored. On exit, the upper or lower triangular factors.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful factorization of matrix A_i. If info_i = j > 0, the leading minor of order j of A_i is not positive definite. The ith factorization stopped at this point.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>potrf()¶

rocblas_status
rocsolver_zpotrf
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_cpotrf
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_dpotrf
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_spotrf
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶ POTRF computes the Cholesky factorization of a real symmetric/complex Hermitian positive definite matrix A.
(This is the blocked version of the algorithm).
The factorization has the form:
A = U' * U, or A = L * L'
depending on the value of uplo. U is an upper triangular matrix and L is lower triangular.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The matrix dimensions.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.
rocsolver_<type>potrf_batched()¶

rocblas_status
rocsolver_zpotrf_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cpotrf_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dpotrf_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_spotrf_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶ POTRF_BATCHED computes the Cholesky factorization of a batch of real symmetric/complex Hermitian positive definite matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_i in the batch has the form:
A_i = U_i' * U_i, or A_i = L_i * L_i'
depending on the value of uplo. U_i is an upper triangular matrix and L_i is lower triangular.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The dimension of matrix A_i.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the matrices A_i to be factored. On exit, the upper or lower triangular factors.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_i.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful factorization of matrix A_i. If info_i = j > 0, the leading minor of order j of A_i is not positive definite. The ith factorization stopped at this point.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>potrf_strided_batched()¶

rocblas_status
rocsolver_zpotrf_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cpotrf_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dpotrf_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_spotrf_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶ POTRF_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric/complex Hermitian positive definite matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_i in the batch has the form:
A_i = U_i' * U_i, or A_i = L_i * L_i'
depending on the value of uplo. U_i is an upper triangular matrix and L_i is lower triangular.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The dimension of matrix A_i.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the matrices A_i to be factored. On exit, the upper or lower triangular factors.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful factorization of matrix A_i. If info_i = j > 0, the leading minor of order j of A_i is not positive definite. The ith factorization stopped at this point.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
General Matrix Factorizations¶
rocsolver_<type>getf2()¶

rocblas_status
rocsolver_zgetf2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶

rocblas_status
rocsolver_cgetf2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶

rocblas_status
rocsolver_dgetf2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶

rocblas_status
rocsolver_sgetf2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶ GETF2 computes the LU factorization of a general mbyn matrix A using partial pivoting with row interchanges.
(This is the unblocked Level2BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization has the form
A = P * L * U
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to rocblas_int. Array on the GPU of dimension min(m,n).The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U(i,i) is the first zero pivot.
rocsolver_<type>getf2_batched()¶

rocblas_status
rocsolver_zgetf2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetf2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetf2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetf2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETF2_BATCHED computes the LU factorization of a batch of general mbyn matrices using partial pivoting with row interchanges.
(This is the unblocked Level2BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization of matrix A_i in the batch has the form
A_i = P_i * L_i * U_i
where P_i is a permutation matrix, L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[out] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors of pivot indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is min(m,n). Elements of ipiv_i are 1based indices. For each instance A_i in the batch and for 1 <= j <= min(m,n), the row j of the matrix A_i was interchanged with row ipiv_i[j]. Matrix P_i of the factorization can be derived from ipiv_i.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getf2_strided_batched()¶

rocblas_status
rocsolver_zgetf2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetf2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetf2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetf2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETF2_STRIDED_BATCHED computes the LU factorization of a batch of general mbyn matrices using partial pivoting with row interchanges.
(This is the unblocked Level2BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization of matrix A_i in the batch has the form
A_i = P_i * L_i * U_i
where P_i is a permutation matrix, L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors of pivots indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is min(m,n). Elements of ipiv_i are 1based indices. For each instance A_i in the batch and for 1 <= j <= min(m,n), the row j of the matrix A_i was interchanged with row ipiv_i[j]. Matrix P_i of the factorization can be derived from ipiv_i.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getrf()¶

rocblas_status
rocsolver_zgetrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶

rocblas_status
rocsolver_cgetrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶

rocblas_status
rocsolver_dgetrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶

rocblas_status
rocsolver_sgetrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶ GETRF computes the LU factorization of a general mbyn matrix A using partial pivoting with row interchanges.
(This is the blocked Level3BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization has the form
A = P * L * U
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to rocblas_int. Array on the GPU of dimension min(m,n).The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U(i,i) is the first zero pivot.
rocsolver_<type>getrf_batched()¶

rocblas_status
rocsolver_zgetrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETRF_BATCHED computes the LU factorization of a batch of general mbyn matrices using partial pivoting with row interchanges.
(This is the blocked Level3BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization of matrix A_i in the batch has the form
A_i = P_i * L_i * U_i
where P_i is a permutation matrix, L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[out] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors of pivot indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is min(m,n). Elements of ipiv_i are 1based indices. For each instance A_i in the batch and for 1 <= j <= min(m,n), the row j of the matrix A_i was interchanged with row ipiv_i(j). Matrix P_i of the factorization can be derived from ipiv_i.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getrf_strided_batched()¶

rocblas_status
rocsolver_zgetrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETRF_STRIDED_BATCHED computes the LU factorization of a batch of general mbyn matrices using partial pivoting with row interchanges.
(This is the blocked Level3BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization of matrix A_i in the batch has the form
A_i = P_i * L_i * U_i
where P_i is a permutation matrix, L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors of pivots indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is min(m,n). Elements of ipiv_i are 1based indices. For each instance A_i in the batch and for 1 <= j <= min(m,n), the row j of the matrix A_i was interchanged with row ipiv_i(j). Matrix P_i of the factorization can be derived from ipiv_i.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>geqr2()¶

rocblas_status
rocsolver_zgeqr2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cgeqr2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶

rocblas_status
rocsolver_dgeqr2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sgeqr2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GEQR2 computes a QR factorization of a general mbyn matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
A = Q * [ R ] [ 0 ]
where R is upper triangular (upper trapezoidal if m < n), and Q is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(k), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I  ipiv[i1] * v(i) * v(i)'
where the first i1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the m  i elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
rocsolver_<type>geqr2_batched()¶

rocblas_status
rocsolver_zgeqr2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgeqr2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgeqr2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgeqr2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQR2_BATCHED computes the QR factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ R_j ] [ 0 ]
where R_j is upper triangular (upper trapezoidal if m < n), and Q_j is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(k), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i) * v_j(i)'
where the first i1 elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the m  i elements of vector v_j(i) for i=1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>geqr2_strided_batched()¶

rocblas_status
rocsolver_zgeqr2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgeqr2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgeqr2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgeqr2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQR2_STRIDED_BATCHED computes the QR factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ R_j ] [ 0 ]
where R_j is upper triangular (upper trapezoidal if m < n), and Q_j is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(k), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i) * v_j(i)'
where the first i1 elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the m  i elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>geqrf()¶

rocblas_status
rocsolver_zgeqrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cgeqrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶

rocblas_status
rocsolver_dgeqrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sgeqrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GEQRF computes a QR factorization of a general mbyn matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
A = Q * [ R ] [ 0 ]
where R is upper triangular (upper trapezoidal if m < n), and Q is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(k), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I  ipiv[i1] * v(i) * v(i)'
where the first i1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the m  i elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
rocsolver_<type>geqrf_batched()¶

rocblas_status
rocsolver_zgeqrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgeqrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgeqrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgeqrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQRF_BATCHED computes the QR factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ R_j ] [ 0 ]
where R_j is upper triangular (upper trapezoidal if m < n), and Q_j is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(k), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i) * v_j(i)'
where the first i1 elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the m  i elements of vector v_j(i) for i=1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>geqrf_strided_batched()¶

rocblas_status
rocsolver_zgeqrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgeqrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgeqrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgeqrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQRF_STRIDED_BATCHED computes the QR factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ R_j ] [ 0 ]
where R_j is upper triangular (upper trapezoidal if m < n), and Q_j is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(k), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i) * v_j(i)'
where the first i1 elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the m  i elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>geql2()¶

rocblas_status
rocsolver_zgeql2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cgeql2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶

rocblas_status
rocsolver_dgeql2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sgeql2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GEQL2 computes a QL factorization of a general mbyn matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
A = Q * [ 0 ] [ L ]
where L is lower triangular (lower trapezoidal if m < n), and Q is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q = H(k) * ... * H(2) * H(1), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I  ipiv[i1] * v(i) * v(i)'
where the last mi elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the i  1 elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
rocsolver_<type>geql2_batched()¶

rocblas_status
rocsolver_zgeql2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgeql2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgeql2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgeql2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQL2_BATCHED computes the QL factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ 0 ] [ L_j ]
where L_j is lower triangular (lower trapezoidal if m < n), and Q_j is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(k) * ... * H_j(2) * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i) * v_j(i)'
where the last mi elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the i  1 elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>geql2_strided_batched()¶

rocblas_status
rocsolver_zgeql2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgeql2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgeql2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgeql2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQL2_STRIDED_BATCHED computes the QL factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ 0 ] [ L_j ]
where L_j is lower triangular (lower trapezoidal if m < n), and Q_j is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(k) * ... * H_j(2) * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i) * v_j(i)'
where the last mi elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the i  1 elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>geqlf()¶

rocblas_status
rocsolver_zgeqlf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cgeqlf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶

rocblas_status
rocsolver_dgeqlf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sgeqlf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GEQLF computes a QL factorization of a general mbyn matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
A = Q * [ 0 ] [ L ]
where L is lower triangular (lower trapezoidal if m < n), and Q is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q = H(k) * ... * H(2) * H(1), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I  ipiv[i1] * v(i) * v(i)'
where the last mi elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the i  1 elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
rocsolver_<type>geqlf_batched()¶

rocblas_status
rocsolver_zgeqlf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgeqlf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgeqlf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgeqlf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQLF_BATCHED computes the QL factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ 0 ] [ L_j ]
where L_j is lower triangular (lower trapezoidal if m < n), and Q_j is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(k) * ... * H_j(2) * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i) * v_j(i)'
where the last mi elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the i  1 elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>geqlf_strided_batched()¶

rocblas_status
rocsolver_zgeqlf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgeqlf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgeqlf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgeqlf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQLF_STRIDED_BATCHED computes the QL factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ 0 ] [ L_j ]
where L_j is lower triangular (lower trapezoidal if m < n), and Q_j is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(k) * ... * H_j(2) * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i) * v_j(i)'
where the last mi elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the i  1 elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gelq2()¶

rocblas_status
rocsolver_zgelq2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cgelq2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶

rocblas_status
rocsolver_dgelq2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sgelq2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GELQ2 computes a LQ factorization of a general mbyn matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
A = [ L 0 ] * Q
where L is lower triangular (lower trapezoidal if m > n), and Q is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
Q = H(k) * H(k1) * ... * H(1), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I  ipiv[i1] * v(i)' * v(i)
where the first i1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix to be factored. On exit, the elements on and delow the diagonal contain the factor L; the elements above the diagonal are the n  i elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
rocsolver_<type>gelq2_batched()¶

rocblas_status
rocsolver_zgelq2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgelq2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgelq2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgelq2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GELQ2_BATCHED computes the LQ factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = [ L_j 0 ] * Q_j
where L_j is lower triangular (lower trapezoidal if m > n), and Q_j is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(k) * H_j(k1) * ... * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i)' * v_j(i)
where the first i1 elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the n  i elements of vector v_j(i) for i=1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gelq2_strided_batched()¶

rocblas_status
rocsolver_zgelq2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgelq2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgelq2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgelq2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GELQ2_STRIDED_BATCHED computes the LQ factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = [ L_j 0 ] * Q_j
where L_j is lower triangular (lower trapezoidal if m > n), and Q_j is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(k) * H_j(k1) * ... * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i)' * v_j(i)
where the first i1 elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the n  i elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gelqf()¶

rocblas_status
rocsolver_zgelqf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶

rocblas_status
rocsolver_cgelqf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶

rocblas_status
rocsolver_dgelqf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶

rocblas_status
rocsolver_sgelqf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GELQF computes a LQ factorization of a general mbyn matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
A = [ L 0 ] * Q
where L is lower triangular (lower trapezoidal if m > n), and Q is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
Q = H(k) * H(k1) * ... * H(1), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I  ipiv[i1] * v(i)' * v(i)
where the first i1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix to be factored. On exit, the elements on and below the diagonal contain the factor L; the elements above the diagonal are the n  i elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
rocsolver_<type>gelqf_batched()¶

rocblas_status
rocsolver_zgelqf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgelqf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgelqf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgelqf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GELQF_BATCHED computes the LQ factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = [ L_j 0 ] * Q_j
where L_j is lower triangular (lower trapezoidal if m > n), and Q_j is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(k) * H_j(k1) * ... * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i)' * v_j(i)
where the first i1 elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the n  i elements of vector v_j(i) for i=1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gelqf_strided_batched()¶

rocblas_status
rocsolver_zgelqf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgelqf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgelqf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgelqf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GELQF_STRIDED_BATCHED computes the LQ factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = [ L_j 0 ] * Q_j
where L_j is lower triangular (lower trapezoidal if m > n), and Q_j is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
Q_j = H_j(k) * H_j(k1) * ... * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I  ipiv_j[i1] * v_j(i)' * v_j(i)
where the first i1 elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the n  i elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gels()¶

rocblas_status
rocsolver_zgels
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb, rocblas_int *info)¶

rocblas_status
rocsolver_cgels
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb, rocblas_int *info)¶

rocblas_status
rocsolver_dgels
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, double *B, const rocblas_int ldb, rocblas_int *info)¶

rocblas_status
rocsolver_sgels
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, float *B, const rocblas_int ldb, rocblas_int *info)¶ GELS solves an overdetermined (or underdetermined) linear system defined by an mbyn matrix A, and a corresponding matrix B, using the QR factorization computed by GEQRF (or the LQ factorization computed by GELQF).
The problem solved by this function is either of the form
A * X = B (no transpose), or A' * X = B (transpose/conjugate transpose)
depending on the value of trans.
If m >= n (or n < m in the case of transpose/conjugate transpose), the system is overdetermined and a leastsquares solution approximating X is found minimizing
 B  A * X  (no transpose), or  B  A' * X  (transpose/conjugate transpose)
If n < m (or m >= n in the case of transpose/conjugate transpose), the system is underdetermined and a unique solution for X is chosen minimizing  X 
 Note
The current implementation only supports the overdetermined, no transpose case.
rocblas_status_not_implemented
will be returned if m < n, or if trans isrocblas_operation_transpose
orrocblas_operation_conjugate_transpose
. Parameters
[in] handle
: rocblas_handle.[in] trans
: rocblas_operation.Specifies the form of the system of equations.
[in] m
: rocblas_int. m >= 0.The number of rows of matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of matrix A.
[in] nrhs
: rocblas_int. nrhs >= 0.The number of columns of matrices B and X; i.e., the columns on the right hand side.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix A. On exit, the QR (or LQ) factorization of A as returned by GEQRF (or GELQF).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrix A.
[inout] B
: pointer to type. Array on the GPU of dimension ldb*nrhs.On entry, the matrix B is mbynrhs if nontransposed, or nbynrhs if transposed. On exit, when info = 0, B is overwritten by the solution vectors stored as columns.
[in] ldb
: rocblas_int. ldb >= max(m,n).Specifies the leading dimension of matrix B.
[out] info
: pointer to rocblas_int on the GPU.If info = 0, successful exit. If info = j > 0, the solution could not be computed because input matrix A is singular; the jth diagonal element of its triangular factor is zero.
rocsolver_<type>gels_batched()¶

rocblas_status
rocsolver_zgels_batched
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgels_batched
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgels_batched
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgels_batched
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)¶ GELS_BATCHED solves batches of overdetermined (or underdetermined) linear systems defined by the array of mbyn matrices A, and an array of corresponding matrices B, using the QR factorizations computed by GEQRF (or the LQ factorizations computed by GELQF).
The problem solved by this function is either of the form
A_i * X_i = B_i (no transpose), or A_i' * X_i = B_i (transpose/conjugate transpose)
depending on the value of trans.
If m >= n (or n < m in the case of transpose/conjugate transpose), the systems are overdetermined and leastsquares solutions approximating X_i are found minimizing
 B_i  A_i * X_i  (no transpose), or  B_i  A_i' * X_i  (transpose/conjugate transpose)
If n < m (or m >= n in the case of transpose/conjugate transpose), the system is underdetermined and a unique solution for X_i is chosen minimizing  X_i 
 Note
The current implementation only supports the overdetermined, no transpose case.
rocblas_status_not_implemented
will be returned if m < n, or if trans isrocblas_operation_transpose
orrocblas_operation_conjugate_transpose
. Parameters
[in] handle
: rocblas_handle.[in] trans
: rocblas_operation.Specifies the form of the system of equations.
[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[in] nrhs
: rocblas_int. nrhs >= 0.The number of columns of all matrices B_i and X_i in the batch; i.e., the columns on the right hand side.
[inout] A
: array of pointer to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_i. On exit, the QR (or LQ) factorizations of A_i as returned by GEQRF (or GELQF).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[inout] B
: array of pointer to type. Each pointer points to an array on the GPU of dimension ldb*nrhs.On entry, the matrices B_i are mbynrhs if nontransposed, or nbynrhs if transposed. On exit, when info = 0, each B_i is overwritten by the solution vectors stored as columns.
[in] ldb
: rocblas_int. ldb >= max(m,n).Specifies the leading dimension of matrices B_i.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for solution of A_i. If info_i = j > 0, the solution of A_i could not be computed because input matrix A_i is singular; the jth diagonal element of its triangular factor is zero.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gels_strided_batched()¶

rocblas_status
rocsolver_zgels_strided_batched
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgels_strided_batched
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgels_strided_batched
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgels_strided_batched
(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)¶ GELS_STRIDED_BATCHED solves batches of overdetermined (or underdetermined) linear systems defined by the array of mbyn matrices A, and an array of corresponding matrices B, using the QR factorizations computed by GEQRF (or the LQ factorizations computed by GELQF).
The problem solved by this function is either of the form
A_i * X_i = B_i (no transpose), or A_i' * X_i = B_i (transpose/conjugate transpose)
depending on the value of trans.
If m >= n (or n < m in the case of transpose/conjugate transpose), the systems are overdetermined and leastsquares solutions approximating X_i are found minimizing
 B_i  A_i * X_i  (no transpose), or  B_i  A_i' * X_i  (transpose/conjugate transpose)
If n < m (or m >= n in the case of transpose/conjugate transpose), the system is underdetermined and a unique solution for X_i is chosen minimizing  X_i 
 Note
The current implementation only supports the overdetermined, no transpose case.
rocblas_status_not_implemented
will be returned if m < n, or if trans isrocblas_operation_transpose
orrocblas_operation_conjugate_transpose
. Parameters
[in] handle
: rocblas_handle.[in] trans
: rocblas_operation.Specifies the form of the system of equations.
[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[in] nrhs
: rocblas_int. nrhs >= 0.The number of columns of all matrices B_i and X_i in the batch; i.e., the columns on the right hand side.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_i. On exit, the QR (or LQ) factorizations of A_i as returned by GEQRF (or GELQF).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[inout] B
: pointer to type. Array on the GPU (the size depends on the value of strideB).On entry, the matrices B_i are mbynrhs if nontransposed, or nbynrhs if transposed. On exit, when info = 0, each B_i is overwritten by the solution vectors stored as columns.
[in] ldb
: rocblas_int. ldb >= max(m,n).Specifies the leading dimension of matrices B_i.
[in] strideB
: rocblas_stride.Stride from the start of one matrix B_i and the next one B_(i+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for solution of A_i. If info_i = j > 0, the solution of A_i could not be computed because input matrix A_i is singular; the jth diagonal element of its triangular factor is zero.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
General Matrix Diagonalizations¶
rocsolver_<type>gebd2()¶

rocblas_status
rocsolver_zgebd2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup)¶

rocblas_status
rocsolver_cgebd2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup)¶

rocblas_status
rocsolver_dgebd2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup)¶

rocblas_status
rocsolver_sgebd2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup)¶ GEBD2 computes the bidiagonal form of a general mbyn matrix A.
(This is the unblocked version of the algorithm).
The bidiagonal form is given by:
B = Q' * A * P
where B is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q and P are orthogonal/unitary matrices represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(n) and P = G(1) * G(2) * ... * G(n1), if m >= n, or Q = H(1) * H(2) * ... * H(m1) and P = G(1) * G(2) * ... * G(m), if m < n
Each Householder matrix H(i) and G(i) is given by
H(i) = I  tauq[i1] * v(i) * v(i)', and G(i) = I  taup[i1] * u(i) * u(i)'
If m >= n, the first i1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1; while the first i elements of the Householder vector u(i) are zero, and u(i)[i+1] = 1. If m < n, the first i elements of the Householder vector v(i) are zero, and v(i)[i+1] = 1; while the first i1 elements of the Householder vector u(i) are zero, and u(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B. If m >= n, the elements below the diagonal are the m  i elements of vector v(i) for i = 1,2,…,n, and the elements above the superdiagonal are the n  i  1 elements of vector u(i) for i = 1,2,…,n1. If m < n, the elements below the subdiagonal are the m  i  1 elements of vector v(i) for i = 1,2,…,m1, and the elements above the diagonal are the n  i elements of vector u(i) for i = 1,2,…,m.
[in] lda
: rocblas_int. lda >= m.specifies the leading dimension of A.
[out] D
: pointer to real type. Array on the GPU of dimension min(m,n).The diagonal elements of B.
[out] E
: pointer to real type. Array on the GPU of dimension min(m,n)1.The offdiagonal elements of B.
[out] tauq
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
[out] taup
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices G(i).
rocsolver_<type>gebd2_batched()¶

rocblas_status
rocsolver_zgebd2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgebd2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgebd2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgebd2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEBD2_BATCHED computes the bidiagonal form of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The bidiagonal form is given by:
B_j = Q_j' * A_j * P_j
where B_j is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q_j and P_j are orthogonal/unitary matrices represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n) and P_j = G_j(1) * G_j(2) * ... * G_j(n1), if m >= n, or Q_j = H_j(1) * H_j(2) * ... * H_j(m1) and P_j = G_j(1) * G_j(2) * ... * G_j(m), if m < n
Each Householder matrix H_j(i) and G_j(i), for j = 1,2,…,batch_count, is given by
H_j(i) = I  tauq_j[i1] * v_j(i) * v_j(i)', and G_j(i) = I  taup_j[i1] * u_j(i) * u_j(i)'
If m >= n, the first i1 elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1; while the first i elements of the Householder vector u_j(i) are zero, and u_j(i)[i+1] = 1. If m < n, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1; while the first i1 elements of the Householder vector u_j(i) are zero, and u_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_j to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_j. If m >= n, the elements below the diagonal are the m  i elements of vector v_j(i) for i = 1,2,…,n, and the elements above the superdiagonal are the n  i  1 elements of vector u_j(i) for i = 1,2,…,n1. If m < n, the elements below the subdiagonal are the m  i  1 elements of vector v_j(i) for i = 1,2,…,m1, and the elements above the diagonal are the n  i elements of vector u_j(i) for i = 1,2,…,m.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] D
: pointer to real type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of B_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of B_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
[out] tauq
: pointer to type. Array on the GPU (the size depends on the value of strideQ).Contains the vectors tauq_j of scalar factors of the Householder matrices H_j(i).
[in] strideQ
: rocblas_stride.Stride from the start of one vector tauq_j to the next one tauq_(j+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
[out] taup
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors taup_j of scalar factors of the Householder matrices G_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector taup_j to the next one taup_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gebd2_strided_batched()¶

rocblas_status
rocsolver_zgebd2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgebd2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgebd2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgebd2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEBD2_STRIDED_BATCHED computes the bidiagonal form of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The bidiagonal form is given by:
B_j = Q_j' * A_j * P_j
where B_j is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q_j and P_j are orthogonal/unitary matrices represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n) and P_j = G_j(1) * G_j(2) * ... * G_j(n1), if m >= n, or Q_j = H_j(1) * H_j(2) * ... * H_j(m1) and P_j = G_j(1) * G_j(2) * ... * G_j(m), if m < n
Each Householder matrix H_j(i) and G_j(i), for j = 1,2,…,batch_count, is given by
H_j(i) = I  tauq_j[i1] * v_j(i) * v_j(i)', and G_j(i) = I  taup_j[i1] * u_j(i) * u_j(i)'
If m >= n, the first i1 elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1; while the first i elements of the Householder vector u_j(i) are zero, and u_j(i)[i+1] = 1. If m < n, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1; while the first i1 elements of the Householder vector u_j(i) are zero, and u_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_j to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_j. If m >= n, the elements below the diagonal are the m  i elements of vector v_j(i) for i = 1,2,…,n, and the elements above the superdiagonal are the n  i  1 elements of vector u_j(i) for i = 1,2,…,n1. If m < n, the elements below the subdiagonal are the m  i  1 elements of vector v_j(i) for i = 1,2,…,m1, and the elements above the diagonal are the n  i elements of vector u_j(i) for i = 1,2,…,m.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] D
: pointer to real type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of B_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of B_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
[out] tauq
: pointer to type. Array on the GPU (the size depends on the value of strideQ).Contains the vectors tauq_j of scalar factors of the Householder matrices H_j(i).
[in] strideQ
: rocblas_stride.Stride from the start of one vector tauq_j to the next one tauq_(j+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
[out] taup
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors taup_j of scalar factors of the Householder matrices G_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector taup_j to the next one taup_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gebrd()¶

rocblas_status
rocsolver_zgebrd
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup)¶

rocblas_status
rocsolver_cgebrd
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup)¶

rocblas_status
rocsolver_dgebrd
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup)¶

rocblas_status
rocsolver_sgebrd
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup)¶ GEBRD computes the bidiagonal form of a general mbyn matrix A.
(This is the blocked version of the algorithm).
The bidiagonal form is given by:
B = Q' * A * P
where B is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q and P are orthogonal/unitary matrices represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(n) and P = G(1) * G(2) * ... * G(n1), if m >= n, or Q = H(1) * H(2) * ... * H(m1) and P = G(1) * G(2) * ... * G(m), if m < n
Each Householder matrix H(i) and G(i) is given by
H(i) = I  tauq[i1] * v(i) * v(i)', and G(i) = I  taup[i1] * u(i) * u(i)'
If m >= n, the first i1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1; while the first i elements of the Householder vector u(i) are zero, and u(i)[i+1] = 1. If m < n, the first i elements of the Householder vector v(i) are zero, and v(i)[i+1] = 1; while the first i1 elements of the Householder vector u(i) are zero, and u(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B. If m >= n, the elements below the diagonal are the m  i elements of vector v(i) for i = 1,2,…,n, and the elements above the superdiagonal are the n  i  1 elements of vector u(i) for i = 1,2,…,n1. If m < n, the elements below the subdiagonal are the m  i  1 elements of vector v(i) for i = 1,2,…,m1, and the elements above the diagonal are the n  i elements of vector u(i) for i = 1,2,…,m.
[in] lda
: rocblas_int. lda >= m.specifies the leading dimension of A.
[out] D
: pointer to real type. Array on the GPU of dimension min(m,n).The diagonal elements of B.
[out] E
: pointer to real type. Array on the GPU of dimension min(m,n)1.The offdiagonal elements of B.
[out] tauq
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
[out] taup
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices G(i).
rocsolver_<type>gebrd_batched()¶

rocblas_status
rocsolver_zgebrd_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgebrd_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgebrd_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgebrd_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEBRD_BATCHED computes the bidiagonal form of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The bidiagonal form is given by:
B_j = Q_j' * A_j * P_j
where B_j is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q_j and P_j are orthogonal/unitary matrices represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n) and P_j = G_j(1) * G_j(2) * ... * G_j(n1), if m >= n, or Q_j = H_j(1) * H_j(2) * ... * H_j(m1) and P_j = G_j(1) * G_j(2) * ... * G_j(m), if m < n
Each Householder matrix H_j(i) and G_j(i), for j = 1,2,…,batch_count, is given by
H_j(i) = I  tauq_j[i1] * v_j(i) * v_j(i)', and G_j(i) = I  taup_j[i1] * u_j(i) * u_j(i)'
If m >= n, the first i1 elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1; while the first i elements of the Householder vector u_j(i) are zero, and u_j(i)[i+1] = 1. If m < n, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1; while the first i1 elements of the Householder vector u_j(i) are zero, and u_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_j to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_j. If m >= n, the elements below the diagonal are the m  i elements of vector v_j(i) for i = 1,2,…,n, and the elements above the superdiagonal are the n  i  1 elements of vector u_j(i) for i = 1,2,…,n1. If m < n, the elements below the subdiagonal are the m  i  1 elements of vector v_j(i) for i = 1,2,…,m1, and the elements above the diagonal are the n  i elements of vector u_j(i) for i = 1,2,…,m.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] D
: pointer to real type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of B_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of B_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
[out] tauq
: pointer to type. Array on the GPU (the size depends on the value of strideQ).Contains the vectors tauq_j of scalar factors of the Householder matrices H_j(i).
[in] strideQ
: rocblas_stride.Stride from the start of one vector tauq_j to the next one tauq_(j+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
[out] taup
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors taup_j of scalar factors of the Householder matrices G_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector taup_j to the next one taup_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gebrd_strided_batched()¶

rocblas_status
rocsolver_zgebrd_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgebrd_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgebrd_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgebrd_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEBRD_STRIDED_BATCHED computes the bidiagonal form of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The bidiagonal form is given by:
B_j = Q_j' * A_j * P_j
where B_j is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q_j and P_j are orthogonal/unitary matrices represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n) and P_j = G_j(1) * G_j(2) * ... * G_j(n1), if m >= n, or Q_j = H_j(1) * H_j(2) * ... * H_j(m1) and P_j = G_j(1) * G_j(2) * ... * G_j(m), if m < n
Each Householder matrix H_j(i) and G_j(i), for j = 1,2,…,batch_count, is given by
H_j(i) = I  tauq_j[i1] * v_j(i) * v_j(i)', and G_j(i) = I  taup_j[i1] * u_j(i) * u_j(i)'
If m >= n, the first i1 elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1; while the first i elements of the Householder vector u_j(i) are zero, and u_j(i)[i+1] = 1. If m < n, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1; while the first i1 elements of the Householder vector u_j(i) are zero, and u_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_j to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_j. If m >= n, the elements below the diagonal are the m  i elements of vector v_j(i) for i = 1,2,…,n, and the elements above the superdiagonal are the n  i  1 elements of vector u_j(i) for i = 1,2,…,n1. If m < n, the elements below the subdiagonal are the m  i  1 elements of vector v_j(i) for i = 1,2,…,m1, and the elements above the diagonal are the n  i elements of vector u_j(i) for i = 1,2,…,m.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] D
: pointer to real type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of B_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of B_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
[out] tauq
: pointer to type. Array on the GPU (the size depends on the value of strideQ).Contains the vectors tauq_j of scalar factors of the Householder matrices H_j(i).
[in] strideQ
: rocblas_stride.Stride from the start of one vector tauq_j to the next one tauq_(j+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
[out] taup
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors taup_j of scalar factors of the Householder matrices G_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector taup_j to the next one taup_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
Symmetric Matrix Diagonalizations¶
rocsolver_<type>sytd2()¶

rocblas_status
rocsolver_dsytd2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tau)¶

rocblas_status
rocsolver_ssytd2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tau)¶ SYTD2 computes the tridiagonal form of a real symmetric matrix A.
(This is the unblocked version of the algorithm).
The tridiagonal form is given by:
T = Q' * A * Q
where T is symmetric tridiagonal and Q is an orthogonal matrix represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(n1) if uplo indicates lower, or Q = H(n1) * H(n2) * ... * H(1) if uplo indicates upper.
Each Householder matrix H(i) is given by
H(i) = I  tau[i] * v(i) * v(i)'
where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v(i) are zero, and v(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the i1 nonzero elements of vectors v(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A.
[out] D
: pointer to type. Array on the GPU of dimension n.The diagonal elements of T.
[out] E
: pointer to type. Array on the GPU of dimension n1.The offdiagonal elements of T.
[out] tau
: pointer to type. Array on the GPU of dimension n1.The scalar factors of the Householder matrices H(i).
rocsolver_<type>sytd2_batched()¶

rocblas_status
rocsolver_dsytd2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_ssytd2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶ SYTD2_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_j.
(This is the unblocked version of the algorithm).
The tridiagonal form of A_j is given by:
T_j = Q_j' * A_j * Q_j, for j = 1,2,...,batch_count
where T_j is symmetric tridiagonal and Q_j is an orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n1) if uplo indicates lower, or Q_j = H_j(n1) * H_j(n2) * ... * H_j(1) if uplo indicates upper.
Each Householder matrix H_j(i) is given by
H_j(i) = I  tau_j[i] * v_j(i) * v_j(i)'
where tau_j[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the symmetric matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrices A_j.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal of A_j contain the tridiagonal form T_j; the elements above the superdiagonal contain the i1 nonzero elements of vectors v_j(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v_j(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_j.
[out] D
: pointer to type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of T_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
[out] E
: pointer to type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of T_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
[out] tau
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors tau_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>sytd2_strided_batched()¶

rocblas_status
rocsolver_dsytd2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_ssytd2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶ SYTD2_STRIDED_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_j.
(This is the unblocked version of the algorithm).
The tridiagonal form of A_j is given by:
T_j = Q_j' * A_j * Q_j, for j = 1,2,...,batch_count
where T_j is symmetric tridiagonal and Q_j is an orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n1) if uplo indicates lower, or Q_j = H_j(n1) * H_j(n2) * ... * H_j(1) if uplo indicates upper.
Each Householder matrix H_j(i) is given by
H_j(i) = I  tau_j[i] * v_j(i) * v_j(i)'
where tau_j[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the symmetric matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrices A_j.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal of A_j contain the tridiagonal form T_j; the elements above the superdiagonal contain the i1 nonzero elements of vectors v_j(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v_j(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] D
: pointer to type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of T_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
[out] E
: pointer to type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of T_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
[out] tau
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors tau_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>hetd2()¶

rocblas_status
rocsolver_zhetd2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tau)¶

rocblas_status
rocsolver_chetd2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tau)¶ HETD2 computes the tridiagonal form of a complex hermitian matrix A.
(This is the unblocked version of the algorithm).
The tridiagonal form is given by:
T = Q' * A * Q
where T is hermitian tridiagonal and Q is an unitary matrix represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(n1) if uplo indicates lower, or Q = H(n1) * H(n2) * ... * H(1) if uplo indicates upper.
Each Householder matrix H(i) is given by
H(i) = I  tau[i] * v(i) * v(i)'
where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v(i) are zero, and v(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the i1 nonzero elements of vectors v(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A.
[out] D
: pointer to real type. Array on the GPU of dimension n.The diagonal elements of T.
[out] E
: pointer to real type. Array on the GPU of dimension n1.The offdiagonal elements of T.
[out] tau
: pointer to type. Array on the GPU of dimension n1.The scalar factors of the Householder matrices H(i).
rocsolver_<type>hetd2_batched()¶

rocblas_status
rocsolver_zhetd2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_chetd2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶ HETD2_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_j.
(This is the unblocked version of the algorithm).
The tridiagonal form of A_j is given by:
T_j = Q_j' * A_j * Q_j, for j = 1,2,...,batch_count
where T_j is hermitian tridiagonal and Q_j is a unitary matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n1) if uplo indicates lower, or Q_j = H_j(n1) * H_j(n2) * ... * H_j(1) if uplo indicates upper.
Each Householder matrix H_j(i) is given by
H_j(i) = I  tau_j[i] * v_j(i) * v_j(i)'
where tau_j[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the hermitian matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrices A_j.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal of A_j contain the tridiagonal form T_j; the elements above the superdiagonal contain the i1 nonzero elements of vectors v_j(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v_j(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_j.
[out] D
: pointer to real type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of T_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of T_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
[out] tau
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors tau_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>hetd2_strided_batched()¶

rocblas_status
rocsolver_zhetd2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_chetd2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶ HETD2_STRIDED_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_j.
(This is the unblocked version of the algorithm).
The tridiagonal form of A_j is given by:
T_j = Q_j' * A_j * Q_j, for j = 1,2,...,batch_count
where T_j is hermitian tridiagonal and Q_j is a unitary matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n1) if uplo indicates lower, or Q_j = H_j(n1) * H_j(n2) * ... * H_j(1) if uplo indicates upper.
Each Householder matrix H_j(i) is given by
H_j(i) = I  tau_j[i] * v_j(i) * v_j(i)'
where tau_j[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the hermitian matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrices A_j.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal of A_j contain the tridiagonal form T_j; the elements above the superdiagonal contain the i1 nonzero elements of vectors v_j(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v_j(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] D
: pointer to real type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of T_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of T_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
[out] tau
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors tau_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>sytrd()¶

rocblas_status
rocsolver_dsytrd
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tau)¶

rocblas_status
rocsolver_ssytrd
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tau)¶ SYTRD computes the tridiagonal form of a real symmetric matrix A.
(This is the blocked version of the algorithm).
The tridiagonal form is given by:
T = Q' * A * Q
where T is symmetric tridiagonal and Q is an orthogonal matrix represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(n1) if uplo indicates lower, or Q = H(n1) * H(n2) * ... * H(1) if uplo indicates upper.
Each Householder matrix H(i) is given by
H(i) = I  tau[i] * v(i) * v(i)'
where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v(i) are zero, and v(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the i1 nonzero elements of vectors v(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A.
[out] D
: pointer to type. Array on the GPU of dimension n.The diagonal elements of T.
[out] E
: pointer to type. Array on the GPU of dimension n1.The offdiagonal elements of T.
[out] tau
: pointer to type. Array on the GPU of dimension n1.The scalar factors of the Householder matrices H(i).
rocsolver_<type>sytrd_batched()¶

rocblas_status
rocsolver_dsytrd_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_ssytrd_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶ SYTRD_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_j.
(This is the blocked version of the algorithm).
The tridiagonal form of A_j is given by:
T_j = Q_j' * A_j * Q_j, for j = 1,2,...,batch_count
where T_j is symmetric tridiagonal and Q_j is an orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n1) if uplo indicates lower, or Q_j = H_j(n1) * H_j(n2) * ... * H_j(1) if uplo indicates upper.
Each Householder matrix H_j(i) is given by
H_j(i) = I  tau_j[i] * v_j(i) * v_j(i)'
where tau_j[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the symmetric matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrices A_j.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal of A_j contain the tridiagonal form T_j; the elements above the superdiagonal contain the i1 nonzero elements of vectors v_j(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v_j(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_j.
[out] D
: pointer to type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of T_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
[out] E
: pointer to type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of T_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
[out] tau
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors tau_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>sytrd_strided_batched()¶

rocblas_status
rocsolver_dsytrd_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_ssytrd_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶ SYTRD_STRIDED_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_j.
(This is the blocked version of the algorithm).
The tridiagonal form of A_j is given by:
T_j = Q_j' * A_j * Q_j, for j = 1,2,...,batch_count
where T_j is symmetric tridiagonal and Q_j is an orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n1) if uplo indicates lower, or Q_j = H_j(n1) * H_j(n2) * ... * H_j(1) if uplo indicates upper.
Each Householder matrix H_j(i) is given by
H_j(i) = I  tau_j[i] * v_j(i) * v_j(i)'
where tau_j[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the symmetric matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrices A_j.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal of A_j contain the tridiagonal form T_j; the elements above the superdiagonal contain the i1 nonzero elements of vectors v_j(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v_j(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] D
: pointer to type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of T_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
[out] E
: pointer to type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of T_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
[out] tau
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors tau_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>hetrd()¶

rocblas_status
rocsolver_zhetrd
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tau)¶

rocblas_status
rocsolver_chetrd
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tau)¶ HETRD computes the tridiagonal form of a complex hermitian matrix A.
(This is the blocked version of the algorithm).
The tridiagonal form is given by:
T = Q' * A * Q
where T is hermitian tridiagonal and Q is an unitary matrix represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(n1) if uplo indicates lower, or Q = H(n1) * H(n2) * ... * H(1) if uplo indicates upper.
Each Householder matrix H(i) is given by
H(i) = I  tau[i] * v(i) * v(i)'
where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v(i) are zero, and v(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the i1 nonzero elements of vectors v(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A.
[out] D
: pointer to real type. Array on the GPU of dimension n.The diagonal elements of T.
[out] E
: pointer to real type. Array on the GPU of dimension n1.The offdiagonal elements of T.
[out] tau
: pointer to type. Array on the GPU of dimension n1.The scalar factors of the Householder matrices H(i).
rocsolver_<type>hetrd_batched()¶

rocblas_status
rocsolver_zhetrd_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_chetrd_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶ HETRD_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_j.
(This is the blocked version of the algorithm).
The tridiagonal form of A_j is given by:
T_j = Q_j' * A_j * Q_j, for j = 1,2,...,batch_count
where T_j is hermitian tridiagonal and Q_j is a unitary matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n1) if uplo indicates lower, or Q_j = H_j(n1) * H_j(n2) * ... * H_j(1) if uplo indicates upper.
Each Householder matrix H_j(i) is given by
H_j(i) = I  tau_j[i] * v_j(i) * v_j(i)'
where tau_j[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the hermitian matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrices A_j.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal of A_j contain the tridiagonal form T_j; the elements above the superdiagonal contain the i1 nonzero elements of vectors v_j(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v_j(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_j.
[out] D
: pointer to real type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of T_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of T_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
[out] tau
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors tau_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>hetrd_strided_batched()¶

rocblas_status
rocsolver_zhetrd_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶

rocblas_status
rocsolver_chetrd_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)¶ HETRD_STRIDED_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_j.
(This is the blocked version of the algorithm).
The tridiagonal form of A_j is given by:
T_j = Q_j' * A_j * Q_j, for j = 1,2,...,batch_count
where T_j is hermitian tridiagonal and Q_j is a unitary matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(n1) if uplo indicates lower, or Q_j = H_j(n1) * H_j(n2) * ... * H_j(1) if uplo indicates upper.
Each Householder matrix H_j(i) is given by
H_j(i) = I  tau_j[i] * v_j(i) * v_j(i)'
where tau_j[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector v_j(i) are zero, and v_j(i)[i+1] = 1. If uplo indicates upper, the last ni elements of the Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
 Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the upper or lower part of the hermitian matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrices A_j.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal of A_j contain the tridiagonal form T_j; the elements above the superdiagonal contain the i1 nonzero elements of vectors v_j(i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the ni1 nonzero elements of vectors v_j(i) stored as columns.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] D
: pointer to real type. Array on the GPU (the size depends on the value of strideD).The diagonal elements of T_j.
[in] strideD
: rocblas_stride.Stride from the start of one vector D_j and the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).The offdiagonal elements of T_j.
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j and the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
[out] tau
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors tau_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
General Matrix Inversion¶
rocsolver_<type>getri()¶

rocblas_status
rocsolver_zgetri
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶

rocblas_status
rocsolver_cgetri
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶

rocblas_status
rocsolver_dgetri
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶

rocblas_status
rocsolver_sgetri
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶ GETRI inverts a general nbyn matrix A using the LU factorization computed by GETRF.
The inverse is computed by solving the linear system
inv(A) * L = inv(U)
where L is the lower triangular factor of A with unit diagonal elements, and U is the upper triangular factor.
 Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the factors L and U of the factorization A = P*L*U returned by GETRF. On exit, the inverse of A if info = 0; otherwise undefined.
[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of A.
[in] ipiv
: pointer to rocblas_int. Array on the GPU of dimension n.The pivot indices returned by GETRF.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U(i,i) is the first zero pivot.
rocsolver_<type>getri_batched()¶

rocblas_status
rocsolver_zgetri_batched
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetri_batched
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetri_batched
(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetri_batched
(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETRI_BATCHED inverts a batch of general nbyn matrices using the LU factorization computed by GETRF_BATCHED.
The inverse is computed by solving the linear system
inv(A_j) * L_j = inv(U_j)
where L_j is the lower triangular factor of A_j with unit diagonal elements, and U_j is the upper triangular factor.
 Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of all matrices A_j in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the factors L_j and U_j of the factorization A = P_j*L_j*U_j returned by GETRF_BATCHED. On exit, the inverses of A_j if info_j = 0; otherwise undefined.
[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of matrices A_j.
[in] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).The pivot indices returned by GETRF_BATCHED.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(i+j). There is no restriction for the value of strideP. Normal use case is strideP >= n.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_j = 0, successful exit for inversion of A_j. If info_j = i > 0, U_j is singular. U_j(i,i) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getri_strided_batched()¶

rocblas_status
rocsolver_zgetri_strided_batched
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetri_strided_batched
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetri_strided_batched
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetri_strided_batched
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETRI_STRIDED_BATCHED inverts a batch of general nbyn matrices using the LU factorization computed by GETRF_STRIDED_BATCHED.
The inverse is computed by solving the linear system
inv(A_j) * L_j = inv(U_j)
where L_j is the lower triangular factor of A_j with unit diagonal elements, and U_j is the upper triangular factor.
 Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of all matrices A_i in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by GETRF_STRIDED_BATCHED. On exit, the inverses of A_j if info_j = 0; otherwise undefined.
[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[in] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).The pivot indices returned by GETRF_STRIDED_BATCHED.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_j = 0, successful exit for inversion of A_j. If info_j = i > 0, U_j is singular. U_j(i,i) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
General Systems Solvers¶
rocsolver_<type>getrs()¶

rocblas_status
rocsolver_zgetrs
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_int *ipiv, rocblas_double_complex *B, const rocblas_int ldb)¶

rocblas_status
rocsolver_cgetrs
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_int *ipiv, rocblas_float_complex *B, const rocblas_int ldb)¶

rocblas_status
rocsolver_dgetrs
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_int *ipiv, double *B, const rocblas_int ldb)¶

rocblas_status
rocsolver_sgetrs
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_int *ipiv, float *B, const rocblas_int ldb)¶ GETRS solves a system of n linear equations on n variables using the LU factorization computed by GETRF.
It solves one of the following systems:
A * X = B (no transpose), A' * X = B (transpose), or A* * X = B (conjugate transpose)
depending on the value of trans.
 Parameters
[in] handle
: rocblas_handle.[in] trans
: rocblas_operation.Specifies the form of the system of equations.
[in] n
: rocblas_int. n >= 0.The order of the system, i.e. the number of columns and rows of A.
[in] nrhs
: rocblas_int. nrhs >= 0.The number of right hand sides, i.e., the number of columns of the matrix B.
[in] A
: pointer to type. Array on the GPU of dimension lda*n.The factors L and U of the factorization A = P*L*U returned by GETRF.
[in] lda
: rocblas_int. lda >= n.The leading dimension of A.
[in] ipiv
: pointer to rocblas_int. Array on the GPU of dimension n.The pivot indices returned by GETRF.
[inout] B
: pointer to type. Array on the GPU of dimension ldb*nrhs.On entry, the right hand side matrix B. On exit, the solution matrix X.
[in] ldb
: rocblas_int. ldb >= n.The leading dimension of B.
rocsolver_<type>getrs_batched()¶

rocblas_status
rocsolver_zgetrs_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetrs_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetrs_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, double *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, double *const B[], const rocblas_int ldb, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetrs_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, float *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, float *const B[], const rocblas_int ldb, const rocblas_int batch_count)¶ GETRS_BATCHED solves a batch of systems of n linear equations on n variables using the LU factorization computed by GETRF_BATCHED.
For each instance j in the batch, it solves one of the following systems:
A_j * X_j = B_j (no transpose), A_j' * X_j = B_j (transpose), or A_j* * X_j = B_j (conjugate transpose)
depending on the value of trans.
 Parameters
[in] handle
: rocblas_handle.[in] trans
: rocblas_operation.Specifies the form of the system of equations of each instance in the batch.
[in] n
: rocblas_int. n >= 0.The order of the system, i.e. the number of columns and rows of all A_j matrices.
[in] nrhs
: rocblas_int. nrhs >= 0.The number of right hand sides, i.e., the number of columns of all the matrices B_j.
[in] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by GETRF_BATCHED.
[in] lda
: rocblas_int. lda >= n.The leading dimension of matrices A_j.
[in] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of pivot indices returned by GETRF_BATCHED.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[inout] B
: Array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*nrhs.On entry, the right hand side matrices B_j. On exit, the solution matrix X_j of each system in the batch.
[in] ldb
: rocblas_int. ldb >= n.The leading dimension of matrices B_j.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of instances (systems) in the batch.
rocsolver_<type>getrs_strided_batched()¶

rocblas_status
rocsolver_zgetrs_strided_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetrs_strided_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetrs_strided_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, double *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetrs_strided_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, float *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)¶ GETRS_STRIDED_BATCHED solves a batch of systems of n linear equations on n variables using the LU factorization computed by GETRF_STRIDED_BATCHED.
For each instance j in the batch, it solves one of the following systems:
A_j * X_j = B_j (no transpose), A_j' * X_j = B_j (transpose), or A_j* * X_j = B_j (conjugate transpose)
depending on the value of trans.
 Parameters
[in] handle
: rocblas_handle.[in] trans
: rocblas_operation.Specifies the form of the system of equations of each instance in the batch.
[in] n
: rocblas_int. n >= 0.The order of the system, i.e. the number of columns and rows of all A_j matrices.
[in] nrhs
: rocblas_int. nrhs >= 0.The number of right hand sides, i.e., the number of columns of all the matrices B_j.
[in] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by GETRF_STRIDED_BATCHED.
[in] lda
: rocblas_int. lda >= n.The leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[in] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of pivot indices returned by GETRF_STRIDED_BATCHED.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[inout] B
: pointer to type. Array on the GPU (size depends on the value of strideB).On entry, the right hand side matrices B_j. On exit, the solution matrix X_j of each system in the batch.
[in] ldb
: rocblas_int. ldb >= n.The leading dimension of matrices B_j.
[in] strideB
: rocblas_stride.Stride from the start of one matrix B_j and the next one B_(j+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of instances (systems) in the batch.
General Matrix Singular Value Decomposition¶
rocsolver_<type>gesvd()¶

rocblas_status
rocsolver_zgesvd
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *S, rocblas_double_complex *U, const rocblas_int ldu, rocblas_double_complex *V, const rocblas_int ldv, double *E, const rocblas_workmode fast_alg, rocblas_int *info)¶

rocblas_status
rocsolver_cgesvd
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *S, rocblas_float_complex *U, const rocblas_int ldu, rocblas_float_complex *V, const rocblas_int ldv, float *E, const rocblas_workmode fast_alg, rocblas_int *info)¶

rocblas_status
rocsolver_dgesvd
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *S, double *U, const rocblas_int ldu, double *V, const rocblas_int ldv, double *E, const rocblas_workmode fast_alg, rocblas_int *info)¶

rocblas_status
rocsolver_sgesvd
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *S, float *U, const rocblas_int ldu, float *V, const rocblas_int ldv, float *E, const rocblas_workmode fast_alg, rocblas_int *info)¶ GESVD computes the Singular Values and optionally the Singular Vectors of a general mbyn matrix A (Singular Value Decomposition).
The SVD of matrix A is given by:
A = U * S * V'
where the mbyn matrix S is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of A. U and V are orthogonal (unitary) matrices. The first min(m,n) columns of U and V are the left and right singular vectors of A, respectively.
The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of V’.
left_svect and right_svect are rocblas_svect enums that can take the following values:
rocblas_svect_all: the entire matrix U (or V’) is computed,
rocblas_svect_singular: only the singular vectors (first min(m,n) columns of U or rows of V’) are computed,
rocblas_svect_overwrite: the first columns (or rows) of A are overwritten with the singular vectors, or
rocblas_svect_none: no columns (or rows) of U (or V’) are computed, i.e. no singular vectors.
left_svect and right_svect cannot both be set to overwrite. When neither is set to overwrite, the contents of A are destroyed by the time the function returns.
 Note
When m >> n (or n >> m) the algorithm could be sped up by compressing the matrix A via a QR (or LQ) factorization, and working with the triangular factor afterwards (thinSVD). If the singular vectors are also requested, its computation could be sped up as well via executing some intermediate operations outofplace, and relying more on matrix multiplications (GEMMs); this will require, however, a larger memory workspace. The parameter fast_alg controls whether the fast algorithm is executed or not. For more details see the sections “Tuning rocSOLVER performance” and “Memory model” on the User’s guide.
 Parameters
[in] handle
: rocblas_handle.[in] left_svect
: rocblas_svect.Specifies how the left singular vectors are computed.
[in] right_svect
: rocblas_svect.Specifies how the right singular vectors are computed.
[in] m
: rocblas_int. m >= 0.The number of rows of matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry the matrix A. On exit, if left_svect (or right_svect) is equal to overwrite, the first columns (or rows) contain the left (or right) singular vectors; otherwise, contents of A are destroyed.
[in] lda
: rocblas_int. lda >= m.The leading dimension of A.
[out] S
: pointer to real type. Array on the GPU of dimension min(m,n).The singular values of A in decreasing order.
[out] U
: pointer to type. Array on the GPU of dimension ldu*min(m,n) if left_svect is set to singular, or ldu*m when left_svect is equal to all.The matrix of left singular vectors stored as columns. Not referenced if left_svect is set to overwrite or none.
[in] ldu
: rocblas_int. ldu >= m if left_svect is all or singular; ldu >= 1 otherwise.The leading dimension of U.
[out] V
: pointer to type. Array on the GPU of dimension ldv*n.The matrix of right singular vectors stored as rows (transposed / conjugatetransposed). Not referenced if right_svect is set to overwrite or none.
[in] ldv
: rocblas_int. ldv >= n if right_svect is all; ldv >= min(m,n) if right_svect is set to singular; or ldv >= 1 otherwise.The leading dimension of V.
[out] E
: pointer to real type. Array on the GPU of dimension min(m,n)1.This array is used to work internally with the bidiagonal matrix B associated to A (using BDSQR). On exit, if info > 0, it contains the unconverged offdiagonal elements of B (or properly speaking, a bidiagonal matrix orthogonally equivalent to B). The diagonal elements of this matrix are in S; those that converged correspond to a subset of the singular values of A (not necessarily ordered).
[in] fast_alg
: rocblas_workmode.If set to rocblas_outofplace, the function will execute the fast thinSVD version of the algorithm when possible.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, BDSQR did not converge. i elements of E did not converge to zero.
rocsolver_<type>gesvd_batched()¶

rocblas_status
rocsolver_zgesvd_batched
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *S, const rocblas_stride strideS, rocblas_double_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_double_complex *V, const rocblas_int ldv, const rocblas_stride strideV, double *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgesvd_batched
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *S, const rocblas_stride strideS, rocblas_float_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_float_complex *V, const rocblas_int ldv, const rocblas_stride strideV, float *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgesvd_batched
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *S, const rocblas_stride strideS, double *U, const rocblas_int ldu, const rocblas_stride strideU, double *V, const rocblas_int ldv, const rocblas_stride strideV, double *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgesvd_batched
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *S, const rocblas_stride strideS, float *U, const rocblas_int ldu, const rocblas_stride strideU, float *V, const rocblas_int ldv, const rocblas_stride strideV, float *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)¶ GESVD_BATCHED computes the Singular Values and optionally the Singular Vectors of a batch of general mbyn matrix A (Singular Value Decomposition).
The SVD of matrix A_j is given by:
A_j = U_j * S_j * V_j'
where the mbyn matrix S_j is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of A_j. U_j and V_j are orthogonal (unitary) matrices. The first min(m,n) columns of U_j and V_j are the left and right singular vectors of A_j, respectively.
The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of V_j’.
left_svect and right_svect are rocblas_svect enums that can take the following values:
rocblas_svect_all: the entire matrix U_j (or V_j’) is computed,
rocblas_svect_singular: only the singular vectors (first min(m,n) columns of U_j or rows of V_j’) are computed,
rocblas_svect_overwrite: the first columns (or rows) of A_j are overwritten with the singular vectors, or
rocblas_svect_none: no columns (or rows) of U_j (or V_j’) are computed, i.e. no singular vectors.
left_svect and right_svect cannot both be set to overwrite. When neither is set to overwrite, the contents of A_j are destroyed by the time the function returns.
 Note
When m >> n (or n >> m) the algorithm could be sped up by compressing the matrix A_j via a QR (or LQ) factorization, and working with the triangular factor afterwards (thinSVD). If the singular vectors are also requested, its computation could be sped up as well via executing some intermediate operations outofplace, and relying more on matrix multiplications (GEMMs); this will require, however, a larger memory workspace. The parameter fast_alg controls whether the fast algorithm is executed or not. For more details see the sections “Tuning rocSOLVER performance” and “Memory model” on the User’s guide.
 Parameters
[in] handle
: rocblas_handle.[in] left_svect
: rocblas_svect.Specifies how the left singular vectors are computed.
[in] right_svect
: rocblas_svect.Specifies how the right singular vectors are computed.
[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry the matrices A_j. On exit, if left_svect (or right_svect) is equal to overwrite, the first columns (or rows) of A_j contain the left (or right) corresponding singular vectors; otherwise, contents of A_j are destroyed.
[in] lda
: rocblas_int. lda >= m.The leading dimension of A_j.
[out] S
: pointer to real type. Array on the GPU (the size depends on the value of strideS).The singular values of A_j in decreasing order.
[in] strideS
: rocblas_stride.Stride from the start of one vector S_j to the next one S_(j+1). There is no restriction for the value of strideS. Normal use case is strideS >= min(m,n).
[out] U
: pointer to type. Array on the GPU (the side depends on the value of strideU).The matrices U_j of left singular vectors stored as columns. Not referenced if left_svect is set to overwrite or none.
[in] ldu
: rocblas_int. ldu >= m if left_svect is all or singular; ldu >= 1 otherwise.The leading dimension of U_j.
[in] strideU
: rocblas_stride.Stride from the start of one matrix U_j to the next one U_(j+1). There is no restriction for the value of strideU. Normal use case is strideU >= ldu*min(m,n) if left_svect is set to singular, or strideU >= ldu*m when left_svect is equal to all.
[out] V
: pointer to type. Array on the GPU (the size depends on the value of strideV).The matrices V_j of right singular vectors stored as rows (transposed / conjugatetransposed). Not referenced if right_svect is set to overwrite or none.
[in] ldv
: rocblas_int. ldv >= n if right_svect is all; ldv >= min(m,n) if right_svect is set to singular; or ldv >= 1 otherwise.The leading dimension of V.
[in] strideV
: rocblas_stride.Stride from the start of one matrix V_j to the next one V_(j+1). There is no restriction for the value of strideV. Normal use case is strideV >= ldv*n.
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).This array is used to work internally with the bidiagonal matrix B_j associated to A_j (using BDSQR). On exit, if info > 0, it contains the unconverged offdiagonal elements of B_j (or properly speaking, a bidiagonal matrix orthogonally equivalent to B_j). The diagonal elements of this matrix are in S_j; those that converged correspond to a subset of the singular values of A_j (not necessarily ordered).
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
[in] fast_alg
: rocblas_workmode.If set to rocblas_outofplace, the function will execute the fast thinSVD version of the algorithm when possible.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, BDSQR did not converge. i elements of E did not converge to zero.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>gesvd_strided_batched()¶

rocblas_status
rocsolver_zgesvd_strided_batched
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *S, const rocblas_stride strideS, rocblas_double_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_double_complex *V, const rocblas_int ldv, const rocblas_stride strideV, double *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgesvd_strided_batched
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *S, const rocblas_stride strideS, rocblas_float_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_float_complex *V, const rocblas_int ldv, const rocblas_stride strideV, float *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgesvd_strided_batched
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *S, const rocblas_stride strideS, double *U, const rocblas_int ldu, const rocblas_stride strideU, double *V, const rocblas_int ldv, const rocblas_stride strideV, double *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgesvd_strided_batched
(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *S, const rocblas_stride strideS, float *U, const rocblas_int ldu, const rocblas_stride strideU, float *V, const rocblas_int ldv, const rocblas_stride strideV, float *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)¶ GESVD_STRIDED_BATCHED computes the Singular Values and optionally the Singular Vectors of a batch of general mbyn matrix A (Singular Value Decomposition).
The SVD of matrix A_j is given by:
A_j = U_j * S_j * V_j'
where the mbyn matrix S_j is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of A_j. U_j and V_j are orthogonal (unitary) matrices. The first min(m,n) columns of U_j and V_j are the left and right singular vectors of A_j, respectively.
The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of V_j’.
left_svect and right_svect are rocblas_svect enums that can take the following values:
rocblas_svect_all: the entire matrix U_j (or V_j’) is computed,
rocblas_svect_singular: only the singular vectors (first min(m,n) columns of U_j or rows of V_j’) are computed,
rocblas_svect_overwrite: the first columns (or rows) of A_j are overwritten with the singular vectors, or
rocblas_svect_none: no columns (or rows) of U_j (or V_j’) are computed, i.e. no singular vectors.
left_svect and right_svect cannot both be set to overwrite. When neither is set to overwrite, the contents of A_j are destroyed by the time the function returns.
 Note
When m >> n (or n >> m) the algorithm could be sped up by compressing the matrix A_j via a QR (or LQ) factorization, and working with the triangular factor afterwards (thinSVD). If the singular vectors are also requested, its computation could be sped up as well via executing some intermediate operations outofplace, and relying more on matrix multiplications (GEMMs); this will require, however, a larger memory workspace. The parameter fast_alg controls whether the fast algorithm is executed or not. For more details see the sections “Tuning rocSOLVER performance” and “Memory model” on the User’s guide.
 Parameters
[in] handle
: rocblas_handle.[in] left_svect
: rocblas_svect.Specifies how the left singular vectors are computed.
[in] right_svect
: rocblas_svect.Specifies how the right singular vectors are computed.
[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry the matrices A_j. On exit, if left_svect (or right_svect) is equal to overwrite, the first columns (or rows) of A_j contain the left (or right) corresponding singular vectors; otherwise, contents of A_j are destroyed.
[in] lda
: rocblas_int. lda >= m.The leading dimension of A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] S
: pointer to real type. Array on the GPU (the size depends on the value of strideS).The singular values of A_j in decreasing order.
[in] strideS
: rocblas_stride.Stride from the start of one vector S_j to the next one S_(j+1). There is no restriction for the value of strideS. Normal use case is strideS >= min(m,n).
[out] U
: pointer to type. Array on the GPU (the side depends on the value of strideU).The matrices U_j of left singular vectors stored as columns. Not referenced if left_svect is set to overwrite or none.
[in] ldu
: rocblas_int. ldu >= m if left_svect is all or singular; ldu >= 1 otherwise.The leading dimension of U_j.
[in] strideU
: rocblas_stride.Stride from the start of one matrix U_j to the next one U_(j+1). There is no restriction for the value of strideU. Normal use case is strideU >= ldu*min(m,n) if left_svect is set to singular, or strideU >= ldu*m when left_svect is equal to all.
[out] V
: pointer to type. Array on the GPU (the size depends on the value of strideV).The matrices V_j of right singular vectors stored as rows (transposed / conjugatetransposed). Not referenced if right_svect is set to overwrite or none.
[in] ldv
: rocblas_int. ldv >= n if right_svect is all; ldv >= min(m,n) if right_svect is set to singular; or ldv >= 1 otherwise.The leading dimension of V.
[in] strideV
: rocblas_stride.Stride from the start of one matrix V_j to the next one V_(j+1). There is no restriction for the value of strideV. Normal use case is strideV >= ldv*n.
[out] E
: pointer to real type. Array on the GPU (the size depends on the value of strideE).This array is used to work internally with the bidiagonal matrix B_j associated to A_j (using BDSQR). On exit, if info > 0, it contains the unconverged offdiagonal elements of B_j (or properly speaking, a bidiagonal matrix orthogonally equivalent to B_j). The diagonal elements of this matrix are in S_j; those that converged correspond to a subset of the singular values of A_j (not necessarily ordered).
[in] strideE
: rocblas_stride.Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
[in] fast_alg
: rocblas_workmode.If set to rocblas_outofplace, the function will execute the fast thinSVD version of the algorithm when possible.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, BDSQR did not converge. i elements of E did not converge to zero.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
Lapacklike Functions¶
Other Lapacklike routines provided by rocSOLVER.
General Matrix Factorizations¶
rocsolver_<type>getf2_npvt()¶

rocblas_status
rocsolver_zgetf2_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_cgetf2_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_dgetf2_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_sgetf2_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶ GETF2_NPVT computes the LU factorization of a general mbyn matrix A without partial pivoting.
(This is the unblocked Level2BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization has the form
A = L * U
where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacyLAPACKlike API GETF2 routines instead.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U(i,i) is the first zero element in the diagonal. The factorization from this point might be incomplete.
rocsolver_<type>getf2_npvt_batched()¶

rocblas_status
rocsolver_zgetf2_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetf2_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetf2_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetf2_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶ GETF2_NPVT_BATCHED computes the LU factorization of a batch of general mbyn matrices without partial pivoting.
(This is the unblocked Level2BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization of matrix A_i in the batch has the form
A_i = L_i * U_i
where L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacyLAPACKlike API GETF2 routines instead.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getf2_npvt_strided_batched()¶

rocblas_status
rocsolver_zgetf2_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetf2_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetf2_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetf2_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶ GETF2_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general mbyn matrices without partial pivoting.
(This is the unblocked Level2BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization of matrix A_i in the batch has the form
A_i = L_i * U_i
where L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacyLAPACKlike API GETF2 routines instead.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getrf_npvt()¶

rocblas_status
rocsolver_zgetrf_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_cgetrf_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_dgetrf_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status
rocsolver_sgetrf_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶ GETRF_NPVT computes the LU factorization of a general mbyn matrix A without partial pivoting.
(This is the blocked Level3BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization has the form
A = L * U
where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacyLAPACKlike API GETRF routines instead.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the mbyn matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U(i,i) is the first zero element in the diagonal. The factorization from this point might be incomplete.
rocsolver_<type>getrf_npvt_batched()¶

rocblas_status
rocsolver_zgetrf_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetrf_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetrf_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetrf_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶ GETRF_NPVT_BATCHED computes the LU factorization of a batch of general mbyn matrices without partial pivoting.
(This is the blocked Level3BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization of matrix A_i in the batch has the form
A_i = L_i * U_i
where L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacyLAPACKlike API GETRF routines instead.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the mbyn matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getrf_npvt_strided_batched()¶

rocblas_status
rocsolver_zgetrf_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_cgetrf_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_dgetrf_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status
rocsolver_sgetrf_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶ GETRF_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general mbyn matrices without partial pivoting.
(This is the blocked Level3BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with midsize matrices if optimizations are enabled (default option). For more details see the section “tuning rocSOLVER performance” on the User’s guide).
The factorization of matrix A_i in the batch has the form
A_i = L_i * U_i
where L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacyLAPACKlike API GETRF routines instead.
 Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_i in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the mbyn matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, successful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
Deprecated¶
Types¶
See the rocBLAS types documentation for information on the suggested replacements.
rocsolver_int¶

typedef rocblas_int
rocsolver_int
¶  Deprecated:
Use
rocblas_int
.
Deprecated since version 3.5: Use rocblas_int
.
rocsolver_handle¶

typedef rocblas_handle
rocsolver_handle
¶  Deprecated:
Use
rocblas_handle
.
Deprecated since version 3.5: Use rocblas_handle
.
rocsolver_direction¶

typedef rocblas_direct
rocsolver_direction
¶  Deprecated:
Use
rocblas_direct
Deprecated since version 3.5: Use rocblas_direct
.
rocsolver_storev¶

typedef rocblas_storev
rocsolver_storev
¶  Deprecated:
Use
rocblas_storev
.
Deprecated since version 3.5: Use rocblas_storev
.
rocsolver_operation¶

typedef rocblas_operation
rocsolver_operation
¶  Deprecated:
Use
rocblas_operation
.
Deprecated since version 3.5: Use rocblas_operation
.
rocsolver_fill¶

typedef rocblas_fill
rocsolver_fill
¶  Deprecated:
Use
rocblas_fill
.
Deprecated since version 3.5: Use rocblas_fill
.
rocsolver_diagonal¶

typedef rocblas_diagonal
rocsolver_diagonal
¶  Deprecated:
Use
rocblas_diagonal
.
Deprecated since version 3.5: Use rocblas_diagonal
.
Auxiliary Functions¶
See the rocBLAS auxiliary functions documentation for information on the suggested replacements.
rocsolver_create_handle()¶

rocsolver_status
rocsolver_create_handle
(rocsolver_handle *handle)¶ Creates a handle and sets the pointer mode to
rocblas_pointer_mode_device
. Deprecated:
Use
rocblas_create_handle
.
Deprecated since version 3.5: Use rocblas_create_handle()
.
rocsolver_destroy_handle()¶

rocsolver_status
rocsolver_destroy_handle
(rocsolver_handle handle)¶  Deprecated:
Use
rocblas_destroy_handle
.
Deprecated since version 3.5: Use rocblas_destroy_handle()
.
rocsolver_set_stream()¶

rocsolver_status
rocsolver_set_stream
(rocsolver_handle handle, hipStream_t stream)¶  Deprecated:
Use
rocblas_set_stream
.
Deprecated since version 3.5: Use rocblas_set_stream()
.
rocsolver_get_stream()¶

rocsolver_status
rocsolver_get_stream
(rocsolver_handle handle, hipStream_t *stream)¶  Deprecated:
Use
rocblas_get_stream
.
Deprecated since version 3.5: Use rocblas_get_stream()
.
rocsolver_set_vector()¶

rocsolver_status
rocsolver_set_vector
(rocsolver_int n, rocsolver_int elem_size, const void *x, rocsolver_int incx, void *y, rocsolver_int incy)¶  Deprecated:
Use
rocblas_set_vector
.
Deprecated since version 3.5: Use rocblas_set_vector()
.
rocsolver_get_vector()¶

rocsolver_status
rocsolver_get_vector
(rocsolver_int n, rocsolver_int elem_size, const void *x, rocsolver_int incx, void *y, rocsolver_int incy)¶  Deprecated:
Use
rocblas_get_vector
.
Deprecated since version 3.5: Use rocblas_get_vector()
.
rocsolver_set_matrix()¶

rocsolver_status
rocsolver_set_matrix
(rocsolver_int rows, rocsolver_int cols, rocsolver_int elem_size, const void *a, rocsolver_int lda, void *b, rocsolver_int ldb)¶  Deprecated:
Use
rocblas_set_matrix
.
Deprecated since version 3.5: Use rocblas_set_matrix()
.
rocsolver_get_matrix()¶

rocsolver_status
rocsolver_get_matrix
(rocsolver_int rows, rocsolver_int cols, rocsolver_int elem_size, const void *a, rocsolver_int lda, void *b, rocsolver_int ldb)¶  Deprecated:
Use
rocblas_get_matrix
.
Deprecated since version 3.5: Use rocblas_get_matrix()
.