rocSOLVER API¶
This section provides details of the rocSOLVER library API as in last ROCm release.
Types¶
Most rocSOLVER types are aliases of rocBLAS types. See rocBLAS types here.
Definitions¶
rocsolver_int¶
Warning
doxygentypedef: Cannot find typedef “rocsolver_int” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
Enums¶
rocsolver_handle¶
Warning
doxygentypedef: Cannot find typedef “rocsolver_handle” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_operation¶
Warning
doxygentypedef: Cannot find typedef “rocsolver_operation” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_fill¶
Warning
doxygentypedef: Cannot find typedef “rocsolver_fill” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_diagonal¶
Warning
doxygentypedef: Cannot find typedef “rocsolver_diagonal” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_side¶
Warning
doxygentypedef: Cannot find typedef “rocsolver_side” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_direct¶
Warning
doxygenenum: Cannot find enum “rocsolver_direct” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_storev¶
Warning
doxygenenum: Cannot find enum “rocsolver_storev” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_status¶
Warning
doxygentypedef: Cannot find typedef “rocsolver_status” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
Lapack Auxiliary Functions¶
These are functions that support more advanced Lapack routines.
Matrix permutations and manipulations¶
rocsolver_<type>laswp()¶
-
rocblas_status
rocsolver_zlaswp
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)¶
-
rocblas_status
rocsolver_claswp
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)¶
-
rocblas_status
rocsolver_dlaswp
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)¶
-
rocblas_status
rocsolver_slaswp
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)¶ LASWP performs a series of row interchanges on the matrix A.
It interchanges row I with row IPIV[k1 + (I - k1) * abs(inx)], for each of rows K1 through K2 of A. k1 and k2 are 1-based indices.
- Parameters
[in] handle
: rocblas_handle[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix of column dimension n to which the row interchanges will be applied. On exit, the permuted matrix.
[in] lda
: rocblas_int. lda > 0.The leading dimension of the array A.
[in] k1
: rocblas_int. k1 > 0.The first element of IPIV for which a row interchange will be done. This is a 1-based index.
[in] k2
: rocblas_int. k2 > k1 > 0.(K2-K1+1) is the number of elements of IPIV for which a row interchange will be done. This is a 1-based index.
[in] ipiv
: pointer to rocblas_int. Array on the GPU of dimension at least k1 + (k2 - k1) * abs(incx).The vector of pivot indices. Only the elements in positions k1 through (k1 + (k2 - k1) * abs(incx)) of IPIV are accessed. Elements of ipiv are considered 1-based.
[in] incx
: rocblas_int. incx != 0.The increment between successive values of IPIV. If IPIV is negative, the pivots are applied in reverse order.
Householder reflexions¶
rocsolver_<type>larfg()¶
-
rocblas_status
rocsolver_dlarfg
(rocblas_handle handle, const rocblas_int n, double *alpha, double *x, const rocblas_int incx, double *tau)¶
-
rocblas_status
rocsolver_slarfg
(rocblas_handle handle, const rocblas_int n, float *alpha, float *x, const rocblas_int incx, float *tau)¶ LARFG generates an orthogonal Householder reflector H of order n.
Householder reflector H is such that
H * [alpha] = [beta] [ x ] [ 0 ]
where x is an n-1 vector and alpha and beta are scalars. Matrix H can be generated as
H = I - tau * [1] * [1 v'] [v]
with v an n-1 vector and tau a scalar.
- Parameters
[in] handle
: rocblas_handle[in] n
: rocblas_int. n >= 0.The order (size) of reflector H.
[inout] alpha
: pointer to type. A scalar on the GPU.On input the scalar alpha, on output it is overwritten with beta.
[inout] x
: pointer to type. Array on the GPU of size at least n-1.On input it is the vector x, on output it is overwritten with vector v.
[in] incx
: rocblas_int. incx > 0.The increment between consecutive elements of x.
[out] tau
: pointer to type. A scalar on the GPU.The scalar tau.
-
rocblas_status
rocsolver_clarfg
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *alpha, rocblas_float_complex *x, const rocblas_int incx, rocblas_float_complex *tau)¶
-
rocblas_status
rocsolver_zlarfg
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *alpha, rocblas_double_complex *x, const rocblas_int incx, rocblas_double_complex *tau)¶
rocsolver_<type>larft()¶
-
rocblas_status
rocsolver_dlarft
(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *tau, double *T, const rocblas_int ldt)¶
-
rocblas_status
rocsolver_slarft
(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *tau, float *T, const rocblas_int ldt)¶ LARFT Generates the triangular factor T of a block reflector H of order n.
The block reflector H is defined as the product of k Householder matrices as
H = H(1) * H(2) * ... * H(k) (forward direction), or H = H(k) * ... * H(2) * H(1) (backward direction)
depending on the value of direct.
The triangular matrix T is upper triangular in forward direction and lower triangular in backward direction. If storev is column-wise, then
H = I - V * T * V'
where the i-th column of matrix V contains the Householder vector associated to H(i). If storev is row-wise, then
H = I - V' * T * V
where the i-th row of matrix V contains the Householder vector associated to H(i).
- Parameters
[in] handle
: rocblas_handle.[in] direct
: rocblas_direct.Specifies the direction in which the Householder matrices are applied.
[in] storev
: rocblas_storev.Specifies how the Householder vectors are stored in matrix V.
[in] n
: rocblas_int. n >= 0.The order (size) of the block reflector.
[in] k
: rocsovler_int. k >= 1.The number of Householder matrices.
[in] V
: pointer to type. Array on the GPU of size ldv*k if column-wise, or ldv*n if row-wise.The matrix of Householder vectors.
[in] ldv
: rocblas_int. ldv >= n if column-wise, or ldv >= k if row-wise.Leading dimension of V.
[in] tau
: pointer to type. Array of k scalars on the GPU.The vector of all the scalars associated to the Householder matrices.
[out] T
: pointer to type. Array on the GPU of dimension ldt*k.The triangular factor. T is upper triangular is forward operation, otherwise it is lower triangular. The rest of the array is not used.
[in] ldt
: rocblas_int. ldt >= k.The leading dimension of T.
-
rocblas_status
rocsolver_clarft
(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *tau, rocblas_float_complex *T, const rocblas_int ldt)¶
-
rocblas_status
rocsolver_zlarft
(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *tau, rocblas_double_complex *T, const rocblas_int ldt)¶
rocsolver_<type>larf()¶
-
rocblas_status
rocsolver_dlarf
(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, double *x, const rocblas_int incx, const double *alpha, double *A, const rocblas_int lda)¶
-
rocblas_status
rocsolver_slarf
(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, float *x, const rocblas_int incx, const float *alpha, float *A, const rocblas_int lda)¶ LARF applies a Householder reflector H to a general matrix A.
The Householder reflector H, of order m (or n), is to be applied to a m-by-n matrix A from the left (or the right). H is given by
H = I - alpha * x * x'
where alpha is a scalar and x a Householder vector. H is never actually computed.
- Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.If side = rocblas_side_left, then compute H*A If side = rocblas_side_right, then compute A*H
[in] m
: rocblas_int. m >= 0.Number of rows of A.
[in] n
: rocblas_int. n >= 0.Number of columns of A.
[in] x
: pointer to type. Array on the GPU of size at least (1 + (m-1)*abs(incx)) if left side, or at least (1 + (n-1)*abs(incx)) if right side.The Householder vector x.
[in] incx
: rocblas_int. incx != 0.Increment between to consecutive elements of x. If incx < 0, the elements of x are used in reverse order.
[in] alpha
: pointer to type. A scalar on the GPU.If alpha = 0, then H = I (A will remain the same, x is never used)
[inout] A
: pointer to type. Array on the GPU of size lda*n.On input, the matrix A. On output it is overwritten with H*A (or A*H).
[in] lda
: rocblas_int. lda >= m.Leading dimension of A.
-
rocblas_status
rocsolver_clarf
(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_float_complex *x, const rocblas_int incx, const rocblas_float_complex *alpha, rocblas_float_complex *A, const rocblas_int lda)¶
-
rocblas_status
rocsolver_zlarf
(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_double_complex *x, const rocblas_int incx, const rocblas_double_complex *alpha, rocblas_double_complex *A, const rocblas_int lda)¶
rocsolver_<type>larfb()¶
-
rocblas_status
rocsolver_dlarfb
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *T, const rocblas_int ldt, double *A, const rocblas_int lda)¶
-
rocblas_status
rocsolver_slarfb
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *T, const rocblas_int ldt, float *A, const rocblas_int lda)¶ LARFB applies a block reflector H to a general m-by-n matrix A.
The block reflector H is applied in one of the following forms, depending on the values of side and trans:
H * A (No transpose from the left) H' * A (Transpose or conjugate transpose from the left) A * H (No transpose from the right), and A * H' (Transpose or conjugate transpose from the right)
The block reflector H is defined as the product of k Householder matrices as
H = H(1) * H(2) * ... * H(k) (forward direction), or H = H(k) * ... * H(2) * H(1) (backward direction)
depending on the value of direct. H is never stored. It is calculated as
H = I - V * T * V'
where the i-th column of matrix V contains the Householder vector associated with H(i), if storev is column-wise; or
H = I - V' * T * V
where the i-th row of matrix V contains the Householder vector associated with H(i), if storev is row-wise. T is the associated triangular factor as computed by LARFT.
- Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply H.
[in] trans
: rocblas_operation.Specifies whether the block reflector or its transpose/conjugate transpose is to be applied.
[in] direct
: rocblas_direct.Specifies the direction in which the Householder matrices were to be applied to generate H.
[in] storev
: rocblas_storev.Specifies how the Householder vectors are stored in matrix V.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix A.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix A.
[in] k
: rocsovler_int. k >= 1.The number of Householder matrices.
[in] V
: pointer to type. Array on the GPU of size ldv*k if column-wise, ldv*n if row-wise and applying from the right, or ldv*m if row-wise and applying from the left.The matrix of Householder vectors.
[in] ldv
: rocblas_int. ldv >= k if row-wise, ldv >= m if column-wise and applying from the left, or ldv >= n if column-wise and applying from the right.Leading dimension of V.
[in] T
: pointer to type. Array on the GPU of dimension ldt*k.The triangular factor of the block reflector.
[in] ldt
: rocblas_int. ldt >= k.The leading dimension of T.
[inout] A
: pointer to type. Array on the GPU of size lda*n.On input, the matrix A. On output it is overwritten with H*A, A*H, H’*A, or A*H’.
[in] lda
: rocblas_int. lda >= m.Leading dimension of A.
-
rocblas_status
rocsolver_clarfb
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *T, const rocblas_int ldt, rocblas_float_complex *A, const rocblas_int lda)¶
-
rocblas_status
rocsolver_zlarfb
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *T, const rocblas_int ldt, rocblas_double_complex *A, const rocblas_int lda)¶
Orthonormal matrices¶
rocsolver_<type>org2r()¶
-
rocblas_status
rocsolver_dorg2r
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶
-
rocblas_status
rocsolver_sorg2r
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORG2R generates a m-by-n Matrix Q with orthonormal columns.
(This is the unblocked version of the algorithm).
The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m
Q = H(1) * H(2) * ... * H(k)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQRF.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of colums of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the i-th column has Householder vector v(i), for i = 1,2,…,k as returned in the first k columns of matrix A of GEQRF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
rocsolver_<type>orgqr()¶
-
rocblas_status
rocsolver_dorgqr
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶
-
rocblas_status
rocsolver_sorgqr
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORGQR generates a m-by-n Matrix Q with orthonormal columns.
(This is the blocked version of the algorithm).
The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m
Q = H(1) * H(2) * ... * H(k)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GEQRF.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q.
[in] n
: rocblas_int. 0 <= n <= m.The number of colums of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= n.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the i-th column has Householder vector v(i), for i = 1,2,…,k as returned in the first k columns of matrix A of GEQRF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
rocsolver_<type>orgl2()¶
-
rocblas_status
rocsolver_dorgl2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶
-
rocblas_status
rocsolver_sorgl2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORGL2 generates a m-by-n Matrix Q with orthonormal rows.
(This is the unblocked version of the algorithm).
The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n
Q = H(k) * H(k-1) * ... * H(1)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GELQF.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. 0 <= m <= n.The number of rows of the matrix Q.
[in] n
: rocblas_int. n >= 0.The number of colums of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= m.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the i-th row has Householder vector v(i), for i = 1,2,…,k as returned in the first k rows of matrix A of GELQF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
rocsolver_<type>orglq()¶
-
rocblas_status
rocsolver_dorglq
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶
-
rocblas_status
rocsolver_sorglq
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORGLQ generates a m-by-n Matrix Q with orthonormal rows.
(This is the blocked version of the algorithm).
The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n
Q = H(k) * H(k-1) * ... * H(1)
Householder matrices H(i) are never stored, they are computed from its corresponding Householder vector v(i) and scalar ipiv_i as returned by GELQF.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. 0 <= m <= n.The number of rows of the matrix Q.
[in] n
: rocblas_int. n >= 0.The number of colums of the matrix Q.
[in] k
: rocblas_int. 0 <= k <= m.The number of Householder reflectors.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the i-th row has Householder vector v(i), for i = 1,2,…,k as returned in the first k rows of matrix A of GELQF. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
rocsolver_<type>orgbr()¶
-
rocblas_status
rocsolver_dorgbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)¶
-
rocblas_status
rocsolver_sorgbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)¶ ORGBR generates a m-by-n Matrix Q with orthonormal rows or columns.
If storev is column-wise, then the matrix Q has orthonormal columns. If m >= k, Q is defined as the first n columns of the product of k Householder reflectors of order m
Q = H(1) * H(2) * ... * H(k)
If m < k, Q is defined as the product of Householder reflectors of order m
Q = H(1) * H(2) * ... * H(m-1)
On the other hand, if storev is row-wise, then the matrix Q has orthonormal rows. If n > k, Q is defined as the first m rows of the product of k Householder reflectors of order n
Q = H(k) * H(k-1) * ... * H(1)
If n <= k, Q is defined as the product of Householder reflectors of order n
Q = H(n-1) * H(n-2) * ... * H(1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by GEBRD in its arguments A and tauq or taup.
- Parameters
[in] handle
: rocblas_handle.[in] storev
: rocblas_storev.Specifies whether to work column-wise or row-wise.
[in] m
: rocblas_int. m >= 0.The number of rows of the matrix Q. If row-wise, then min(n,k) <= m <= n.
[in] n
: rocblas_int. n >= 0.The number of colums of the matrix Q. If column-wise, then min(m,k) <= n <= m.
[in] k
: rocblas_int. k >= 0.The number of columns (if storev is colum-wise) or rows (if row-wise) of the original matrix reduced by GEBRD.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the i-th column (or row) has the Householder vector v(i) as returned by GEBRD. On exit, the computed matrix Q.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension min(m,k) if column-wise, or min(n,k) if row-wise.The scalar factors of the Householder matrices H(i) as returned by GEBRD.
rocsolver_<type>orm2r()¶
-
rocblas_status
rocsolver_dorm2r
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶
-
rocblas_status
rocsolver_sorm2r
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORM2R applies a matrix Q with orthonormal columns to a general m-by-n matrix C.
(This is the unblocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(1) * H(2) * ... * H(k)
or order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.
- Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocsovler_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The i-th column has the Householder vector v(i) associated with H(i) as returned by GEQRF in the first k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, or lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] lda
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>ormqr()¶
-
rocblas_status
rocsolver_dormqr
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶
-
rocblas_status
rocsolver_sormqr
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORMQR applies a matrix Q with orthonormal columns to a general m-by-n matrix C.
(This is the blocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(1) * H(2) * ... * H(k)
or order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.
- Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocsovler_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*k.The i-th column has the Householder vector v(i) associated with H(i) as returned by GEQRF in the first k columns of its argument A.
[in] lda
: rocblas_int. lda >= m if side is left, or lda >= n if side is right.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GEQRF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] lda
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>orml2()¶
-
rocblas_status
rocsolver_dorml2
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶
-
rocblas_status
rocsolver_sorml2
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORML2 applies a matrix Q with orthonormal rows to a general m-by-n matrix C.
(This is the unblocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(k) * H(k-1) * ... * H(1)
or order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.
- Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocsovler_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.The i-th row has the Householder vector v(i) associated with H(i) as returned by GELQF in the first k rows of its argument A.
[in] lda
: rocblas_int. lda >= k.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] lda
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>ormlq()¶
-
rocblas_status
rocsolver_dormlq
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶
-
rocblas_status
rocsolver_sormlq
(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORMLQ applies a matrix Q with orthonormal rows to a general m-by-n matrix C.
(This is the blocked version of the algorithm).
The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
Q is an orthogonal matrix defined as the product of k Householder reflectors as
Q = H(k) * H(k-1) * ... * H(1)
or order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.
- Parameters
[in] handle
: rocblas_handle.[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocsovler_int. k >= 0; k <= m if side is left, k <= n if side is right.The number of Householder reflectors that form Q.
[in] A
: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.The i-th row has the Householder vector v(i) associated with H(i) as returned by GELQF in the first k rows of its argument A.
[in] lda
: rocblas_int. lda >= k.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least k.The scalar factors of the Householder matrices H(i) as returned by GELQF.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] lda
: rocblas_int. ldc >= m.Leading dimension of C.
rocsolver_<type>ormbr()¶
-
rocblas_status
rocsolver_dormbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)¶
-
rocblas_status
rocsolver_sormbr
(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)¶ ORMBR applies a matrix Q with orthonormal rows or columns to a general m-by-n matrix C.
If storev is column-wise, then the matrix Q has orthonormal columns. If storev is row-wise, then the matrix Q has orthonormal rows. The matrix Q is applied in one of the following forms, depending on the values of side and trans:
Q * C (No transpose from the left) Q' * C (Transpose from the left) C * Q (No transpose from the right), and C * Q' (Transpose from the right)
The order nq of orthogonal matrix Q is nq = m if applying from the left, or nq = n if applying from the right.
When storev is column-wise, if nq >= k, then Q is defined as the product of k Householder reflectors of order nq
Q = H(1) * H(2) * ... * H(k),
and if nq < k, then Q is defined as the product
Q = H(1) * H(2) * ... * H(nq-1).
When storev is row-wise, if nq > k, then Q is defined as the product of k Householder reflectors of order nq
Q = H(1) * H(2) * ... * H(k),
and if n <= k, Q is defined as the product
Q = H(1) * H(2) * ... * H(nq-1)
The Householder matrices H(i) are never stored, they are computed from its corresponding Householder vectors v(i) and scalars ipiv_i as returned by GEBRD in its arguments A and tauq or taup.
- Parameters
[in] handle
: rocblas_handle.[in] storev
: rocblas_storev.Specifies whether to work column-wise or row-wise.
[in] side
: rocblas_side.Specifies from which side to apply Q.
[in] trans
: rocblas_operation.Specifies whether the matrix Q or its transpose is to be applied.
[in] m
: rocblas_int. m >= 0.Number of rows of matrix C.
[in] n
: rocblas_int. n >= 0.Number of columns of matrix C.
[in] k
: rocsovler_int. k >= 0.The number of columns (if storev is colum-wise) or rows (if row-wise) of the original matrix reduced by GEBRD.
[in] A
: pointer to type. Array on the GPU of size lda*min(nq,k) if column-wise, or lda*nq if row-wise.The i-th column (or row) has the Householder vector v(i) associated with H(i) as returned by GEBRD.
[in] lda
: rocblas_int. lda >= nq if column-wise, or lda >= min(nq,k) if row-wise.Leading dimension of A.
[in] ipiv
: pointer to type. Array on the GPU of dimension at least min(nq,k).The scalar factors of the Householder matrices H(i) as returned by GEBRD.
[inout] C
: pointer to type. Array on the GPU of size ldc*n.On input, the matrix C. On output it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.
[in] lda
: rocblas_int. ldc >= m.Leading dimension of C.
Lapack Functions¶
Lapack routines solve complex Numerical Linear Algebra problems.
Special Matrix Factorizations¶
rocsolver_<type>potf2()¶
-
rocblas_status
rocsolver_dpotf2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_spotf2
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶ POTF2 computes the Cholesky factorization of a real symmetric positive definite matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form:
A = U' * U, or A = L * L'
depending on the value of uplo. U is an upper triangular matrix and L is lower triangular.
- Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The matrix dimensions.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, succesful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.
rocsolver_<type>potf2_batched()¶
-
rocblas_status
rocsolver_dpotf2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_spotf2_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶ POTF2_BATCHED computes the Cholesky factorization of a batch of real symmetric positive definite matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_i in the batch has the form:
A_i = U_i' * U_i, or A_i = L_i * L_i'
depending on the value of uplo. U_i is an upper triangular matrix and L_i is lower triangular.
- Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The dimension of matrix A_i.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the matrices A_i to be factored. On exit, the upper or lower triangular factors.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_i.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, succesful factorization of matrix A_i. If info_i = j > 0, the leading minor of order j of A_i is not positive definite. The i-th factorization stopped at this point.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>potf2_strided_batched()¶
-
rocblas_status
rocsolver_dpotf2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_spotf2_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶ POTF2_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric positive definite matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_i in the batch has the form:
A_i = U_i' * U_i, or A_i = L_i * L_i'
depending on the value of uplo. U_i is an upper triangular matrix and L_i is lower triangular.
- Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The dimension of matrix A_i.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the matrices A_i to be factored. On exit, the upper or lower triangular factors.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, succesful factorization of matrix A_i. If info_i = j > 0, the leading minor of order j of A_i is not positive definite. The i-th factorization stopped at this point.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>potrf()¶
-
rocblas_status
rocsolver_dpotrf
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_spotrf
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶ POTRF computes the Cholesky factorization of a real symmetric positive definite matrix A.
(This is the blocked version of the algorithm).
The factorization has the form:
A = U' * U, or A = L * L'
depending on the value of uplo. U is an upper triangular matrix and L is lower triangular.
- Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The matrix dimensions.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, succesful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.
rocsolver_<type>potrf_batched()¶
-
rocblas_status
rocsolver_dpotrf_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_spotrf_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶ POTRF_BATCHED computes the Cholesky factorization of a batch of real symmetric positive definite matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_i in the batch has the form:
A_i = U_i' * U_i, or A_i = L_i * L_i'
depending on the value of uplo. U_i is an upper triangular matrix and L_i is lower triangular.
- Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The dimension of matrix A_i.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the matrices A_i to be factored. On exit, the upper or lower triangular factors.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_i.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, succesful factorization of matrix A_i. If info_i = j > 0, the leading minor of order j of A_i is not positive definite. The i-th factorization stopped at this point.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>potrf_strided_batched()¶
-
rocblas_status
rocsolver_dpotrf_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_spotrf_strided_batched
(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶ POTRF_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric positive definite matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_i in the batch has the form:
A_i = U_i' * U_i, or A_i = L_i * L_i'
depending on the value of uplo. U_i is an upper triangular matrix and L_i is lower triangular.
- Parameters
[in] handle
: rocblas_handle.[in] uplo
: rocblas_fill.Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
[in] n
: rocblas_int. n >= 0.The dimension of matrix A_i.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the matrices A_i to be factored. On exit, the upper or lower triangular factors.
[in] lda
: rocblas_int. lda >= n.specifies the leading dimension of A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, succesful factorization of matrix A_i. If info_i = j > 0, the leading minor of order j of A_i is not positive definite. The i-th factorization stopped at this point.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
General Matrix Factorizations¶
rocsolver_<type>getf2()¶
-
rocblas_status
rocsolver_zgetf2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶
-
rocblas_status
rocsolver_cgetf2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶
-
rocblas_status
rocsolver_dgetf2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶
-
rocblas_status
rocsolver_sgetf2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶ GETF2 computes the LU factorization of a general m-by-n matrix A using partial pivoting with row interchanges.
(This is the right-looking Level 2 BLAS version of the algorithm).
The factorization has the form
A = P * L * U
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of colums of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to rocblas_int. Array on the GPU of dimension min(m,n).The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, succesful exit. If info = i > 0, U is singular. U(i,i) is the first zero pivot.
rocsolver_<type>getf2_batched()¶
-
rocblas_status
rocsolver_zgetf2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetf2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetf2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetf2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETF2_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.
(This is the right-looking Level 2 BLAS version of the algorithm).
The factorization of matrix A_i in the batch has the form
A_i = P_i * L_i * U_i
where P_i is a permutation matrix, L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all matrices A_i in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[out] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors of pivot indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is min(m,n). Elements of ipiv_i are 1-based indices. For each instance A_i in the batch and for 1 <= j <= min(m,n), the row j of the matrix A_i was interchanged with row ipiv_i[j]. Matrix P_i of the factorization can be derived from ipiv_i.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, succesful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getf2_strided_batched()¶
-
rocblas_status
rocsolver_zgetf2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetf2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetf2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetf2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETF2_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.
(This is the right-looking Level 2 BLAS version of the algorithm).
The factorization of matrix A_i in the batch has the form
A_i = P_i * L_i * U_i
where P_i is a permutation matrix, L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all matrices A_i in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, in contains the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors of pivots indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is min(m,n). Elements of ipiv_i are 1-based indices. For each instance A_i in the batch and for 1 <= j <= min(m,n), the row j of the matrix A_i was interchanged with row ipiv_i[j]. Matrix P_i of the factorization can be derived from ipiv_i.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, succesful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getrf()¶
-
rocblas_status
rocsolver_zgetrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶
-
rocblas_status
rocsolver_cgetrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶
-
rocblas_status
rocsolver_dgetrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶
-
rocblas_status
rocsolver_sgetrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)¶ GETRF computes the LU factorization of a general m-by-n matrix A using partial pivoting with row interchanges.
(This is the right-looking Level 3 BLAS version of the algorithm).
The factorization has the form
A = P * L * U
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of colums of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to rocblas_int. Array on the GPU of dimension min(m,n).The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, succesful exit. If info = i > 0, U is singular. U(i,i) is the first zero pivot.
rocsolver_<type>getrf_batched()¶
-
rocblas_status
rocsolver_zgetrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETRF_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.
(This is the right-looking Level 3 BLAS version of the algorithm).
The factorization of matrix A_i in the batch has the form
A_i = P_i * L_i * U_i
where P_i is a permutation matrix, L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all matrices A_i in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[out] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors of pivot indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is min(m,n). Elements of ipiv_i are 1-based indices. For each instance A_i in the batch and for 1 <= j <= min(m,n), the row j of the matrix A_i was interchanged with row ipiv_i(j). Matrix P_i of the factorization can be derived from ipiv_i.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, succesful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getrf_strided_batched()¶
-
rocblas_status
rocsolver_zgetrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)¶ GETRF_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.
(This is the right-looking Level 3 BLAS version of the algorithm).
The factorization of matrix A_i in the batch has the form
A_i = P_i * L_i * U_i
where P_i is a permutation matrix, L_i is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U_i is upper triangular (upper trapezoidal if m < n).
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_i in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all matrices A_i in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, in contains the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_i.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_i and the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors of pivots indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is min(m,n). Elements of ipiv_i are 1-based indices. For each instance A_i in the batch and for 1 <= j <= min(m,n), the row j of the matrix A_i was interchanged with row ipiv_i(j). Matrix P_i of the factorization can be derived from ipiv_i.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info_i = 0, succesful exit for factorization of A_i. If info_i = j > 0, U_i is singular. U_i(j,j) is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>geqr2()¶
-
rocblas_status
rocsolver_dgeqr2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶
-
rocblas_status
rocsolver_sgeqr2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GEQR2 computes a QR factorization of a general m-by-n matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
A = Q * [ R ] [ 0 ]
where R is upper triangular (upper trapezoidal if m < n), and Q is a m-by-m orthogonal matrix represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(k), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I - ipiv[i-1] * v(i) * v(i)'
where the first i-1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of colums of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the m-by-n matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the m - i elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
-
rocblas_status
rocsolver_cgeqr2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶
-
rocblas_status
rocsolver_zgeqr2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶
rocsolver_<type>geqr2_batched()¶
-
rocblas_status
rocsolver_dgeqr2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgeqr2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQR2_BATCHED computes the QR factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ R_j ] [ 0 ]
where R_j is upper triangular (upper trapezoidal if m < n), and Q_j is a m-by-m orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(k), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I - ipiv_j[i-1] * v_j(i) * v_j(i)'
where the first i-1 elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the m - i elements of vector v_j(i) for i=1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
-
rocblas_status
rocsolver_cgeqr2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_zgeqr2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
rocsolver_<type>geqr2_strided_batched()¶
-
rocblas_status
rocsolver_dgeqr2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgeqr2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQR2_STRIDED_BATCHED computes the QR factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ R_j ] [ 0 ]
where R_j is upper triangular (upper trapezoidal if m < n), and Q_j is a m-by-m orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(k), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I - ipiv_j[i-1] * v_j(i) * v_j(i)'
where the first i-1 elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the m - i elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
-
rocblas_status
rocsolver_cgeqr2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_zgeqr2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
rocsolver_<type>geqrf()¶
-
rocblas_status
rocsolver_dgeqrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶
-
rocblas_status
rocsolver_sgeqrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GEQRF computes a QR factorization of a general m-by-n matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
A = Q * [ R ] [ 0 ]
where R is upper triangular (upper trapezoidal if m < n), and Q is a m-by-m orthogonal matrix represented as the product of Householder matrices
Q = H(1) * H(2) * ... * H(k), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I - ipiv[i-1] * v(i) * v(i)'
where the first i-1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of colums of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the m-by-n matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the m - i elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
-
rocblas_status
rocsolver_cgeqrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶
-
rocblas_status
rocsolver_zgeqrf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶
rocsolver_<type>geqrf_batched()¶
-
rocblas_status
rocsolver_dgeqrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgeqrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQRF_BATCHED computes the QR factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ R_j ] [ 0 ]
where R_j is upper triangular (upper trapezoidal if m < n), and Q_j is a m-by-m orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(k), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I - ipiv_j[i-1] * v_j(i) * v_j(i)'
where the first i-1 elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the m - i elements of vector v_j(i) for i=1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
-
rocblas_status
rocsolver_cgeqrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_zgeqrf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
rocsolver_<type>geqrf_strided_batched()¶
-
rocblas_status
rocsolver_dgeqrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgeqrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GEQRF_STRIDED_BATCHED computes the QR factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = Q_j * [ R_j ] [ 0 ]
where R_j is upper triangular (upper trapezoidal if m < n), and Q_j is a m-by-m orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(1) * H_j(2) * ... * H_j(k), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I - ipiv_j[i-1] * v_j(i) * v_j(i)'
where the first i-1 elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the m - i elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
-
rocblas_status
rocsolver_cgeqrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_zgeqrf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
rocsolver_<type>gelq2()¶
-
rocblas_status
rocsolver_dgelq2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶
-
rocblas_status
rocsolver_sgelq2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GELQ2 computes a LQ factorization of a general m-by-n matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
A = [ L 0 ] * Q
where L is lower triangular (lower trapezoidal if m > n), and Q is a n-by-n orthogonal matrix represented as the product of Householder matrices
Q = H(k) * H(k-1) * ... * H(1), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I - ipiv[i-1] * v(i)' * v(i)
where the first i-1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of colums of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the m-by-n matrix to be factored. On exit, the elements on and delow the diagonal contain the factor L; the elements above the diagonal are the n - i elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
-
rocblas_status
rocsolver_cgelq2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶
-
rocblas_status
rocsolver_zgelq2
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶
rocsolver_<type>gelq2_batched()¶
-
rocblas_status
rocsolver_dgelq2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgelq2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GELQ2_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = [ L_j 0 ] * Q_j
where L_j is lower triangular (lower trapezoidal if m > n), and Q_j is a n-by-n orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(k) * H_j(k-1) * ... * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I - ipiv_j[i-1] * v_j(i)' * v_j(i)
where the first i-1 elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the n - i elements of vector v_j(i) for i=1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
-
rocblas_status
rocsolver_cgelq2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_zgelq2_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
rocsolver_<type>gelq2_strided_batched()¶
-
rocblas_status
rocsolver_dgelq2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgelq2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GELQ2_STRIDED_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = [ L_j 0 ] * Q_j
where L_j is lower triangular (lower trapezoidal if m > n), and Q_j is a n-by-n orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(k) * H_j(k-1) * ... * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I - ipiv_j[i-1] * v_j(i)' * v_j(i)
where the first i-1 elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the n - i elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
-
rocblas_status
rocsolver_cgelq2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_zgelq2_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
rocsolver_<type>gelqf()¶
-
rocblas_status
rocsolver_dgelqf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)¶
-
rocblas_status
rocsolver_sgelqf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)¶ GELQF computes a LQ factorization of a general m-by-n matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
A = [ L 0 ] * Q
where L is lower triangular (lower trapezoidal if m > n), and Q is a n-by-n orthogonal matrix represented as the product of Householder matrices
Q = H(k) * H(k-1) * ... * H(1), with k = min(m,n)
Each Householder matrix H(i), for i = 1,2,…,k, is given by
H(i) = I - ipiv[i-1] * v(i)' * v(i)
where the first i-1 elements of the Householder vector v(i) are zero, and v(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of colums of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the m-by-n matrix to be factored. On exit, the elements on and delow the diagonal contain the factor L; the elements above the diagonal are the n - i elements of vector v(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] ipiv
: pointer to type. Array on the GPU of dimension min(m,n).The scalar factors of the Householder matrices H(i).
-
rocblas_status
rocsolver_cgelqf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)¶
-
rocblas_status
rocsolver_zgelqf
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)¶
rocsolver_<type>gelqf_batched()¶
-
rocblas_status
rocsolver_dgelqf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgelqf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GELQF_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = [ L_j 0 ] * Q_j
where L_j is lower triangular (lower trapezoidal if m > n), and Q_j is a n-by-n orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(k) * H_j(k-1) * ... * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I - ipiv_j[i-1] * v_j(i)' * v_j(i)
where the first i-1 elements of Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all the matrices A_j in the batch.
[inout] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the n - i elements of vector v_j(i) for i=1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
-
rocblas_status
rocsolver_cgelqf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_zgelqf_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
rocsolver_<type>gelqf_strided_batched()¶
-
rocblas_status
rocsolver_dgelqf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgelqf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶ GELQF_STRIDED_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix A_j in the batch has the form
A_j = [ L_j 0 ] * Q_j
where L_j is lower triangular (lower trapezoidal if m > n), and Q_j is a n-by-n orthogonal matrix represented as the product of Householder matrices
Q_j = H_j(k) * H_j(k-1) * ... * H_j(1), with k = min(m,n)
Each Householder matrices H_j(i), for j = 1,2,…,batch_count, and i = 1,2,…,k, is given by
H_j(i) = I - ipiv_j[i-1] * v_j(i)' * v_j(i)
where the first i-1 elements of vector Householder vector v_j(i) are zero, and v_j(i)[i] = 1.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all the matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of colums of all the matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the n - i elements of vector v_j(i) for i = 1,2,…,min(m,n).
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out] ipiv
: pointer to type. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of scalar factors of the Householder matrices H_j(i).
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
-
rocblas_status
rocsolver_cgelqf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_zgelqf_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)¶
General systems solvers¶
rocsolver_<type>getrs()¶
-
rocblas_status
rocsolver_zgetrs
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_int *ipiv, rocblas_double_complex *B, const rocblas_int ldb)¶
-
rocblas_status
rocsolver_cgetrs
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_int *ipiv, rocblas_float_complex *B, const rocblas_int ldb)¶
-
rocblas_status
rocsolver_dgetrs
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_int *ipiv, double *B, const rocblas_int ldb)¶
-
rocblas_status
rocsolver_sgetrs
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_int *ipiv, float *B, const rocblas_int ldb)¶ GETRS solves a system of n linear equations on n variables using the LU factorization computed by GETRF.
It solves one of the following systems:
A * X = B (no transpose), A' * X = B (transpose), or A* * X = B (conjugate transpose)
depending on the value of trans.
- Parameters
[in] handle
: rocblas_handle.[in] trans
: rocblas_operation.Specifies the form of the system of equations.
[in] n
: rocblas_int. n >= 0.The order of the system, i.e. the number of columns and rows of A.
[in] nrhs
: rocblas_int. nrhs >= 0.The number of right hand sides, i.e., the number of columns of the matrix B.
[in] A
: pointer to type. Array on the GPU of dimension lda*n.The factors L and U of the factorization A = P*L*U returned by GETRF.
[in] lda
: rocblas_int. lda >= n.The leading dimension of A.
[in] ipiv
: pointer to rocblas_int. Array on the GPU of dimension n.The pivot indices returned by GETRF.
[inout] B
: pointer to type. Array on the GPU of dimension ldb*nrhs.On entry, the right hand side matrix B. On exit, the solution matrix X.
[in] ldb
: rocblas_int. ldb >= n.The leading dimension of B.
rocsolver_<type>getrs_batched()¶
-
rocblas_status
rocsolver_zgetrs_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetrs_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetrs_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, double *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, double *const B[], const rocblas_int ldb, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetrs_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, float *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, float *const B[], const rocblas_int ldb, const rocblas_int batch_count)¶ GETRS_BATCHED solves a batch of systems of n linear equations on n variables using the LU factorization computed by GETRF_BATCHED.
For each instance j in the batch, it solves one of the following systems:
A_j * X_j = B_j (no transpose), A_j' * X_j = B_j (transpose), or A_j* * X_j = B_j (conjugate transpose)
depending on the value of trans.
- Parameters
[in] handle
: rocblas_handle.[in] trans
: rocblas_operation.Specifies the form of the system of equations of each instance in the batch.
[in] n
: rocblas_int. n >= 0.The order of the system, i.e. the number of columns and rows of all A_j matrices.
[in] nrhs
: rocblas_int. nrhs >= 0.The number of right hand sides, i.e., the number of columns of all the matrices B_j.
[in] A
: Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by GETRF_BATCHED.
[in] lda
: rocblas_int. lda >= n.The leading dimension of matrices A_j.
[in] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of pivot indices returned by GETRF_BATCHED.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[inout] B
: Array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*nrhs.On entry, the right hand side matrices B_j. On exit, the solution matrix X_j of each system in the batch.
[in] ldb
: rocblas_int. ldb >= n.The leading dimension of matrices B_j.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of instances (systems) in the batch.
rocsolver_<type>getrs_strided_batched()¶
-
rocblas_status
rocsolver_zgetrs_strided_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetrs_strided_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetrs_strided_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, double *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetrs_strided_batched
(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, float *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)¶ GETRS_STRIDED_BATCHED solves a batch of systems of n linear equations on n variables using the LU factorization computed by GETRF_STRIDED_BATCHED.
For each instance j in the batch, it solves one of the following systems:
A_j * X_j = B_j (no transpose), A_j' * X_j = B_j (transpose), or A_j* * X_j = B_j (conjugate transpose)
depending on the value of trans.
- Parameters
[in] handle
: rocblas_handle.[in] trans
: rocblas_operation.Specifies the form of the system of equations of each instance in the batch.
[in] n
: rocblas_int. n >= 0.The order of the system, i.e. the number of columns and rows of all A_j matrices.
[in] nrhs
: rocblas_int. nrhs >= 0.The number of right hand sides, i.e., the number of columns of all the matrices B_j.
[in] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by GETRF_STRIDED_BATCHED.
[in] lda
: rocblas_int. lda >= n.The leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j and the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[in] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).Contains the vectors ipiv_j of pivot indices returned by GETRF_STRIDED_BATCHED.
[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
[inout] B
: pointer to type. Array on the GPU (size depends on the value of strideB).On entry, the right hand side matrices B_j. On exit, the solution matrix X_j of each system in the batch.
[in] ldb
: rocblas_int. ldb >= n.The leading dimension of matrices B_j.
[in] strideB
: rocblas_stride.Stride from the start of one matrix B_j and the next one B_(j+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of instances (systems) in the batch.
Auxiliaries¶
rocSOLVER auxiliary functions are aliases of rocBLAS auxiliary functions. See rocBLAS auxiliary functions here.
rocSOLVER handle auxiliaries¶
rocsolver_create_handle()¶
Warning
doxygenfunction: Cannot find function “rocsolver_create_handle” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_destroy_handle()¶
Warning
doxygenfunction: Cannot find function “rocsolver_destroy_handle” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_add_stream()¶
Warning
doxygenfunction: Cannot find function “rocsolver_add_stream” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_set_stream()¶
Warning
doxygenfunction: Cannot find function “rocsolver_set_stream” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_get_stream()¶
Warning
doxygenfunction: Cannot find function “rocsolver_get_stream” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
Other auxiliaries¶
rocsolver_set_vector()¶
Warning
doxygenfunction: Cannot find function “rocsolver_set_vector” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_get_vector()¶
Warning
doxygenfunction: Cannot find function “rocsolver_get_vector” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_set_matrix()¶
Warning
doxygenfunction: Cannot find function “rocsolver_set_matrix” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml
rocsolver_get_matrix()¶
Warning
doxygenfunction: Cannot find function “rocsolver_get_matrix” in doxygen xml output for project “rocSOLVER” from directory: ../docBin/xml