3.3. LAPACK Functions#
LAPACK routines solve complex Numerical Linear Algebra problems. These functions are organized in the following categories:
Triangular factorizations. Based on Gaussian elimination.
Orthogonal factorizations. Based on Householder reflections.
Problem and matrix reductions. Transformation of matrices and problems into equivalent forms.
Linearsystems solvers. Based on triangular factorizations.
Leastsquares solvers. Based on orthogonal factorizations.
Symmetric eigensolvers. Eigenproblems for symmetric matrices.
Singular value decomposition. Singular values and related problems for general matrices.
Note
Throughout the APIs’ descriptions, we use the following notations:
x[i] stands for the ith element of vector x, while A[i,j] represents the element in the ith row and jth column of matrix A. Indices are 1based, i.e. x[1] is the first element of x.
If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.
x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.
3.3.1. Triangular factorizations#
3.3.1.1. rocsolver_<type>potf2()#

rocblas_status rocsolver_zpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_cpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_dpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_spotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#
POTF2 computes the Cholesky factorization of a real symmetric (complex Hermitian) positive definite matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form:
\[\begin{split} \begin{array}{cl} A = U'U & \: \text{if uplo is upper, or}\\ A = LL' & \: \text{if uplo is lower.} \end{array} \end{split}\]U is an upper triangular matrix and L is lower triangular.
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.
3.3.1.2. rocsolver_<type>potf2_batched()#

rocblas_status rocsolver_zpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_spotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
POTF2_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form:
\[\begin{split} \begin{array}{cl} A_j = U_j'U_j & \: \text{if uplo is upper, or}\\ A_j = L_jL_j' & \: \text{if uplo is lower.} \end{array} \end{split}\]\(U_j\) is an upper triangular matrix and \(L_j\) is lower triangular.
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of matrix A_j.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j to be factored. On exit, the upper or lower triangular factors.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful factorization of matrix A_j. If info[j] = i > 0, the leading minor of order i of A_j is not positive definite. The jth factorization stopped at this point.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.3. rocsolver_<type>potf2_strided_batched()#

rocblas_status rocsolver_zpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_spotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
POTF2_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form:
\[\begin{split} \begin{array}{cl} A_j = U_j'U_j & \: \text{if uplo is upper, or}\\ A_j = L_jL_j' & \: \text{if uplo is lower.} \end{array} \end{split}\]\(U_j\) is an upper triangular matrix and \(L_j\) is lower triangular.
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of matrix A_j.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the matrices A_j to be factored. On exit, the upper or lower triangular factors.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful factorization of matrix A_j. If info[j] = i > 0, the leading minor of order i of A_j is not positive definite. The jth factorization stopped at this point.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.4. rocsolver_<type>potrf()#

rocblas_status rocsolver_zpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_cpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_dpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_spotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#
POTRF computes the Cholesky factorization of a real symmetric (complex Hermitian) positive definite matrix A.
(This is the blocked version of the algorithm).
The factorization has the form:
\[\begin{split} \begin{array}{cl} A = U'U & \: \text{if uplo is upper, or}\\ A = LL' & \: \text{if uplo is lower.} \end{array} \end{split}\]U is an upper triangular matrix and L is lower triangular.
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.
3.3.1.5. rocsolver_<type>potrf_batched()#

rocblas_status rocsolver_zpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_spotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
POTRF_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form:
\[\begin{split} \begin{array}{cl} A_j = U_j'U_j & \: \text{if uplo is upper, or}\\ A_j = L_jL_j' & \: \text{if uplo is lower.} \end{array} \end{split}\]\(U_j\) is an upper triangular matrix and \(L_j\) is lower triangular.
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of matrix A_j.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j to be factored. On exit, the upper or lower triangular factors.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful factorization of matrix A_j. If info[j] = i > 0, the leading minor of order i of A_j is not positive definite. The jth factorization stopped at this point.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.6. rocsolver_<type>potrf_strided_batched()#

rocblas_status rocsolver_zpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_spotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
POTRF_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form:
\[\begin{split} \begin{array}{cl} A_j = U_j'U_j & \: \text{if uplo is upper, or}\\ A_j = L_jL_j' & \: \text{if uplo is lower.} \end{array} \end{split}\]\(U_j\) is an upper triangular matrix and \(L_j\) is lower triangular.
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of matrix A_j.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the matrices A_j to be factored. On exit, the upper or lower triangular factors.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful factorization of matrix A_j. If info[j] = i > 0, the leading minor of order i of A_j is not positive definite. The jth factorization stopped at this point.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.7. rocsolver_<type>getf2()#

rocblas_status rocsolver_zgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_cgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_dgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_sgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
GETF2 computes the LU factorization of a general mbyn matrix A using partial pivoting with row interchanges.
(This is the unblocked Level2BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and midsize matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization has the form
\[ A = PLU \]where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to rocblas_int. Array on the GPU of dimension min(m,n).
The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
3.3.1.8. rocsolver_<type>getf2_batched()#

rocblas_status rocsolver_zgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
GETF2_BATCHED computes the LU factorization of a batch of general mbyn matrices using partial pivoting with row interchanges.
(This is the unblocked Level2BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and midsize matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = P_jL_jU_j \]where \(P_j\) is a permutation matrix, \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).
Contains the vectors of pivot indices ipiv_j (corresponding to A_j). Dimension of ipiv_j is min(m,n). Elements of ipiv_j are 1based indices. For each instance A_j in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_j was interchanged with row ipiv_j[i]. Matrix P_j of the factorization can be derived from ipiv_j.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.9. rocsolver_<type>getf2_strided_batched()#

rocblas_status rocsolver_zgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
GETF2_STRIDED_BATCHED computes the LU factorization of a batch of general mbyn matrices using partial pivoting with row interchanges.
(This is the unblocked Level2BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and midsize matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = P_jL_jU_j \]where \(P_j\) is a permutation matrix, \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [out]
pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).
Contains the vectors of pivots indices ipiv_j (corresponding to A_j). Dimension of ipiv_j is min(m,n). Elements of ipiv_j are 1based indices. For each instance A_j in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_j was interchanged with row ipiv_j[i]. Matrix P_j of the factorization can be derived from ipiv_j.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.10. rocsolver_<type>getrf()#

rocblas_status rocsolver_zgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_cgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_dgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_sgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
GETRF computes the LU factorization of a general mbyn matrix A using partial pivoting with row interchanges.
(This is the blocked Level3BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with midsize matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization has the form
\[ A = PLU \]where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to rocblas_int. Array on the GPU of dimension min(m,n).
The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
3.3.1.11. rocsolver_<type>getrf_batched()#

rocblas_status rocsolver_zgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
GETRF_BATCHED computes the LU factorization of a batch of general mbyn matrices using partial pivoting with row interchanges.
(This is the blocked Level3BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with midsize matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = P_jL_jU_j \]where \(P_j\) is a permutation matrix, \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).
Contains the vectors of pivot indices ipiv_j (corresponding to A_j). Dimension of ipiv_j is min(m,n). Elements of ipiv_j are 1based indices. For each instance A_j in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_j was interchanged with row ipiv_j[i]. Matrix P_j of the factorization can be derived from ipiv_j.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.12. rocsolver_<type>getrf_strided_batched()#

rocblas_status rocsolver_zgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
GETRF_STRIDED_BATCHED computes the LU factorization of a batch of general mbyn matrices using partial pivoting with row interchanges.
(This is the blocked Level3BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with midsize matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = P_jL_jU_j \]where \(P_j\) is a permutation matrix, \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [out]
pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).
Contains the vectors of pivots indices ipiv_j (corresponding to A_j). Dimension of ipiv_j is min(m,n). Elements of ipiv_j are 1based indices. For each instance A_j in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_j was interchanged with row ipiv_j[i]. Matrix P_j of the factorization can be derived from ipiv_j.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.13. rocsolver_<type>sytf2()#

rocblas_status rocsolver_zsytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_csytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_dsytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_ssytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
SYTF2 computes the factorization of a symmetric indefinite matrix \(A\) using BunchKaufman diagonal pivoting.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A = U D U^T & \: \text{or}\\ A = L D L^T & \end{array} \end{split}\]where \(U\) or \(L\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D\) is a symmetric block diagonal matrix with 1by1 and 2by2 diagonal blocks \(D(k)\).
Specifically, \(U\) and \(L\) are computed as
\[\begin{split} \begin{array}{cl} U = P(n) U(n) \cdots P(k) U(k) \cdots & \: \text{and}\\ L = P(1) L(1) \cdots P(k) L(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D(k)\), and \(P(k)\) is a permutation matrix defined by \(ipiv[k]\). If we let \(s\) denote the order of block \(D(k)\), then \(U(k)\) and \(L(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U(k) = \left[ \begin{array}{ccc} I_{ks} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{nk} \end{array} \right] \end{split}\]and
\[\begin{split} L(k) = \left[ \begin{array}{ccc} I_{k1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{nks+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D(k)\) is stored in \(A[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A\). If \(s = 2\) and uplo is upper, then \(D(k)\) is stored in \(A[k1,k1]\), \(A[k1,k]\), and \(A[k,k]\), and \(v\) is stored in the upper parts of columns \(k1\) and \(k\) of \(A\). If \(s = 2\) and uplo is lower, then \(D(k)\) is stored in \(A[k,k]\), \(A[k+1,k]\), and \(A[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the symmetric matrix A to be factored. On exit, the block diagonal matrix D and the multipliers needed to compute U or L.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A.
ipiv – [out]
pointer to rocblas_int. Array on the GPU of dimension n.
The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1by1 diagonal block. If, instead, ipiv[k] = ipiv[k1] < 0 and uplo is upper (or ipiv[k] = ipiv[k+1] < 0 and uplo is lower), then rows and columns k1 and ipiv[k] (or rows and columns k+1 and ipiv[k]) were interchanged and D[k1,k1] to D[k,k] (or D[k,k] to D[k+1,k+1]) is a 2by2 diagonal block.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.
3.3.1.14. rocsolver_<type>sytf2_batched()#

rocblas_status rocsolver_zsytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_csytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dsytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_ssytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
SYTF2_BATCHED computes the factorization of a batch of symmetric indefinite matrices using BunchKaufman diagonal pivoting.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A_j = U_j D_j U_j^T & \: \text{or}\\ A_j = L_j D_j L_j^T & \end{array} \end{split}\]where \(U_j\) or \(L_j\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_j\) is a symmetric block diagonal matrix with 1by1 and 2by2 diagonal blocks \(D_j(k)\).
Specifically, \(U_j\) and \(L_j\) are computed as
\[\begin{split} \begin{array}{cl} U_j = P_j(n) U_j(n) \cdots P_j(k) U_j(k) \cdots & \: \text{and}\\ L_j = P_j(1) L_j(1) \cdots P_j(k) L_j(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_j(k)\), and \(P_j(k)\) is a permutation matrix defined by \(ipiv_j[k]\). If we let \(s\) denote the order of block \(D_j(k)\), then \(U_j(k)\) and \(L_j(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U_j(k) = \left[ \begin{array}{ccc} I_{ks} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{nk} \end{array} \right] \end{split}\]and
\[\begin{split} L_j(k) = \left[ \begin{array}{ccc} I_{k1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{nks+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_j(k)\) is stored in \(A_j[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_j\). If \(s = 2\) and uplo is upper, then \(D_j(k)\) is stored in \(A_j[k1,k1]\), \(A_j[k1,k]\), and \(A_j[k,k]\), and \(v\) is stored in the upper parts of columns \(k1\) and \(k\) of \(A_j\). If \(s = 2\) and uplo is lower, then \(D_j(k)\) is stored in \(A_j[k,k]\), \(A_j[k+1,k]\), and \(A_j[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_j\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrices A_j are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_j is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the symmetric matrices A_j to be factored. On exit, the block diagonal matrices D_j and the multipliers needed to compute U_j or L_j.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to rocblas_int. Array on the GPU of dimension n.
The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= k <= n, if ipiv_j[k] > 0 then rows and columns k and ipiv_j[k] were interchanged and D_j[k,k] is a 1by1 diagonal block. If, instead, ipiv_j[k] = ipiv_j[k1] < 0 and uplo is upper (or ipiv_j[k] = ipiv_j[k+1] < 0 and uplo is lower), then rows and columns k1 and ipiv_j[k] (or rows and columns k+1 and ipiv_j[k]) were interchanged and D_j[k1,k1] to D_j[k,k] (or D_j[k,k] to D_j[k+1,k+1]) is a 2by2 diagonal block.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, D_j is singular. D_j[i,i] is the first diagonal zero.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.15. rocsolver_<type>sytf2_strided_batched()#

rocblas_status rocsolver_zsytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_csytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dsytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_ssytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
SYTF2_STRIDED_BATCHED computes the factorization of a batch of symmetric indefinite matrices using BunchKaufman diagonal pivoting.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A_j = U_j D_j U_j^T & \: \text{or}\\ A_j = L_j D_j L_j^T & \end{array} \end{split}\]where \(U_j\) or \(L_j\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_j\) is a symmetric block diagonal matrix with 1by1 and 2by2 diagonal blocks \(D_j(k)\).
Specifically, \(U_j\) and \(L_j\) are computed as
\[\begin{split} \begin{array}{cl} U_j = P_j(n) U_j(n) \cdots P_j(k) U_j(k) \cdots & \: \text{and}\\ L_j = P_j(1) L_j(1) \cdots P_j(k) L_j(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_j(k)\), and \(P_j(k)\) is a permutation matrix defined by \(ipiv_j[k]\). If we let \(s\) denote the order of block \(D_j(k)\), then \(U_j(k)\) and \(L_j(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U_j(k) = \left[ \begin{array}{ccc} I_{ks} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{nk} \end{array} \right] \end{split}\]and
\[\begin{split} L_j(k) = \left[ \begin{array}{ccc} I_{k1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{nks+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_j(k)\) is stored in \(A_j[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_j\). If \(s = 2\) and uplo is upper, then \(D_j(k)\) is stored in \(A_j[k1,k1]\), \(A_j[k1,k]\), and \(A_j[k,k]\), and \(v\) is stored in the upper parts of columns \(k1\) and \(k\) of \(A_j\). If \(s = 2\) and uplo is lower, then \(D_j(k)\) is stored in \(A_j[k,k]\), \(A_j[k+1,k]\), and \(A_j[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_j\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrices A_j are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_j is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the symmetric matrices A_j to be factored. On exit, the block diagonal matrices D_j and the multipliers needed to compute U_j or L_j.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [out]
pointer to rocblas_int. Array on the GPU of dimension n.
The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= k <= n, if ipiv_j[k] > 0 then rows and columns k and ipiv_j[k] were interchanged and D_j[k,k] is a 1by1 diagonal block. If, instead, ipiv_j[k] = ipiv_j[k1] < 0 and uplo is upper (or ipiv_j[k] = ipiv_j[k+1] < 0 and uplo is lower), then rows and columns k1 and ipiv_j[k] (or rows and columns k+1 and ipiv_j[k]) were interchanged and D_j[k1,k1] to D_j[k,k] (or D_j[k,k] to D_j[k+1,k+1]) is a 2by2 diagonal block.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, D_j is singular. D_j[i,i] is the first diagonal zero.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.16. rocsolver_<type>sytrf()#

rocblas_status rocsolver_zsytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_csytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_dsytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

rocblas_status rocsolver_ssytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
SYTRF computes the factorization of a symmetric indefinite matrix \(A\) using BunchKaufman diagonal pivoting.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A = U D U^T & \: \text{or}\\ A = L D L^T & \end{array} \end{split}\]where \(U\) or \(L\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D\) is a symmetric block diagonal matrix with 1by1 and 2by2 diagonal blocks \(D(k)\).
Specifically, \(U\) and \(L\) are computed as
\[\begin{split} \begin{array}{cl} U = P(n) U(n) \cdots P(k) U(k) \cdots & \: \text{and}\\ L = P(1) L(1) \cdots P(k) L(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D(k)\), and \(P(k)\) is a permutation matrix defined by \(ipiv[k]\). If we let \(s\) denote the order of block \(D(k)\), then \(U(k)\) and \(L(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U(k) = \left[ \begin{array}{ccc} I_{ks} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{nk} \end{array} \right] \end{split}\]and
\[\begin{split} L(k) = \left[ \begin{array}{ccc} I_{k1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{nks+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D(k)\) is stored in \(A[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A\). If \(s = 2\) and uplo is upper, then \(D(k)\) is stored in \(A[k1,k1]\), \(A[k1,k]\), and \(A[k,k]\), and \(v\) is stored in the upper parts of columns \(k1\) and \(k\) of \(A\). If \(s = 2\) and uplo is lower, then \(D(k)\) is stored in \(A[k,k]\), \(A[k+1,k]\), and \(A[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the symmetric matrix A to be factored. On exit, the block diagonal matrix D and the multipliers needed to compute U or L.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A.
ipiv – [out]
pointer to rocblas_int. Array on the GPU of dimension n.
The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1by1 diagonal block. If, instead, ipiv[k] = ipiv[k1] < 0 and uplo is upper (or ipiv[k] = ipiv[k+1] < 0 and uplo is lower), then rows and columns k1 and ipiv[k] (or rows and columns k+1 and ipiv[k]) were interchanged and D[k1,k1] to D[k,k] (or D[k,k] to D[k+1,k+1]) is a 2by2 diagonal block.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.
3.3.1.17. rocsolver_<type>sytrf_batched()#

rocblas_status rocsolver_zsytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_csytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dsytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_ssytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
SYTRF_BATCHED computes the factorization of a batch of symmetric indefinite matrices using BunchKaufman diagonal pivoting.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A_j = U_j D_j U_j^T & \: \text{or}\\ A_j = L_j D_j L_j^T & \end{array} \end{split}\]where \(U_j\) or \(L_j\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_j\) is a symmetric block diagonal matrix with 1by1 and 2by2 diagonal blocks \(D_j(k)\).
Specifically, \(U_j\) and \(L_j\) are computed as
\[\begin{split} \begin{array}{cl} U_j = P_j(n) U_j(n) \cdots P_j(k) U_j(k) \cdots & \: \text{and}\\ L_j = P_j(1) L_j(1) \cdots P_j(k) L_j(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_j(k)\), and \(P_j(k)\) is a permutation matrix defined by \(ipiv_j[k]\). If we let \(s\) denote the order of block \(D_j(k)\), then \(U_j(k)\) and \(L_j(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U_j(k) = \left[ \begin{array}{ccc} I_{ks} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{nk} \end{array} \right] \end{split}\]and
\[\begin{split} L_j(k) = \left[ \begin{array}{ccc} I_{k1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{nks+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_j(k)\) is stored in \(A_j[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_j\). If \(s = 2\) and uplo is upper, then \(D_j(k)\) is stored in \(A_j[k1,k1]\), \(A_j[k1,k]\), and \(A_j[k,k]\), and \(v\) is stored in the upper parts of columns \(k1\) and \(k\) of \(A_j\). If \(s = 2\) and uplo is lower, then \(D_j(k)\) is stored in \(A_j[k,k]\), \(A_j[k+1,k]\), and \(A_j[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_j\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrices A_j are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_j is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the symmetric matrices A_j to be factored. On exit, the block diagonal matrices D_j and the multipliers needed to compute U_j or L_j.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to rocblas_int. Array on the GPU of dimension n.
The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= k <= n, if ipiv_j[k] > 0 then rows and columns k and ipiv_j[k] were interchanged and D_j[k,k] is a 1by1 diagonal block. If, instead, ipiv_j[k] = ipiv_j[k1] < 0 and uplo is upper (or ipiv_j[k] = ipiv_j[k+1] < 0 and uplo is lower), then rows and columns k1 and ipiv_j[k] (or rows and columns k+1 and ipiv_j[k]) were interchanged and D_j[k1,k1] to D_j[k,k] (or D_j[k,k] to D_j[k+1,k+1]) is a 2by2 diagonal block.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, D_j is singular. D_j[i,i] is the first diagonal zero.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.1.18. rocsolver_<type>sytrf_strided_batched()#

rocblas_status rocsolver_zsytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_csytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dsytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_ssytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
SYTRF_STRIDED_BATCHED computes the factorization of a batch of symmetric indefinite matrices using BunchKaufman diagonal pivoting.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A_j = U_j D_j U_j^T & \: \text{or}\\ A_j = L_j D_j L_j^T & \end{array} \end{split}\]where \(U_j\) or \(L_j\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_j\) is a symmetric block diagonal matrix with 1by1 and 2by2 diagonal blocks \(D_j(k)\).
Specifically, \(U_j\) and \(L_j\) are computed as
\[\begin{split} \begin{array}{cl} U_j = P_j(n) U_j(n) \cdots P_j(k) U_j(k) \cdots & \: \text{and}\\ L_j = P_j(1) L_j(1) \cdots P_j(k) L_j(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_j(k)\), and \(P_j(k)\) is a permutation matrix defined by \(ipiv_j[k]\). If we let \(s\) denote the order of block \(D_j(k)\), then \(U_j(k)\) and \(L_j(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U_j(k) = \left[ \begin{array}{ccc} I_{ks} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{nk} \end{array} \right] \end{split}\]and
\[\begin{split} L_j(k) = \left[ \begin{array}{ccc} I_{k1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{nks+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_j(k)\) is stored in \(A_j[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_j\). If \(s = 2\) and uplo is upper, then \(D_j(k)\) is stored in \(A_j[k1,k1]\), \(A_j[k1,k]\), and \(A_j[k,k]\), and \(v\) is stored in the upper parts of columns \(k1\) and \(k\) of \(A_j\). If \(s = 2\) and uplo is lower, then \(D_j(k)\) is stored in \(A_j[k,k]\), \(A_j[k+1,k]\), and \(A_j[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_j\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrices A_j are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_j is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the symmetric matrices A_j to be factored. On exit, the block diagonal matrices D_j and the multipliers needed to compute U_j or L_j.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [out]
pointer to rocblas_int. Array on the GPU of dimension n.
The vector of pivot indices. Elements of ipiv are 1based indices. For 1 <= k <= n, if ipiv_j[k] > 0 then rows and columns k and ipiv_j[k] were interchanged and D_j[k,k] is a 1by1 diagonal block. If, instead, ipiv_j[k] = ipiv_j[k1] < 0 and uplo is upper (or ipiv_j[k] = ipiv_j[k+1] < 0 and uplo is lower), then rows and columns k1 and ipiv_j[k] (or rows and columns k+1 and ipiv_j[k]) were interchanged and D_j[k1,k1] to D_j[k,k] (or D_j[k,k] to D_j[k+1,k+1]) is a 2by2 diagonal block.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, D_j is singular. D_j[i,i] is the first diagonal zero.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2. Orthogonal factorizations#
3.3.2.1. rocsolver_<type>geqr2()#

rocblas_status rocsolver_zgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#

rocblas_status rocsolver_cgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

rocblas_status rocsolver_dgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#

rocblas_status rocsolver_sgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GEQR2 computes a QR factorization of a general mbyn matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} A = Q\left[\begin{array}{c} R\\ 0 \end{array}\right] \end{split}\]where R is upper triangular (upper trapezoidal if m < n), and Q is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_1H_2\cdots H_k, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{ipiv}[i] \cdot v_i v_i' \]where the first i1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the last m  i elements of Householder vector v_i.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars.
3.3.2.2. rocsolver_<type>geqr2_batched()#

rocblas_status rocsolver_zgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQR2_BATCHED computes the QR factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[\begin{split} A_j = Q_j\left[\begin{array}{c} R_j\\ 0 \end{array}\right] \end{split}\]where \(R_j\) is upper triangular (upper trapezoidal if m < n), and \(Q_j\) is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_1}H_{j_2}\cdots H_{j_k}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the first i1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the last m  i elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.3. rocsolver_<type>geqr2_strided_batched()#

rocblas_status rocsolver_zgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQR2_STRIDED_BATCHED computes the QR factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[\begin{split} A_j = Q_j\left[\begin{array}{c} R_j\\ 0 \end{array}\right] \end{split}\]where \(R_j\) is upper triangular (upper trapezoidal if m < n), and \(Q_j\) is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_1}H_{j_2}\cdots H_{j_k}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the first i1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the last m  i elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.4. rocsolver_<type>geqrf()#

rocblas_status rocsolver_zgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#

rocblas_status rocsolver_cgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

rocblas_status rocsolver_dgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#

rocblas_status rocsolver_sgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GEQRF computes a QR factorization of a general mbyn matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} A = Q\left[\begin{array}{c} R\\ 0 \end{array}\right] \end{split}\]where R is upper triangular (upper trapezoidal if m < n), and Q is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_1H_2\cdots H_k, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{ipiv}[i] \cdot v_i v_i' \]where the first i1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the last m  i elements of Householder vector v_i.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars.
3.3.2.5. rocsolver_<type>geqrf_batched()#

rocblas_status rocsolver_zgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQRF_BATCHED computes the QR factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[\begin{split} A_j = Q_j\left[\begin{array}{c} R_j\\ 0 \end{array}\right] \end{split}\]where \(R_j\) is upper triangular (upper trapezoidal if m < n), and \(Q_j\) is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_1}H_{j_2}\cdots H_{j_k}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the first i1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the last m  i elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.6. rocsolver_<type>geqrf_strided_batched()#

rocblas_status rocsolver_zgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQRF_STRIDED_BATCHED computes the QR factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[\begin{split} A_j = Q_j\left[\begin{array}{c} R_j\\ 0 \end{array}\right] \end{split}\]where \(R_j\) is upper triangular (upper trapezoidal if m < n), and \(Q_j\) is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_1}H_{j_2}\cdots H_{j_k}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the first i1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the last m  i elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.7. rocsolver_<type>gerq2()#

rocblas_status rocsolver_zgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#

rocblas_status rocsolver_cgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

rocblas_status rocsolver_dgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#

rocblas_status rocsolver_sgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GERQ2 computes a RQ factorization of a general mbyn matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
\[ A = \left[\begin{array}{cc} 0 & R \end{array}\right] Q \]where R is upper triangular (upper trapezoidal if m > n), and Q is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_1'H_2' \cdots H_k', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{ipiv}[i] \cdot v_i v_i' \]where the last ni elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on and above the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor R; the elements below the sub/superdiagonal are the first i  1 elements of Householder vector v_i.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars.
3.3.2.8. rocsolver_<type>gerq2_batched()#

rocblas_status rocsolver_zgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GERQ2_BATCHED computes the RQ factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = \left[\begin{array}{cc} 0 & R_j \end{array}\right] Q_j \]where \(R_j\) is upper triangular (upper trapezoidal if m > n), and \(Q_j\) is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_1}'H_{j_2}' \cdots H_{j_k}', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the last ni elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor R_j; the elements below the sub/superdiagonal are the first i  1 elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.9. rocsolver_<type>gerq2_strided_batched()#

rocblas_status rocsolver_zgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GERQ2_STRIDED_BATCHED computes the RQ factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = \left[\begin{array}{cc} 0 & R_j \end{array}\right] Q_j \]where \(R_j\) is upper triangular (upper trapezoidal if m > n), and \(Q_j\) is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_1}'H_{j_2}' \cdots H_{j_k}', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the last ni elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor R_j; the elements below the sub/superdiagonal are the first i  1 elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.10. rocsolver_<type>gerqf()#

rocblas_status rocsolver_zgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#

rocblas_status rocsolver_cgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

rocblas_status rocsolver_dgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#

rocblas_status rocsolver_sgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GERQF computes a RQ factorization of a general mbyn matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
\[ A = \left[\begin{array}{cc} 0 & R \end{array}\right] Q \]where R is upper triangular (upper trapezoidal if m > n), and Q is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_1'H_2' \cdots H_k', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{ipiv}[i] \cdot v_i v_i' \]where the last ni elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on and above the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor R; the elements below the sub/superdiagonal are the first i  1 elements of Householder vector v_i.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars.
3.3.2.11. rocsolver_<type>gerqf_batched()#

rocblas_status rocsolver_zgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GERQF_BATCHED computes the RQ factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = \left[\begin{array}{cc} 0 & R_j \end{array}\right] Q_j \]where \(R_j\) is upper triangular (upper trapezoidal if m > n), and \(Q_j\) is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_1}'H_{j_2}' \cdots H_{j_k}', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the last ni elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor R_j; the elements below the sub/superdiagonal are the first i  1 elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.12. rocsolver_<type>gerqf_strided_batched()#

rocblas_status rocsolver_zgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GERQF_STRIDED_BATCHED computes the RQ factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = \left[\begin{array}{cc} 0 & R_j \end{array}\right] Q_j \]where \(R_j\) is upper triangular (upper trapezoidal if m > n), and \(Q_j\) is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_1}'H_{j_2}' \cdots H_{j_k}', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the last ni elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and above the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor R_j; the elements below the sub/superdiagonal are the first i  1 elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.13. rocsolver_<type>geql2()#

rocblas_status rocsolver_zgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#

rocblas_status rocsolver_cgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

rocblas_status rocsolver_dgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#

rocblas_status rocsolver_sgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GEQL2 computes a QL factorization of a general mbyn matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} A = Q\left[\begin{array}{c} 0\\ L \end{array}\right] \end{split}\]where L is lower triangular (lower trapezoidal if m < n), and Q is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_kH_{k1}\cdots H_1, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{ipiv}[i] \cdot v_i v_i' \]where the last mi elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the first i  1 elements of Householder vector v_i.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars.
3.3.2.14. rocsolver_<type>geql2_batched()#

rocblas_status rocsolver_zgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQL2_BATCHED computes the QL factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[\begin{split} A_j = Q_j\left[\begin{array}{c} 0\\ L_j \end{array}\right] \end{split}\]where \(L_j\) is lower triangular (lower trapezoidal if m < n), and \(Q_j\) is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_{j_k}H_{j_{k1}}\cdots H_{j_1}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the last mi elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the first i  1 elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.15. rocsolver_<type>geql2_strided_batched()#

rocblas_status rocsolver_zgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQL2_STRIDED_BATCHED computes the QL factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[\begin{split} A_j = Q_j\left[\begin{array}{c} 0\\ L_j \end{array}\right] \end{split}\]where \(L_j\) is lower triangular (lower trapezoidal if m < n), and \(Q_j\) is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_{j_k}H_{j_{k1}}\cdots H_{j_1}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the last mi elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the first i  1 elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.16. rocsolver_<type>geqlf()#

rocblas_status rocsolver_zgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#

rocblas_status rocsolver_cgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

rocblas_status rocsolver_dgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#

rocblas_status rocsolver_sgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GEQLF computes a QL factorization of a general mbyn matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} A = Q\left[\begin{array}{c} 0\\ L \end{array}\right] \end{split}\]where L is lower triangular (lower trapezoidal if m < n), and Q is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_kH_{k1}\cdots H_1, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{ipiv}[i] \cdot v_i v_i' \]where the last mi elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the first i  1 elements of Householder vector v_i.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars.
3.3.2.17. rocsolver_<type>geqlf_batched()#

rocblas_status rocsolver_zgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQLF_BATCHED computes the QL factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[\begin{split} A_j = Q_j\left[\begin{array}{c} 0\\ L_j \end{array}\right] \end{split}\]where \(L_j\) is lower triangular (lower trapezoidal if m < n), and \(Q_j\) is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_{j_k}H_{j_{k1}}\cdots H_{j_1}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the last mi elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the first i  1 elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.18. rocsolver_<type>geqlf_strided_batched()#

rocblas_status rocsolver_zgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQLF_STRIDED_BATCHED computes the QL factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[\begin{split} A_j = Q_j\left[\begin{array}{c} 0\\ L_j \end{array}\right] \end{split}\]where \(L_j\) is lower triangular (lower trapezoidal if m < n), and \(Q_j\) is a mbym orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_{j_k}H_{j_{k1}}\cdots H_{j_1}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]where the last mi elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the (mn)th subdiagonal (when m >= n) or the (nm)th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the first i  1 elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.19. rocsolver_<type>gelq2()#

rocblas_status rocsolver_zgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#

rocblas_status rocsolver_cgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

rocblas_status rocsolver_dgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#

rocblas_status rocsolver_sgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GELQ2 computes a LQ factorization of a general mbyn matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
\[ A = \left[\begin{array}{cc} L & 0 \end{array}\right] Q \]where L is lower triangular (lower trapezoidal if m > n), and Q is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_k'H_{k1}' \cdots H_1', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{ipiv}[i] \cdot v_i' v_i \]where the first i1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on and below the diagonal contain the factor L; the elements above the diagonal are the last n  i elements of Householder vector v_i.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars.
3.3.2.20. rocsolver_<type>gelq2_batched()#

rocblas_status rocsolver_zgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GELQ2_BATCHED computes the LQ factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = \left[\begin{array}{cc} L_j & 0 \end{array}\right] Q_j \]where \(L_j\) is lower triangular (lower trapezoidal if m > n), and \(Q_j\) is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_k}'H_{j_{k1}}' \cdots H_{j_1}', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i}' v_{j_i} \]where the first i1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the last n  i elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.21. rocsolver_<type>gelq2_strided_batched()#

rocblas_status rocsolver_zgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GELQ2_STRIDED_BATCHED computes the LQ factorization of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = \left[\begin{array}{cc} L_j & 0 \end{array}\right] Q_j \]where \(L_j\) is lower triangular (lower trapezoidal if m > n), and \(Q_j\) is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_k}'H_{j_{k1}}' \cdots H_{j_1}', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i}' v_{j_i} \]where the first i1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the last n  i elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.22. rocsolver_<type>gelqf()#

rocblas_status rocsolver_zgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#

rocblas_status rocsolver_cgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

rocblas_status rocsolver_dgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#

rocblas_status rocsolver_sgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GELQF computes a LQ factorization of a general mbyn matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
\[ A = \left[\begin{array}{cc} L & 0 \end{array}\right] Q \]where L is lower triangular (lower trapezoidal if m > n), and Q is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H_k'H_{k1}' \cdots H_1', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{ipiv}[i] \cdot v_i' v_i \]where the first i1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on and below the diagonal contain the factor L; the elements above the diagonal are the last n  i elements of Householder vector v_i.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
ipiv – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars.
3.3.2.23. rocsolver_<type>gelqf_batched()#

rocblas_status rocsolver_zgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GELQF_BATCHED computes the LQ factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = \left[\begin{array}{cc} L_j & 0 \end{array}\right] Q_j \]where \(L_j\) is lower triangular (lower trapezoidal if m > n), and \(Q_j\) is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_k}'H_{j_{k1}}' \cdots H_{j_1}', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i}' v_{j_i} \]where the first i1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the last n  i elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.2.24. rocsolver_<type>gelqf_strided_batched()#

rocblas_status rocsolver_zgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GELQF_STRIDED_BATCHED computes the LQ factorization of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = \left[\begin{array}{cc} L_j & 0 \end{array}\right] Q_j \]where \(L_j\) is lower triangular (lower trapezoidal if m > n), and \(Q_j\) is a nbyn orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_j = H_{j_k}'H_{j_{k1}}' \cdots H_{j_1}', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{ipiv}_j[i] \cdot v_{j_i}' v_{j_i} \]where the first i1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the last n  i elements of Householder vector v_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors ipiv_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3. Problem and matrix reductions#
3.3.3.1. rocsolver_<type>gebd2()#

rocblas_status rocsolver_zgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup)#

rocblas_status rocsolver_cgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup)#

rocblas_status rocsolver_dgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup)#

rocblas_status rocsolver_sgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup)#
GEBD2 computes the bidiagonal form of a general mbyn matrix A.
(This is the unblocked version of the algorithm).
The bidiagonal form is given by:
\[ B = Q' A P \]where B is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q and P are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_n\: \text{and} \: P = G_1G_2\cdots G_{n1}, & \: \text{if}\: m >= n, \:\text{or}\\ Q = H_1H_2\cdots H_{m1}\: \text{and} \: P = G_1G_2\cdots G_{m}, & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_i\) and \(G_i\) is given by
\[\begin{split} \begin{array}{cl} H_i = I  \text{tauq}[i] \cdot v_i v_i', & \: \text{and}\\ G_i = I  \text{taup}[i] \cdot u_i' u_i. \end{array} \end{split}\]If m >= n, the first i1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\); while the first i elements of the Householder vector \(u_i\) are zero, and \(u_i[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\); while the first i1 elements of the Householder vector \(u_i\) are zero, and \(u_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B. If m >= n, the elements below the diagonal are the last m  i elements of Householder vector v_i, and the elements above the superdiagonal are the last n  i  1 elements of Householder vector u_i. If m < n, the elements below the subdiagonal are the last m  i  1 elements of Householder vector v_i, and the elements above the diagonal are the last n  i elements of Householder vector u_i.
lda – [in]
rocblas_int. lda >= m.
specifies the leading dimension of A.
D – [out]
pointer to real type. Array on the GPU of dimension min(m,n).
The diagonal elements of B.
E – [out]
pointer to real type. Array on the GPU of dimension min(m,n)1.
The offdiagonal elements of B.
tauq – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars associated with matrix Q.
taup – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars associated with matrix P.
3.3.3.2. rocsolver_<type>gebd2_batched()#

rocblas_status rocsolver_zgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
GEBD2_BATCHED computes the bidiagonal form of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
For each instance in the batch, the bidiagonal form is given by:
\[ B_j = Q_j' A_j P_j \]where \(B_j\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_j\) and \(P_j\) are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_n}\: \text{and} \: P_j = G_{j_1}G_{j_2}\cdots G_{j_{n1}}, & \: \text{if}\: m >= n, \:\text{or}\\ Q_j = H_{j_1}H_{j_2}\cdots H_{j_{m1}}\: \text{and} \: P_j = G_{j_1}G_{j_2}\cdots G_{j_m}, & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) and \(G_{j_i}\) is given by
\[\begin{split} \begin{array}{cl} H_{j_i} = I  \text{tauq}_j[i] \cdot v_{j_i} v_{j_i}', & \: \text{and}\\ G_{j_i} = I  \text{taup}_j[i] \cdot u_{j_i}' u_{j_i}. \end{array} \end{split}\]If m >= n, the first i1 elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\); while the first i elements of the Householder vector \(u_{j_i}\) are zero, and \(u_{j_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\); while the first i1 elements of the Householder vector \(u_{j_i}\) are zero, and \(u_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_j. If m >= n, the elements below the diagonal are the last m  i elements of Householder vector v_(j_i), and the elements above the superdiagonal are the last n  i  1 elements of Householder vector u_(j_i). If m < n, the elements below the subdiagonal are the last m  i  1 elements of Householder vector v_(j_i), and the elements above the diagonal are the last n  i elements of Householder vector u_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
D – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of B_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
E – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of B_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
tauq – [out]
pointer to type. Array on the GPU (the size depends on the value of strideQ).
Contains the vectors tauq_j of Householder scalars associated with matrices Q_j.
strideQ – [in]
rocblas_stride.
Stride from the start of one vector tauq_j to the next one tauq_(j+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
taup – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors taup_j of Householder scalars associated with matrices P_j.
strideP – [in]
rocblas_stride.
Stride from the start of one vector taup_j to the next one taup_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.3. rocsolver_<type>gebd2_strided_batched()#

rocblas_status rocsolver_zgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
GEBD2_STRIDED_BATCHED computes the bidiagonal form of a batch of general mbyn matrices.
(This is the unblocked version of the algorithm).
For each instance in the batch, the bidiagonal form is given by:
\[ B_j = Q_j' A_j P_j \]where \(B_j\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_j\) and \(P_j\) are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_n}\: \text{and} \: P_j = G_{j_1}G_{j_2}\cdots G_{j_{n1}}, & \: \text{if}\: m >= n, \:\text{or}\\ Q_j = H_{j_1}H_{j_2}\cdots H_{j_{m1}}\: \text{and} \: P_j = G_{j_1}G_{j_2}\cdots G_{j_m}, & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) and \(G_{j_i}\) is given by
\[\begin{split} \begin{array}{cl} H_{j_i} = I  \text{tauq}_j[i] \cdot v_{j_i} v_{j_i}', & \: \text{and}\\ G_{j_i} = I  \text{taup}_j[i] \cdot u_{j_i}' u_{j_i}. \end{array} \end{split}\]If m >= n, the first i1 elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\); while the first i elements of the Householder vector \(u_{j_i}\) are zero, and \(u_{j_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\); while the first i1 elements of the Householder vector \(u_{j_i}\) are zero, and \(u_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_j. If m >= n, the elements below the diagonal are the last m  i elements of Householder vector v_(j_i), and the elements above the superdiagonal are the last n  i  1 elements of Householder vector u_(j_i). If m < n, the elements below the subdiagonal are the last m  i  1 elements of Householder vector v_(j_i), and the elements above the diagonal are the last n  i elements of Householder vector u_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of B_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
E – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of B_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
tauq – [out]
pointer to type. Array on the GPU (the size depends on the value of strideQ).
Contains the vectors tauq_j of Householder scalars associated with matrices Q_j.
strideQ – [in]
rocblas_stride.
Stride from the start of one vector tauq_j to the next one tauq_(j+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
taup – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors taup_j of Householder scalars associated with matrices P_j.
strideP – [in]
rocblas_stride.
Stride from the start of one vector taup_j to the next one taup_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.4. rocsolver_<type>gebrd()#

rocblas_status rocsolver_zgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup)#

rocblas_status rocsolver_cgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup)#

rocblas_status rocsolver_dgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup)#

rocblas_status rocsolver_sgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup)#
GEBRD computes the bidiagonal form of a general mbyn matrix A.
(This is the blocked version of the algorithm).
The bidiagonal form is given by:
\[ B = Q' A P \]where B is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q and P are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_n\: \text{and} \: P = G_1G_2\cdots G_{n1}, & \: \text{if}\: m >= n, \:\text{or}\\ Q = H_1H_2\cdots H_{m1}\: \text{and} \: P = G_1G_2\cdots G_{m}, & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_i\) and \(G_i\) is given by
\[\begin{split} \begin{array}{cl} H_i = I  \text{tauq}[i] \cdot v_i v_i', & \: \text{and}\\ G_i = I  \text{taup}[i] \cdot u_i' u_i. \end{array} \end{split}\]If m >= n, the first i1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\); while the first i elements of the Householder vector \(u_i\) are zero, and \(u_i[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\); while the first i1 elements of the Householder vector \(u_i\) are zero, and \(u_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the mbyn matrix to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B. If m >= n, the elements below the diagonal are the last m  i elements of Householder vector v_i, and the elements above the superdiagonal are the last n  i  1 elements of Householder vector u_i. If m < n, the elements below the subdiagonal are the last m  i  1 elements of Householder vector v_i, and the elements above the diagonal are the last n  i elements of Householder vector u_i.
lda – [in]
rocblas_int. lda >= m.
specifies the leading dimension of A.
D – [out]
pointer to real type. Array on the GPU of dimension min(m,n).
The diagonal elements of B.
E – [out]
pointer to real type. Array on the GPU of dimension min(m,n)1.
The offdiagonal elements of B.
tauq – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars associated with matrix Q.
taup – [out]
pointer to type. Array on the GPU of dimension min(m,n).
The Householder scalars associated with matrix P.
3.3.3.5. rocsolver_<type>gebrd_batched()#

rocblas_status rocsolver_zgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
GEBRD_BATCHED computes the bidiagonal form of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
For each instance in the batch, the bidiagonal form is given by:
\[ B_j = Q_j' A_j P_j \]where \(B_j\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_j\) and \(P_j\) are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_n}\: \text{and} \: P_j = G_{j_1}G_{j_2}\cdots G_{j_{n1}}, & \: \text{if}\: m >= n, \:\text{or}\\ Q_j = H_{j_1}H_{j_2}\cdots H_{j_{m1}}\: \text{and} \: P_j = G_{j_1}G_{j_2}\cdots G_{j_m}, & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) and \(G_{j_i}\) is given by
\[\begin{split} \begin{array}{cl} H_{j_i} = I  \text{tauq}_j[i] \cdot v_{j_i} v_{j_i}', & \: \text{and}\\ G_{j_i} = I  \text{taup}_j[i] \cdot u_{j_i}' u_{j_i}. \end{array} \end{split}\]If m >= n, the first i1 elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\); while the first i elements of the Householder vector \(u_{j_i}\) are zero, and \(u_{j_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\); while the first i1 elements of the Householder vector \(u_{j_i}\) are zero, and \(u_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the mbyn matrices A_j to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_j. If m >= n, the elements below the diagonal are the last m  i elements of Householder vector v_(j_i), and the elements above the superdiagonal are the last n  i  1 elements of Householder vector u_(j_i). If m < n, the elements below the subdiagonal are the last m  i  1 elements of Householder vector v_(j_i), and the elements above the diagonal are the last n  i elements of Householder vector u_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
D – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of B_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
E – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of B_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
tauq – [out]
pointer to type. Array on the GPU (the size depends on the value of strideQ).
Contains the vectors tauq_j of Householder scalars associated with matrices Q_j.
strideQ – [in]
rocblas_stride.
Stride from the start of one vector tauq_j to the next one tauq_(j+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
taup – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors taup_j of Householder scalars associated with matrices P_j.
strideP – [in]
rocblas_stride.
Stride from the start of one vector taup_j to the next one taup_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.6. rocsolver_<type>gebrd_strided_batched()#

rocblas_status rocsolver_zgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_cgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_dgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_sgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
GEBRD_STRIDED_BATCHED computes the bidiagonal form of a batch of general mbyn matrices.
(This is the blocked version of the algorithm).
For each instance in the batch, the bidiagonal form is given by:
\[ B_j = Q_j' A_j P_j \]where \(B_j\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_j\) and \(P_j\) are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_n}\: \text{and} \: P_j = G_{j_1}G_{j_2}\cdots G_{j_{n1}}, & \: \text{if}\: m >= n, \:\text{or}\\ Q_j = H_{j_1}H_{j_2}\cdots H_{j_{m1}}\: \text{and} \: P_j = G_{j_1}G_{j_2}\cdots G_{j_m}, & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) and \(G_{j_i}\) is given by
\[\begin{split} \begin{array}{cl} H_{j_i} = I  \text{tauq}_j[i] \cdot v_{j_i} v_{j_i}', & \: \text{and}\\ G_{j_i} = I  \text{taup}_j[i] \cdot u_{j_i}' u_{j_i}. \end{array} \end{split}\]If m >= n, the first i1 elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\); while the first i elements of the Householder vector \(u_{j_i}\) are zero, and \(u_{j_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\); while the first i1 elements of the Householder vector \(u_{j_i}\) are zero, and \(u_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all the matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all the matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the mbyn matrices A_j to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_j. If m >= n, the elements below the diagonal are the last m  i elements of Householder vector v_(j_i), and the elements above the superdiagonal are the last n  i  1 elements of Householder vector u_(j_i). If m < n, the elements below the subdiagonal are the last m  i  1 elements of Householder vector v_(j_i), and the elements above the diagonal are the last n  i elements of Householder vector u_(j_i).
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of B_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
E – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of B_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)1.
tauq – [out]
pointer to type. Array on the GPU (the size depends on the value of strideQ).
Contains the vectors tauq_j of Householder scalars associated with matrices Q_j.
strideQ – [in]
rocblas_stride.
Stride from the start of one vector tauq_j to the next one tauq_(j+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
taup – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors taup_j of Householder scalars associated with matrices P_j.
strideP – [in]
rocblas_stride.
Stride from the start of one vector taup_j to the next one taup_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.7. rocsolver_<type>sytd2()#

rocblas_status rocsolver_dsytd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tau)#

rocblas_status rocsolver_ssytd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tau)#
SYTD2 computes the tridiagonal form of a real symmetric matrix A.
(This is the unblocked version of the algorithm).
The tridiagonal form is given by:
\[ T = Q' A Q \]where T is symmetric tridiagonal and Q is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_{n1} & \: \text{if uplo indicates lower, or}\\ Q = H_{n1}H_{n2}\cdots H_1 & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{tau}[i] \cdot v_i v_i' \]where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_i stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A.
D – [out]
pointer to type. Array on the GPU of dimension n.
The diagonal elements of T.
E – [out]
pointer to type. Array on the GPU of dimension n1.
The offdiagonal elements of T.
tau – [out]
pointer to type. Array on the GPU of dimension n1.
The Householder scalars.
3.3.3.8. rocsolver_<type>sytd2_batched()#

rocblas_status rocsolver_dsytd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_ssytd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
SYTD2_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_j.
(This is the unblocked version of the algorithm).
The tridiagonal form of \(A_j\) is given by:
\[ T_j = Q_j' A_j Q_j \]where \(T_j\) is symmetric tridiagonal and \(Q_j\) is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_{n1}} & \: \text{if uplo indicates lower, or}\\ Q_j = H_{j_{n1}}H_{j_{n2}}\cdots H_{j_1} & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{tau}_j[i] \cdot v_{j_i} v_{j_i}' \]where \(\text{tau}_j[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the symmetric matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrices A_j.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_j; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_(j_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_(j_i) stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A_j.
D – [out]
pointer to type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of T_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out]
pointer to type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of T_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
tau – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors tau_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.9. rocsolver_<type>sytd2_strided_batched()#

rocblas_status rocsolver_dsytd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_ssytd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
SYTD2_STRIDED_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_j.
(This is the unblocked version of the algorithm).
The tridiagonal form of \(A_j\) is given by:
\[ T_j = Q_j' A_j Q_j \]where \(T_j\) is symmetric tridiagonal and \(Q_j\) is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_{n1}} & \: \text{if uplo indicates lower, or}\\ Q_j = H_{j_{n1}}H_{j_{n2}}\cdots H_{j_1} & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{tau}_j[i] \cdot v_{j_i} v_{j_i}' \]where \(\text{tau}_j[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the symmetric matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrices A_j.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_j; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_(j_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_(j_i) stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out]
pointer to type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of T_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out]
pointer to type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of T_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
tau – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors tau_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.10. rocsolver_<type>hetd2()#

rocblas_status rocsolver_zhetd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tau)#

rocblas_status rocsolver_chetd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tau)#
HETD2 computes the tridiagonal form of a complex hermitian matrix A.
(This is the unblocked version of the algorithm).
The tridiagonal form is given by:
\[ T = Q' A Q \]where T is hermitian tridiagonal and Q is an unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_{n1} & \: \text{if uplo indicates lower, or}\\ Q = H_{n1}H_{n2}\cdots H_1 & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{tau}[i] \cdot v_i v_i' \]where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i1 elements of the Householders vector v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_i stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A.
D – [out]
pointer to real type. Array on the GPU of dimension n.
The diagonal elements of T.
E – [out]
pointer to real type. Array on the GPU of dimension n1.
The offdiagonal elements of T.
tau – [out]
pointer to type. Array on the GPU of dimension n1.
The Householder scalars.
3.3.3.11. rocsolver_<type>hetd2_batched()#

rocblas_status rocsolver_zhetd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_chetd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
HETD2_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_j.
(This is the unblocked version of the algorithm).
The tridiagonal form of \(A_j\) is given by:
\[ T_j = Q_j' A_j Q_j \]where \(T_j\) is Hermitian tridiagonal and \(Q_j\) is a unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_{n1}} & \: \text{if uplo indicates lower, or}\\ Q_j = H_{j_{n1}}H_{j_{n2}}\cdots H_{j_1} & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{tau}_j[i] \cdot v_{j_i} v_{j_i}' \]where \(\text{tau}_j[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the hermitian matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrices A_j.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_j; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_(j_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_(j_i) stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A_j.
D – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of T_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of T_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
tau – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors tau_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.12. rocsolver_<type>hetd2_strided_batched()#

rocblas_status rocsolver_zhetd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_chetd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
HETD2_STRIDED_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_j.
(This is the unblocked version of the algorithm).
The tridiagonal form of \(A_j\) is given by:
\[ T_j = Q_j' A_j Q_j \]where \(T_j\) is Hermitian tridiagonal and \(Q_j\) is a unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_{n1}} & \: \text{if uplo indicates lower, or}\\ Q_j = H_{j_{n1}}H_{j_{n2}}\cdots H_{j_1} & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{tau}_j[i] \cdot v_{j_i} v_{j_i}' \]where \(\text{tau}_j[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the hermitian matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrices A_j.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_j; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_(j_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_(j_i) stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of T_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of T_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
tau – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors tau_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.13. rocsolver_<type>sytrd()#

rocblas_status rocsolver_dsytrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tau)#

rocblas_status rocsolver_ssytrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tau)#
SYTRD computes the tridiagonal form of a real symmetric matrix A.
(This is the blocked version of the algorithm).
The tridiagonal form is given by:
\[ T = Q' A Q \]where T is symmetric tridiagonal and Q is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_{n1} & \: \text{if uplo indicates lower, or}\\ Q = H_{n1}H_{n2}\cdots H_1 & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{tau}[i] \cdot v_i v_i' \]where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_i stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A.
D – [out]
pointer to type. Array on the GPU of dimension n.
The diagonal elements of T.
E – [out]
pointer to type. Array on the GPU of dimension n1.
The offdiagonal elements of T.
tau – [out]
pointer to type. Array on the GPU of dimension n1.
The Householder scalars.
3.3.3.14. rocsolver_<type>sytrd_batched()#

rocblas_status rocsolver_dsytrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_ssytrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
SYTRD_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_j.
(This is the blocked version of the algorithm).
The tridiagonal form of \(A_j\) is given by:
\[ T_j = Q_j' A_j Q_j \]where \(T_j\) is symmetric tridiagonal and \(Q_j\) is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_{n1}} & \: \text{if uplo indicates lower, or}\\ Q_j = H_{j_{n1}}H_{j_{n2}}\cdots H_{j_1} & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{tau}_j[i] \cdot v_{j_i} v_{j_i}' \]where \(\text{tau}_j[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the symmetric matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrices A_j.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_j; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_(j_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_(j_i) stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A_j.
D – [out]
pointer to type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of T_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out]
pointer to type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of T_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
tau – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors tau_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.15. rocsolver_<type>sytrd_strided_batched()#

rocblas_status rocsolver_dsytrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_ssytrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
SYTRD_STRIDED_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_j.
(This is the blocked version of the algorithm).
The tridiagonal form of \(A_j\) is given by:
\[ T_j = Q_j' A_j Q_j \]where \(T_j\) is symmetric tridiagonal and \(Q_j\) is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_{n1}} & \: \text{if uplo indicates lower, or}\\ Q_j = H_{j_{n1}}H_{j_{n2}}\cdots H_{j_1} & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{tau}_j[i] \cdot v_{j_i} v_{j_i}' \]where \(\text{tau}_j[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the symmetric matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrices A_j.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_j; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_(j_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_(j_i) stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out]
pointer to type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of T_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out]
pointer to type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of T_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
tau – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors tau_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.16. rocsolver_<type>hetrd()#

rocblas_status rocsolver_zhetrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tau)#

rocblas_status rocsolver_chetrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tau)#
HETRD computes the tridiagonal form of a complex hermitian matrix A.
(This is the blocked version of the algorithm).
The tridiagonal form is given by:
\[ T = Q' A Q \]where T is hermitian tridiagonal and Q is an unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_{n1} & \: \text{if uplo indicates lower, or}\\ Q = H_{n1}H_{n2}\cdots H_1 & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_i\) is given by
\[ H_i = I  \text{tau}[i] \cdot v_i v_i' \]where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_i stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A.
D – [out]
pointer to real type. Array on the GPU of dimension n.
The diagonal elements of T.
E – [out]
pointer to real type. Array on the GPU of dimension n1.
The offdiagonal elements of T.
tau – [out]
pointer to type. Array on the GPU of dimension n1.
The Householder scalars.
3.3.3.17. rocsolver_<type>hetrd_batched()#

rocblas_status rocsolver_zhetrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_chetrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
HETRD_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_j.
(This is the blocked version of the algorithm).
The tridiagonal form of \(A_j\) is given by:
\[ T_j = Q_j' A_j Q_j \]where \(T_j\) is Hermitian tridiagonal and \(Q_j\) is a unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_{n1}} & \: \text{if uplo indicates lower, or}\\ Q_j = H_{j_{n1}}H_{j_{n2}}\cdots H_{j_1} & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{tau}_j[i] \cdot v_{j_i} v_{j_i}' \]where \(\text{tau}_j[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the hermitian matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrices A_j.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_j; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_(j_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_(j_i) stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A_j.
D – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of T_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of T_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
tau – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors tau_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.18. rocsolver_<type>hetrd_strided_batched()#

rocblas_status rocsolver_zhetrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

rocblas_status rocsolver_chetrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
HETRD_STRIDED_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_j.
(This is the blocked version of the algorithm).
The tridiagonal form of \(A_j\) is given by:
\[ T_j = Q_j' A_j Q_j \]where \(T_j\) is Hermitian tridiagonal and \(Q_j\) is a unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_j = H_{j_1}H_{j_2}\cdots H_{j_{n1}} & \: \text{if uplo indicates lower, or}\\ Q_j = H_{j_{n1}}H_{j_{n2}}\cdots H_{j_1} & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_{j_i}\) is given by
\[ H_{j_i} = I  \text{tau}_j[i] \cdot v_{j_i} v_{j_i}' \]where \(\text{tau}_j[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i+1] = 1\). If uplo indicates upper, the last ni elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).
 Parameters:
handle – [in] rocblas_handle.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the hermitian matrix A_j is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrices A_j.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the matrices A_j to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_j; the elements above the superdiagonal contain the first i1 elements of the Householder vectors v_(j_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_j; the elements below the subdiagonal contain the last ni1 elements of the Householder vectors v_(j_i) stored as columns.
lda – [in]
rocblas_int. lda >= n.
The leading dimension of A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideD).
The diagonal elements of T_j.
strideD – [in]
rocblas_stride.
Stride from the start of one vector D_j to the next one D_(j+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out]
pointer to real type. Array on the GPU (the size depends on the value of strideE).
The offdiagonal elements of T_j.
strideE – [in]
rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= n1.
tau – [out]
pointer to type. Array on the GPU (the size depends on the value of strideP).
Contains the vectors tau_j of corresponding Householder scalars.
strideP – [in]
rocblas_stride.
Stride from the start of one vector tau_j to the next one tau_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= n1.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.19. rocsolver_<type>sygs2()#

rocblas_status rocsolver_dsygs2(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *B, const rocblas_int ldb)#

rocblas_status rocsolver_ssygs2(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *B, const rocblas_int ldb)#
SYGS2 reduces a real symmetricdefinite generalized eigenproblem to standard form.
(This is the unblocked version of the algorithm).
The problem solved by this function is either of the form
\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]depending on the value of itype.
If the problem is of the 1st form, then A is overwritten with
\[\begin{split} \begin{array}{cl} U^{T} A U^{1}, & \: \text{or}\\ L^{1} A L^{T}, \end{array} \end{split}\]where the symmetricdefinite matrix B has been factorized as either \(U^T U\) or \(L L^T\) as returned by POTRF, depending on the value of uplo.
If the problem is of the 2nd or 3rd form, then A is overwritten with
\[\begin{split} \begin{array}{cl} U A U^T, & \: \text{or}\\ L^T A L, \end{array} \end{split}\]also depending on the value of uplo.
 Parameters:
handle – [in] rocblas_handle.
itype – [in] rocblas_eform
.
Specifies the form of the generalized eigenproblem.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrix A is stored, and whether the factorization applied to B was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.
n – [in]
rocblas_int. n >= 0.
The matrix dimensions.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the matrix A. On exit, the transformed matrix associated with the equivalent standard eigenvalue problem.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A.
B – [out]
pointer to type. Array on the GPU of dimension ldb*n.
The triangular factor of the matrix B, as returned by
POTRF.ldb – [in]
rocblas_int. ldb >= n.
Specifies the leading dimension of B.
3.3.3.20. rocsolver_<type>sygs2_batched()#

rocblas_status rocsolver_dsygs2_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, const rocblas_int batch_count)#

rocblas_status rocsolver_ssygs2_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
SYGS2_BATCHED reduces a batch of real symmetricdefinite generalized eigenproblems to standard form.
(This is the unblocked version of the algorithm).
For each instance in the batch, the problem solved by this function is either of the form
\[\begin{split} \begin{array}{cl} A_j X_j = \lambda B_j X_j & \: \text{1st form,}\\ A_j B_j X_j = \lambda X_j & \: \text{2nd form, or}\\ B_j A_j X_j = \lambda X_j & \: \text{3rd form,} \end{array} \end{split}\]depending on the value of itype.
If the problem is of the 1st form, then \(A_j\) is overwritten with
\[\begin{split} \begin{array}{cl} U_j^{T} A_j U_j^{1}, & \: \text{or}\\ L_j^{1} A_j L_j^{T}, \end{array} \end{split}\]where the symmetricdefinite matrix \(B_j\) has been factorized as either \(U_j^T U_j\) or \(L_j L_j^T\) as returned by POTRF, depending on the value of uplo.
If the problem is of the 2nd or 3rd form, then A is overwritten with
\[\begin{split} \begin{array}{cl} U_j A_j U_j^T, & \: \text{or}\\ L_j^T A_j L_j, \end{array} \end{split}\]also depending on the value of uplo.
 Parameters:
handle – [in] rocblas_handle.
itype – [in] rocblas_eform
.
Specifies the form of the generalized eigenproblems.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrices A_j are stored, and whether the factorization applied to B_j was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_j and B_j are not used.
n – [in]
rocblas_int. n >= 0.
The matrix dimensions.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A_j.
B – [out]
array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n.
The triangular factors of the matrices B_j, as returned by
POTRF_BATCHED.ldb – [in]
rocblas_int. ldb >= n.
Specifies the leading dimension of B_j.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.21. rocsolver_<type>sygs2_strided_batched()#

rocblas_status rocsolver_dsygs2_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#

rocblas_status rocsolver_ssygs2_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
SYGS2_STRIDED_BATCHED reduces a batch of real symmetricdefinite generalized eigenproblems to standard form.
(This is the unblocked version of the algorithm).
For each instance in the batch, the problem solved by this function is either of the form
\[\begin{split} \begin{array}{cl} A_j X_j = \lambda B_j X_j & \: \text{1st form,}\\ A_j B_j X_j = \lambda X_j & \: \text{2nd form, or}\\ B_j A_j X_j = \lambda X_j & \: \text{3rd form,} \end{array} \end{split}\]depending on the value of itype.
If the problem is of the 1st form, then \(A_j\) is overwritten with
\[\begin{split} \begin{array}{cl} U_j^{T} A_j U_j^{1}, & \: \text{or}\\ L_j^{1} A_j L_j^{T}, \end{array} \end{split}\]where the symmetricdefinite matrix \(B_j\) has been factorized as either \(U_j^T U_j\) or \(L_j L_j^T\) as returned by POTRF, depending on the value of uplo.
If the problem is of the 2nd or 3rd form, then A is overwritten with
\[\begin{split} \begin{array}{cl} U_j A_j U_j^T, & \: \text{or}\\ L_j^T A_j L_j, \end{array} \end{split}\]also depending on the value of uplo.
 Parameters:
handle – [in] rocblas_handle.
itype – [in] rocblas_eform
.
Specifies the form of the generalized eigenproblems.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrices A_j are stored, and whether the factorization applied to B_j was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_j and B_j are not used.
n – [in]
rocblas_int. n >= 0.
The matrix dimensions.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the matrices A_j. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
B – [out]
pointer to type. Array on the GPU (the size depends on the value of strideB).
The triangular factors of the matrices B_j, as returned by
POTRF_STRIDED_BATCHED.ldb – [in]
rocblas_int. ldb >= n.
Specifies the leading dimension of B_j.
strideB – [in]
rocblas_stride.
Stride from the start of one matrix B_j to the next one B_(j+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*n.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.22. rocsolver_<type>hegs2()#

rocblas_status rocsolver_zhegs2(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb)#

rocblas_status rocsolver_chegs2(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb)#
HEGS2 reduces a hermitiandefinite generalized eigenproblem to standard form.
(This is the unblocked version of the algorithm).
The problem solved by this function is either of the form
\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]depending on the value of itype.
If the problem is of the 1st form, then A is overwritten with
\[\begin{split} \begin{array}{cl} U^{H} A U^{1}, & \: \text{or}\\ L^{1} A L^{H}, \end{array} \end{split}\]where the hermitiandefinite matrix B has been factorized as either \(U^H U\) or \(L L^H\) as returned by POTRF, depending on the value of uplo.
If the problem is of the 2nd or 3rd form, then A is overwritten with
\[\begin{split} \begin{array}{cl} U A U^H, & \: \text{or}\\ L^H A L, \end{array} \end{split}\]also depending on the value of uplo.
 Parameters:
handle – [in] rocblas_handle.
itype – [in] rocblas_eform
.
Specifies the form of the generalized eigenproblem.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrix A is stored, and whether the factorization applied to B was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.
n – [in]
rocblas_int. n >= 0.
The matrix dimensions.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the matrix A. On exit, the transformed matrix associated with the equivalent standard eigenvalue problem.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A.
B – [out]
pointer to type. Array on the GPU of dimension ldb*n.
The triangular factor of the matrix B, as returned by
POTRF.ldb – [in]
rocblas_int. ldb >= n.
Specifies the leading dimension of B.
3.3.3.23. rocsolver_<type>hegs2_batched()#

rocblas_status rocsolver_zhegs2_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#

rocblas_status rocsolver_chegs2_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
HEGS2_BATCHED reduces a batch of hermitiandefinite generalized eigenproblems to standard form.
(This is the unblocked version of the algorithm).
For each instance in the batch, the problem solved by this function is either of the form
\[\begin{split} \begin{array}{cl} A_j X_j = \lambda B_j X_j & \: \text{1st form,}\\ A_j B_j X_j = \lambda X_j & \: \text{2nd form, or}\\ B_j A_j X_j = \lambda X_j & \: \text{3rd form,} \end{array} \end{split}\]depending on the value of itype.
If the problem is of the 1st form, then \(A_j\) is overwritten with
\[\begin{split} \begin{array}{cl} U_j^{H} A_j U_j^{1}, & \: \text{or}\\ L_j^{1} A_j L_j^{H}, \end{array} \end{split}\]where the hermitiandefinite matrix \(B_j\) has been factorized as either \(U_j^H U_j\) or \(L_j L_j^H\) as returned by POTRF, depending on the value of uplo.
If the problem is of the 2nd or 3rd form, then A is overwritten with
\[\begin{split} \begin{array}{cl} U_j A_j U_j^H, & \: \text{or}\\ L_j^H A_j L_j, \end{array} \end{split}\]also depending on the value of uplo.
 Parameters:
handle – [in] rocblas_handle.
itype – [in] rocblas_eform
.
Specifies the form of the generalized eigenproblems.
uplo – [in]
rocblas_fill.
Specifies whether the upper or lower part of the matrices A_j are stored, and whether the factorization applied to B_j was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_j and B_j are not used.
n – [in]
rocblas_int. n >= 0.
The matrix dimensions.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.
lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A_j.
B – [out]
array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n.
The triangular factors of the matrices B_j, as returned by
POTRF_BATCHED.ldb – [in]
rocblas_int. ldb >= n.
Specifies the leading dimension of B_j.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
3.3.3.24. rocsolver_<type>hegs2_strided_batched()#
 rocblas_status rocsolver_zhegs2_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_in