3.4. Lapack-like Functions¶
Other Lapack-like routines provided by rocSOLVER. These are divided into the following subcategories:
Triangular factorizations. Based on Gaussian elimination.
Linear-systems solvers. Based on triangular factorizations.
Note
Throughout the APIs’ descriptions, we use the following notations:
x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.
If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.
x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.
3.4.1. Triangular factorizations¶
List of Lapack-like triangular factorizations
rocsolver_<type>getf2_npvt()¶
-
rocblas_status
rocsolver_zgetf2_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_cgetf2_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_dgetf2_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_sgetf2_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶ GETF2_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.
(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization has the form
\[ A = LU \]where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2 routines instead.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
rocsolver_<type>getf2_npvt_batched()¶
-
rocblas_status
rocsolver_zgetf2_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetf2_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetf2_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetf2_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶ GETF2_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.
(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = L_jU_j \]where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_BATCHED routines instead.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_j in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getf2_npvt_strided_batched()¶
-
rocblas_status
rocsolver_zgetf2_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetf2_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetf2_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetf2_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶ GETF2_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.
(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = L_jU_j \]where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_STRIDED_BATCHED routines instead.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getrf_npvt()¶
-
rocblas_status
rocsolver_zgetrf_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_cgetrf_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_dgetrf_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_sgetrf_npvt
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶ GETRF_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.
(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization has the form
\[ A = LU \]where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF routines instead.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of the matrix A.
[in] n
: rocblas_int. n >= 0.The number of columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of A.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
rocsolver_<type>getrf_npvt_batched()¶
-
rocblas_status
rocsolver_zgetrf_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetrf_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetrf_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetrf_npvt_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶ GETRF_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.
(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = L_jU_j \]where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_BATCHED routines instead.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_j in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getrf_npvt_strided_batched()¶
-
rocblas_status
rocsolver_zgetrf_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetrf_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetrf_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetrf_npvt_strided_batched
(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶ GETRF_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.
(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = L_jU_j \]where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_STRIDED_BATCHED routines instead.
- Parameters
[in] handle
: rocblas_handle.[in] m
: rocblas_int. m >= 0.The number of rows of all matrices A_j in the batch.
[in] n
: rocblas_int. n >= 0.The number of columns of all matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.
[in] lda
: rocblas_int. lda >= m.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
3.4.2. Linear-systems solvers¶
List of Lapack-like linear solvers
rocsolver_<type>getri_npvt()¶
-
rocblas_status
rocsolver_zgetri_npvt
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_cgetri_npvt
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_dgetri_npvt
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶
-
rocblas_status
rocsolver_sgetri_npvt
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶ GETRI_NPVT inverts a general n-by-n matrix A using the LU factorization computed by GETRF_NPVT.
The inverse is computed by solving the linear system
\[ A^{-1}L = U^{-1} \]where L is the lower triangular factor of A with unit diagonal elements, and U is the upper triangular factor.
- Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix A.
[inout] A
: pointer to type. Array on the GPU of dimension lda*n.On entry, the factors L and U of the factorization A = L*U returned by
GETRF_NPVT. On exit, the inverse of A if info = 0; otherwise undefined.[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of A.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
rocsolver_<type>getri_npvt_batched()¶
-
rocblas_status
rocsolver_zgetri_npvt_batched
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetri_npvt_batched
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetri_npvt_batched
(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetri_npvt_batched
(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶ GETRI_NPVT_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_BATCHED.
The inverse of matrix \(A_j\) in the batch is computed by solving the linear system
\[ A_j^{-1} L_j = U_j^{-1} \]where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.
- Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of all matrices A_j in the batch.
[inout] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.On entry, the factors L_j and U_j of the factorization A = L_j*U_j returned by
GETRF_NPVT_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of matrices A_j.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getri_npvt_strided_batched()¶
-
rocblas_status
rocsolver_zgetri_npvt_strided_batched
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetri_npvt_strided_batched
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetri_npvt_strided_batched
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetri_npvt_strided_batched
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶ GETRI_NPVT_STRIDED_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_STRIDED_BATCHED.
The inverse of matrix \(A_j\) in the batch is computed by solving the linear system
\[ A_j^{-1} L_j = U_j^{-1} \]where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.
- Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of all matrices A_j in the batch.
[inout] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).On entry, the factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_STRIDED_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getri_outofplace()¶
-
rocblas_status
rocsolver_zgetri_outofplace
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)¶
-
rocblas_status
rocsolver_cgetri_outofplace
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)¶
-
rocblas_status
rocsolver_dgetri_outofplace
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, double *C, const rocblas_int ldc, rocblas_int *info)¶
-
rocblas_status
rocsolver_sgetri_outofplace
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, float *C, const rocblas_int ldc, rocblas_int *info)¶ GETRI_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A.
The inverse is computed by solving the linear system
\[ AC = I \]where I is the identity matrix, and A is factorized as \(A = PLU\) as given by GETRF.
- Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix A.
[in] A
: pointer to type. Array on the GPU of dimension lda*n.The factors L and U of the factorization A = P*L*U returned by
GETRF.[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of A.
[in] ipiv
: pointer to rocblas_int. Array on the GPU of dimension n.The pivot indices returned by
GETRF.[out] C
: pointer to type. Array on the GPU of dimension ldc*n.If info = 0, the inverse of A. Otherwise, undefined.
[in] ldc
: rocblas_int. ldc >= n.Specifies the leading dimension of C.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
rocsolver_<type>getri_outofplace_batched()¶
-
rocblas_status
rocsolver_zgetri_outofplace_batched
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetri_outofplace_batched
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetri_outofplace_batched
(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetri_outofplace_batched
(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶ GETRI_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).
The inverse is computed by solving the linear system
\[ A_j C_j = I \]where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_BATCHED.
- Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of all matrices A_j in the batch.
[in] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by
GETRF_BATCHED.[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of matrices A_j.
[in] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).The pivot indices returned by
GETRF_BATCHED.[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(i+j). There is no restriction for the value of strideP. Normal use case is strideP >= n.
[out] C
: array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
[in] ldc
: rocblas_int. ldc >= n.Specifies the leading dimension of C_j.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getri_outofplace_strided_batched()¶
-
rocblas_status
rocsolver_zgetri_outofplace_strided_batched
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetri_outofplace_strided_batched
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetri_outofplace_strided_batched
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetri_outofplace_strided_batched
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶ GETRI_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).
The inverse is computed by solving the linear system
\[ A_j C_j = I \]where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_STRIDED_BATCHED.
- Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of all matrices A_j in the batch.
[in] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by
GETRF_STRIDED_BATCHED.[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[in] ipiv
: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).The pivot indices returned by
GETRF_STRIDED_BATCHED.[in] strideP
: rocblas_stride.Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
[out] C
: pointer to type. Array on the GPU (the size depends on the value of strideC).If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
[in] ldc
: rocblas_int. ldc >= n.Specifies the leading dimension of C_j.
[in] strideC
: rocblas_stride.Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getri_npvt_outofplace()¶
-
rocblas_status
rocsolver_zgetri_npvt_outofplace
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)¶
-
rocblas_status
rocsolver_cgetri_npvt_outofplace
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)¶
-
rocblas_status
rocsolver_dgetri_npvt_outofplace
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, double *C, const rocblas_int ldc, rocblas_int *info)¶
-
rocblas_status
rocsolver_sgetri_npvt_outofplace
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, float *C, const rocblas_int ldc, rocblas_int *info)¶ GETRI_NPVT_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A without partial pivoting.
The inverse is computed by solving the linear system
\[ AC = I \]where I is the identity matrix, and A is factorized as \(A = LU\) as given by GETRF_NPVT.
- Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of the matrix A.
[in] A
: pointer to type. Array on the GPU of dimension lda*n.The factors L and U of the factorization A = L*U returned by
GETRF_NPVT.[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of A.
[out] C
: pointer to type. Array on the GPU of dimension ldc*n.If info = 0, the inverse of A. Otherwise, undefined.
[in] ldc
: rocblas_int. ldc >= n.Specifies the leading dimension of C.
[out] info
: pointer to a rocblas_int on the GPU.If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
rocsolver_<type>getri_npvt_outofplace_batched()¶
-
rocblas_status
rocsolver_zgetri_npvt_outofplace_batched
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetri_npvt_outofplace_batched
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetri_npvt_outofplace_batched
(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetri_npvt_outofplace_batched
(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶ GETRI_NPVT_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.
The inverse is computed by solving the linear system
\[ A_j C_j = I \]where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_BATCHED.
- Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of all matrices A_j in the batch.
[in] A
: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.The factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_BATCHED.[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of matrices A_j.
[out] C
: array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
[in] ldc
: rocblas_int. ldc >= n.Specifies the leading dimension of C_j.
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.
rocsolver_<type>getri_npvt_outofplace_strided_batched()¶
-
rocblas_status
rocsolver_zgetri_npvt_outofplace_strided_batched
(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_cgetri_npvt_outofplace_strided_batched
(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_dgetri_npvt_outofplace_strided_batched
(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶
-
rocblas_status
rocsolver_sgetri_npvt_outofplace_strided_batched
(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶ GETRI_NPVT_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.
The inverse is computed by solving the linear system
\[ A_j C_j = I \]where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_STRIDED_BATCHED.
- Parameters
[in] handle
: rocblas_handle.[in] n
: rocblas_int. n >= 0.The number of rows and columns of all matrices A_j in the batch.
[in] A
: pointer to type. Array on the GPU (the size depends on the value of strideA).The factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_STRIDED_BATCHED.[in] lda
: rocblas_int. lda >= n.Specifies the leading dimension of matrices A_j.
[in] strideA
: rocblas_stride.Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] C
: pointer to type. Array on the GPU (the size depends on the value of strideC).If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
[in] ldc
: rocblas_int. ldc >= n.Specifies the leading dimension of C_j.
[in] strideC
: rocblas_stride.Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n
[out] info
: pointer to rocblas_int. Array of batch_count integers on the GPU.If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count
: rocblas_int. batch_count >= 0.Number of matrices in the batch.