# 3.2. LAPACK Auxiliary Functions¶

These are functions that support more advanced LAPACK routines. The auxiliary functions are divided into the following categories:

Note

Throughout the APIs’ descriptions, we use the following notations:

• x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x is the first element of x.

• If X is a real vector or matrix, $$X^T$$ indicates its transpose; if X is complex, then $$X^H$$ represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.

• x_i $$=x_i$$; we sometimes use both notations, $$x_i$$ when displaying mathematical equations, and x_i in the text describing the function parameters.

## 3.2.1. Vector and Matrix manipulations¶

List of vector and matrix manipulations

### rocsolver_<type>lacgv()¶

rocblas_status rocsolver_zlacgv(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *x, const rocblas_int incx)
rocblas_status rocsolver_clacgv(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *x, const rocblas_int incx)

LACGV conjugates the complex vector x.

It conjugates the n entries of a complex vector x with increment incx.

Parameters
• [in] handle: rocblas_handle.

• [in] n: rocblas_int. n >= 0.

The dimension of vector x.

• [inout] x: pointer to type. Array on the GPU of size at least n (size depends on the value of incx).

On entry, the vector x. On exit, each entry is overwritten with its conjugate value.

• [in] incx: rocblas_int. incx != 0.

The distance between two consecutive elements of x. If incx is negative, the elements of x are indexed in reverse order.

### rocsolver_<type>laswp()¶

rocblas_status rocsolver_zlaswp(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)
rocblas_status rocsolver_claswp(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)
rocblas_status rocsolver_dlaswp(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)
rocblas_status rocsolver_slaswp(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)

LASWP performs a series of row interchanges on the matrix A.

Row interchanges are done one by one. If $$\text{ipiv}[k_1 + (j - k_1) \cdot \text{abs}(\text{incx})] = r$$, then the j-th row of A will be interchanged with the r-th row of A, for $$j = k_1,k_1+1,\dots,k_2$$. Indices $$k_1$$ and $$k_2$$ are 1-based indices.

Parameters
• [in] handle: rocblas_handle.

• [in] n: rocblas_int. n >= 0.

The number of columns of the matrix A.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix to which the row interchanges will be applied. On exit, the resulting permuted matrix.

• [in] lda: rocblas_int. lda > 0.

The leading dimension of the array A.

• [in] k1: rocblas_int. k1 > 0.

The k_1 index. It is the first element of ipiv for which a row interchange will be done. This is a 1-based index.

• [in] k2: rocblas_int. k2 > k1 > 0.

The k_2 index. k_2 - k_1 + 1 is the number of elements of ipiv for which a row interchange will be done. This is a 1-based index.

• [in] ipiv: pointer to rocblas_int. Array on the GPU of dimension at least k_1 + (k_2 - k_1)*abs(incx).

The vector of pivot indices. Only the elements in positions k_1 through k_1 + (k_2 - k_1)*abs(incx) of this vector are accessed. Elements of ipiv are considered 1-based.

• [in] incx: rocblas_int. incx != 0.

The distance between successive values of ipiv. If incx is negative, the pivots are applied in reverse order.

## 3.2.2. Householder reflections¶

List of Householder functions

### rocsolver_<type>larfg()¶

rocblas_status rocsolver_zlarfg(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *alpha, rocblas_double_complex *x, const rocblas_int incx, rocblas_double_complex *tau)
rocblas_status rocsolver_clarfg(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *alpha, rocblas_float_complex *x, const rocblas_int incx, rocblas_float_complex *tau)
rocblas_status rocsolver_dlarfg(rocblas_handle handle, const rocblas_int n, double *alpha, double *x, const rocblas_int incx, double *tau)
rocblas_status rocsolver_slarfg(rocblas_handle handle, const rocblas_int n, float *alpha, float *x, const rocblas_int incx, float *tau)

LARFG generates a Householder reflector H of order n.

The reflector H is such that

$\begin{split} H'\left[\begin{array}{c} \text{alpha}\\ x \end{array}\right]=\left[\begin{array}{c} \text{beta}\\ 0 \end{array}\right] \end{split}$

where x is an n-1 vector, and alpha and beta are scalars. Matrix H can be generated as

$\begin{split} H = I - \text{tau}\left[\begin{array}{c} 1\\ v \end{array}\right]\left[\begin{array}{cc} 1 & v' \end{array}\right] \end{split}$

where v is an n-1 vector, and tau is a scalar known as the Householder scalar. The vector

$\begin{split} \bar{v}=\left[\begin{array}{c} 1\\ v \end{array}\right] \end{split}$

is the Householder vector associated with the reflection.

Note

The matrix H is orthogonal/unitary (i.e. $$H'H=HH'=I$$). It is symmetric when real (i.e. $$H^T=H$$), but not Hermitian when complex (i.e. $$H^H\neq H$$ in general).

Parameters
• [in] handle: rocblas_handle.

• [in] n: rocblas_int. n >= 0.

The order (size) of reflector H.

• [inout] alpha: pointer to type. A scalar on the GPU.

On entry, the scalar alpha. On exit, it is overwritten with beta.

• [inout] x: pointer to type. Array on the GPU of size at least n-1 (size depends on the value of incx).

On entry, the vector x, On exit, it is overwritten with vector v.

• [in] incx: rocblas_int. incx > 0.

The distance between two consecutive elements of x.

• [out] tau: pointer to type. A scalar on the GPU.

The Householder scalar tau.

### rocsolver_<type>larft()¶

rocblas_status rocsolver_zlarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *tau, rocblas_double_complex *T, const rocblas_int ldt)
rocblas_status rocsolver_clarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *tau, rocblas_float_complex *T, const rocblas_int ldt)
rocblas_status rocsolver_dlarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *tau, double *T, const rocblas_int ldt)
rocblas_status rocsolver_slarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *tau, float *T, const rocblas_int ldt)

LARFT generates the triangular factor T of a block reflector H of order n.

The block reflector H is defined as the product of k Householder matrices

$\begin{split} \begin{array}{cl} H = H_1H_2\cdots H_k & \: \text{if direct indicates forward direction, or} \\ H = H_k\cdots H_2H_1 & \: \text{if direct indicates backward direction} \end{array} \end{split}$

The triangular factor T is upper triangular in the forward direction and lower triangular in the backward direction. If storev is column-wise, then

$H = I - VTV'$

where the i-th column of matrix V contains the Householder vector associated with $$H_i$$. If storev is row-wise, then

$H = I - V'TV$

where the i-th row of matrix V contains the Householder vector associated with $$H_i$$.

Parameters
• [in] handle: rocblas_handle.

• [in] direct: rocblas_direct.

Specifies the direction in which the Householder matrices are applied.

• [in] storev: rocblas_storev.

Specifies how the Householder vectors are stored in matrix V.

• [in] n: rocblas_int. n >= 0.

The order (size) of the block reflector.

• [in] k: rocblas_int. k >= 1.

The number of Householder matrices forming H.

• [in] V: pointer to type. Array on the GPU of size ldv*k if column-wise, or ldv*n if row-wise.

The matrix of Householder vectors.

• [in] ldv: rocblas_int. ldv >= n if column-wise, or ldv >= k if row-wise.

• [in] tau: pointer to type. Array of k scalars on the GPU.

The vector of all the Householder scalars.

• [out] T: pointer to type. Array on the GPU of dimension ldt*k.

The triangular factor. T is upper triangular if direct indicates forward direction, otherwise it is lower triangular. The rest of the array is not used.

• [in] ldt: rocblas_int. ldt >= k.

### rocsolver_<type>larf()¶

rocblas_status rocsolver_zlarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_double_complex *x, const rocblas_int incx, const rocblas_double_complex *alpha, rocblas_double_complex *A, const rocblas_int lda)
rocblas_status rocsolver_clarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_float_complex *x, const rocblas_int incx, const rocblas_float_complex *alpha, rocblas_float_complex *A, const rocblas_int lda)
rocblas_status rocsolver_dlarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, double *x, const rocblas_int incx, const double *alpha, double *A, const rocblas_int lda)
rocblas_status rocsolver_slarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, float *x, const rocblas_int incx, const float *alpha, float *A, const rocblas_int lda)

LARF applies a Householder reflector H to a general matrix A.

The Householder reflector H, of order m or n, is to be applied to an m-by-n matrix A from the left or the right, depending on the value of side. H is given by

$H = I - \text{alpha}\cdot xx'$

where alpha is the Householder scalar and x is a Householder vector. H is never actually computed.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Determines whether H is applied from the left or the right.

• [in] m: rocblas_int. m >= 0.

Number of rows of A.

• [in] n: rocblas_int. n >= 0.

Number of columns of A.

• [in] x: pointer to type. Array on the GPU of size at least 1 + (m-1)*abs(incx) if left side, or at least 1 + (n-1)*abs(incx) if right side.

The Householder vector x.

• [in] incx: rocblas_int. incx != 0.

Distance between two consecutive elements of x. If incx < 0, the elements of x are indexed in reverse order.

• [in] alpha: pointer to type. A scalar on the GPU.

The Householder scalar. If alpha = 0, then H = I (A will remain the same; x is never used)

• [inout] A: pointer to type. Array on the GPU of size lda*n.

On entry, the matrix A. On exit, it is overwritten with H*A (or A*H).

• [in] lda: rocblas_int. lda >= m.

### rocsolver_<type>larfb()¶

rocblas_status rocsolver_zlarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *T, const rocblas_int ldt, rocblas_double_complex *A, const rocblas_int lda)
rocblas_status rocsolver_clarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *T, const rocblas_int ldt, rocblas_float_complex *A, const rocblas_int lda)
rocblas_status rocsolver_dlarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *T, const rocblas_int ldt, double *A, const rocblas_int lda)
rocblas_status rocsolver_slarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *T, const rocblas_int ldt, float *A, const rocblas_int lda)

LARFB applies a block reflector H to a general m-by-n matrix A.

The block reflector H is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} HA & \: \text{(No transpose from the left),}\\ H'A & \: \text{(Transpose or conjugate transpose from the left),}\\ AH & \: \text{(No transpose from the right), or}\\ AH' & \: \text{(Transpose or conjugate transpose from the right).} \end{array} \end{split}$

The block reflector H is defined as the product of k Householder matrices as

$\begin{split} \begin{array}{cl} H = H_1H_2\cdots H_k & \: \text{if direct indicates forward direction, or} \\ H = H_k\cdots H_2H_1 & \: \text{if direct indicates backward direction} \end{array} \end{split}$

H is never stored. It is calculated as

$H = I - VTV'$

where the i-th column of matrix V contains the Householder vector associated with $$H_i$$, if storev is column-wise; or

$H = I - V'TV$

where the i-th row of matrix V contains the Householder vector associated with $$H_i$$, if storev is row-wise. T is the associated triangular factor as computed by LARFT.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply H.

• [in] trans: rocblas_operation.

Specifies whether the block reflector or its transpose/conjugate transpose is to be applied.

• [in] direct: rocblas_direct.

Specifies the direction in which the Householder matrices are to be applied to generate H.

• [in] storev: rocblas_storev.

Specifies how the Householder vectors are stored in matrix V.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix A.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix A.

• [in] k: rocblas_int. k >= 1.

The number of Householder matrices.

• [in] V: pointer to type. Array on the GPU of size ldv*k if column-wise, ldv*n if row-wise and applying from the right, or ldv*m if row-wise and applying from the left.

The matrix of Householder vectors.

• [in] ldv: rocblas_int. ldv >= k if row-wise, ldv >= m if column-wise and applying from the left, or ldv >= n if column-wise and applying from the right.

• [in] T: pointer to type. Array on the GPU of dimension ldt*k.

The triangular factor of the block reflector.

• [in] ldt: rocblas_int. ldt >= k.

• [inout] A: pointer to type. Array on the GPU of size lda*n.

On entry, the matrix A. On exit, it is overwritten with H*A, A*H, H’*A, or A*H’.

• [in] lda: rocblas_int. lda >= m.

## 3.2.3. Bidiagonal forms¶

List of functions for bidiagonal forms

### rocsolver_<type>labrd()¶

rocblas_status rocsolver_zlabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup, rocblas_double_complex *X, const rocblas_int ldx, rocblas_double_complex *Y, const rocblas_int ldy)
rocblas_status rocsolver_clabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup, rocblas_float_complex *X, const rocblas_int ldx, rocblas_float_complex *Y, const rocblas_int ldy)
rocblas_status rocsolver_dlabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup, double *X, const rocblas_int ldx, double *Y, const rocblas_int ldy)
rocblas_status rocsolver_slabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup, float *X, const rocblas_int ldx, float *Y, const rocblas_int ldy)

LABRD computes the bidiagonal form of the first k rows and columns of a general m-by-n matrix A, as well as the matrices X and Y needed to reduce the remaining part of A.

The reduced form is given by:

$B = Q'AP$

where the leading k-by-k block of B is upper bidiagonal if m >= n, or lower bidiagonal if m < n. Q and P are orthogonal/unitary matrices represented as the product of Householder matrices

$\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_k, & \text{and} \\ P = G_1G_2\cdots G_k. \end{array} \end{split}$

Each Householder matrix $$H_i$$ and $$G_i$$ is given by

$\begin{split} \begin{array}{cl} H_i = I - \text{tauq}[i]\cdot v_iv_i', & \text{and} \\ G_i = I - \text{taup}[i]\cdot u_iu_i'. \end{array} \end{split}$

If m >= n, the first i-1 elements of the Householder vector $$v_i$$ are zero, and $$v_i[i]=1$$; while the first i elements of the Householder vector $$u_i$$ are zero, and $$u_i[i+1]=1$$. If m < n, the first i elements of the Householder vector $$v_i$$ are zero, and $$v_i[i+1]=1$$; while the first i-1 elements of the Householder vector $$u_i$$ are zero, and $$u_i[i]=1$$.

The unreduced part of the matrix A can be updated using the block update

$A = A - VY' - XU'$

where V and U are the m-by-k and n-by-k matrices formed with the vectors $$v_i$$ and $$u_i$$, respectively.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix A.

• [in] n: rocblas_int. n >= 0.

The number of columns of the matrix A.

• [in] k: rocblas_int. min(m,n) >= k >= 0.

The number of leading rows and columns of matrix A that will be reduced.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the m-by-n matrix to be reduced. On exit, the first k elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n), contain the bidiagonal form B. If m >= n, the elements below the diagonal of the first k columns are the possibly non-zero elements of the Householder vectors associated with Q, while the elements above the superdiagonal of the first k rows are the n - i - 1 possibly non-zero elements of the Householder vectors related to P. If m < n, the elements below the subdiagonal of the first k columns are the m - i - 1 possibly non-zero elements of the Householder vectors related to Q, while the elements above the diagonal of the first k rows are the n - i possibly non-zero elements of the vectors associated with P.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [out] D: pointer to real type. Array on the GPU of dimension k.

The diagonal elements of B.

• [out] E: pointer to real type. Array on the GPU of dimension k.

The off-diagonal elements of B.

• [out] tauq: pointer to type. Array on the GPU of dimension k.

The Householder scalars associated with matrix Q.

• [out] taup: pointer to type. Array on the GPU of dimension k.

The Householder scalars associated with matrix P.

• [out] X: pointer to type. Array on the GPU of dimension ldx*k.

The m-by-k matrix needed to update the unreduced part of A.

• [in] ldx: rocblas_int. ldx >= m.

• [out] Y: pointer to type. Array on the GPU of dimension ldy*k.

The n-by-k matrix needed to update the unreduced part of A.

• [in] ldy: rocblas_int. ldy >= n.

### rocsolver_<type>bdsqr()¶

rocblas_status rocsolver_zbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, double *D, double *E, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *U, const rocblas_int ldu, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_cbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, float *D, float *E, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *U, const rocblas_int ldu, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_dbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, double *D, double *E, double *V, const rocblas_int ldv, double *U, const rocblas_int ldu, double *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_sbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, float *D, float *E, float *V, const rocblas_int ldv, float *U, const rocblas_int ldu, float *C, const rocblas_int ldc, rocblas_int *info)

BDSQR computes the singular value decomposition (SVD) of an n-by-n bidiagonal matrix B, using the implicit QR algorithm.

The SVD of B has the form:

$B = QSP'$

where S is the n-by-n diagonal matrix of singular values of B, the columns of Q are the left singular vectors of B, and the columns of P are its right singular vectors.

The computation of the singular vectors is optional; this function accepts input matrices U (of size nu-by-n) and V (of size n-by-nv) that are overwritten with $$UQ$$ and $$P'V$$. If nu = 0 no left vectors are computed; if nv = 0 no right vectors are computed.

Optionally, this function can also compute $$Q'C$$ for a given n-by-nc input matrix C.

Parameters
• [in] handle: rocblas_handle.

• [in] uplo: rocblas_fill.

Specifies whether B is upper or lower bidiagonal.

• [in] n: rocblas_int. n >= 0.

The number of rows and columns of matrix B.

• [in] nv: rocblas_int. nv >= 0.

The number of columns of matrix V.

• [in] nu: rocblas_int. nu >= 0.

The number of rows of matrix U.

• [in] nc: rocblas_int. nu >= 0.

The number of columns of matrix C.

• [inout] D: pointer to real type. Array on the GPU of dimension n.

On entry, the diagonal elements of B. On exit, if info = 0, the singular values of B in decreasing order; if info > 0, the diagonal elements of a bidiagonal matrix orthogonally equivalent to B.

• [inout] E: pointer to real type. Array on the GPU of dimension n-1.

On entry, the off-diagonal elements of B. On exit, if info > 0, the off-diagonal elements of a bidiagonal matrix orthogonally equivalent to B (if info = 0 this matrix converges to zero).

• [inout] V: pointer to type. Array on the GPU of dimension ldv*nv.

On entry, the matrix V. On exit, it is overwritten with P’*V. (Not referenced if nv = 0).

• [in] ldv: rocblas_int. ldv >= n if nv > 0, or ldv >=1 if nv = 0.

• [inout] U: pointer to type. Array on the GPU of dimension ldu*n.

On entry, the matrix U. On exit, it is overwritten with U*Q. (Not referenced if nu = 0).

• [in] ldu: rocblas_int. ldu >= nu.

• [inout] C: pointer to type. Array on the GPU of dimension ldc*nc.

On entry, the matrix C. On exit, it is overwritten with Q’*C. (Not referenced if nc = 0).

• [in] ldc: rocblas_int. ldc >= n if nc > 0, or ldc >=1 if nc = 0.

• [out] info: pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, i elements of E have not converged to zero.

### rocsolver_<type>bdsvdx()¶

rocblas_status rocsolver_dbdsvdx(rocblas_handle handle, const rocblas_fill uplo, const rocblas_svect svect, const rocblas_srange srange, const rocblas_int n, double *D, double *E, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, double *S, double *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)
rocblas_status rocsolver_sbdsvdx(rocblas_handle handle, const rocblas_fill uplo, const rocblas_svect svect, const rocblas_srange srange, const rocblas_int n, float *D, float *E, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, float *S, float *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)

BDSVDX computes a set of singular values of a bidiagonal matrix B.

This function computes all the singular values of B, all the singular values in the half-open interval $$[vl, vu)$$, or the il-th through iu-th singular values, depending on the value of srange.

Depending on the value of svect, the corresponding singular vectors will be computed and stored as blocks in the output matrix Z. That is,

$\begin{split} Z = \left[\begin{array}{c} U\\ V \end{array}\right] \end{split}$

where U contains the corresponding left singular vectors of B, and V contains the corresponding right singular vectors.

Parameters
• [in] handle: rocblas_handle.

• [in] uplo: rocblas_fill.

Specifies whether B is upper or lower bidiagonal.

• [in] svect: rocblas_svect.

Specifies how the singular vectors are computed. Only rocblas_svect_none and rocblas_svect_singular are accepted.

• [in] srange: rocblas_srange.

Specifies the type of range or interval of the singular values to be computed.

• [in] n: rocblas_int. n >= 0.

The order of the bidiagonal matrix B.

• [in] D: pointer to real type. Array on the GPU of dimension n.

The diagonal elements of the bidiagonal matrix.

• [in] E: pointer to real type. Array on the GPU of dimension n-1.

The off-diagonal elements of the bidiagonal matrix.

• [in] vl: real type. 0 <= vl < vu.

The lower bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of B or the singular values within a set of indices.

• [in] vu: real type. 0 <= vl < vu.

The upper bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of B or the singular values within a set of indices.

• [in] il: rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise.

The index of the largest singular value to be computed. Ignored if srange indicates to look for all the singular values of B or the singular values in a half-open interval.

• [in] iu: rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.

The index of the smallest singular value to be computed. Ignored if srange indicates to look for all the singular values of B or the singular values in a half-open interval.

• [out] nsv: pointer to a rocblas_int on the GPU.

The total number of singular values found. If srange is rocblas_srange_all, nsv = n. If srange is rocblas_srange_index, nsv = iu - il + 1. Otherwise, 0 <= nsv <= n.

• [out] S: pointer to real type. Array on the GPU of dimension n.

The first nsv elements contain the computed singular values in descending order.

• [out] Z: pointer to real type. Array on the GPU of dimension ldz*nsv.

If info = 0, the first nsv columns contain the computed singular vectors corresponding to the singular values in S. The first n rows of Z contain the matrix U, and the next n rows contain the matrix V. Not referenced if svect is rocblas_svect_none. Note: If srange is rocblas_srange_value, then the values of nsv are not known in advance. The user should ensure that Z is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

• [in] ldz: rocblas_int. ldz >= 2*n if svect is rocblas_svect_singular; ldz >= 1 otherwise.

Specifies the leading dimension of Z.

• [out] ifail: pointer to rocblas_int. Array on the GPU of dimension n.

If info = 0, the first nsv elements of ifail are zero. Otherwise, contains the indices of those eigenvectors that failed to converge, as returned by

STEIN. Not referenced if svect is rocblas_svect_none.

• [out] info: pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, i eigenvectors did not converge in

STEIN; their indices are stored in ifail.

## 3.2.4. Tridiagonal forms¶

List of functions for tridiagonal forms

### rocsolver_<type>latrd()¶

rocblas_status rocsolver_zlatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, double *E, rocblas_double_complex *tau, rocblas_double_complex *W, const rocblas_int ldw)
rocblas_status rocsolver_clatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, float *E, rocblas_float_complex *tau, rocblas_float_complex *W, const rocblas_int ldw)
rocblas_status rocsolver_dlatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *E, double *tau, double *W, const rocblas_int ldw)
rocblas_status rocsolver_slatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *E, float *tau, float *W, const rocblas_int ldw)

LATRD computes the tridiagonal form of k rows and columns of a symmetric/hermitian matrix A, as well as the matrix W needed to update the remaining part of A.

The reduced form is given by:

$T = Q'AQ$

If uplo is lower, the first k rows and columns of T form the tridiagonal block. If uplo is upper, then the last k rows and columns of T form the tridiagonal block. Q is an orthogonal/unitary matrix represented as the product of Householder matrices

$\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_k & \text{if uplo indicates lower, or}\\ Q = H_nH_{n-1}\cdots H_{n-k+1} & \text{if uplo is upper}. \end{array} \end{split}$

Each Householder matrix $$H_i$$ is given by

$H_i = I - \text{tau}[i]\cdot v_iv_i'$

where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector $$v_i$$ are zero, and $$v_i[i+1] = 1$$. If uplo is upper, the last n-i elements of the Householder vector $$v_i$$ are zero, and $$v_i[i] = 1$$.

The unreduced part of the matrix A can be updated using a rank update of the form:

$A = A - VW' - WV'$

where V is the n-by-k matrix formed by the vectors $$v_i$$.

Parameters
• [in] handle: rocblas_handle.

• [in] uplo: rocblas_fill.

Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

• [in] n: rocblas_int. n >= 0.

The number of rows and columns of the matrix A.

• [in] k: rocblas_int. 0 <= k <= n.

The number of rows and columns of the matrix A to be reduced.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the n-by-n matrix to be reduced. On exit, if uplo is lower, the first k columns have been reduced to tridiagonal form (given in the diagonal elements of A and the array E), the elements below the diagonal contain the possibly non-zero entries of the Householder vectors associated with Q, stored as columns. If uplo is upper, the last k columns have been reduced to tridiagonal form (given in the diagonal elements of A and the array E), the elements above the diagonal contain the possibly non-zero entries of the Householder vectors associated with Q, stored as columns.

• [in] lda: rocblas_int. lda >= n.

• [out] E: pointer to real type. Array on the GPU of dimension n-1.

If upper (lower), the last (first) k elements of E are the off-diagonal elements of the computed tridiagonal block.

• [out] tau: pointer to type. Array on the GPU of dimension n-1.

If upper (lower), the last (first) k elements of tau are the Householder scalars related to Q.

• [out] W: pointer to type. Array on the GPU of dimension ldw*k.

The n-by-k matrix needed to update the unreduced part of A.

• [in] ldw: rocblas_int. ldw >= n.

### rocsolver_<type>sterf()¶

rocblas_status rocsolver_dsterf(rocblas_handle handle, const rocblas_int n, double *D, double *E, rocblas_int *info)
rocblas_status rocsolver_ssterf(rocblas_handle handle, const rocblas_int n, float *D, float *E, rocblas_int *info)

STERF computes the eigenvalues of a symmetric tridiagonal matrix.

The eigenvalues of the symmetric tridiagonal matrix are computed by the Pal-Walker-Kahan variant of the QL/QR algorithm, and returned in increasing order.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E.

Parameters
• [in] handle: rocblas_handle.

• [in] n: rocblas_int. n >= 0.

The number of rows and columns of the tridiagonal matrix.

• [inout] D: pointer to real type. Array on the GPU of dimension n.

On entry, the diagonal elements of the tridiagonal matrix. On exit, if info = 0, the eigenvalues in increasing order. If info > 0, the diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

• [inout] E: pointer to real type. Array on the GPU of dimension n-1.

On entry, the off-diagonal elements of the tridiagonal matrix. On exit, if info = 0, this array converges to zero. If info > 0, the off-diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

• [out] info: pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, STERF did not converge. i elements of E did not converge to zero.

### rocsolver_<type>stebz()¶

rocblas_status rocsolver_dstebz(rocblas_handle handle, const rocblas_erange erange, const rocblas_eorder eorder, const rocblas_int n, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, double *D, double *E, rocblas_int *nev, rocblas_int *nsplit, double *W, rocblas_int *iblock, rocblas_int *isplit, rocblas_int *info)
rocblas_status rocsolver_sstebz(rocblas_handle handle, const rocblas_erange erange, const rocblas_eorder eorder, const rocblas_int n, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, float *D, float *E, rocblas_int *nev, rocblas_int *nsplit, float *W, rocblas_int *iblock, rocblas_int *isplit, rocblas_int *info)

STEBZ computes a set of eigenvalues of a symmetric tridiagonal matrix T.

This function computes all the eigenvalues of T, all the eigenvalues in the half-open interval $$(vl, vu]$$, or the il-th through iu-th eigenvalues, depending on the value of erange.

The eigenvalues are returned in increasing order either for the entire matrix, or grouped by independent diagonal blocks (if they exist), depending on the value of eorder.

Parameters
• [in] handle: rocblas_handle.

• [in] erange: rocblas_erange.

Specifies the type of range or interval of the eigenvalues to be computed.

• [in] eorder: rocblas_eorder.

Specifies whether the computed eigenvalues will be ordered by their position in the entire spectrum, or grouped by independent diagonal (split off) blocks.

• [in] n: rocblas_int. n >= 0.

The order of the tridiagonal matrix T.

• [in] vl: real type. vl < vu.

The lower bound of the search interval (vl, vu]. Ignored if erange indicates to look for all the eigenvalues of T or the eigenvalues within a set of indices.

• [in] vu: real type. vl < vu.

The upper bound of the search interval (vl, vu]. Ignored if erange indicates to look for all the eigenvalues of T or the eigenvalues within a set of indices.

• [in] il: rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise.

The index of the smallest eigenvalue to be computed. Ignored if erange indicates to look for all the eigenvalues of T or the eigenvalues in a half-open interval.

• [in] iu: rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.

The index of the largest eigenvalue to be computed. Ignored if erange indicates to look for all the eigenvalues of T or the eigenvalues in a half-open interval.

• [in] abstol: real type.

The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of T will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

• [in] D: pointer to real type. Array on the GPU of dimension n.

The diagonal elements of the tridiagonal matrix.

• [in] E: pointer to real type. Array on the GPU of dimension n-1.

The off-diagonal elements of the tridiagonal matrix.

• [out] nev: pointer to a rocblas_int on the GPU.

The total number of eigenvalues found.

• [out] nsplit: pointer to a rocblas_int on the GPU.

The number of split off blocks in the matrix.

• [out] W: pointer to real type. Array on the GPU of dimension n.

The first nev elements contain the computed eigenvalues. (The remaining elements may be used as workspace for internal computations).

• [out] iblock: pointer to rocblas_int. Array on the GPU of dimension n.

The block indices corresponding to each eigenvalue. When matrix T has split off blocks (nsplit > 1), then if iblock[i] = k, the eigenvalue W[i] belongs to the k-th diagonal block from the top.

• [out] isplit: pointer to rocblas_int. Array on the GPU of dimension n.

The splitting indices that divide the tridiagonal matrix into diagonal blocks. The k-th block stretches from the end of the (k-1)-th block (or the top left corner of the tridiagonal matrix, in the case of the 1st block) to the isplit[k]-th row/column.

• [out] info: pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = 1, the bisection did not converge for some eigenvalues, i.e. the returned values are not as accurate as the given tolerance. The non-converged eigenvalues are flagged by negative entries in iblock.

### rocsolver_<type>steqr()¶

rocblas_status rocsolver_zsteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_csteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_dsteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, double *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_ssteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, float *C, const rocblas_int ldc, rocblas_int *info)

STEQR computes the eigenvalues and (optionally) eigenvectors of a symmetric tridiagonal matrix.

The eigenvalues of the symmetric tridiagonal matrix are computed by the implicit QL/QR algorithm, and returned in increasing order.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E. When D and E correspond to the tridiagonal form of a full symmetric/Hermitian matrix, as returned by, e.g., SYTRD or HETRD, the eigenvectors of the original matrix can also be computed, depending on the value of evect.

Parameters
• [in] handle: rocblas_handle.

• [in] evect: rocblas_evect.

Specifies how the eigenvectors are computed.

• [in] n: rocblas_int. n >= 0.

The number of rows and columns of the tridiagonal matrix.

• [inout] D: pointer to real type. Array on the GPU of dimension n.

On entry, the diagonal elements of the tridiagonal matrix. On exit, if info = 0, the eigenvalues in increasing order. If info > 0, the diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

• [inout] E: pointer to real type. Array on the GPU of dimension n-1.

On entry, the off-diagonal elements of the tridiagonal matrix. On exit, if info = 0, this array converges to zero. If info > 0, the off-diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

• [inout] C: pointer to type. Array on the GPU of dimension ldc*n.

On entry, if evect is original, the orthogonal/unitary matrix used for the reduction to tridiagonal form as returned by, e.g.,

ORGTR or UNGTR. On exit, it is overwritten with the eigenvectors of the original symmetric/Hermitian matrix (if evect is original), or the eigenvectors of the tridiagonal matrix (if evect is tridiagonal). (Not referenced if evect is none).

• [in] ldc: rocblas_int. ldc >= n if evect is original or tridiagonal.

Specifies the leading dimension of C. (Not referenced if evect is none).

• [out] info: pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, STEQR did not converge. i elements of E did not converge to zero.

### rocsolver_<type>stedc()¶

rocblas_status rocsolver_zstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_cstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_dstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, double *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_sstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, float *C, const rocblas_int ldc, rocblas_int *info)

STEDC computes the eigenvalues and (optionally) eigenvectors of a symmetric tridiagonal matrix.

This function uses the divide and conquer method to compute the eigenvectors. The eigenvalues are returned in increasing order.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E. When D and E correspond to the tridiagonal form of a full symmetric/Hermitian matrix, as returned by, e.g., SYTRD or HETRD, the eigenvectors of the original matrix can also be computed, depending on the value of evect.

Parameters
• [in] handle: rocblas_handle.

• [in] evect: rocblas_evect.

Specifies how the eigenvectors are computed.

• [in] n: rocblas_int. n >= 0.

The number of rows and columns of the tridiagonal matrix.

• [inout] D: pointer to real type. Array on the GPU of dimension n.

On entry, the diagonal elements of the tridiagonal matrix. On exit, if info = 0, the eigenvalues in increasing order.

• [inout] E: pointer to real type. Array on the GPU of dimension n-1.

On entry, the off-diagonal elements of the tridiagonal matrix. On exit, if info = 0, the values of this array are destroyed.

• [inout] C: pointer to type. Array on the GPU of dimension ldc*n.

On entry, if evect is original, the orthogonal/unitary matrix used for the reduction to tridiagonal form as returned by, e.g.,

ORGTR or UNGTR. On exit, if info = 0, it is overwritten with the eigenvectors of the original symmetric/Hermitian matrix (if evect is original), or the eigenvectors of the tridiagonal matrix (if evect is tridiagonal). (Not referenced if evect is none).

• [in] ldc: rocblas_int. ldc >= n if evect is original or tridiagonal.

Specifies the leading dimension of C. (Not referenced if evect is none).

• [out] info: pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, STEDC failed to compute an eigenvalue on the sub-matrix formed by the rows and columns info/(n+1) through mod(info,n+1).

### rocsolver_<type>stein()¶

rocblas_status rocsolver_zstein(rocblas_handle handle, const rocblas_int n, double *D, double *E, rocblas_int *nev, double *W, rocblas_int *iblock, rocblas_int *isplit, rocblas_double_complex *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)
rocblas_status rocsolver_cstein(rocblas_handle handle, const rocblas_int n, float *D, float *E, rocblas_int *nev, float *W, rocblas_int *iblock, rocblas_int *isplit, rocblas_float_complex *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)
rocblas_status rocsolver_dstein(rocblas_handle handle, const rocblas_int n, double *D, double *E, rocblas_int *nev, double *W, rocblas_int *iblock, rocblas_int *isplit, double *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)
rocblas_status rocsolver_sstein(rocblas_handle handle, const rocblas_int n, float *D, float *E, rocblas_int *nev, float *W, rocblas_int *iblock, rocblas_int *isplit, float *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)

STEIN computes the eigenvectors associated with a set of provided eigenvalues of a symmetric tridiagonal matrix.

The eigenvectors of the symmetric tridiagonal matrix are computed using inverse iteration.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E. The eigenvalues must be provided in the array W, as returned by STEBZ.

Parameters
• [in] handle: rocblas_handle.

• [in] n: rocblas_int. n >= 0.

The number of rows and columns of the tridiagonal matrix.

• [in] D: pointer to real type. Array on the GPU of dimension n.

The diagonal elements of the tridiagonal matrix.

• [in] E: pointer to real type. Array on the GPU of dimension n-1.

The off-diagonal elements of the tridiagonal matrix.

• [in] nev: pointer to a rocblas_int on the GPU. 0 <= nev <= n.

The number of provided eigenvalues, and the number of eigenvectors to be computed.

• [in] W: pointer to real type. Array on the GPU of dimension >= nev.

A subset of nev eigenvalues of the tridiagonal matrix, as returned by

STEBZ.

• [in] iblock: pointer to rocblas_int. Array on the GPU of dimension n.

The block indices corresponding to each eigenvalue, as returned by

STEBZ. If iblock[i] = k, then eigenvalue W[i] belongs to the k-th block from the top.

• [in] isplit: pointer to rocblas_int. Array on the GPU of dimension n.

The splitting indices that divide the tridiagonal matrix into diagonal blocks, as returned by

STEBZ. The k-th block stretches from the end of the (k-1)-th block (or the top left corner of the tridiagonal matrix, in the case of the 1st block) to the isplit[k]-th row/column.

• [out] Z: pointer to type. Array on the GPU of dimension ldz*nev.

On exit, contains the eigenvectors of the tridiagonal matrix corresponding to the provided eigenvalues, stored by columns.

• [in] ldz: rocblas_int. ldz >= n.

Specifies the leading dimension of Z.

• [out] ifail: pointer to rocblas_int. Array on the GPU of dimension n.

If info = 0, the first nev elements of ifail are zero. Otherwise, contains the indices of those eigenvectors that failed to converge.

• [out] info: pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, i eigenvectors did not converge; their indices are stored in ifail.

## 3.2.5. Symmetric matrices¶

List of functions for symmetric matrices

### rocsolver_<type>lasyf()¶

rocblas_status rocsolver_zlasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)
rocblas_status rocsolver_clasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)
rocblas_status rocsolver_dlasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)
rocblas_status rocsolver_slasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)

LASYF computes a partial factorization of a symmetric matrix $$A$$ using Bunch-Kaufman diagonal pivoting.

The partial factorization has the form

$\begin{split} A = \left[ \begin{array}{cc} I & U_{12} \\ 0 & U_{22} \end{array} \right] \left[ \begin{array}{cc} A_{11} & 0 \\ 0 & D \end{array} \right] \left[ \begin{array}{cc} I & 0 \\ U_{12}^T & U_{22}^T \end{array} \right] \end{split}$

or

$\begin{split} A = \left[ \begin{array}{cc} L_{11} & 0 \\ L_{21} & I \end{array} \right] \left[ \begin{array}{cc} D & 0 \\ 0 & A_{22} \end{array} \right] \left[ \begin{array}{cc} L_{11}^T & L_{21}^T \\ 0 & I \end{array} \right] \end{split}$

depending on the value of uplo. The order of the block diagonal matrix $$D$$ is either $$nb$$ or $$nb-1$$, and is returned in the argument $$kb$$.

Parameters
• [in] handle: rocblas_handle.

• [in] uplo: rocblas_fill.

Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

• [in] n: rocblas_int. n >= 0.

The number of rows and columns of the matrix A.

• [in] nb: rocblas_int. 2 <= nb <= n.

The number of columns of A to be factored.

• [out] kb: pointer to a rocblas_int on the GPU.

The number of columns of A that were actually factored (either nb or nb-1).

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the symmetric matrix A to be factored. On exit, the partially factored matrix.

• [in] lda: rocblas_int. lda >= n.

Specifies the leading dimension of A.

• [out] ipiv: pointer to rocblas_int. Array on the GPU of dimension n.

The vector of pivot indices. Elements of ipiv are 1-based indices. If uplo is upper, then only the last kb elements of ipiv will be set. For n - kb < k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k-1] < 0, then rows and columns k-1 and -ipiv[k] were interchanged and D[k-1,k-1] to D[k,k] is a 2-by-2 diagonal block. If uplo is lower, then only the first kb elements of ipiv will be set. For 1 <= k <= kb, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k+1] < 0, then rows and columns k+1 and -ipiv[k] were interchanged and D[k,k] to D[k+1,k+1] is a 2-by-2 diagonal block.

• [out] info: pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.

## 3.2.6. Orthonormal matrices¶

### rocsolver_<type>org2r()¶

rocblas_status rocsolver_dorg2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorg2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORG2R generates an m-by-n Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

$Q = H_1H_2\cdots H_k.$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEQRF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q.

• [in] n: rocblas_int. 0 <= n <= m.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= n.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQRF.

### rocsolver_<type>orgqr()¶

rocblas_status rocsolver_dorgqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGQR generates an m-by-n Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

$Q = H_1H_2\cdots H_k$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEQRF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q.

• [in] n: rocblas_int. 0 <= n <= m.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= n.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQRF.

### rocsolver_<type>orgl2()¶

rocblas_status rocsolver_dorgl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGL2 generates an m-by-n Matrix Q with orthonormal rows.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

$Q = H_kH_{k-1}\cdots H_1$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GELQF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. 0 <= m <= n.

The number of rows of the matrix Q.

• [in] n: rocblas_int. n >= 0.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= m.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GELQF.

### rocsolver_<type>orglq()¶

rocblas_status rocsolver_dorglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGLQ generates an m-by-n Matrix Q with orthonormal rows.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

$Q = H_kH_{k-1}\cdots H_1$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GELQF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. 0 <= m <= n.

The number of rows of the matrix Q.

• [in] n: rocblas_int. n >= 0.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= m.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GELQF.

### rocsolver_<type>org2l()¶

rocblas_status rocsolver_dorg2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorg2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORG2L generates an m-by-n Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m

$Q = H_kH_{k-1}\cdots H_1$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEQLF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q.

• [in] n: rocblas_int. 0 <= n <= m.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= n.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQLF.

### rocsolver_<type>orgql()¶

rocblas_status rocsolver_dorgql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGQL generates an m-by-n Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the last n column of the product of k Householder reflectors of order m

$Q = H_kH_{k-1}\cdots H_1$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEQLF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q.

• [in] n: rocblas_int. 0 <= n <= m.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= n.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQLF.

### rocsolver_<type>orgbr()¶

rocblas_status rocsolver_dorgbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGBR generates an m-by-n Matrix Q with orthonormal rows or columns.

If storev is column-wise, then the matrix Q has orthonormal columns. If m >= k, Q is defined as the first n columns of the product of k Householder reflectors of order m

$Q = H_1H_2\cdots H_k$

If m < k, Q is defined as the product of Householder reflectors of order m

$Q = H_1H_2\cdots H_{m-1}$

On the other hand, if storev is row-wise, then the matrix Q has orthonormal rows. If n > k, Q is defined as the first m rows of the product of k Householder reflectors of order n

$Q = H_kH_{k-1}\cdots H_1$

If n <= k, Q is defined as the product of Householder reflectors of order n

$Q = H_{n-1}H_{n-2}\cdots H_1$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEBRD in its arguments A and tauq or taup.

Parameters
• [in] handle: rocblas_handle.

• [in] storev: rocblas_storev.

Specifies whether to work column-wise or row-wise.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q. If row-wise, then min(n,k) <= m <= n.

• [in] n: rocblas_int. n >= 0.

The number of columns of the matrix Q. If column-wise, then min(m,k) <= n <= m.

• [in] k: rocblas_int. k >= 0.

The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by

GEBRD.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the Householder vectors as returned by

GEBRD. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension min(m,k) if column-wise, or min(n,k) if row-wise.

The Householder scalars as returned by

GEBRD.

### rocsolver_<type>orgtr()¶

rocblas_status rocsolver_dorgtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)

ORGTR generates an n-by-n orthogonal Matrix Q.

Q is defined as the product of n-1 Householder reflectors of order n. If uplo indicates upper, then Q has the form

$Q = H_{n-1}H_{n-2}\cdots H_1$

On the other hand, if uplo indicates lower, then Q has the form

$Q = H_1H_2\cdots H_{n-1}$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by SYTRD in its arguments A and tau.

Parameters
• [in] handle: rocblas_handle.

• [in] uplo: rocblas_fill.

Specifies whether the

SYTRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

• [in] n: rocblas_int. n >= 0.

The number of rows and columns of the matrix Q.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the Householder vectors as returned by

SYTRD. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= n.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension n-1.

The Householder scalars as returned by

SYTRD.

### rocsolver_<type>orm2r()¶

rocblas_status rocsolver_dorm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sorm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORM2R multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_1H_2 \cdots H_k$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*k.

The Householder vectors as returned by

GEQRF in the first k columns of its argument A.

• [in] lda: rocblas_int. lda >= m if side is left, or lda >= n if side is right.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQRF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>ormqr()¶

rocblas_status rocsolver_dormqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMQR multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_1H_2\cdots H_k$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*k.

The Householder vectors as returned by

GEQRF in the first k columns of its argument A.

• [in] lda: rocblas_int. lda >= m if side is left, or lda >= n if side is right.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQRF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>orml2()¶

rocblas_status rocsolver_dorml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sorml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORML2 multiplies a matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_kH_{k-1}\cdots H_1$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.

The Householder vectors as returned by

GELQF in the first k rows of its argument A.

• [in] lda: rocblas_int. lda >= k.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GELQF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>ormlq()¶

rocblas_status rocsolver_dormlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMLQ multiplies a matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_kH_{k-1}\cdots H_1$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.

The Householder vectors as returned by

GELQF in the first k rows of its argument A.

• [in] lda: rocblas_int. lda >= k.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GELQF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>orm2l()¶

rocblas_status rocsolver_dorm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sorm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORM2L multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_kH_{k-1}\cdots H_1$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*k.

The Householder vectors as returned by

GEQLF in the last k columns of its argument A.

• [in] lda: rocblas_int. lda >= m if side is left, lda >= n if side is right.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQLF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>ormql()¶

rocblas_status rocsolver_dormql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMQL multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_kH_{k-1}\cdots H_1$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*k.

The Householder vectors as returned by

GEQLF in the last k columns of its argument A.

• [in] lda: rocblas_int. lda >= m if side is left, lda >= n if side is right.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQLF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>ormbr()¶

rocblas_status rocsolver_dormbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMBR multiplies a matrix Q with orthonormal rows or columns by a general m-by-n matrix C.

If storev is column-wise, then the matrix Q has orthonormal columns. If storev is row-wise, then the matrix Q has orthonormal rows. The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}$

The order q of the orthogonal matrix Q is q = m if applying from the left, or q = n if applying from the right.

When storev is column-wise, if q >= k, then Q is defined as the product of k Householder reflectors

$Q = H_1H_2\cdots H_k,$

and if q < k, then Q is defined as the product

$Q = H_1H_2\cdots H_{q-1}.$

When storev is row-wise, if q > k, then Q is defined as the product of k Householder reflectors

$Q = H_1H_2\cdots H_k,$

and if q <= k, Q is defined as the product

$Q = H_1H_2\cdots H_{q-1}.$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors and scalars as returned by GEBRD in its arguments A and tauq or taup.

Parameters
• [in] handle: rocblas_handle.

• [in] storev: rocblas_storev.

Specifies whether to work column-wise or row-wise.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0.

The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by

GEBRD.

• [in] A: pointer to type. Array on the GPU of size lda*min(q,k) if column-wise, or lda*q if row-wise.

The Householder vectors as returned by

GEBRD.

• [in] lda: rocblas_int. lda >= q if column-wise, or lda >= min(q,k) if row-wise.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least min(q,k).

The Householder scalars as returned by

GEBRD.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>ormtr()¶

rocblas_status rocsolver_dormtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMTR multiplies an orthogonal matrix Q by a general m-by-n matrix C.

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}$

The order q of the orthogonal matrix Q is q = m if applying from the left, or q = n if applying from the right.

Q is defined as a product of q-1 Householder reflectors. If uplo indicates upper, then Q has the form

$Q = H_{q-1}H_{q-2}\cdots H_1.$

On the other hand, if uplo indicates lower, then Q has the form

$Q = H_1H_2\cdots H_{q-1}$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors and scalars as returned by SYTRD in its arguments A and tau.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] uplo: rocblas_fill.

Specifies whether the

SYTRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] A: pointer to type. Array on the GPU of size lda*q.

On entry, the Householder vectors as returned by

SYTRD.

• [in] lda: rocblas_int. lda >= q.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least q-1.

The Householder scalars as returned by

SYTRD.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

## 3.2.7. Unitary matrices¶

### rocsolver_<type>ung2r()¶

rocblas_status rocsolver_zung2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cung2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNG2R generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

$Q = H_1H_2\cdots H_k$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEQRF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q.

• [in] n: rocblas_int. 0 <= n <= m.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= n.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQRF.

### rocsolver_<type>ungqr()¶

rocblas_status rocsolver_zungqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGQR generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

$Q = H_1H_2\cdots H_k$

Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEQRF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q.

• [in] n: rocblas_int. 0 <= n <= m.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= n.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQRF.

### rocsolver_<type>ungl2()¶

rocblas_status rocsolver_zungl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGL2 generates an m-by-n complex Matrix Q with orthonormal rows.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

$Q = H_k^HH_{k-1}^H\cdots H_1^H$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GELQF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. 0 <= m <= n.

The number of rows of the matrix Q.

• [in] n: rocblas_int. n >= 0.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= m.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GELQF.

### rocsolver_<type>unglq()¶

rocblas_status rocsolver_zunglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cunglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGLQ generates an m-by-n complex Matrix Q with orthonormal rows.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

$Q = H_k^HH_{k-1}^H\cdots H_1^H$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GELQF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. 0 <= m <= n.

The number of rows of the matrix Q.

• [in] n: rocblas_int. n >= 0.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= m.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GELQF.

### rocsolver_<type>ung2l()¶

rocblas_status rocsolver_zung2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cung2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNG2L generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m

$Q = H_kH_{k-1}\cdots H_1$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEQLF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q.

• [in] n: rocblas_int. 0 <= n <= m.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= n.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQLF.

### rocsolver_<type>ungql()¶

rocblas_status rocsolver_zungql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGQL generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m

$Q = H_kH_{k-1}\cdots H_1$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEQLF.

Parameters
• [in] handle: rocblas_handle.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q.

• [in] n: rocblas_int. 0 <= n <= m.

The number of columns of the matrix Q.

• [in] k: rocblas_int. 0 <= k <= n.

The number of Householder reflectors.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the matrix A as returned by

GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQLF.

### rocsolver_<type>ungbr()¶

rocblas_status rocsolver_zungbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGBR generates an m-by-n complex Matrix Q with orthonormal rows or columns.

If storev is column-wise, then the matrix Q has orthonormal columns. If m >= k, Q is defined as the first n columns of the product of k Householder reflectors of order m

$Q = H_1H_2\cdots H_k$

If m < k, Q is defined as the product of Householder reflectors of order m

$Q = H_1H_2\cdots H_{m-1}$

On the other hand, if storev is row-wise, then the matrix Q has orthonormal rows. If n > k, Q is defined as the first m rows of the product of k Householder reflectors of order n

$Q = H_kH_{k-1}\cdots H_1$

If n <= k, Q is defined as the product of Householder reflectors of order n

$Q = H_{n-1}H_{n-2}\cdots H_1$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by GEBRD in its arguments A and tauq or taup.

Parameters
• [in] handle: rocblas_handle.

• [in] storev: rocblas_storev.

Specifies whether to work column-wise or row-wise.

• [in] m: rocblas_int. m >= 0.

The number of rows of the matrix Q. If row-wise, then min(n,k) <= m <= n.

• [in] n: rocblas_int. n >= 0.

The number of columns of the matrix Q. If column-wise, then min(m,k) <= n <= m.

• [in] k: rocblas_int. k >= 0.

The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by

GEBRD.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the Householder vectors as returned by

GEBRD. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension min(m,k) if column-wise, or min(n,k) if row-wise.

The Householder scalars as returned by

GEBRD.

### rocsolver_<type>ungtr()¶

rocblas_status rocsolver_zungtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGTR generates an n-by-n unitary Matrix Q.

Q is defined as the product of n-1 Householder reflectors of order n. If uplo indicates upper, then Q has the form

$Q = H_{n-1}H_{n-2}\cdots H_1$

On the other hand, if uplo indicates lower, then Q has the form

$Q = H_1H_2\cdots H_{n-1}$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors $$v_i$$ and scalars $$\text{ipiv}[i]$$, as returned by HETRD in its arguments A and tau.

Parameters
• [in] handle: rocblas_handle.

• [in] uplo: rocblas_fill.

Specifies whether the

HETRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

• [in] n: rocblas_int. n >= 0.

The number of rows and columns of the matrix Q.

• [inout] A: pointer to type. Array on the GPU of dimension lda*n.

On entry, the Householder vectors as returned by

HETRD. On exit, the computed matrix Q.

• [in] lda: rocblas_int. lda >= m.

Specifies the leading dimension of A.

• [in] ipiv: pointer to type. Array on the GPU of dimension n-1.

The Householder scalars as returned by

HETRD.

### rocsolver_<type>unm2r()¶

rocblas_status rocsolver_zunm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNM2R multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_1H_2\cdots H_k$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its conjugate transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*k.

The Householder vectors as returned by

GEQRF in the first k columns of its argument A.

• [in] lda: rocblas_int. lda >= m if side is left, or lda >= n if side is right.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQRF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>unmqr()¶

rocblas_status rocsolver_zunmqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMQR multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_1H_2\cdots H_k$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its conjugate transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*k.

The Householder vectors as returned by

GEQRF in the first k columns of its argument A.

• [in] lda: rocblas_int. lda >= m if side is left, or lda >= n if side is right.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQRF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>unml2()¶

rocblas_status rocsolver_zunml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNML2 multiplies a complex matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_k^HH_{k-1}^H\cdots H_1^H$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its conjugate transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.

The Householder vectors as returned by

GELQF in the first k rows of its argument A.

• [in] lda: rocblas_int. lda >= k.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GELQF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>unmlq()¶

rocblas_status rocsolver_zunmlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMLQ multiplies a complex matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_k^HH_{k-1}^H\cdots H_1^H$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its conjugate transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.

The Householder vectors as returned by

GELQF in the first k rows of its argument A.

• [in] lda: rocblas_int. lda >= k.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GELQF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>unm2l()¶

rocblas_status rocsolver_zunm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNM2L multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_kH_{k-1}\cdots H_1$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its conjugate transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*k.

The Householder vectors as returned by

GEQLF in the last k columns of its argument A.

• [in] lda: rocblas_int. lda >= m if side is left, lda >= n if side is right.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQLF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>unmql()¶

rocblas_status rocsolver_zunmql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMQL multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}$

Q is defined as the product of k Householder reflectors

$Q = H_kH_{k-1}\cdots H_1$

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its conjugate transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

The number of Householder reflectors that form Q.

• [in] A: pointer to type. Array on the GPU of size lda*k.

The Householder vectors as returned by

GEQLF in the last k columns of its argument A.

• [in] lda: rocblas_int. lda >= m if side is left, lda >= n if side is right.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

The Householder scalars as returned by

GEQLF.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>unmbr()¶

rocblas_status rocsolver_zunmbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMBR multiplies a complex matrix Q with orthonormal rows or columns by a general m-by-n matrix C.

If storev is column-wise, then the matrix Q has orthonormal columns. If storev is row-wise, then the matrix Q has orthonormal rows. The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}$

The order q of the unitary matrix Q is q = m if applying from the left, or q = n if applying from the right.

When storev is column-wise, if q >= k, then Q is defined as the product of k Householder reflectors

$Q = H_1H_2\cdots H_k,$

and if q < k, then Q is defined as the product

$Q = H_1H_2\cdots H_{q-1}.$

When storev is row-wise, if q > k, then Q is defined as the product of k Householder reflectors

$Q = H_1H_2\cdots H_k,$

and if q <= k, Q is defined as the product

$Q = H_1H_2\cdots H_{q-1}.$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors and scalars as returned by GEBRD in its arguments A and tauq or taup.

Parameters
• [in] handle: rocblas_handle.

• [in] storev: rocblas_storev.

Specifies whether to work column-wise or row-wise.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its conjugate transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] k: rocblas_int. k >= 0.

The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by

GEBRD.

• [in] A: pointer to type. Array on the GPU of size lda*min(q,k) if column-wise, or lda*q if row-wise.

The Householder vectors as returned by

GEBRD.

• [in] lda: rocblas_int. lda >= q if column-wise, or lda >= min(q,k) if row-wise.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least min(q,k).

The Householder scalars as returned by

GEBRD.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.

### rocsolver_<type>unmtr()¶

rocblas_status rocsolver_zunmtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMTR multiplies a unitary matrix Q by a general m-by-n matrix C.

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

$\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}$

The order q of the unitary matrix Q is q = m if applying from the left, or q = n if applying from the right.

Q is defined as a product of q-1 Householder reflectors. If uplo indicates upper, then Q has the form

$Q = H_{q-1}H_{q-2}\cdots H_1.$

On the other hand, if uplo indicates lower, then Q has the form

$Q = H_1H_2\cdots H_{q-1}$

The Householder matrices $$H_i$$ are never stored, they are computed from its corresponding Householder vectors and scalars as returned by HETRD in its arguments A and tau.

Parameters
• [in] handle: rocblas_handle.

• [in] side: rocblas_side.

Specifies from which side to apply Q.

• [in] uplo: rocblas_fill.

Specifies whether the

HETRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

• [in] trans: rocblas_operation.

Specifies whether the matrix Q or its conjugate transpose is to be applied.

• [in] m: rocblas_int. m >= 0.

Number of rows of matrix C.

• [in] n: rocblas_int. n >= 0.

Number of columns of matrix C.

• [in] A: pointer to type. Array on the GPU of size lda*q.

On entry, the Householder vectors as returned by

HETRD.

• [in] lda: rocblas_int. lda >= q.

• [in] ipiv: pointer to type. Array on the GPU of dimension at least q-1.

The Householder scalars as returned by

HETRD.

• [inout] C: pointer to type. Array on the GPU of size ldc*n.

On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

• [in] ldc: rocblas_int. ldc >= m.