rocSOLVER
rocm-5.4.0
1. rocSOLVER User Guide
1.1. Introduction
1.1.1. Library overview
1.1.2. Currently implemented functionality
LAPACK auxiliary functions
LAPACK main functions
LAPACK-like functions
1.2. Building and Installation
1.2.1. Prerequisites
1.2.2. Installing from pre-built packages
1.2.3. Building & installing from source
Using the install.sh script
Manual building and installation
1.3. Using rocSOLVER
1.3.1. QR factorization of a single matrix
1.3.2. QR factorization of a batch of matrices
Strided_batched version
Batched version
1.4. Memory Model
1.4.1. Automatic workspace
1.4.2. User-managed workspace
Minimum required size
Using an environment variable
Using helper functions
1.4.3. User-owned workspace
1.5. Multi-level Logging
1.5.1. Logging modes
Trace logging
Bench logging
Profile logging
1.5.2. Initialization and set-up
1.5.3. Example code
1.5.4. Kernel logging
1.5.5. Multiple host threads
1.6. Clients
1.6.1. Testing rocSOLVER
1.6.2. Benchmarking rocSOLVER
1.6.3. rocSOLVER sample code
2. rocSOLVER Library Design Guide
2.1. Introduction
2.2. Batched rocSOLVER
2.3. Tuning rocSOLVER Performance
2.3.1. geqr2/geqrf and geql2/geqlf functions
GEQxF_BLOCKSIZE
GEQxF_GEQx2_SWITCHSIZE
2.3.2. gerq2/gerqf and gelq2/gelqf functions
GExQF_BLOCKSIZE
GExQF_GExQ2_SWITCHSIZE
2.3.3. org2r/orgqr, org2l/orgql, ung2r/ungqr and ung2l/ungql functions
xxGQx_BLOCKSIZE
xxGQx_xxGQx2_SWITCHSIZE
2.3.4. orgr2/orgrq, orgl2/orglq, ungr2/ungrq and ungl2/unglq functions
xxGxQ_BLOCKSIZE
xxGxQ_xxGxQ2_SWITCHSIZE
2.3.5. orm2r/ormqr, orm2l/ormql, unm2r/unmqr and unm2l/unmql functions
xxMQx_BLOCKSIZE
2.3.6. ormr2/ormrq, orml2/ormlq, unmr2/unmrq and unml2/unmlq functions
xxMxQ_BLOCKSIZE
2.3.7. gebd2/gebrd and labrd functions
GEBRD_BLOCKSIZE
GEBRD_GEBD2_SWITCHSIZE
2.3.8. gesvd function
THIN_SVD_SWITCH
2.3.9. sytd2/sytrd, hetd2/hetrd and latrd functions
xxTRD_BLOCKSIZE
xxTRD_xxTD2_SWITCHSIZE
2.3.10. sygs2/sygst and hegs2/hegst functions
xxGST_BLOCKSIZE
2.3.11. syevd, heevd and stedc functions
STEDC_MIN_DC_SIZE
2.3.12. potf2/potrf functions
POTRF_BLOCKSIZE
POTRF_POTF2_SWITCHSIZE
2.3.13. sytf2/sytrf and lasyf functions
SYTRF_BLOCKSIZE
SYTRF_SYTF2_SWITCHSIZE
2.3.14. getf2/getrf functions
GETF2_MAX_COLS
GETF2_MAX_THDS
GETF2_OPTIM_NGRP
GETRF_NUM_INTERVALS
GETRF_INTERVALS
GETRF_BLKSIZES
GETRF_BATCH_NUM_INTERVALS
GETRF_BATCH_INTERVALS
GETRF_BATCH_BLKSIZES
GETRF_NPVT_NUM_INTERVALS
GETRF_NPVT_INTERVALS
GETRF_NPVT_BLKSIZES
GETRF_NPVT_BATCH_NUM_INTERVALS
GETRF_NPVT_BATCH_INTERVALS
GETRF_NPVT_BATCH_BLKSIZES
2.3.15. getri function
GETRI_MAX_COLS
GETRI_TINY_SIZE
GETRI_NUM_INTERVALS
GETRI_INTERVALS
GETRI_BLKSIZES
GETRI_BATCH_TINY_SIZE
GETRI_BATCH_NUM_INTERVALS
GETRI_BATCH_INTERVALS
GETRI_BATCH_BLKSIZES
2.3.16. trtri function
TRTRI_MAX_COLS
TRTRI_NUM_INTERVALS
TRTRI_INTERVALS
TRTRI_BLKSIZES
TRTRI_BATCH_NUM_INTERVALS
TRTRI_BATCH_INTERVALS
TRTRI_BATCH_BLKSIZES
2.4. Contributing Guidelines
3. rocSOLVER API
3.1. Types
3.1.1. Additional types
rocblas_direct
rocblas_storev
rocblas_svect
rocblas_srange
rocblas_evect
rocblas_workmode
rocblas_eform
rocblas_erange
rocblas_eorder
rocblas_esort
rocblas_layer_mode_flags
3.2. LAPACK Auxiliary Functions
3.2.1. Vector and Matrix manipulations
rocsolver_<type>lacgv()
rocsolver_<type>laswp()
3.2.2. Householder reflections
rocsolver_<type>larfg()
rocsolver_<type>larft()
rocsolver_<type>larf()
rocsolver_<type>larfb()
3.2.3. Bidiagonal forms
rocsolver_<type>labrd()
rocsolver_<type>bdsqr()
rocsolver_<type>bdsvdx()
3.2.4. Tridiagonal forms
rocsolver_<type>latrd()
rocsolver_<type>sterf()
rocsolver_<type>stebz()
rocsolver_<type>steqr()
rocsolver_<type>stedc()
rocsolver_<type>stein()
3.2.5. Symmetric matrices
rocsolver_<type>lasyf()
3.2.6. Orthonormal matrices
rocsolver_<type>org2r()
rocsolver_<type>orgqr()
rocsolver_<type>orgl2()
rocsolver_<type>orglq()
rocsolver_<type>org2l()
rocsolver_<type>orgql()
rocsolver_<type>orgbr()
rocsolver_<type>orgtr()
rocsolver_<type>orm2r()
rocsolver_<type>ormqr()
rocsolver_<type>orml2()
rocsolver_<type>ormlq()
rocsolver_<type>orm2l()
rocsolver_<type>ormql()
rocsolver_<type>ormbr()
rocsolver_<type>ormtr()
3.2.7. Unitary matrices
rocsolver_<type>ung2r()
rocsolver_<type>ungqr()
rocsolver_<type>ungl2()
rocsolver_<type>unglq()
rocsolver_<type>ung2l()
rocsolver_<type>ungql()
rocsolver_<type>ungbr()
rocsolver_<type>ungtr()
rocsolver_<type>unm2r()
rocsolver_<type>unmqr()
rocsolver_<type>unml2()
rocsolver_<type>unmlq()
rocsolver_<type>unm2l()
rocsolver_<type>unmql()
rocsolver_<type>unmbr()
rocsolver_<type>unmtr()
3.3. LAPACK Functions
3.3.1. Triangular factorizations
rocsolver_<type>potf2()
rocsolver_<type>potf2_batched()
rocsolver_<type>potf2_strided_batched()
rocsolver_<type>potrf()
rocsolver_<type>potrf_batched()
rocsolver_<type>potrf_strided_batched()
rocsolver_<type>getf2()
rocsolver_<type>getf2_batched()
rocsolver_<type>getf2_strided_batched()
rocsolver_<type>getrf()
rocsolver_<type>getrf_batched()
rocsolver_<type>getrf_strided_batched()
rocsolver_<type>sytf2()
rocsolver_<type>sytf2_batched()
rocsolver_<type>sytf2_strided_batched()
rocsolver_<type>sytrf()
rocsolver_<type>sytrf_batched()
rocsolver_<type>sytrf_strided_batched()
3.3.2. Orthogonal factorizations
rocsolver_<type>geqr2()
rocsolver_<type>geqr2_batched()
rocsolver_<type>geqr2_strided_batched()
rocsolver_<type>geqrf()
rocsolver_<type>geqrf_batched()
rocsolver_<type>geqrf_strided_batched()
rocsolver_<type>gerq2()
rocsolver_<type>gerq2_batched()
rocsolver_<type>gerq2_strided_batched()
rocsolver_<type>gerqf()
rocsolver_<type>gerqf_batched()
rocsolver_<type>gerqf_strided_batched()
rocsolver_<type>geql2()
rocsolver_<type>geql2_batched()
rocsolver_<type>geql2_strided_batched()
rocsolver_<type>geqlf()
rocsolver_<type>geqlf_batched()
rocsolver_<type>geqlf_strided_batched()
rocsolver_<type>gelq2()
rocsolver_<type>gelq2_batched()
rocsolver_<type>gelq2_strided_batched()
rocsolver_<type>gelqf()
rocsolver_<type>gelqf_batched()
rocsolver_<type>gelqf_strided_batched()
3.3.3. Problem and matrix reductions
rocsolver_<type>gebd2()
rocsolver_<type>gebd2_batched()
rocsolver_<type>gebd2_strided_batched()
rocsolver_<type>gebrd()
rocsolver_<type>gebrd_batched()
rocsolver_<type>gebrd_strided_batched()
rocsolver_<type>sytd2()
rocsolver_<type>sytd2_batched()
rocsolver_<type>sytd2_strided_batched()
rocsolver_<type>hetd2()
rocsolver_<type>hetd2_batched()
rocsolver_<type>hetd2_strided_batched()
rocsolver_<type>sytrd()
rocsolver_<type>sytrd_batched()
rocsolver_<type>sytrd_strided_batched()
rocsolver_<type>hetrd()
rocsolver_<type>hetrd_batched()
rocsolver_<type>hetrd_strided_batched()
rocsolver_<type>sygs2()
rocsolver_<type>sygs2_batched()
rocsolver_<type>sygs2_strided_batched()
rocsolver_<type>hegs2()
rocsolver_<type>hegs2_batched()
rocsolver_<type>hegs2_strided_batched()
rocsolver_<type>sygst()
rocsolver_<type>sygst_batched()
rocsolver_<type>sygst_strided_batched()
rocsolver_<type>hegst()
rocsolver_<type>hegst_batched()
rocsolver_<type>hegst_strided_batched()
3.3.4. Linear-systems solvers
rocsolver_<type>trtri()
rocsolver_<type>trtri_batched()
rocsolver_<type>trtri_strided_batched()
rocsolver_<type>getri()
rocsolver_<type>getri_batched()
rocsolver_<type>getri_strided_batched()
rocsolver_<type>getrs()
rocsolver_<type>getrs_batched()
rocsolver_<type>getrs_strided_batched()
rocsolver_<type>gesv()
rocsolver_<type>gesv_batched()
rocsolver_<type>gesv_strided_batched()
rocsolver_<type>potri()
rocsolver_<type>potri_batched()
rocsolver_<type>potri_strided_batched()
rocsolver_<type>potrs()
rocsolver_<type>potrs_batched()
rocsolver_<type>potrs_strided_batched()
rocsolver_<type>posv()
rocsolver_<type>posv_batched()
rocsolver_<type>posv_strided_batched()
3.3.5. Least-squares solvers
rocsolver_<type>gels()
rocsolver_<type>gels_batched()
rocsolver_<type>gels_strided_batched()
3.3.6. Symmetric eigensolvers
rocsolver_<type>syev()
rocsolver_<type>syev_batched()
rocsolver_<type>syev_strided_batched()
rocsolver_<type>heev()
rocsolver_<type>heev_batched()
rocsolver_<type>heev_strided_batched()
rocsolver_<type>syevd()
rocsolver_<type>syevd_batched()
rocsolver_<type>syevd_strided_batched()
rocsolver_<type>heevd()
rocsolver_<type>heevd_batched()
rocsolver_<type>heevd_strided_batched()
rocsolver_<type>syevx()
rocsolver_<type>syevx_batched()
rocsolver_<type>syevx_strided_batched()
rocsolver_<type>heevx()
rocsolver_<type>heevx_batched()
rocsolver_<type>heevx_strided_batched()
rocsolver_<type>sygv()
rocsolver_<type>sygv_batched()
rocsolver_<type>sygv_strided_batched()
rocsolver_<type>hegv()
rocsolver_<type>hegv_batched()
rocsolver_<type>hegv_strided_batched()
rocsolver_<type>sygvd()
rocsolver_<type>sygvd_batched()
rocsolver_<type>sygvd_strided_batched()
rocsolver_<type>hegvd()
rocsolver_<type>hegvd_batched()
rocsolver_<type>hegvd_strided_batched()
rocsolver_<type>sygvx()
rocsolver_<type>sygvx_batched()
rocsolver_<type>sygvx_strided_batched()
rocsolver_<type>hegvx()
rocsolver_<type>hegvx_batched()
rocsolver_<type>hegvx_strided_batched()
3.3.7. Singular value decomposition
rocsolver_<type>gesvd()
rocsolver_<type>gesvd_batched()
rocsolver_<type>gesvd_strided_batched()
rocsolver_<type>gesvdx()
rocsolver_<type>gesvdx_batched()
rocsolver_<type>gesvdx_strided_batched()
3.4. Lapack-like Functions
3.4.1. Triangular factorizations
rocsolver_<type>getf2_npvt()
rocsolver_<type>getf2_npvt_batched()
rocsolver_<type>getf2_npvt_strided_batched()
rocsolver_<type>getrf_npvt()
rocsolver_<type>getrf_npvt_batched()
rocsolver_<type>getrf_npvt_strided_batched()
3.4.2. Linear-systems solvers
rocsolver_<type>getri_npvt()
rocsolver_<type>getri_npvt_batched()
rocsolver_<type>getri_npvt_strided_batched()
rocsolver_<type>getri_outofplace()
rocsolver_<type>getri_outofplace_batched()
rocsolver_<type>getri_outofplace_strided_batched()
rocsolver_<type>getri_npvt_outofplace()
rocsolver_<type>getri_npvt_outofplace_batched()
rocsolver_<type>getri_npvt_outofplace_strided_batched()
3.4.3. Symmetric eigensolvers
rocsolver_<type>syevj()
rocsolver_<type>syevj_batched()
rocsolver_<type>syevj_strided_batched()
rocsolver_<type>heevj()
rocsolver_<type>heevj_batched()
rocsolver_<type>heevj_strided_batched()
rocsolver_<type>sygvj()
rocsolver_<type>sygvj_batched()
rocsolver_<type>sygvj_strided_batched()
rocsolver_<type>hegvj()
rocsolver_<type>hegvj_batched()
rocsolver_<type>hegvj_strided_batched()
3.5. Logging Functions and Library Information
3.5.1. Logging functions
rocsolver_log_begin()
rocsolver_log_end()
rocsolver_log_set_layer_mode()
rocsolver_log_set_max_levels()
rocsolver_log_restore_defaults()
rocsolver_log_write_profile()
rocsolver_log_flush_profile()
3.5.2. Library information
rocsolver_get_version_string()
rocsolver_get_version_string_size()
3.6. Deprecated
3.6.1. Types
rocsolver_int
rocsolver_handle
rocsolver_direction
rocsolver_storev
rocsolver_operation
rocsolver_fill
rocsolver_diagonal
rocsolver_side
rocsolver_status
3.6.2. Auxiliary functions
rocsolver_create_handle()
rocsolver_destroy_handle()
rocsolver_set_stream()
rocsolver_get_stream()
rocsolver_set_vector()
rocsolver_get_vector()
rocsolver_set_matrix()
rocsolver_get_matrix()
4. License & Attributions
rocSOLVER
»
2.
rocSOLVER Library Design Guide
»
2.2.
Batched rocSOLVER
Edit on GitHub
Next
Previous
2.2.
Batched rocSOLVER
¶
More to come later…
Read the Docs
v: rocm-5.4.0
Versions
master
latest
rocm-5.4.0
rocm-5.3.3
rocm-5.3.2
rocm-5.3.1
rocm-5.3.0
rocm-5.2.3
rocm-5.2.1
rocm-5.2.0
rocm-5.1.3
rocm-5.1.1
rocm-5.1.0
rocm-5.0.2
rocm-5.0.1
rocm-5.0.0
rocm-4.5.2
rocm-4.5.0
rocm-4.3.1
rocm-4.3.0
rocm-4.2.0
rocm-4.1.0
rocm-4.0.0
rocm-3.9.0
rocm-3.8.0
rocm-3.7.0
rocm-3.5.0
rocm-3.10.0
Downloads
On Read the Docs
Project Home
Builds