rocSOLVER
rocm-5.1.3
  • 1. rocSOLVER User Guide
    • 1.1. Introduction
      • 1.1.1. Library overview
      • 1.1.2. Currently implemented functionality
        • LAPACK auxiliary functions
        • LAPACK main functions
        • LAPACK-like functions
    • 1.2. Building and Installation
      • 1.2.1. Prerequisites
      • 1.2.2. Installing from pre-built packages
      • 1.2.3. Building & installing from source
        • Using the install.sh script
        • Manual building and installation
    • 1.3. Using rocSOLVER
      • 1.3.1. QR factorization of a single matrix
      • 1.3.2. QR factorization of a batch of matrices
        • Strided_batched version
        • Batched version
    • 1.4. Memory Model
      • 1.4.1. Automatic workspace
      • 1.4.2. User-managed workspace
        • Minimum required size
        • Using an environment variable
        • Using helper functions
      • 1.4.3. User-owned workspace
    • 1.5. Multi-level Logging
      • 1.5.1. Logging modes
        • Trace logging
        • Bench logging
        • Profile logging
      • 1.5.2. Initialization and set-up
      • 1.5.3. Example code
      • 1.5.4. Kernel logging
      • 1.5.5. Multiple host threads
    • 1.6. Clients
      • 1.6.1. Testing rocSOLVER
      • 1.6.2. Benchmarking rocSOLVER
      • 1.6.3. rocSOLVER sample code
  • 2. rocSOLVER Library Design Guide
    • 2.1. Introduction
    • 2.2. Batched rocSOLVER
    • 2.3. Tuning rocSOLVER Performance
      • 2.3.1. geqr2/geqrf and geql2/geqlf functions
        • GEQxF_BLOCKSIZE
        • GEQxF_GEQx2_SWITCHSIZE
      • 2.3.2. gerq2/gerqf and gelq2/gelqf functions
        • GExQF_BLOCKSIZE
        • GExQF_GExQ2_SWITCHSIZE
      • 2.3.3. org2r/orgqr, org2l/orgql, ung2r/ungqr and ung2l/ungql functions
        • xxGQx_BLOCKSIZE
        • xxGQx_xxGQx2_SWITCHSIZE
      • 2.3.4. orgr2/orgrq, orgl2/orglq, ungr2/ungrq and ungl2/unglq functions
        • xxGxQ_BLOCKSIZE
        • xxGxQ_xxGxQ2_SWITCHSIZE
      • 2.3.5. orm2r/ormqr, orm2l/ormql, unm2r/unmqr and unm2l/unmql functions
        • xxMQx_BLOCKSIZE
      • 2.3.6. ormr2/ormrq, orml2/ormlq, unmr2/unmrq and unml2/unmlq functions
        • xxMxQ_BLOCKSIZE
      • 2.3.7. gebd2/gebrd and labrd functions
        • GEBRD_BLOCKSIZE
        • GEBRD_GEBD2_SWITCHSIZE
      • 2.3.8. gesvd function
        • THIN_SVD_SWITCH
      • 2.3.9. sytd2/sytrd, hetd2/hetrd and latrd functions
        • xxTRD_BLOCKSIZE
        • xxTRD_xxTD2_SWITCHSIZE
      • 2.3.10. sygs2/sygst and hegs2/hegst functions
        • xxGST_BLOCKSIZE
      • 2.3.11. syevd, heevd and stedc functions
        • STEDC_MIN_DC_SIZE
      • 2.3.12. potf2/potrf functions
        • POTRF_BLOCKSIZE
        • POTRF_POTF2_SWITCHSIZE
      • 2.3.13. sytf2/sytrf and lasyf functions
        • SYTRF_BLOCKSIZE
        • SYTRF_SYTF2_SWITCHSIZE
      • 2.3.14. getf2/getrf functions
        • GETF2_MAX_COLS
        • GETF2_MAX_THDS
        • GETF2_OPTIM_NGRP
        • GETRF_NUM_INTERVALS
        • GETRF_INTERVALS
        • GETRF_BLKSIZES
        • GETRF_BATCH_NUM_INTERVALS
        • GETRF_BATCH_INTERVALS
        • GETRF_BATCH_BLKSIZES
        • GETRF_NPVT_NUM_INTERVALS
        • GETRF_NPVT_INTERVALS
        • GETRF_NPVT_BLKSIZES
        • GETRF_NPVT_BATCH_NUM_INTERVALS
        • GETRF_NPVT_BATCH_INTERVALS
        • GETRF_NPVT_BATCH_BLKSIZES
      • 2.3.15. getri function
        • GETRI_MAX_COLS
        • GETRI_TINY_SIZE
        • GETRI_NUM_INTERVALS
        • GETRI_INTERVALS
        • GETRI_BLKSIZES
        • GETRI_BATCH_TINY_SIZE
        • GETRI_BATCH_NUM_INTERVALS
        • GETRI_BATCH_INTERVALS
        • GETRI_BATCH_BLKSIZES
      • 2.3.16. trtri function
        • TRTRI_MAX_COLS
        • TRTRI_NUM_INTERVALS
        • TRTRI_INTERVALS
        • TRTRI_BLKSIZES
        • TRTRI_BATCH_NUM_INTERVALS
        • TRTRI_BATCH_INTERVALS
        • TRTRI_BATCH_BLKSIZES
    • 2.4. Contributing Guidelines
  • 3. rocSOLVER API
    • 3.1. Types
      • 3.1.1. Additional types
        • rocblas_direct
        • rocblas_storev
        • rocblas_svect
        • rocblas_evect
        • rocblas_workmode
        • rocblas_eform
    • 3.2. LAPACK Auxiliary Functions
      • 3.2.1. Vector and Matrix manipulations
        • rocsolver_<type>lacgv()
        • rocsolver_<type>laswp()
      • 3.2.2. Householder reflections
        • rocsolver_<type>larfg()
        • rocsolver_<type>larft()
        • rocsolver_<type>larf()
        • rocsolver_<type>larfb()
      • 3.2.3. Bidiagonal forms
        • rocsolver_<type>labrd()
        • rocsolver_<type>bdsqr()
      • 3.2.4. Tridiagonal forms
        • rocsolver_<type>latrd()
        • rocsolver_<type>sterf()
        • rocsolver_<type>steqr()
        • rocsolver_<type>stedc()
      • 3.2.5. Symmetric matrices
        • rocsolver_<type>lasyf()
      • 3.2.6. Orthonormal matrices
        • rocsolver_<type>org2r()
        • rocsolver_<type>orgqr()
        • rocsolver_<type>orgl2()
        • rocsolver_<type>orglq()
        • rocsolver_<type>org2l()
        • rocsolver_<type>orgql()
        • rocsolver_<type>orgbr()
        • rocsolver_<type>orgtr()
        • rocsolver_<type>orm2r()
        • rocsolver_<type>ormqr()
        • rocsolver_<type>orml2()
        • rocsolver_<type>ormlq()
        • rocsolver_<type>orm2l()
        • rocsolver_<type>ormql()
        • rocsolver_<type>ormbr()
        • rocsolver_<type>ormtr()
      • 3.2.7. Unitary matrices
        • rocsolver_<type>ung2r()
        • rocsolver_<type>ungqr()
        • rocsolver_<type>ungl2()
        • rocsolver_<type>unglq()
        • rocsolver_<type>ung2l()
        • rocsolver_<type>ungql()
        • rocsolver_<type>ungbr()
        • rocsolver_<type>ungtr()
        • rocsolver_<type>unm2r()
        • rocsolver_<type>unmqr()
        • rocsolver_<type>unml2()
        • rocsolver_<type>unmlq()
        • rocsolver_<type>unm2l()
        • rocsolver_<type>unmql()
        • rocsolver_<type>unmbr()
        • rocsolver_<type>unmtr()
    • 3.3. LAPACK Functions
      • 3.3.1. Triangular factorizations
        • rocsolver_<type>potf2()
        • rocsolver_<type>potf2_batched()
        • rocsolver_<type>potf2_strided_batched()
        • rocsolver_<type>potrf()
        • rocsolver_<type>potrf_batched()
        • rocsolver_<type>potrf_strided_batched()
        • rocsolver_<type>getf2()
        • rocsolver_<type>getf2_batched()
        • rocsolver_<type>getf2_strided_batched()
        • rocsolver_<type>getrf()
        • rocsolver_<type>getrf_batched()
        • rocsolver_<type>getrf_strided_batched()
        • rocsolver_<type>sytf2()
        • rocsolver_<type>sytf2_batched()
        • rocsolver_<type>sytf2_strided_batched()
        • rocsolver_<type>sytrf()
        • rocsolver_<type>sytrf_batched()
        • rocsolver_<type>sytrf_strided_batched()
      • 3.3.2. Orthogonal factorizations
        • rocsolver_<type>geqr2()
        • rocsolver_<type>geqr2_batched()
        • rocsolver_<type>geqr2_strided_batched()
        • rocsolver_<type>geqrf()
        • rocsolver_<type>geqrf_batched()
        • rocsolver_<type>geqrf_strided_batched()
        • rocsolver_<type>gerq2()
        • rocsolver_<type>gerq2_batched()
        • rocsolver_<type>gerq2_strided_batched()
        • rocsolver_<type>gerqf()
        • rocsolver_<type>gerqf_batched()
        • rocsolver_<type>gerqf_strided_batched()
        • rocsolver_<type>geql2()
        • rocsolver_<type>geql2_batched()
        • rocsolver_<type>geql2_strided_batched()
        • rocsolver_<type>geqlf()
        • rocsolver_<type>geqlf_batched()
        • rocsolver_<type>geqlf_strided_batched()
        • rocsolver_<type>gelq2()
        • rocsolver_<type>gelq2_batched()
        • rocsolver_<type>gelq2_strided_batched()
        • rocsolver_<type>gelqf()
        • rocsolver_<type>gelqf_batched()
        • rocsolver_<type>gelqf_strided_batched()
      • 3.3.3. Problem and matrix reductions
        • rocsolver_<type>gebd2()
        • rocsolver_<type>gebd2_batched()
        • rocsolver_<type>gebd2_strided_batched()
        • rocsolver_<type>gebrd()
        • rocsolver_<type>gebrd_batched()
        • rocsolver_<type>gebrd_strided_batched()
        • rocsolver_<type>sytd2()
        • rocsolver_<type>sytd2_batched()
        • rocsolver_<type>sytd2_strided_batched()
        • rocsolver_<type>hetd2()
        • rocsolver_<type>hetd2_batched()
        • rocsolver_<type>hetd2_strided_batched()
        • rocsolver_<type>sytrd()
        • rocsolver_<type>sytrd_batched()
        • rocsolver_<type>sytrd_strided_batched()
        • rocsolver_<type>hetrd()
        • rocsolver_<type>hetrd_batched()
        • rocsolver_<type>hetrd_strided_batched()
        • rocsolver_<type>sygs2()
        • rocsolver_<type>sygs2_batched()
        • rocsolver_<type>sygs2_strided_batched()
        • rocsolver_<type>hegs2()
        • rocsolver_<type>hegs2_batched()
        • rocsolver_<type>hegs2_strided_batched()
        • rocsolver_<type>sygst()
        • rocsolver_<type>sygst_batched()
        • rocsolver_<type>sygst_strided_batched()
        • rocsolver_<type>hegst()
        • rocsolver_<type>hegst_batched()
        • rocsolver_<type>hegst_strided_batched()
      • 3.3.4. Linear-systems solvers
        • rocsolver_<type>trtri()
        • rocsolver_<type>trtri_batched()
        • rocsolver_<type>trtri_strided_batched()
        • rocsolver_<type>getri()
        • rocsolver_<type>getri_batched()
        • rocsolver_<type>getri_strided_batched()
        • rocsolver_<type>getrs()
        • rocsolver_<type>getrs_batched()
        • rocsolver_<type>getrs_strided_batched()
        • rocsolver_<type>gesv()
        • rocsolver_<type>gesv_batched()
        • rocsolver_<type>gesv_strided_batched()
        • rocsolver_<type>potri()
        • rocsolver_<type>potri_batched()
        • rocsolver_<type>potri_strided_batched()
        • rocsolver_<type>potrs()
        • rocsolver_<type>potrs_batched()
        • rocsolver_<type>potrs_strided_batched()
        • rocsolver_<type>posv()
        • rocsolver_<type>posv_batched()
        • rocsolver_<type>posv_strided_batched()
      • 3.3.5. Least-squares solvers
        • rocsolver_<type>gels()
        • rocsolver_<type>gels_batched()
        • rocsolver_<type>gels_strided_batched()
      • 3.3.6. Symmetric eigensolvers
        • rocsolver_<type>syev()
        • rocsolver_<type>syev_batched()
        • rocsolver_<type>syev_strided_batched()
        • rocsolver_<type>heev()
        • rocsolver_<type>heev_batched()
        • rocsolver_<type>heev_strided_batched()
        • rocsolver_<type>syevd()
        • rocsolver_<type>syevd_batched()
        • rocsolver_<type>syevd_strided_batched()
        • rocsolver_<type>heevd()
        • rocsolver_<type>heevd_batched()
        • rocsolver_<type>heevd_strided_batched()
        • rocsolver_<type>sygv()
        • rocsolver_<type>sygv_batched()
        • rocsolver_<type>sygv_strided_batched()
        • rocsolver_<type>hegv()
        • rocsolver_<type>hegv_batched()
        • rocsolver_<type>hegv_strided_batched()
        • rocsolver_<type>sygvd()
        • rocsolver_<type>sygvd_batched()
        • rocsolver_<type>sygvd_strided_batched()
        • rocsolver_<type>hegvd()
        • rocsolver_<type>hegvd_batched()
        • rocsolver_<type>hegvd_strided_batched()
      • 3.3.7. Singular value decomposition
        • rocsolver_<type>gesvd()
        • rocsolver_<type>gesvd_batched()
        • rocsolver_<type>gesvd_strided_batched()
    • 3.4. Lapack-like Functions
      • 3.4.1. Triangular factorizations
        • rocsolver_<type>getf2_npvt()
        • rocsolver_<type>getf2_npvt_batched()
        • rocsolver_<type>getf2_npvt_strided_batched()
        • rocsolver_<type>getrf_npvt()
        • rocsolver_<type>getrf_npvt_batched()
        • rocsolver_<type>getrf_npvt_strided_batched()
      • 3.4.2. Linear-systems solvers
        • rocsolver_<type>getri_npvt()
        • rocsolver_<type>getri_npvt_batched()
        • rocsolver_<type>getri_npvt_strided_batched()
        • rocsolver_<type>getri_outofplace()
        • rocsolver_<type>getri_outofplace_batched()
        • rocsolver_<type>getri_outofplace_strided_batched()
        • rocsolver_<type>getri_npvt_outofplace()
        • rocsolver_<type>getri_npvt_outofplace_batched()
        • rocsolver_<type>getri_npvt_outofplace_strided_batched()
    • 3.5. Logging Functions and Library Information
      • 3.5.1. Logging functions
        • rocsolver_log_begin()
        • rocsolver_log_end()
        • rocsolver_log_set_layer_mode()
        • rocsolver_log_set_max_levels()
        • rocsolver_log_restore_defaults()
        • rocsolver_log_write_profile()
        • rocsolver_log_flush_profile()
      • 3.5.2. Library information
        • rocsolver_get_version_string()
        • rocsolver_get_version_string_size()
    • 3.6. Deprecated
      • 3.6.1. Types
        • rocsolver_int
        • rocsolver_handle
        • rocsolver_direction
        • rocsolver_storev
        • rocsolver_operation
        • rocsolver_fill
        • rocsolver_diagonal
        • rocsolver_side
        • rocsolver_status
      • 3.6.2. Auxiliary functions
        • rocsolver_create_handle()
        • rocsolver_destroy_handle()
        • rocsolver_set_stream()
        • rocsolver_get_stream()
        • rocsolver_set_vector()
        • rocsolver_get_vector()
        • rocsolver_set_matrix()
        • rocsolver_get_matrix()
  • 4. License & Attributions
rocSOLVER
  • »
  • 1. rocSOLVER User Guide
  • Edit on GitHub
Next Previous

1. rocSOLVER User Guide¶

  • 1.1. Introduction
    • 1.1.1. Library overview
    • 1.1.2. Currently implemented functionality
      • LAPACK auxiliary functions
      • LAPACK main functions
      • LAPACK-like functions
  • 1.2. Building and Installation
    • 1.2.1. Prerequisites
    • 1.2.2. Installing from pre-built packages
    • 1.2.3. Building & installing from source
      • Using the install.sh script
      • Manual building and installation
  • 1.3. Using rocSOLVER
    • 1.3.1. QR factorization of a single matrix
    • 1.3.2. QR factorization of a batch of matrices
      • Strided_batched version
      • Batched version
  • 1.4. Memory Model
    • 1.4.1. Automatic workspace
    • 1.4.2. User-managed workspace
      • Minimum required size
      • Using an environment variable
      • Using helper functions
    • 1.4.3. User-owned workspace
  • 1.5. Multi-level Logging
    • 1.5.1. Logging modes
      • Trace logging
      • Bench logging
      • Profile logging
    • 1.5.2. Initialization and set-up
    • 1.5.3. Example code
    • 1.5.4. Kernel logging
    • 1.5.5. Multiple host threads
  • 1.6. Clients
    • 1.6.1. Testing rocSOLVER
    • 1.6.2. Benchmarking rocSOLVER
    • 1.6.3. rocSOLVER sample code
Next Previous

© Copyright 2020, Advanced Micro Devices. Revision 0e9e2c9c.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: rocm-5.1.3
Versions
master
latest
rocm-5.1.3
rocm-5.1.1
rocm-5.1.0
rocm-5.0.2
rocm-5.0.1
rocm-5.0.0
rocm-4.5.2
rocm-4.5.0
rocm-4.3.1
rocm-4.3.0
rocm-4.2.0
rocm-4.1.0
rocm-4.0.0
rocm-3.9.0
rocm-3.8.0
rocm-3.7.0
rocm-3.5.0
rocm-3.10.0
Downloads
On Read the Docs
Project Home
Builds