Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / ROCmSoftwarePlatform/Tensile issues and pull requests

#1837 - adding xf32 datatype to rocblas-bench input creator

Pull Request - State: open - Opened by babakpst 10 months ago
Labels: NoCI

#1836 - another vcpkg version package name fix

Pull Request - State: open - Opened by TorreZuk 10 months ago

#1835 - Mode to dynamically adjust number of CUs used

Pull Request - State: open - Opened by AlexBrownAMD 10 months ago

#1834 - CI: Increase timeout for extended test

Pull Request - State: closed - Opened by nakajee 10 months ago - 1 comment

#1832 - kernel.cpp without assembly kernel implement

Issue - State: open - Opened by DoubleClark 10 months ago - 1 comment

#1831 - Add new parameters to specify global load width for A and B separately

Pull Request - State: open - Opened by nakajee 10 months ago - 3 comments

#1830 - Disable HW monitor for aquvavanjaram941

Pull Request - State: closed - Opened by aferoz21 10 months ago

#1829 - Optimization for ShadowLimit

Pull Request - State: closed - Opened by nakajee 10 months ago - 2 comments

#1828 - Clean up old unused code, mostly related to old client

Pull Request - State: closed - Opened by AlexBrownAMD 10 months ago - 1 comment

#1827 - fix for newer windows vcpkg msgpack

Pull Request - State: closed - Opened by TorreZuk 10 months ago

#1826 - Pre-apply offsets for strided batch kernels

Pull Request - State: closed - Opened by AlexBrownAMD 10 months ago - 7 comments

#1825 - test reduced parallelism

Pull Request - State: open - Opened by TorreZuk 10 months ago

#1824 - tensile build with 16 threads

Pull Request - State: closed - Opened by TorreZuk 10 months ago - 1 comment

#1823 - reverse MFMA order in inner loop for odd outer iteration

Pull Request - State: closed - Opened by nakajee 10 months ago - 2 comments

#1822 - Optimized waitcnt lgkmcnt for 1LDSBuffer + PGR>1

Pull Request - State: closed - Opened by nakajee 10 months ago - 1 comment

#1821 - Enhance maximum value of DepthU to 1024

Pull Request - State: closed - Opened by nakajee 10 months ago

#1820 - Update CHANGELOG.md for ROCm 6.1.0

Pull Request - State: closed - Opened by babakpst 11 months ago

#1819 - ROCm 6.1 merge master into develop

Pull Request - State: closed - Opened by babakpst 11 months ago - 1 comment

#1818 - ROCm 6.1 merge staging into master

Pull Request - State: closed - Opened by babakpst 11 months ago

#1817 - CI: limit compile to 16 threads

Pull Request - State: closed - Opened by eidenyoshida 11 months ago - 3 comments

#1816 - Updating changelog for 4.40.0

Pull Request - State: closed - Opened by babakpst 11 months ago
Labels: NoCI, Documentation

#1815 - Two-Tile Stream-K

Pull Request - State: closed - Opened by AlexBrownAMD 11 months ago - 2 comments

#1813 - Small fix for LdsPad optimization (LdsElement calculation)

Pull Request - State: closed - Opened by nakajee 11 months ago

#1811 - PCIID Changes from 32bit to 64bit for ROCm SMI HW monitor (#1802)

Pull Request - State: closed - Opened by aferoz21 11 months ago - 1 comment

#1809 - Predicate for APU libs

Pull Request - State: closed - Opened by AlexBrownAMD 11 months ago

#1808 - Fix documentation typo from "creaet" to "create".

Pull Request - State: closed - Opened by AndySu12 11 months ago - 1 comment
Labels: Documentation

#1807 - CI: Restore hipcc compile append flag parallel-jobs=4

Pull Request - State: closed - Opened by eidenyoshida 11 months ago

#1805 - Add code to allow testing stream-k grid multipliers.

Pull Request - State: closed - Opened by AlexBrownAMD 11 months ago - 1 comment

#1804 - Added reject condition for FractionalLoad + DepthU!=power of 2

Pull Request - State: closed - Opened by nakajee 11 months ago - 1 comment

#1803 - LdsPad optimization + new parameters for local read related optimizations

Pull Request - State: closed - Opened by nakajee 11 months ago - 2 comments

#1802 - PCIID Changes from 32bit to 64bit for ROCm SMI HW monitor

Pull Request - State: closed - Opened by aferoz21 12 months ago

#1801 - Disable HW monitor for gfx1101 and gfx1102 products

Pull Request - State: closed - Opened by aferoz21 12 months ago - 6 comments

#1800 - Fix lint warnings

Pull Request - State: closed - Opened by AlexBrownAMD 12 months ago - 1 comment

#1798 - modify search order for function which on Linux

Pull Request - State: closed - Opened by amcamd 12 months ago

#1796 - postGSU code optimization

Pull Request - State: closed - Opened by nakajee 12 months ago - 6 comments

#1795 - Enable DirectToVgpr + MI4x4, plus skinny MacroTile support

Pull Request - State: closed - Opened by nakajee 12 months ago

#1794 - Stream-K kernel generation

Pull Request - State: closed - Opened by AlexBrownAMD 12 months ago - 3 comments

#1793 - Fix documentation typo from "creaet" to "create".

Pull Request - State: closed - Opened by AndySu12 12 months ago - 2 comments

#1792 - enable MFMA + LocalSplitU=4 for MT16x16

Pull Request - State: closed - Opened by nakajee 12 months ago - 2 comments

#1791 - Clear hipErrorNotFound error code since it is an expected part of the search

Pull Request - State: closed - Opened by AlexBrownAMD about 1 year ago - 4 comments

#1790 - Update CHANGELOG.md for ROCm 6.0

Pull Request - State: closed - Opened by babakpst about 1 year ago

#1789 - ROCm 6.0 merge master into develop

Pull Request - State: closed - Opened by babakpst about 1 year ago - 2 comments

#1788 - ROCm 6.0 merge staging into master

Pull Request - State: closed - Opened by babakpst about 1 year ago - 1 comment

#1787 - Updating the change log file for ROCm 6.0

Pull Request - State: closed - Opened by babakpst about 1 year ago
Labels: NoCI, Documentation

#1786 - Disable the hardware monitor for aquavanjaram 942

Pull Request - State: closed - Opened by aferoz21 about 1 year ago

#1785 - Enable input conversion from f8 to f16

Pull Request - State: closed - Opened by nakajee about 1 year ago - 5 comments

#1784 - Fix gfx11 CI test fail

Pull Request - State: closed - Opened by nakajee about 1 year ago - 2 comments

#1783 - merging back bug fix of forcestoresc1 arch selection

Pull Request - State: closed - Opened by yoichiyoshida about 1 year ago

#1782 - Enable wider local read + pack with v_perm for 8or16bit + UnrollMajorLDS=false

Pull Request - State: closed - Opened by nakajee about 1 year ago - 1 comment

#1781 - cleaner build on ubuntu22 with boost link fix

Pull Request - State: closed - Opened by TorreZuk about 1 year ago - 1 comment

#1780 - fix bug in forcestoresc1 arch selection

Pull Request - State: closed - Opened by yoichiyoshida about 1 year ago

#1779 - fixing bug in forcestoresc1 arch selection

Pull Request - State: closed - Opened by yoichiyoshida about 1 year ago

#1778 - fix bug in sc1 arch selection

Pull Request - State: closed - Opened by yoichiyoshida about 1 year ago - 1 comment

#1777 - Removed unused CustomKernels and ReplacementKernels.

Pull Request - State: closed - Opened by pkamd about 1 year ago - 2 comments

#1776 - Adjust miIssueLatency for gfx940

Pull Request - State: closed - Opened by nakajee about 1 year ago

#1775 - Add a new parameter ExtraMiLatencyLeft to improve local read scheduling

Pull Request - State: closed - Opened by nakajee about 1 year ago

#1774 - Small improvement for previous miLatency issue fix

Pull Request - State: closed - Opened by nakajee about 1 year ago - 2 comments

#1773 - Enable dedicated vgpr allocation for local read + pack

Pull Request - State: closed - Opened by nakajee about 1 year ago - 5 comments

#1772 - Suppress assertion for MemoryBuffer.cpp

Pull Request - State: closed - Opened by nakajee about 1 year ago - 4 comments

#1771 - To support multi-gpu ( different architectures) in lazy library loading

Pull Request - State: closed - Opened by rkamd about 1 year ago - 1 comment

#1770 - Refactor allowLRVWBforTLUandMI and enable DGEMM TLUB + LRVW=2 for odd N

Pull Request - State: closed - Opened by nakajee about 1 year ago - 2 comments

#1769 - Kernels source code are not generated

Issue - State: open - Opened by mabdallah89 about 1 year ago

#1768 - init code optimization

Pull Request - State: closed - Opened by nakajee about 1 year ago - 3 comments

#1767 - sgpr allocation optimization

Pull Request - State: closed - Opened by nakajee about 1 year ago

#1766 - Re-enable miLatency opt for MI16x16 and instruction scheduling fixes

Pull Request - State: closed - Opened by nakajee about 1 year ago - 3 comments

#1765 - Enable batch

Pull Request - State: closed - Opened by wbgilmartin about 1 year ago - 1 comment

#1764 - ROCm 6.0 merge master into develop

Pull Request - State: closed - Opened by babakpst about 1 year ago

#1763 - ROCm 6.0 merge staging into master

Pull Request - State: closed - Opened by babakpst about 1 year ago

#1762 - updating changelog for 4.39.0/fc6.0

Pull Request - State: closed - Opened by babakpst about 1 year ago
Labels: NoCI

#1761 - Hotfix: Fix override when lazy loading (#1756)

Pull Request - State: closed - Opened by nielenventer about 1 year ago - 8 comments

#1760 - Disable miLatency opt for MI16x16

Pull Request - State: closed - Opened by nakajee about 1 year ago - 1 comment

#1759 - Fix compiler directive for gfx941 and gfx942

Pull Request - State: closed - Opened by msujon-AMD about 1 year ago - 1 comment

#1758 - Add missing includes

Pull Request - State: closed - Opened by AlexBrownAMD about 1 year ago - 1 comment

#1756 - Fix override when lazy loading

Pull Request - State: closed - Opened by nielenventer about 1 year ago - 2 comments

#1755 - Changes to Enable ROCm SMI for gfx940

Pull Request - State: closed - Opened by aferoz21 about 1 year ago

#1754 - Skip gfx940 tests on gfx11xx

Pull Request - State: closed - Opened by AlexBrownAMD about 1 year ago

#1753 - Remove custom kernels for Aquavanjaram

Pull Request - State: closed - Opened by nakajee about 1 year ago - 1 comment

#1752 - Use 'gcnArchName' device property to identify the GCN architecture

Pull Request - State: closed - Opened by rkamd about 1 year ago

#1751 - DirectToLds issue fixes

Pull Request - State: closed - Opened by nakajee about 1 year ago

#1750 - Apply InitAccVgprOpt for more cases + bug fix for InitAccVgprOpt + GSU>1

Pull Request - State: closed - Opened by nakajee about 1 year ago - 2 comments

#1749 - Log power and temperature information of winning kernel in CSV file

Pull Request - State: closed - Opened by aferoz21 about 1 year ago

#1748 - updating the script according to the changes made after the merge

Pull Request - State: closed - Opened by babakpst about 1 year ago

#1747 - don't derive from deprecated std::iterator

Pull Request - State: closed - Opened by TorreZuk about 1 year ago - 4 comments

#1746 - Fix spelling error in debug log

Pull Request - State: closed - Opened by cgmb about 1 year ago
Labels: NoCI

#1745 - Log frequency information in CSV file

Pull Request - State: closed - Opened by aferoz21 about 1 year ago - 3 comments

#1744 - DirectToLds support for larger data types with 32bit global load

Pull Request - State: closed - Opened by nakajee about 1 year ago - 1 comment

#1743 - Fix merge error affecting i8 with wmma

Pull Request - State: closed - Opened by yoichiyoshida about 1 year ago

#1742 - Fix merge error affecting i8 with wmma

Pull Request - State: closed - Opened by AlexBrownAMD about 1 year ago - 3 comments

#1741 - adding library logic convertor script to tuning folder

Pull Request - State: closed - Opened by babakpst about 1 year ago

#1740 - adding tuning script: rocblas-bench input creator from lib logic

Pull Request - State: closed - Opened by babakpst about 1 year ago - 1 comment

#1739 - Improve missing clang-offload-bundler message

Pull Request - State: closed - Opened by cgmb about 1 year ago - 1 comment

#1738 - Adjust miLatency for gfx940 + MFMA for specific data types

Pull Request - State: closed - Opened by nakajee about 1 year ago