GitHub / rocm/composable_kernel issues and pull requests
#2221 - [VLLM V1] Add chunked prefill for FA to pass seq with small seqlen_q
Pull Request -
State: open - Opened by Zzz9990 2 months ago
#2220 - [Issue]: how to use ck as lib in other project
Issue -
State: open - Opened by ZJLi2013 2 months ago
#2219 - [CK_TILE] Multiple-D GEMM example
Pull Request -
State: open - Opened by mozga-amd 2 months ago
#2218 - Remove not needed bwd wei merged groups instances
Pull Request -
State: open - Opened by bartekxk 2 months ago
#2217 - Add IO module for profiler
Issue -
State: open - Opened by smedegaard 2 months ago
Labels: enhancement
#2216 - Adds gemm_universal running example in ckProfiler readme
Pull Request -
State: closed - Opened by AviralGoelAMD 2 months ago
#2215 - fixed typos in blockwise_gemm_v2
Pull Request -
State: closed - Opened by zjing14 2 months ago
- 2 comments
#2214 - Refactor tile_window.hpp, tile_window_linear.hpp into a CK Tile Hierarchy
Pull Request -
State: closed - Opened by AviralGoelAMD 2 months ago
- 1 comment
#2213 - [CK_TILE] Remove extra if from CMakeLists.txt of gemm tests
Pull Request -
State: open - Opened by samremes 2 months ago
#2212 - Add the instances for small sized GEMM in preshuffle and improve CMake Flag
Pull Request -
State: closed - Opened by ThomasNing 2 months ago
#2211 - improvements to timing and profiling
Pull Request -
State: open - Opened by lfmeadow 2 months ago
#2210 - [CMake] Disable newly added compiler warning -Wnrvo
Pull Request -
State: closed - Opened by jplehr 2 months ago
- 2 comments
#2209 - [CK_TILE] For FMHA forward kernels, assign block indices reversely if using mask
Pull Request -
State: open - Opened by poyenc 2 months ago
- 1 comment
#2208 - add int8 support in ck-tile/03_gemm
Pull Request -
State: open - Opened by ZJLi2013 2 months ago
- 2 comments
#2207 - Fix 11_add_rmsnorm2d_rdquant
Pull Request -
State: closed - Opened by SamiAario-AMD 2 months ago
#2206 - Ck tile tutorial examples
Pull Request -
State: open - Opened by ClementLinCF 2 months ago
#2205 - [Issue]: int8 gemm pipeline build error
Issue -
State: closed - Opened by ZJLi2013 2 months ago
- 5 comments
#2204 - Removes debug print statements from CMakeLists files
Pull Request -
State: closed - Opened by AviralGoelAMD 3 months ago
#2203 - Fix example_grouped_gemm_multiple_d_xdl_fp16 on gfx950
Pull Request -
State: closed - Opened by jefyang1 3 months ago
#2202 - Use new mfma instructions for FP8 on gfx950
Pull Request -
State: closed - Opened by jefyang1 3 months ago
#2201 - Restore oddc instances
Pull Request -
State: closed - Opened by bartekxk 3 months ago
#2200 - [CK_tile] Add rotating buffer feature for universal gemm
Pull Request -
State: open - Opened by amd-khushbu 3 months ago
#2199 - MX GEMM - Expand MX MFMA Testing to BF8, FP6, and BF6 Data Types
Pull Request -
State: closed - Opened by andriy-ca 3 months ago
- 1 comment
#2198 - [CK_TILE] fMHA batch_prefill block index & logits soft-capping optimizations
Pull Request -
State: closed - Opened by poyenc 3 months ago
#2197 - Grouped conv bwd wei add for larger filter and Merge Groupes optimization
Pull Request -
State: closed - Opened by bartekxk 3 months ago
#2196 - Build and store CK library deb package for all targets daily.
Pull Request -
State: closed - Opened by illsilin 3 months ago
#2195 - [CK_TILE] Fix fMHA forward batch_prefill kernel codegen errors
Pull Request -
State: closed - Opened by poyenc 3 months ago
#2194 - Narrowing error fix for codegen compilation
Pull Request -
State: closed - Opened by arai713 3 months ago
- 1 comment
#2193 - [Tile Engine] Add benchmark for tile engine gemm.
Pull Request -
State: open - Opened by Yanxing-Shi 3 months ago
- 3 comments
#2192 - Update the buffer load/store intrinsic names for clang>=20.
Pull Request -
State: closed - Opened by illsilin 3 months ago
#2191 - [CK_TILE] Tile loop persistent gemm kernel
Pull Request -
State: open - Opened by samremes 3 months ago
#2190 - fix moe sorting build fail
Pull Request -
State: closed - Opened by solinzby1 3 months ago
Labels: CI - Pass
#2189 - Adding validation for tile sizes in Tile Engine
Pull Request -
State: closed - Opened by amd-khushbu 3 months ago
#2188 - [CK TILE] Grouped Convolution Forward Kernel
Pull Request -
State: open - Opened by bartekxk 3 months ago
#2187 - Extend 64x64 with 4 waves instances for grouped conv bwd wei
Pull Request -
State: closed - Opened by bartekxk 3 months ago
#2186 - Disable conv for Filter1x1Stride1Pad0 when K or C is even
Pull Request -
State: closed - Opened by mozga-amd 3 months ago
- 1 comment
#2185 - add CShuffleM/NXdlPerWavePerShuffle in cshuffle_epilogue
Pull Request -
State: open - Opened by joyeamd 3 months ago
- 3 comments
#2184 - Disable SMFMA gfx90a
Pull Request -
State: closed - Opened by amd-khushbu 3 months ago
#2183 - [CK_TILE] Blockwise GEMM pipeline v5
Pull Request -
State: open - Opened by aledudek 3 months ago
#2182 - Disable SMFMA for gfx90a
Pull Request -
State: closed - Opened by amd-khushbu 3 months ago
#2181 - Switch to v2 pipeline for grouped conv bwd data
Pull Request -
State: closed - Opened by bartekxk 3 months ago
- 2 comments
#2180 - Generate ckProfiler package for gfx942 only.
Pull Request -
State: closed - Opened by illsilin 3 months ago
#2179 - Add grouped conv fwd bias relu instances
Pull Request -
State: closed - Opened by bartekxk 3 months ago
#2178 - Add grouped conv fwd bias relu instances
Pull Request -
State: closed - Opened by bartekxk 3 months ago
#2177 - transpose load api development
Pull Request -
State: open - Opened by joyeamd 3 months ago
- 1 comment
#2176 - enable blockwise gemm v3 pipeline in moe
Pull Request -
State: open - Opened by lalala-sh 3 months ago
#2175 - Revert "Disable the SMFMA instruction for gfx90a."
Pull Request -
State: closed - Opened by ThomasNing 3 months ago
#2174 - Disable the SMFMA instruction for gfx90a.
Pull Request -
State: closed - Opened by amd-khushbu 3 months ago
#2173 - Fix grouped conv bwd data tests on gfx950
Pull Request -
State: closed - Opened by jefyang1 3 months ago
#2172 - Support for swizzle and transpose for MFMA_16x16x32_F16/BF16
Pull Request -
State: closed - Opened by amd-khushbu 3 months ago
#2171 - Ensure MX GEMM Instances can be Cross-Compiled for Multiple Architectures
Pull Request -
State: closed - Opened by andriy-ca 3 months ago
#2170 - [DO NOT MERGE] Add a repro example
Pull Request -
State: open - Opened by geyyer 3 months ago
#2169 - Revert "Integrate universal gemm with conv bwd data and add SplitK"
Pull Request -
State: closed - Opened by aosewski 3 months ago
Labels: urgency_high
#2168 - Flatmm merge
Pull Request -
State: closed - Opened by solinzby1 3 months ago
#2167 - [CK_TILE] fix for default epilogue
Pull Request -
State: closed - Opened by jakpiase 3 months ago
#2166 - Improve the general performance of the Preshuffled GEMM V3 & delete the unnecessary instances
Pull Request -
State: closed - Opened by ThomasNing 3 months ago
#2165 - Move 16x16 grouped conv fwd instances from comp header
Pull Request -
State: closed - Opened by bartekxk 3 months ago
#2164 - [LWPCK-2957] fix moe gemm2 for gfx950
Pull Request -
State: closed - Opened by mtgu0705 3 months ago
#2163 - [CK_TILE] Add logits soft-capping & customization support to the FMHA forward kernel/pipelines
Pull Request -
State: closed - Opened by poyenc 3 months ago
- 2 comments
#2162 - Support fp8 in example tile_example_flatmm_basic
Pull Request -
State: closed - Opened by linqun 3 months ago
#2160 - Add Doxygen Documentation for HostTesnor, HostTensorDescriptor, DeviceMem, FillUniformDistribution
Pull Request -
State: closed - Opened by AviralGoelAMD 3 months ago
Labels: documentation
#2159 - [CK_Tile] Simplified Mem pipeline
Pull Request -
State: closed - Opened by amd-khushbu 3 months ago
- 1 comment
#2158 - [CK_TILE] Add type traits to detect tile window types at compile time
Pull Request -
State: closed - Opened by AviralGoelAMD 3 months ago
#2157 - Restrict MX GEMM Instantiation to GFX950 Architecture
Pull Request -
State: closed - Opened by andriy-ca 3 months ago
#2156 - Simple copy kernel, which can be a tool to experiment with CK_Tile API with minimal code.
Pull Request -
State: closed - Opened by kylasa 3 months ago
- 1 comment
#2153 - [CK_TILE] optimize moe sorting kernel, boost large context case up to 20x
Pull Request -
State: closed - Opened by carlushuang 3 months ago
#2151 - Add FP4 MX MFMA tests
Pull Request -
State: closed - Opened by geyyer 3 months ago
#2146 - [CK_TILE] Grouped GEMM tile loop
Pull Request -
State: closed - Opened by samremes 3 months ago
- 1 comment
#2145 - Stream-K Reduction option as Runtime parameter and Compilation Error Fix (SK- Reduction)
Pull Request -
State: open - Opened by ozturkosu 3 months ago
- 1 comment
#2144 - [DO NOT MERGE] WIP Bgp v5
Pull Request -
State: closed - Opened by aledudek 3 months ago
#2141 - Run CI jobs as user jenkins
Pull Request -
State: closed - Opened by illsilin 3 months ago
#2140 - Add grouped conv fwd 16x16 mfma instruction instances
Pull Request -
State: closed - Opened by bartekxk 3 months ago
#2139 - fix moe i4 example bug
Pull Request -
State: closed - Opened by lalala-sh 3 months ago
#2138 - [CK Tile] Fix the numerical issue for MFMA 16x16x128 for FP8
Pull Request -
State: closed - Opened by DDEle 3 months ago
#2137 - enable hd128 swa
Pull Request -
State: closed - Opened by slippedJim 3 months ago
#2136 - Add Matrix A and Matrix B Swizzle for LDS in Computev4 policy
Pull Request -
State: closed - Opened by AviralGoelAMD 3 months ago
#2135 - [Tile Engine] Add benchmark for tile engine
Pull Request -
State: closed - Opened by Yanxing-Shi 3 months ago
- 3 comments
#2134 - [Tile Engine] Improved README.md
Pull Request -
State: closed - Opened by AviralGoelAMD 3 months ago
- 1 comment
Labels: documentation
#2131 - Vectorized Transpose for Batched Transpose CK Tile Operator
Pull Request -
State: closed - Opened by ThomasNing 3 months ago
- 2 comments
#2125 - Support for MFMA_16x16x128 for fp8/bf8
Pull Request -
State: closed - Opened by amd-khushbu 3 months ago
#2117 - Fix failure in test_batched_gemm_softmax_gemm_permute for lower resource devices
Pull Request -
State: closed - Opened by ozturkosu 3 months ago
#2114 - [CK_TILE] FMHA Support hdim_v to as a Multiple of 32
Pull Request -
State: open - Opened by DDEle 3 months ago
#2113 - Update Code Ownership
Pull Request -
State: closed - Opened by ThomasNing 3 months ago
#2112 - Temporarily disable MX FP4 device tests
Pull Request -
State: closed - Opened by geyyer 3 months ago
#2108 - [CK_TILE] Allow specifying logits soft-caping through fMHA forwawrd APIs
Pull Request -
State: closed - Opened by poyenc 3 months ago
#2105 - fix pk_i4_v3 tests failures in Unbuntu env.
Pull Request -
State: closed - Opened by mtgu0705 3 months ago
- 1 comment
#2104 - fix CI build fail
Pull Request -
State: closed - Opened by solinzby1 3 months ago
#2103 - MFMA 16x16x32fp8
Pull Request -
State: closed - Opened by ThomasNing 3 months ago
#2102 - GEMM Multiply Multiply Fix
Pull Request -
State: closed - Opened by ThomasNing 3 months ago
#2101 - Floating Point Exception Error fix for Streamk GEMM Kernel
Pull Request -
State: closed - Opened by ozturkosu 3 months ago
#2098 - fix for basic gemm UseStructuredSparsity
Pull Request -
State: closed - Opened by jakpiase 3 months ago
- 1 comment
#2097 - [Issue]: Building CK fails due to -fuse-ld=path deprecation
Issue -
State: open - Opened by esna-echo 3 months ago
- 2 comments
Labels: Under Investigation
#2087 - Added a much simpler example and commented 01_add
Pull Request -
State: closed - Opened by AviralGoelAMD 4 months ago
- 2 comments
#2081 - [Performance] AMD wmma redundant load issue
Issue -
State: open - Opened by DoubleClark 4 months ago
#2080 - [Draft] For multi instance generation for CkTileEngine
Pull Request -
State: open - Opened by amd-khushbu 4 months ago
#2079 - Solve the Static Encoding Pattern compile error when the tile size is too small
Pull Request -
State: closed - Opened by ThomasNing 4 months ago
#2078 - addded documentation for ck_tile::array<T,N>
Pull Request -
State: open - Opened by AviralGoelAMD 4 months ago
Labels: documentation
#2077 - Fix build issues for multiple targets.
Pull Request -
State: closed - Opened by illsilin 4 months ago
#2076 - [Issue]: Building the CK library fails to complete
Issue -
State: open - Opened by stefan0re 4 months ago
- 1 comment
Labels: Under Investigation
#2075 - CK pk_i4_t test failures fix
Pull Request -
State: closed - Opened by mtgu0705 4 months ago