Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / triton-lang/triton issues and pull requests

#5144 - [AMD] Refactoring Instruction Scheduling

Pull Request - State: open - Opened by ravil-mobile 2 days ago - 1 comment

#5143 - feat: Dev Container for consistent dev setup

Pull Request - State: open - Opened by maryamtahhan 2 days ago

#5139 - [AMD] Use warp shuffle for MFMA to Dot operand layout conversion (FP8)

Pull Request - State: open - Opened by ilia-cher 3 days ago - 1 comment

#5138 - triton GEMM with size < 16

Issue - State: open - Opened by March-H 3 days ago
Labels: performance

#5137 - [AMD] Get rid of flat load/store instructions

Pull Request - State: closed - Opened by joviliast 3 days ago - 1 comment

#5136 - Warp memory alignment error when manually launching compiled PTX

Issue - State: closed - Opened by noctrog 3 days ago - 2 comments
Labels: bug

#5133 - unsupported shared memory layout for MMAv3

Issue - State: open - Opened by sleepwalker2017 3 days ago
Labels: bug

#5132 - [BACKEND] Support scalar fp_to_fp

Pull Request - State: closed - Opened by ThomasRaoux 4 days ago

#5131 - [AMD] NFC: change to func walk in ReorderInstructions

Pull Request - State: closed - Opened by antiagainst 4 days ago

#5130 - [AMD] Fixed instruction reorder

Pull Request - State: open - Opened by ravil-mobile 4 days ago - 1 comment

#5129 - [CI] Remove a workaround for yapf issue

Pull Request - State: closed - Opened by pbchekin 4 days ago

#5127 - [Draft][Backend] Add Address Sanitizer Pass

Pull Request - State: open - Opened by CRobeck 4 days ago

#5125 - CUDA_ERROR_ILLEGAL_ADDRESS on certain small tile sizes

Issue - State: open - Opened by Moerafaat 4 days ago - 2 comments

#5124 - [TEST] Skip poison when checking llvm ir for xpu

Pull Request - State: closed - Opened by Retribution98 4 days ago

#5123 - Error in attention kernel on H100

Issue - State: closed - Opened by calebthomas259 4 days ago - 1 comment

#5118 - Fix coverity issues

Pull Request - State: open - Opened by anmyachev 5 days ago

#5117 - [Frontend] Print out config when autotune fails

Pull Request - State: closed - Opened by htyu 5 days ago - 4 comments

#5114 - Fix barrier insertion after `assert` op

Pull Request - State: closed - Opened by anmyachev 5 days ago - 1 comment

#5112 - [AMD] Enable B scale for scaled_dot

Pull Request - State: closed - Opened by antiagainst 5 days ago

#5111 - Fix llvm-build for almalinux

Pull Request - State: closed - Opened by pbchekin 5 days ago

#5110 - [DRAFT][PROTON] Add `proton.state` utility

Pull Request - State: open - Opened by Jokeren 6 days ago

#5109 - [PROTON][NFC] Clean up code

Pull Request - State: closed - Opened by Jokeren 6 days ago

#5108 - possible bug for "TritonGPURemoveLayoutConversionsPass"

Issue - State: closed - Opened by Shaquille-Wu 6 days ago - 1 comment

#5107 - Support scaled_dot with rhs scale

Pull Request - State: closed - Opened by ThomasRaoux 8 days ago - 1 comment

#5106 - CUDA cudaMemcpy & Triton Kernel

Issue - State: closed - Opened by deepak-vij 8 days ago - 1 comment

#5105 - [BACKEND][NVIDIA] Remove NvidiaMma::getTotalElems...ForOperand

Pull Request - State: closed - Opened by ggengnv 8 days ago - 2 comments

#5102 - warp_group_dot lowering crashes for specific instruction shape

Issue - State: closed - Opened by gflegar 8 days ago - 3 comments

#5101 - how to understand "multiRootGetSlice"?

Issue - State: open - Opened by Shaquille-Wu 8 days ago

#5099 - [BACKEND] Drop all volta related code

Pull Request - State: closed - Opened by Jokeren 9 days ago - 1 comment

#5098 - Remove unknown distribution option: `test_suite`

Pull Request - State: open - Opened by anmyachev 9 days ago

#5092 - [RUNTIME] Add flags for detecting user-defined Autotuner hooks

Pull Request - State: closed - Opened by aakhundov 9 days ago - 7 comments

#5091 - about llvm.struct and vector

Issue - State: open - Opened by Shaquille-Wu 10 days ago - 1 comment

#5088 - Prevent cache base64 dir names from starting with a hyphen

Pull Request - State: closed - Opened by fulvius31 10 days ago

#5087 - [BACKEND] Fix mmav2 for fp8

Pull Request - State: closed - Opened by Jokeren 10 days ago

#5085 - Use getOrder instead of getThreadOrder in AxisInfo.cpp

Pull Request - State: closed - Opened by oplavsic 10 days ago - 7 comments

#5084 - [AMD] Fix issue with rank=1 in tryFitCvtIntoLDS

Pull Request - State: closed - Opened by SamGinzburg 10 days ago - 1 comment

#5083 - [RUNTIME] Pass full kwargs to Autotuner hooks instead of positional args

Pull Request - State: closed - Opened by aakhundov 10 days ago - 3 comments

#5081 - [FRONTEND] Fix handling of `from m import x as y` in CodeGenerator

Pull Request - State: closed - Opened by davidberard98 11 days ago - 2 comments

#5080 - [BACKEND] Fix reduce with slice layout inputs

Pull Request - State: closed - Opened by ThomasRaoux 11 days ago

#5079 - [BACKEND] Make ExternElementwise op implement ConditionallySpeculatable

Pull Request - State: closed - Opened by davidberard98 11 days ago - 5 comments

#5077 - [AMD] Update HIP headers to 6.2.2

Pull Request - State: closed - Opened by antiagainst 11 days ago

#5073 - [CI] remove unused inductor workflows

Pull Request - State: open - Opened by leseb 11 days ago - 3 comments

#5072 - [AMD] Add atomicRMW dpp logic

Pull Request - State: closed - Opened by joviliast 11 days ago - 2 comments

#5070 - [Triton][Allocation] Enable `getScratchValueSize` specialization

Pull Request - State: open - Opened by victor-eds 11 days ago - 9 comments

#5059 - [AMD] Improve instruction scheduling hints for more targets

Pull Request - State: closed - Opened by ravil-mobile 12 days ago - 3 comments

#5054 - Inline PTX asm with mixed ptr and value argument types

Issue - State: open - Opened by mlazos 12 days ago - 2 comments

#5044 - [BACKEND] Get rid of unpack/pack I32

Pull Request - State: closed - Opened by Jokeren 15 days ago - 1 comment

#5034 - [AMD] Add support for scaled_dot(mxfp4, -)

Pull Request - State: open - Opened by antiagainst 15 days ago

#5032 - python triton vs native cuda performance

Issue - State: open - Opened by helloworldstone 15 days ago

#5031 - [BACKEND] Fix the combineSelectAndIf when the user of select in ifOp.

Pull Request - State: open - Opened by tfruan2000 15 days ago - 2 comments

#5030 - [BACKEND] Minor Bugfixes for SharedToDotOperand MMAv3

Pull Request - State: open - Opened by ggengnv 16 days ago

#5029 - [AMD] Enable scaled_dot(-, bf16)

Pull Request - State: closed - Opened by antiagainst 16 days ago

#5028 - [DRAFT][AMD] Introduce OptimizeAtomicLayouts pass

Pull Request - State: open - Opened by joviliast 16 days ago - 2 comments

#5027 - [LoopUnroll] Do not pipeline epilog loops generated by loop unrolling

Pull Request - State: open - Opened by htyu 16 days ago - 4 comments

#5024 - [Triton] Add canonicalization patterns for `tt.reduce`

Pull Request - State: open - Opened by victor-eds 16 days ago

#5021 - [NIT][BACKEND] Clean up Allocation.cpp

Pull Request - State: closed - Opened by Jokeren 17 days ago

#5020 - Fix formatting in docs for triton.language.dot

Pull Request - State: closed - Opened by saagarjha 17 days ago

#5019 - [AMD] Support Cross-Lane Reduction With DPP

Pull Request - State: closed - Opened by knwng 17 days ago - 1 comment

#5018 - [AMD] Add tritonamdgpu-block-pingpong pass

Pull Request - State: open - Opened by jungpark-mlir 17 days ago

#5017 - Ignore autotune runs failed with PTXAS error

Pull Request - State: closed - Opened by htyu 17 days ago - 2 comments

#5016 - Update version to 3.2.0

Pull Request - State: open - Opened by bertmaher 17 days ago

#5015 - Allow windows cuda files to be used in `setup.py`

Pull Request - State: closed - Opened by anmyachev 17 days ago