Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / triton-lang/triton issues and pull requests
#5144 - [AMD] Refactoring Instruction Scheduling
Pull Request -
State: open - Opened by ravil-mobile 2 days ago
- 1 comment
#5143 - feat: Dev Container for consistent dev setup
Pull Request -
State: open - Opened by maryamtahhan 2 days ago
#5142 - RuntimeError: Cannot call @triton.jit'd outside of the scope of a kernel
Issue -
State: open - Opened by yang9936 2 days ago
Labels: bug
#5141 - 有人遇到过yolov8n.pt模型转torchscripts和onnx,在triton server或Deepytorch Inference上推理,精度下降的问题吗?
Issue -
State: open - Opened by JackonLiu 3 days ago
Labels: bug
#5140 - [PROTON] Fix proton's support for multiple profiling sessions
Pull Request -
State: open - Opened by Jokeren 3 days ago
#5139 - [AMD] Use warp shuffle for MFMA to Dot operand layout conversion (FP8)
Pull Request -
State: open - Opened by ilia-cher 3 days ago
- 1 comment
#5138 - triton GEMM with size < 16
Issue -
State: open - Opened by March-H 3 days ago
Labels: performance
#5137 - [AMD] Get rid of flat load/store instructions
Pull Request -
State: closed - Opened by joviliast 3 days ago
- 1 comment
#5136 - Warp memory alignment error when manually launching compiled PTX
Issue -
State: closed - Opened by noctrog 3 days ago
- 2 comments
Labels: bug
#5135 - [BACKEND][NVIDIA][NFC] Remove `BarrierOpConversion` conversion as dead code
Pull Request -
State: closed - Opened by anmyachev 3 days ago
#5134 - How to export the triton code to ptx with dynamic metadata like `num_warps` and `BLOCK_M`?
Issue -
State: open - Opened by sleepwalker2017 3 days ago
#5133 - unsupported shared memory layout for MMAv3
Issue -
State: open - Opened by sleepwalker2017 3 days ago
Labels: bug
#5132 - [BACKEND] Support scalar fp_to_fp
Pull Request -
State: closed - Opened by ThomasRaoux 4 days ago
#5131 - [AMD] NFC: change to func walk in ReorderInstructions
Pull Request -
State: closed - Opened by antiagainst 4 days ago
#5130 - [AMD] Fixed instruction reorder
Pull Request -
State: open - Opened by ravil-mobile 4 days ago
- 1 comment
#5129 - [CI] Remove a workaround for yapf issue
Pull Request -
State: closed - Opened by pbchekin 4 days ago
#5128 - Ensure Consistent Use of List Type for block_type Shape Argument in semantic.py and interpreter.py
Pull Request -
State: closed - Opened by simonidaa 4 days ago
#5127 - [Draft][Backend] Add Address Sanitizer Pass
Pull Request -
State: open - Opened by CRobeck 4 days ago
#5126 - [AMD] Implement RepOrder for AMD MMA layouts and change kMajor notation to kMinor
Pull Request -
State: open - Opened by oplavsic 4 days ago
- 3 comments
#5125 - CUDA_ERROR_ILLEGAL_ADDRESS on certain small tile sizes
Issue -
State: open - Opened by Moerafaat 4 days ago
- 2 comments
#5124 - [TEST] Skip poison when checking llvm ir for xpu
Pull Request -
State: closed - Opened by Retribution98 4 days ago
#5123 - Error in attention kernel on H100
Issue -
State: closed - Opened by calebthomas259 4 days ago
- 1 comment
#5122 - Compilation crashes on TTGIR coalescing pass after LLVM update to b5cc222d (#4927)
Issue -
State: open - Opened by aakhundov 4 days ago
- 1 comment
#5121 - [BACKEND] Fix a special case where elements along the k dimension are repeated within each thread
Pull Request -
State: closed - Opened by Jokeren 4 days ago
#5120 - Source to Destination Tensor copy in accordance to block_mapping list structure
Issue -
State: closed - Opened by deepak-vij 4 days ago
- 1 comment
#5119 - [PROTON] Introduce the Proton dialect as a third-party plugin for intra-kernel perf tooling
Pull Request -
State: open - Opened by fywkevin 5 days ago
#5118 - Fix coverity issues
Pull Request -
State: open - Opened by anmyachev 5 days ago
#5117 - [Frontend] Print out config when autotune fails
Pull Request -
State: closed - Opened by htyu 5 days ago
- 4 comments
#5116 - [BACKEND] Update LLVM version to https://github.com/llvm/llvm-project/commit/fb4f426c81d7e87dbb30df7abeba15ffc2f9f41a
Pull Request -
State: closed - Opened by vwbaker 5 days ago
#5115 - [GitHub] Add issue templates and slightly refine PR template
Pull Request -
State: closed - Opened by peterbell10 5 days ago
#5114 - Fix barrier insertion after `assert` op
Pull Request -
State: closed - Opened by anmyachev 5 days ago
- 1 comment
#5113 - Update `.gitignore` file to include windows specific file extensions
Pull Request -
State: closed - Opened by anmyachev 5 days ago
#5112 - [AMD] Enable B scale for scaled_dot
Pull Request -
State: closed - Opened by antiagainst 5 days ago
#5111 - Fix llvm-build for almalinux
Pull Request -
State: closed - Opened by pbchekin 5 days ago
#5110 - [DRAFT][PROTON] Add `proton.state` utility
Pull Request -
State: open - Opened by Jokeren 6 days ago
#5109 - [PROTON][NFC] Clean up code
Pull Request -
State: closed - Opened by Jokeren 6 days ago
#5108 - possible bug for "TritonGPURemoveLayoutConversionsPass"
Issue -
State: closed - Opened by Shaquille-Wu 6 days ago
- 1 comment
#5107 - Support scaled_dot with rhs scale
Pull Request -
State: closed - Opened by ThomasRaoux 8 days ago
- 1 comment
#5106 - CUDA cudaMemcpy & Triton Kernel
Issue -
State: closed - Opened by deepak-vij 8 days ago
- 1 comment
#5105 - [BACKEND][NVIDIA] Remove NvidiaMma::getTotalElems...ForOperand
Pull Request -
State: closed - Opened by ggengnv 8 days ago
- 2 comments
#5104 - [BACKEND]Refactor convertLayoutOp by expanding dot operands of 2d tensors to 3d
Pull Request -
State: open - Opened by yiqian1 8 days ago
#5103 - [BACKEND] Implement generic code to allow for dot_scaled(mmav3) and warp choices
Pull Request -
State: closed - Opened by lezcano 8 days ago
#5102 - warp_group_dot lowering crashes for specific instruction shape
Issue -
State: closed - Opened by gflegar 8 days ago
- 3 comments
#5101 - how to understand "multiRootGetSlice"?
Issue -
State: open - Opened by Shaquille-Wu 8 days ago
#5099 - [BACKEND] Drop all volta related code
Pull Request -
State: closed - Opened by Jokeren 9 days ago
- 1 comment
#5098 - Remove unknown distribution option: `test_suite`
Pull Request -
State: open - Opened by anmyachev 9 days ago
#5097 - [PROTON] Remove `#include "Driver/GPU/CudaApi.h"` as unused from `Proton.cpp`
Pull Request -
State: closed - Opened by anmyachev 9 days ago
#5096 - Artifact `llvm-fa57c7a6-almalinux-x64` contains `install` folder, which at least doubles its size.
Issue -
State: closed - Opened by anmyachev 9 days ago
- 2 comments
#5095 - Error during the use of the bit-level shifting operation
Issue -
State: open - Opened by wenhaoli-xmu 9 days ago
#5094 - Triton for windows support is needed as pytorch2 using triton in TorchInductor and TorchDynamo
Issue -
State: open - Opened by Ritanlisa 9 days ago
- 3 comments
#5093 - AssertionError assigning a value to a local array by index (triton 3.1.0)
Issue -
State: closed - Opened by alexjc 9 days ago
- 2 comments
#5092 - [RUNTIME] Add flags for detecting user-defined Autotuner hooks
Pull Request -
State: closed - Opened by aakhundov 9 days ago
- 7 comments
#5091 - about llvm.struct and vector
Issue -
State: open - Opened by Shaquille-Wu 10 days ago
- 1 comment
#5090 - Should `setuptools` be specified as a dependency in the pyproject.toml?
Issue -
State: open - Opened by vancoykendall 10 days ago
#5089 - Consolidate `getOrder` as "element order" and implement `getRepOrder` for general and NVIDIA layouts
Pull Request -
State: closed - Opened by lezcano 10 days ago
- 2 comments
#5088 - Prevent cache base64 dir names from starting with a hyphen
Pull Request -
State: closed - Opened by fulvius31 10 days ago
#5087 - [BACKEND] Fix mmav2 for fp8
Pull Request -
State: closed - Opened by Jokeren 10 days ago
#5086 - Fix descriptor type being lost in the frontend after control flow
Pull Request -
State: closed - Opened by peterbell10 10 days ago
#5085 - Use getOrder instead of getThreadOrder in AxisInfo.cpp
Pull Request -
State: closed - Opened by oplavsic 10 days ago
- 7 comments
#5084 - [AMD] Fix issue with rank=1 in tryFitCvtIntoLDS
Pull Request -
State: closed - Opened by SamGinzburg 10 days ago
- 1 comment
#5083 - [RUNTIME] Pass full kwargs to Autotuner hooks instead of positional args
Pull Request -
State: closed - Opened by aakhundov 10 days ago
- 3 comments
#5082 - pre_/post_hook in triton.autotune break with kwargs passed to the kernel
Issue -
State: closed - Opened by aakhundov 10 days ago
#5081 - [FRONTEND] Fix handling of `from m import x as y` in CodeGenerator
Pull Request -
State: closed - Opened by davidberard98 11 days ago
- 2 comments
#5080 - [BACKEND] Fix reduce with slice layout inputs
Pull Request -
State: closed - Opened by ThomasRaoux 11 days ago
#5079 - [BACKEND] Make ExternElementwise op implement ConditionallySpeculatable
Pull Request -
State: closed - Opened by davidberard98 11 days ago
- 5 comments
#5077 - [AMD] Update HIP headers to 6.2.2
Pull Request -
State: closed - Opened by antiagainst 11 days ago
#5074 - [AMD] Enable all existing scaled_dot data type tests on MI300
Pull Request -
State: closed - Opened by antiagainst 11 days ago
#5073 - [CI] remove unused inductor workflows
Pull Request -
State: open - Opened by leseb 11 days ago
- 3 comments
#5072 - [AMD] Add atomicRMW dpp logic
Pull Request -
State: closed - Opened by joviliast 11 days ago
- 2 comments
#5070 - [Triton][Allocation] Enable `getScratchValueSize` specialization
Pull Request -
State: open - Opened by victor-eds 11 days ago
- 9 comments
#5067 - [Instrumentation][Proton] Add MLIR/LLVM level compiler instrumentation pass support in Proton
Pull Request -
State: closed - Opened by CRobeck 12 days ago
- 4 comments
#5059 - [AMD] Improve instruction scheduling hints for more targets
Pull Request -
State: closed - Opened by ravil-mobile 12 days ago
- 3 comments
#5054 - Inline PTX asm with mixed ptr and value argument types
Issue -
State: open - Opened by mlazos 12 days ago
- 2 comments
#5044 - [BACKEND] Get rid of unpack/pack I32
Pull Request -
State: closed - Opened by Jokeren 15 days ago
- 1 comment
#5035 - [BACKEND] Add barrier after assert op to avoid race condition
Pull Request -
State: open - Opened by ThomasRaoux 15 days ago
#5034 - [AMD] Add support for scaled_dot(mxfp4, -)
Pull Request -
State: open - Opened by antiagainst 15 days ago
#5033 - [Backend] Fix predicates for device assert inside reduction/scan region
Pull Request -
State: open - Opened by davidberard98 15 days ago
#5032 - python triton vs native cuda performance
Issue -
State: open - Opened by helloworldstone 15 days ago
#5031 - [BACKEND] Fix the combineSelectAndIf when the user of select in ifOp.
Pull Request -
State: open - Opened by tfruan2000 15 days ago
- 2 comments
#5030 - [BACKEND] Minor Bugfixes for SharedToDotOperand MMAv3
Pull Request -
State: open - Opened by ggengnv 16 days ago
#5029 - [AMD] Enable scaled_dot(-, bf16)
Pull Request -
State: closed - Opened by antiagainst 16 days ago
#5028 - [DRAFT][AMD] Introduce OptimizeAtomicLayouts pass
Pull Request -
State: open - Opened by joviliast 16 days ago
- 2 comments
#5027 - [LoopUnroll] Do not pipeline epilog loops generated by loop unrolling
Pull Request -
State: open - Opened by htyu 16 days ago
- 4 comments
#5026 - triton.backends.compiler.AttrsDescriptor is not a Dataclass, causing torch.compile() to break when building from source
Issue -
State: closed - Opened by j93hahn 16 days ago
- 3 comments
#5025 - [WIP][Instrumentation] Add instrumentation pass for cloning kernels and augmenting kernel args region
Pull Request -
State: closed - Opened by CRobeck 16 days ago
- 1 comment
#5024 - [Triton] Add canonicalization patterns for `tt.reduce`
Pull Request -
State: open - Opened by victor-eds 16 days ago
#5023 - The triton-nightly seems to have not updated for three months
Issue -
State: open - Opened by WANDY666 16 days ago
#5022 - Memory access fault when running a copy kernel for large matrices
Issue -
State: closed - Opened by wenchenvincent 16 days ago
- 1 comment
#5021 - [NIT][BACKEND] Clean up Allocation.cpp
Pull Request -
State: closed - Opened by Jokeren 17 days ago
#5020 - Fix formatting in docs for triton.language.dot
Pull Request -
State: closed - Opened by saagarjha 17 days ago
#5019 - [AMD] Support Cross-Lane Reduction With DPP
Pull Request -
State: closed - Opened by knwng 17 days ago
- 1 comment
#5018 - [AMD] Add tritonamdgpu-block-pingpong pass
Pull Request -
State: open - Opened by jungpark-mlir 17 days ago
#5017 - Ignore autotune runs failed with PTXAS error
Pull Request -
State: closed - Opened by htyu 17 days ago
- 2 comments
#5016 - Update version to 3.2.0
Pull Request -
State: open - Opened by bertmaher 17 days ago
#5015 - Allow windows cuda files to be used in `setup.py`
Pull Request -
State: closed - Opened by anmyachev 17 days ago
#5014 - Don't specify `-A x64` option and reuse cmake build type config for Windows
Pull Request -
State: closed - Opened by anmyachev 17 days ago
#5013 - [AMD] [ROCm] from `triton.runtime.cache` import `get_cache_manager` Behaves differently between pypi whl and source built whl
Issue -
State: open - Opened by tjtanaa 17 days ago
#5012 - differences between function and JITFunction implementations in triton.language.
Issue -
State: closed - Opened by jianlingl 17 days ago
- 1 comment
#5011 - fp16 Performance Slower than fp32 in Simple vector-addition operation on A30
Issue -
State: open - Opened by fnusid 18 days ago
#5010 - [INTERPRETER] Make sure interpreter works with float16 by reusing NumPy HALF-related code
Pull Request -
State: closed - Opened by anmyachev 18 days ago
- 2 comments