Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / triton-lang/triton issues and pull requests
#5009 - [BACKEND][NVIDIA] Add kWidth support in MMAv3
Pull Request -
State: closed - Opened by ggengnv 18 days ago
- 5 comments
#5008 - [AMD] Skip scaled_dot tests for gfx11 and gfx12
Pull Request -
State: closed - Opened by AlexAUT 18 days ago
#5007 - Fold fp_to_fp op with zero constant input
Pull Request -
State: open - Opened by oplavsic 18 days ago
#5006 - [FRONTEND] Fix transpose with tuple dims
Pull Request -
State: closed - Opened by peterbell10 18 days ago
#5005 - [AMD][BACKEND] Switch to code object v5
Pull Request -
State: open - Opened by AlexAUT 18 days ago
#5003 - [BACKEND][NVIDIA] Add DotOp Hoisting Pass for WGMMA and Add Lowering for SMEM-to-MMAv3 DotOp Copy
Pull Request -
State: open - Opened by ggengnv 19 days ago
- 33 comments
#5002 - [AMD] remove redundant LDS bypass checks
Pull Request -
State: closed - Opened by binarman 19 days ago
- 2 comments
#5000 - different between do_not_specialize and tl.constexpr
Issue -
State: closed - Opened by zhanglei1172 20 days ago
- 1 comment
#4999 - Hopper support for mma -> mma layout conversion?
Issue -
State: closed - Opened by hypnopump 20 days ago
- 1 comment
#4998 - [AMD] Restructure ReorderInstructions pass
Pull Request -
State: closed - Opened by antiagainst 21 days ago
- 1 comment
#4997 - [CI] bump yapf version from be72557 to 7e21823
Pull Request -
State: closed - Opened by XuehaiPan 21 days ago
#4996 - [Frontend][Backend] Implement support for scale_dot(-, bf16)
Pull Request -
State: closed - Opened by lezcano 22 days ago
#4995 - libtriton.so undefined symbol: Pyxxx
Issue -
State: closed - Opened by helloworldstone 22 days ago
#4994 - [AMD] Add initial support for scaled_dot(mxfp8, fp8)
Pull Request -
State: closed - Opened by antiagainst 22 days ago
- 1 comment
#4993 - [AMD] getBackwardSlice variant with handling for op regions
Pull Request -
State: open - Opened by karthik-man 23 days ago
- 1 comment
#4992 - Interpreter fails with float16 atomic_add
Issue -
State: closed - Opened by davidberard98 23 days ago
- 5 comments
Labels: enhancement
#4991 - [BACKEND] Improve detection of register to register conversion
Pull Request -
State: closed - Opened by Jokeren 23 days ago
#4990 - Pytest on Ampere platform has errors, see the log below
Issue -
State: closed - Opened by bingyizh233 23 days ago
- 1 comment
#4989 - about DCE for ModuleAxisInfoAnalysis
Issue -
State: open - Opened by Shaquille-Wu 23 days ago
- 2 comments
#4987 - [Proton] Adding Sorting of Kernels
Pull Request -
State: closed - Opened by CRobeck 24 days ago
- 4 comments
#4984 - [AMD][prototype] Transpose between global load and local store for non-TN layouts
Pull Request -
State: open - Opened by jtang10 24 days ago
#4983 - [AMD] Enable shared->MFMA dot operand conversion through LinearLayout
Pull Request -
State: closed - Opened by binarman 24 days ago
- 1 comment
#4982 - [BACKEND][NVIDIA] pass ptx-version to ttgir->llir conversion pass and use it for vectorized atomics
Pull Request -
State: closed - Opened by davidberard98 24 days ago
- 9 comments
#4981 - [TEST] float16 test for test_tensor_atomic_rmw
Pull Request -
State: closed - Opened by davidberard98 24 days ago
- 3 comments
#4980 - [Backend] Fix when trying to convert an mma<!tt.ptr<f32>> into blocked
Pull Request -
State: closed - Opened by lezcano 24 days ago
- 6 comments
#4978 - Pypi build for aarch64
Issue -
State: open - Opened by surak 24 days ago
- 2 comments
#4976 - [INTERPRETER] Replace GCC `__ATOMIC*` built-ins with std::atomic for `interpreter.cc`
Pull Request -
State: closed - Opened by anmyachev 24 days ago
- 7 comments
#4975 - module 'triton.language.extra.cuda' has no attribute 'experimental_device_tensormap_create2d'
Issue -
State: closed - Opened by Daming-wang 24 days ago
- 1 comment
#4973 - [DO NOT MERGE] Reintroduce "[SWP] attempt to remove a workaround for a triton llvm codegen bug (#4873)"
Pull Request -
State: closed - Opened by pawelszczerbuk 25 days ago
#4970 - Where can I find a triton 2.0.0 dev verison?
Issue -
State: open - Opened by hulihan-start 25 days ago
- 1 comment
#4968 - [Backend] Add LL::quotient and remove uses of divideRight and sublayoutIsIdentity
Pull Request -
State: closed - Opened by lezcano 25 days ago
- 2 comments
#4966 - [AMD] Add pass to convert tt.load/tt.store to buffer operations
Pull Request -
State: closed - Opened by giuseros 25 days ago
- 3 comments
#4962 - Document that tl.reduce assumes associativity/commutativity
Issue -
State: open - Opened by bertmaher 26 days ago
- 1 comment
#4958 - [NFC] Use `get_config_var('EXT_SUFFIX')` instead of using `so` directly
Pull Request -
State: closed - Opened by anmyachev 26 days ago
#4955 - [Frontend] [BC breaking] Always follow C semantics on %
Pull Request -
State: closed - Opened by brod4910 28 days ago
#4951 - [BACKEND] Replace `isMmaToDotShortcut` with linear layout based logic
Pull Request -
State: closed - Opened by Jokeren 29 days ago
- 2 comments
#4950 - [Backend] Pipeline scale_dot
Pull Request -
State: closed - Opened by lezcano 29 days ago
#4940 - [AMD] Reland instruction scheduling hint changes
Pull Request -
State: closed - Opened by ravil-mobile about 1 month ago
- 2 comments
#4937 - [AMD]Add fast_expf to libdevice
Pull Request -
State: open - Opened by knwng about 1 month ago
#4936 - [BUILD] Add instrumentation libs to `.gitignore`
Pull Request -
State: closed - Opened by Jokeren about 1 month ago
#4935 - [AMD] Reland sinking the 2nd tt.load after local_load's
Pull Request -
State: closed - Opened by zhanglx13 about 1 month ago
- 1 comment
#4934 - Fix 3xTF32 precision issues
Pull Request -
State: closed - Opened by alexsamardzic about 1 month ago
#4933 - Fix 3xTF32 precision issues
Pull Request -
State: closed - Opened by ThomasRaoux about 1 month ago
- 1 comment
#4932 - Added an option to dump IR to files
Pull Request -
State: open - Opened by arakhmati-openai about 1 month ago
#4931 - Add the predicate to the instrRepr before returning it when onlyAttachMLIRArgs=true
Pull Request -
State: closed - Opened by arakhmati-openai about 1 month ago
#4930 - [backend] Update to llvm/llvm-project@b5cc222d7429
Pull Request -
State: closed - Opened by ravil-mobile about 1 month ago
#4929 - Accelerating custom sparsity patterns on GPU with triton
Issue -
State: open - Opened by abhishektyaagi about 1 month ago
- 2 comments
#4928 - Update GCC aarch64 toolchain to 14.2.0-1
Pull Request -
State: closed - Opened by antiagainst about 1 month ago
#4927 - [BACKEND] Update to llvm/llvm-project@b5cc222d7429
Pull Request -
State: closed - Opened by ravil-mobile about 1 month ago
#4926 - Fix 3xTF32 precision issues
Pull Request -
State: closed - Opened by alexsamardzic about 1 month ago
- 12 comments
#4925 - [AMD] Emit vectorized 16-bit float LLVM atomic ops
Pull Request -
State: closed - Opened by joviliast about 1 month ago
- 9 comments
#4924 - [frontend] Remove Complex Regex for MLIR Parsing
Pull Request -
State: open - Opened by SamGinzburg about 1 month ago
#4923 - [BACKEND] Update LLVM to llvm/llvm-project@1d40fefb08e9b11
Pull Request -
State: closed - Opened by ravil-mobile about 1 month ago
- 1 comment
#4922 - LLVM ERROR: mma16816 data type not supported
Issue -
State: closed - Opened by mobicham about 1 month ago
- 8 comments
#4921 - Return `num_warmups`, `num_reps` and `use_cuda_graph` fields of `Autotuner`
Pull Request -
State: open - Opened by anmyachev about 1 month ago
- 1 comment
#4920 - [AMD] unrevert #4901; revert #4823
Pull Request -
State: closed - Opened by ptillet about 1 month ago
#4919 - [AMD] revert optimizations
Pull Request -
State: closed - Opened by ptillet about 1 month ago
#4918 - log1p
Issue -
State: closed - Opened by nimz about 1 month ago
- 1 comment
#4917 - [CI] Run CI on all PRs
Pull Request -
State: closed - Opened by peterbell10 about 1 month ago
- 3 comments
#4916 - Add tensor descriptor API backed by device-side TMA creation
Pull Request -
State: closed - Opened by peterbell10 about 1 month ago
#4915 - [Frontend] Factor out block shape validation function
Pull Request -
State: closed - Opened by peterbell10 about 1 month ago
#4914 - Fix order
Pull Request -
State: open - Opened by rawnhenry about 1 month ago
- 8 comments
#4913 - Update pre-commit config
Pull Request -
State: closed - Opened by anmyachev about 1 month ago
#4912 - [AMD] Fix gfx12 warp size and fix wmma in maybeDeduplicate
Pull Request -
State: closed - Opened by AlexAUT about 1 month ago
#4911 - Windows fix
Pull Request -
State: closed - Opened by wkpark about 1 month ago
#4910 - Add a tt.pointer_range_32 specialization for AMD backend
Pull Request -
State: open - Opened by giuseros about 1 month ago
- 2 comments
#4909 - Will tl.load faster without boundary_check?
Issue -
State: open - Opened by MARD1NO about 1 month ago
#4908 - `triton-nightly` package not updated since July 17
Issue -
State: open - Opened by 152334H about 1 month ago
#4907 - [WIP] [AMD] Apply basic optimizations to vecAdd on MI300X
Pull Request -
State: closed - Opened by zhanglx13 about 1 month ago
#4906 - Poor performance on Ampere vs. Ada with bitpacked weights
Issue -
State: open - Opened by mobicham about 1 month ago
- 18 comments
#4905 - [Frontend] Re-enable NumPy 2.0 semantics for add, sub, mul.
Pull Request -
State: closed - Opened by lezcano about 1 month ago
#4904 - [Backend] Implement `scaled_dot(mxfp4, fp8)`
Pull Request -
State: closed - Opened by lezcano about 1 month ago
#4903 - Introduce amdgpu.buffer_load and amdgpu.buffer_store
Pull Request -
State: open - Opened by giuseros about 1 month ago
- 1 comment
#4902 - Can not understand canFoldIntoConversion function in layout propagate process
Issue -
State: open - Opened by yanzixu about 1 month ago
#4901 - [AMD] Fix "keep Q tensor in VGPRS" optimization
Pull Request -
State: closed - Opened by oplavsic about 1 month ago
- 2 comments
#4900 - chore: update core.py
Pull Request -
State: closed - Opened by eltociear about 1 month ago
- 1 comment
#4899 - [Instrumentation] Move instrumentation lib test to stand alone lib directory and update name
Pull Request -
State: closed - Opened by CRobeck about 1 month ago
#4898 - only cpu
Issue -
State: closed - Opened by JocelynPanPan about 1 month ago
- 1 comment
#4897 - Fix typo: Correct 'piepling' to 'pipelining' in kernel comments for clarity in software optimization.
Pull Request -
State: closed - Opened by yuWeiCute about 1 month ago
#4896 - [IR] Add poison value to triton IR and use in frontend in place of undef
Pull Request -
State: closed - Opened by peterbell10 about 1 month ago
- 1 comment
#4895 - [BACKEND] Small fixes for dot operand properties
Pull Request -
State: closed - Opened by Jokeren about 1 month ago
- 2 comments
#4894 - Fix type hint in setup.py
Pull Request -
State: closed - Opened by kbumsik about 1 month ago
#4893 - [Backend] Update scf.if result uses in RewriteTensorPointer pass
Pull Request -
State: closed - Opened by yiqian1 about 1 month ago
#4892 - [BACKEND] Avoid undefined behavior in `std::clamp` when `shapePerCTA[i] < sizePerThread[i]`
Pull Request -
State: closed - Opened by Jokeren about 1 month ago
#4891 - [Linear Layouts] Implement LL conversion for DotOperand(version=2)
Pull Request -
State: closed - Opened by lezcano about 1 month ago
- 2 comments
#4890 - [Pipeliner] Fix epilogue peeling for num_stages=3+
Pull Request -
State: closed - Opened by sjw36 about 1 month ago
- 1 comment
#4889 - Auto import backends in `triton.language.extra`
Pull Request -
State: closed - Opened by kbumsik about 1 month ago
#4888 - Add string representation for AttrsDescriptor
Pull Request -
State: closed - Opened by alexbaden about 1 month ago
- 4 comments
#4887 - [SWP] Fix a bug in SWP that did not correctly compute the number of loop iterations
Pull Request -
State: open - Opened by sfzhu93 about 1 month ago
#4886 - [AMD] Skip incompatible layouts in RemoveLayoutConversions pass
Pull Request -
State: open - Opened by binarman about 1 month ago
#4885 - [Frontend] Fix codegen when top level control flow occurs after an unconditional return
Pull Request -
State: closed - Opened by peterbell10 about 1 month ago
#4884 - Unpack the int8 incorrectly under irregular dimensions
Issue -
State: closed - Opened by kuviki about 1 month ago
- 1 comment
#4883 - LLVM ERROR: operation destroyed but still has uses
Issue -
State: closed - Opened by jansel about 1 month ago
Labels: bug
#4882 - [frontend] Pretty-print `ptxas` command on failure
Pull Request -
State: closed - Opened by bertmaher about 1 month ago
#4881 - [AMD][Pipeliner] Improve clustering and add prefetch
Pull Request -
State: closed - Opened by sjw36 about 1 month ago
- 1 comment
#4880 - [SWP][Tests] Add one test that triggers SWP error and improve test logic
Pull Request -
State: closed - Opened by sfzhu93 about 1 month ago
- 1 comment
#4879 - Weird tl.trans dims error
Issue -
State: closed - Opened by Edenzzzz about 1 month ago
#4878 - How can I add some dialects for optimization?
Issue -
State: open - Opened by zhananran about 1 month ago
- 4 comments
#4877 - [BACKEND] Update LLVM version to https://github.com/llvm/llvm-project/commit/82f5acfbec65e1a645d902f746253eeaf0bd2d70
Pull Request -
State: closed - Opened by khasanovaa about 1 month ago
#4876 - different gemm performance behaviors on A100 and RTX 4060
Issue -
State: open - Opened by brisker about 1 month ago