Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / NVIDIA/cub issues and pull requests
#745 - BlockReduce<int, ...> causes "an illegal instruction was encountered"
Issue -
State: open - Opened by intractabilis about 1 year ago
- 3 comments
#744 - How do I reduce partially filled 2D blocks?
Issue -
State: closed - Opened by intractabilis about 1 year ago
- 2 comments
#743 - Possible bug in variable naming
Issue -
State: closed - Opened by akshit-sharma about 1 year ago
- 2 comments
#741 - nvcc fatal : Unsupported gpu architecture 'compute_80'
Issue -
State: open - Opened by tyq996 about 1 year ago
- 1 comment
#739 - BlockLoad never attempts to vectorize
Issue -
State: closed - Opened by iclementine about 1 year ago
- 5 comments
#738 - 64-bit indexing for DeviceSegmentedReduce
Pull Request -
State: closed - Opened by jecs about 1 year ago
- 3 comments
#737 - select_if kernel needs grid boundary or reprogramming tile_idx
Issue -
State: closed - Opened by zhaolianshuizls about 1 year ago
- 1 comment
#732 - what's the purpose of CUB_SUBSCRIPTION_FACTOR
Issue -
State: closed - Opened by zhaolianshuizls about 1 year ago
#731 - Misleading documentation for DeviceSegmentedRadixSort (or I'm using it wrong)
Issue -
State: closed - Opened by HapeMask about 1 year ago
- 1 comment
#730 - Segmented sorting does not preserve data in-between segments.
Issue -
State: closed - Opened by isovic about 1 year ago
- 6 comments
#729 - What is the correct compile command in Linux platform to compile a function citing cuh?
Issue -
State: closed - Opened by fengwang about 1 year ago
- 1 comment
#728 - Fix instances of 'scan' copy-pasted into reduction documentation
Pull Request -
State: closed - Opened by milesvant about 1 year ago
- 2 comments
#727 - Fix multiset erase call
Pull Request -
State: closed - Opened by rongou about 1 year ago
- 1 comment
#726 - Illegal memory access on trying to use `DeviceReduce::Sum()` to count number of non-zeros
Issue -
State: closed - Opened by alexsamardzic over 1 year ago
- 2 comments
#725 - Why are inplace overloads for BlockExchange not documented
Issue -
State: open - Opened by pauleonix over 1 year ago
- 1 comment
Labels: good first issue
#724 - Fix BlockAdjacentDifference documentation
Pull Request -
State: closed - Opened by pauleonix over 1 year ago
- 3 comments
#723 - Ensure that any CMake re-rooting doesn't break our find_file.
Pull Request -
State: closed - Opened by robertmaynard over 1 year ago
Labels: monorepo blocker
#722 - Improve docs for warp-wide primitives
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed, monorepo blocker
#721 - Tune RLE for SM90
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed, monorepo blocker
#720 - Quickselect algorithm to select kth smallest element
Issue -
State: open - Opened by mfbalin over 1 year ago
- 4 comments
#719 - Can't get correct result when use cub in CUDA12.0
Issue -
State: closed - Opened by YuanRisheng over 1 year ago
- 24 comments
#718 - Tune select and partition for SM90
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#717 - Compile error: cannot find cuda_bf16.h
Issue -
State: open - Opened by peizhang-cn over 1 year ago
- 3 comments
#716 - Use `NV_IF_ELSE_TARGET`
Issue -
State: open - Opened by senior-zero over 1 year ago
#715 - Fix reduce by key tile state for Pascal
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#714 - Workaround three-way partition compilation issue
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#713 - [EPIC] Design a scheme allowing CUB to process user-defined types of any size
Issue -
State: open - Opened by jrhemstad over 1 year ago
- 1 comment
#712 - Introduce SM90 tuning policy into scan
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
- 1 comment
Labels: testing: gpuCI passed
#711 - Tune Decoupled Look-back based Algorithms for H100
Issue -
State: closed - Opened by senior-zero over 1 year ago
- 1 comment
Labels: cub
#710 - Introduce delay policy to decoupled look-back based algorithms
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#709 - Enable test coverage for int128 in DeviceHistogram::Even
Issue -
State: open - Opened by elstehle over 1 year ago
Labels: area: tests
#708 - Fix CDP test wrapper for CTK 11.5
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#706 - Fix radix sort / MSVC 2017
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#705 - Segfault in CachingDeviceAllocator when out of memory
Issue -
State: closed - Opened by orjgre over 1 year ago
- 4 comments
#704 - Initial port to new monorepo build system.
Pull Request -
State: open - Opened by allisonvacanti over 1 year ago
- 1 comment
#703 - Use of <array> and <atomic> breaks some CUB algorithms with Jitify
Issue -
State: open - Opened by maddyscientist over 1 year ago
- 2 comments
Labels: helps: quda, compiler: nvrtc
#702 - Fix dependent template in radix sort
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#701 - Allow analysis script to process multiple dbs
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
#700 - Decoupled look-back example
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#699 - Write example for decoupled look-back API
Issue -
State: closed - Opened by senior-zero over 1 year ago
Labels: cub
#698 - Implement scripts for detection of performance regressions across CUB versions
Issue -
State: open - Opened by senior-zero over 1 year ago
Labels: cub
#697 - Implement VSMem abstraction
Issue -
State: open - Opened by senior-zero over 1 year ago
#696 - Implement tuning db merger
Issue -
State: closed - Opened by senior-zero over 1 year ago
- 3 comments
Labels: cub
#695 - Make decoupled look-back delay part of tuning
Issue -
State: closed - Opened by senior-zero over 1 year ago
- 2 comments
Labels: cub
#694 - Add policy parameter to allow tuning
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#693 - Static assert arithmetic types in basic radix sort API
Issue -
State: open - Opened by senior-zero over 1 year ago
Labels: cub
#692 - Unresolved extern function 'cudaLaunchDevice' error while using NVCC 11.x and cub 2.10 with -G
Issue -
State: closed - Opened by lilohuang over 1 year ago
- 3 comments
#691 - Provide Run-Length Decode API
Issue -
State: open - Opened by senior-zero over 1 year ago
#690 - Add block-wide set operations like merge and intersection
Issue -
State: open - Opened by fkallen over 1 year ago
#689 - Add policy parameter to allow tuning
Issue -
State: closed - Opened by senior-zero over 1 year ago
#688 - CUB Tuning Infrastructure
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#687 - Fixes narrowing conversion warning in tests
Pull Request -
State: closed - Opened by elstehle over 1 year ago
Labels: only: tests, area: tests
#686 - Re-enable extended test coverage for WarpReduce tests
Issue -
State: open - Opened by elstehle over 1 year ago
Labels: only: tests, area: tests, cub
#685 - Remove extra fence in radix sort
Issue -
State: open - Opened by senior-zero over 1 year ago
#684 - Removes unused variables in DeviceCopy::Batched
Pull Request -
State: closed - Opened by elstehle over 1 year ago
Labels: only: tests, area: tests
#683 - Limits tests for generic WarpReduce tests to builtin types
Pull Request -
State: closed - Opened by elstehle over 1 year ago
Labels: cub
#682 - Remove add_to_project action as it is no longer needed.
Pull Request -
State: closed - Opened by jrhemstad over 1 year ago
#681 - Backport fixes for reordering in CUB member initializer lists into 2.0 branch
Pull Request -
State: closed - Opened by ericniebler over 1 year ago
Labels: testing: gpuCI in progress
#678 - Fixes non_void_value_t for ptr-to-void iterator types
Pull Request -
State: closed - Opened by elstehle over 1 year ago
#677 - Documentation of warp-wide collectives refers to `__syncthreads` instead of `__syncwarp`
Issue -
State: closed - Opened by fkallen over 1 year ago
- 1 comment
#675 - Initial implementation for fancy devicememcpybatch
Pull Request -
State: closed - Opened by mfbalin over 1 year ago
- 8 comments
#674 - Specialize DeviceMemcpy::Batched to also support iterators
Issue -
State: closed - Opened by elstehle over 1 year ago
- 1 comment
Labels: cub
#672 - DeviceMemcpy::Batched supports only memory buffers
Issue -
State: closed - Opened by mfbalin over 1 year ago
- 4 comments
#671 - Support user-defined types in radix sort and introduce initial sphinx docs
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#670 - Add WarpExchangeRegister class
Pull Request -
State: closed - Opened by pb-dseifert over 1 year ago
- 7 comments
#668 - Backport a few fixes from main
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#667 - Register-only based `WarpExchange`
Issue -
State: closed - Opened by pb-dseifert over 1 year ago
- 4 comments
#666 - Can cub::DeviceSegmentedReduce::Reduce support self-defined functor for struct variable instead of just integer?
Issue -
State: closed - Opened by zlwu92 over 1 year ago
- 2 comments
#665 - Fix uninitialized copy in block scan raking
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#664 - Optimize merge sort
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed, area: performance
#663 - Performance of small sums could be improved
Issue -
State: open - Opened by seberg over 1 year ago
- 1 comment
#662 - Apply `ArgMin`/`ArgMax` fix for `infinity` input to `Min`/`Max`
Issue -
State: closed - Opened by nolmoonen over 1 year ago
- 2 comments
#661 - Reduce RAM usage by radix sort tests
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI in progress, only: tests
#660 - Migrates remaining warp-scope tests to Catch2
Pull Request -
State: closed - Opened by elstehle over 1 year ago
Labels: testing: gpuCI passed
#659 - Fix `cub::DeviceSpmv` for empty matrices
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
- 2 comments
Labels: testing: gpuCI in progress
#658 - Deprecate cub::mutex and fix remaining narrowing conversions in tests
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed, release: breaking change
#657 - Fix conversion issues in block histogram test
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed, only: tests
#655 - Is there any way that I want to remove duplicates in a array but maintaining the original relative order in array using cuda library? That means not sorting.
Issue -
State: closed - Opened by zlwu92 over 1 year ago
#654 - CDP abstraction for CUB tests
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed, only: tests, cub
#653 - Deprecate `cub::Mutex`
Issue -
State: closed - Opened by senior-zero over 1 year ago
- 1 comment
#652 - Fix `cub::DeviceReduce::Arg{Max,Min}` for `inf` values
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
- 3 comments
Labels: testing: gpuCI passed
#651 - Fix reduce to match the documentation and use numeric limits
Issue -
State: open - Opened by senior-zero over 1 year ago
#650 - Force reuse of CUDA arches from thrust.
Pull Request -
State: closed - Opened by allisonvacanti over 1 year ago
#649 - Assert max tile size for radix sort
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: testing: gpuCI passed
#648 - CI: set minimal permissions on GitHub Workflow
Issue -
State: open - Opened by diogoteles08 over 1 year ago
- 3 comments
#647 - Link `CubDebug` with `CUB_DEBUG_LOG`
Issue -
State: open - Opened by senior-zero over 1 year ago
Labels: cub
#646 - 2.1.x changelog update
Pull Request -
State: closed - Opened by allisonvacanti over 1 year ago
#644 - how to use dynamic shared memory in cub block radix sort
Issue -
State: closed - Opened by zlwu92 over 1 year ago
- 12 comments
#643 - Is it possible to radix sort a struct?
Issue -
State: closed - Opened by alibillalhammoud over 1 year ago
- 1 comment
#642 - `cub::DeviceReduce::ArgMin` returns wrong value for INF input
Issue -
State: closed - Opened by asi1024 over 1 year ago
- 1 comment
Labels: type: bug: functional
#640 - why some code use ptx?
Issue -
State: closed - Opened by MonroeD over 1 year ago
#639 - Fix clang / nvcc CI build
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
#638 - Cub can't be compiled on CLANG with CUDA 11.8
Issue -
State: closed - Opened by ShuaiShao93 over 1 year ago
- 3 comments
#636 - Question regarding block launch order in CUDA
Issue -
State: closed - Opened by Snektron over 1 year ago
#632 - Fixes overflows for [Multi]HistogramEven over integral types
Pull Request -
State: closed - Opened by elstehle over 1 year ago
Labels: testing: gpuCI passed
#631 - Initial CUB Tuning Infrastructure
Issue -
State: closed - Opened by senior-zero over 1 year ago
Labels: P0: must have, area: performance, cub
#630 - Cleanup CTK version checks
Pull Request -
State: closed - Opened by senior-zero over 1 year ago
Labels: type: bug: functional, testing: gpuCI passed
#629 - Compiler error with CUDA <11.5 due to int128
Issue -
State: closed - Opened by jrhemstad over 1 year ago
- 1 comment
Labels: type: bug: functional, cub
#628 - Add CUDA CC 87
Pull Request -
State: closed - Opened by jrhemstad over 1 year ago
- 1 comment
#625 - Rewrite remaining tests to use Catch2
Issue -
State: closed - Opened by senior-zero over 1 year ago
- 2 comments
Labels: cub