Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / NVIDIA/nccl-tests issues and pull requests
#124 - ArchLinux test Failed
Issue -
State: open - Opened by jacklu333333 almost 2 years ago
- 3 comments
#123 - Understanding the latency of NCCL
Issue -
State: open - Opened by ConnollyLeon almost 2 years ago
- 2 comments
#122 - Update getHostHash() to avoid hash conflict
Pull Request -
State: closed - Opened by dong0321 almost 2 years ago
- 3 comments
#121 - Multi-Node Launch
Issue -
State: open - Opened by apoorvemohan almost 2 years ago
- 1 comment
#120 - Option to output results in csv and json format
Issue -
State: open - Opened by avolkov1 almost 2 years ago
- 11 comments
#119 - Evaluation of NCCL test result
Issue -
State: closed - Opened by Yujaeseo almost 2 years ago
- 2 comments
#118 - nccl-test result (error field)
Issue -
State: open - Opened by susol-hjkim almost 2 years ago
- 1 comment
#117 - NCCL all_reduce_perf test hangs with multiple RTX 4090 GPUs, works fine when I swap in 2080tis
Issue -
State: closed - Opened by RCS1 almost 2 years ago
- 47 comments
#116 - Update README.md
Pull Request -
State: closed - Opened by BlueCloudDev almost 2 years ago
- 4 comments
#115 - The multi-gpu tests always hang and NCCL cannot find CUDA
Issue -
State: open - Opened by SusuXu about 2 years ago
- 5 comments
#114 - Does not compile with NVHPC 22.7
Issue -
State: closed - Opened by zyndagj about 2 years ago
- 2 comments
#113 - Support setting CUDA_VISIBLE_DEVICES env variable
Pull Request -
State: open - Opened by ryanamazon about 2 years ago
- 7 comments
#112 - nccl test only gets ~65% of the link bandwidth
Issue -
State: closed - Opened by sandyhouse about 2 years ago
- 10 comments
#111 - nccl test failed with mpirun for two machines
Issue -
State: closed - Opened by sandyhouse about 2 years ago
- 6 comments
#110 - The size of grid and block seems mismatch
Issue -
State: open - Opened by ihchoi12 over 2 years ago
- 2 comments
#109 - Do ranks on multiple nodes participate in ops or is the test standalone?
Issue -
State: closed - Opened by MrAta over 2 years ago
- 1 comment
#108 - Test failure when NCCL_MIN_NCHANNELS is set to a value other than 2
Issue -
State: open - Opened by ihchoi12 over 2 years ago
- 2 comments
#107 - How to understand the result?
Issue -
State: open - Opened by ihchoi12 over 2 years ago
- 8 comments
#106 - Inconsistent all_reduce busbw between 2 nodes
Issue -
State: open - Opened by zhengwy888 over 2 years ago
- 9 comments
#105 - Support setting CUDA_VISIBLE_DEVICES env variable
Pull Request -
State: closed - Opened by nzmsv over 2 years ago
- 4 comments
#104 - Got different results on same devices and same tests
Issue -
State: closed - Opened by HaoKang-Timmy over 2 years ago
- 2 comments
#103 - where is mpi.h
Issue -
State: open - Opened by ShivanshuPurohit over 2 years ago
#102 - Multiple node NCCL tests hang
Issue -
State: open - Opened by aamcintosh over 2 years ago
- 3 comments
#101 - Profiling all_reduce_perf with Nsight hangs
Issue -
State: open - Opened by caogao almost 3 years ago
- 1 comment
#100 - HGX A100 can not reach peak bandwidth on 2nd Gen NVSwitch ?
Issue -
State: closed - Opened by ShrimpLau almost 3 years ago
- 9 comments
#99 - slow ranks search improvements
Pull Request -
State: open - Opened by dmonakhov almost 3 years ago
#98 - testing nccl in 2 nodes and 16 gpus and something wrong
Issue -
State: closed - Opened by ngc7292 almost 3 years ago
- 9 comments
#97 - Why is the bandwidth utilization of 7 GPUs much lower than that of 6, 8 GPUs for tasks with a specific total parameter amount?
Issue -
State: open - Opened by Youhe-Jiang almost 3 years ago
- 2 comments
#96 - Add option to statically link cudart
Pull Request -
State: closed - Opened by AddyLaddy almost 3 years ago
- 1 comment
#95 - Tests do not build/run with nvhpc -- missing link to CUDA Runtime
Issue -
State: open - Opened by ronnieChatt almost 3 years ago
- 1 comment
#94 - /bin/ld: cannot find -lmpi
Issue -
State: closed - Opened by ShreyasKudari almost 3 years ago
- 2 comments
#93 - Performance degrades drastically between two docker in one host
Issue -
State: closed - Opened by cookie-YL about 3 years ago
- 11 comments
#92 - Multiple node running nccl tests failed.
Issue -
State: open - Opened by JunjieChen-2020 about 3 years ago
- 7 comments
#91 - Failure with NCCL 2.10 + CUDA 11.4
Issue -
State: closed - Opened by xkszltl about 3 years ago
- 8 comments
#90 - common.o fails build with cudaStreamCaptureModeThreadLocal undefined - CUDA 10
Issue -
State: closed - Opened by amrragab8080 about 3 years ago
- 2 comments
#89 - how to make all reduce use ring
Issue -
State: closed - Opened by Mellonta over 3 years ago
- 5 comments
#88 - Cleanup argument error handling and messages
Pull Request -
State: closed - Opened by nzmsv over 3 years ago
#87 - Compilation failure with GCC 10.3.0
Issue -
State: open - Opened by cponder over 3 years ago
- 5 comments
#86 - Topology XML file
Issue -
State: closed - Opened by guntrogu over 3 years ago
- 2 comments
#85 - ./build/all_reduce_perf: error while loading shared libraries: libnccl.so.2: cannot open shared object file: No such file or directory
Issue -
State: closed - Opened by ShreyasKudari over 3 years ago
- 3 comments
#83 - NCCL unable to use full TCP bandwidth in Azure
Issue -
State: closed - Opened by rhl-bthr over 3 years ago
- 6 comments
#82 - NCCL Broadcast bus bandwidth higher than network bandwidth
Issue -
State: closed - Opened by rhl-bthr over 3 years ago
- 6 comments
#81 - NCCL_HOME set, and still nccl.h: No such file or directory
Issue -
State: closed - Opened by WurmD over 3 years ago
- 5 comments
#80 - Add support for new datatype: bfloat16
Pull Request -
State: closed - Opened by AddyLaddy over 3 years ago
#79 - Test CUDA failure common.cu:253 'no kernel image is available for execution on the device'
Issue -
State: closed - Opened by ratovarius over 3 years ago
- 2 comments
#78 - NCCL alltoall tests failing at 256 GPUs
Issue -
State: open - Opened by awan-10 over 3 years ago
- 10 comments
#72 - Question about why measurement of time for performance benchmarks includes time taken for MPI_Barrier to complete.
Issue -
State: closed - Opened by nithintsk over 3 years ago
- 4 comments
#66 - Test CUDA failure common.cu:730 'unknown error'
Issue -
State: closed - Opened by kumareshr over 3 years ago
- 2 comments
#65 - when run test in default, How can I determine what nccl-algorithm is used
Issue -
State: closed - Opened by huyutuo over 3 years ago
- 2 comments
#64 - Add boot_id to the hostname hash due to collisions on Azure
Pull Request -
State: closed - Opened by AddyLaddy over 3 years ago
- 1 comment
#62 - Got very low performance of nccl-tests on A100 with NVLink over 200Gb RoCE network
Issue -
State: open - Opened by weberxie almost 4 years ago
- 7 comments
#58 - Test NCCL failure common.cu:752 'internal error'
Issue -
State: closed - Opened by ghost almost 4 years ago
- 1 comment
#57 - GPU affinity sets different CPU masks when using the same NCCL_TOPO_FILE
Issue -
State: closed - Opened by rexcsn almost 4 years ago
- 3 comments
#55 - common.cu:375
Issue -
State: closed - Opened by Hamidreza-Ramezani almost 4 years ago
- 2 comments
#54 - Run with MPI on 40 processes test failed.
Issue -
State: closed - Opened by TimJZ almost 4 years ago
- 5 comments
#51 - show me the nccl.h is not found. who can help me ?? ^_^
Issue -
State: closed - Opened by harrycrq about 4 years ago
- 2 comments
#50 - undefine reference to ncclTestEngine at compile time
Issue -
State: closed - Opened by AmericanEnglish about 4 years ago
- 3 comments
#47 - ncclSend and ncclRecv undefined
Issue -
State: closed - Opened by joehandzik over 4 years ago
- 5 comments
#44 - System hangs running nccl-tests, with 2 2080ti and NVlink bridge.
Issue -
State: closed - Opened by AlexWang1900 over 4 years ago
- 2 comments
#42 - Feature request: write results to file
Issue -
State: closed - Opened by christopherhesse over 4 years ago
- 3 comments
#41 - Internal error
Issue -
State: closed - Opened by hpadhuka over 4 years ago
- 6 comments
#39 - MPI_HOME alone is not able to build since mpi.h does not reside in /usr/include/mpi.h
Issue -
State: closed - Opened by 372046933 over 4 years ago
- 3 comments
#38 - Is PGI a suported compiler?
Issue -
State: closed - Opened by dkokron over 4 years ago
- 4 comments
#36 - error during all_reduce_perf with openmpi running on Azure Standard_NC24rs_v3 Infiniband.
Issue -
State: closed - Opened by tohaowu over 4 years ago
- 12 comments
#35 - make error `error: missing binary operator before token "("`
Issue -
State: closed - Opened by zsef123 over 4 years ago
- 4 comments
#34 - /usr/bin/ld:cannot find -lmpi
Issue -
State: closed - Opened by Dongguage over 4 years ago
- 2 comments
#32 - mpi.h: No such file or directory
Issue -
State: closed - Opened by scottzockoll over 4 years ago
- 6 comments
#30 - Test CUDA failure common.cu:730 'no CUDA-capable device is detected'
Issue -
State: closed - Opened by shuxiaobo almost 5 years ago
- 3 comments
#27 - Running NCCL test on multiple nodes
Issue -
State: closed - Opened by leeQT almost 5 years ago
- 6 comments
#26 - nccl-test with mpi hangs
Issue -
State: closed - Opened by eric-haibin-lin about 5 years ago
- 6 comments
#25 - Add bit redop test
Pull Request -
State: open - Opened by wangxicoding about 5 years ago
- 3 comments
#19 - Why is all_reduce_perf result not consistent with that of reduce_scatter_perf and all_gather_perf?
Issue -
State: closed - Opened by EdwardZhang88 over 5 years ago
- 6 comments
#18 - Stuck when running MPI test
Issue -
State: closed - Opened by kyoungrok0517 over 5 years ago
- 13 comments
#15 - NCCL failure common.cu:916
Issue -
State: closed - Opened by gmyofustc about 6 years ago
- 14 comments
#13 - NCCL failure all_reduce.cu:95 'unhandled cuda error'
Issue -
State: closed - Opened by leonf88 about 6 years ago
- 2 comments
#12 - Cuda failure common.cu:891
Issue -
State: closed - Opened by Ujjalbuet over 6 years ago
- 2 comments
#10 - Running in container got failed: misc/nvmlwrap.cu:170 WARN nvmlDeviceSetCpuAffinity() failed: Unknown Error
Issue -
State: closed - Opened by fxrcode over 6 years ago
- 1 comment
#9 - mpi run on multi nodes does not work
Issue -
State: closed - Opened by gmyofustc over 6 years ago
- 3 comments
#8 - nv_peer_mem NCCL2 nccl-tests fails with: Out of bounds values : 24 FAILED
Issue -
State: closed - Opened by shijieheping over 6 years ago
- 5 comments
#7 - Out of bounds values : 248 FAILED
Issue -
State: closed - Opened by wm10240 over 6 years ago
- 4 comments
#6 - NCCL failure common.cu:908 'unhandled cuda error'
Issue -
State: closed - Opened by galphag over 6 years ago
- 2 comments
#4 - Is NCCL suitable for calculating "sum = a1 + a2 + .. +an;"?
Issue -
State: closed - Opened by NanXiao almost 7 years ago
- 2 comments
#2 - Multinode NCCL 2.0 MPI Test code failure
Issue -
State: closed - Opened by mpatwary almost 7 years ago
- 8 comments