Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / NVIDIA/nccl-tests issues and pull requests

#124 - ArchLinux test Failed

Issue - State: open - Opened by jacklu333333 almost 2 years ago - 3 comments

#123 - Understanding the latency of NCCL

Issue - State: open - Opened by ConnollyLeon almost 2 years ago - 2 comments

#122 - Update getHostHash() to avoid hash conflict

Pull Request - State: closed - Opened by dong0321 almost 2 years ago - 3 comments

#121 - Multi-Node Launch

Issue - State: open - Opened by apoorvemohan almost 2 years ago - 1 comment

#120 - Option to output results in csv and json format

Issue - State: open - Opened by avolkov1 almost 2 years ago - 11 comments

#119 - Evaluation of NCCL test result

Issue - State: closed - Opened by Yujaeseo almost 2 years ago - 2 comments

#118 - nccl-test result (error field)

Issue - State: open - Opened by susol-hjkim almost 2 years ago - 1 comment

#116 - Update README.md

Pull Request - State: closed - Opened by BlueCloudDev almost 2 years ago - 4 comments

#115 - The multi-gpu tests always hang and NCCL cannot find CUDA

Issue - State: open - Opened by SusuXu about 2 years ago - 5 comments

#114 - Does not compile with NVHPC 22.7

Issue - State: closed - Opened by zyndagj about 2 years ago - 2 comments

#113 - Support setting CUDA_VISIBLE_DEVICES env variable

Pull Request - State: open - Opened by ryanamazon about 2 years ago - 7 comments

#112 - nccl test only gets ~65% of the link bandwidth

Issue - State: closed - Opened by sandyhouse about 2 years ago - 10 comments

#111 - nccl test failed with mpirun for two machines

Issue - State: closed - Opened by sandyhouse about 2 years ago - 6 comments

#110 - The size of grid and block seems mismatch

Issue - State: open - Opened by ihchoi12 over 2 years ago - 2 comments

#109 - Do ranks on multiple nodes participate in ops or is the test standalone?

Issue - State: closed - Opened by MrAta over 2 years ago - 1 comment

#108 - Test failure when NCCL_MIN_NCHANNELS is set to a value other than 2

Issue - State: open - Opened by ihchoi12 over 2 years ago - 2 comments

#107 - How to understand the result?

Issue - State: open - Opened by ihchoi12 over 2 years ago - 8 comments

#106 - Inconsistent all_reduce busbw between 2 nodes

Issue - State: open - Opened by zhengwy888 over 2 years ago - 9 comments

#105 - Support setting CUDA_VISIBLE_DEVICES env variable

Pull Request - State: closed - Opened by nzmsv over 2 years ago - 4 comments

#104 - Got different results on same devices and same tests

Issue - State: closed - Opened by HaoKang-Timmy over 2 years ago - 2 comments

#103 - where is mpi.h

Issue - State: open - Opened by ShivanshuPurohit over 2 years ago

#102 - Multiple node NCCL tests hang

Issue - State: open - Opened by aamcintosh over 2 years ago - 3 comments

#101 - Profiling all_reduce_perf with Nsight hangs

Issue - State: open - Opened by caogao almost 3 years ago - 1 comment

#100 - HGX A100 can not reach peak bandwidth on 2nd Gen NVSwitch ?

Issue - State: closed - Opened by ShrimpLau almost 3 years ago - 9 comments

#99 - slow ranks search improvements

Pull Request - State: open - Opened by dmonakhov almost 3 years ago

#98 - testing nccl in 2 nodes and 16 gpus and something wrong

Issue - State: closed - Opened by ngc7292 almost 3 years ago - 9 comments

#96 - Add option to statically link cudart

Pull Request - State: closed - Opened by AddyLaddy almost 3 years ago - 1 comment

#95 - Tests do not build/run with nvhpc -- missing link to CUDA Runtime

Issue - State: open - Opened by ronnieChatt almost 3 years ago - 1 comment

#94 - /bin/ld: cannot find -lmpi

Issue - State: closed - Opened by ShreyasKudari almost 3 years ago - 2 comments

#93 - Performance degrades drastically between two docker in one host

Issue - State: closed - Opened by cookie-YL about 3 years ago - 11 comments

#92 - Multiple node running nccl tests failed.

Issue - State: open - Opened by JunjieChen-2020 about 3 years ago - 7 comments

#91 - Failure with NCCL 2.10 + CUDA 11.4

Issue - State: closed - Opened by xkszltl about 3 years ago - 8 comments

#90 - common.o fails build with cudaStreamCaptureModeThreadLocal undefined - CUDA 10

Issue - State: closed - Opened by amrragab8080 about 3 years ago - 2 comments

#89 - how to make all reduce use ring

Issue - State: closed - Opened by Mellonta over 3 years ago - 5 comments

#88 - Cleanup argument error handling and messages

Pull Request - State: closed - Opened by nzmsv over 3 years ago

#87 - Compilation failure with GCC 10.3.0

Issue - State: open - Opened by cponder over 3 years ago - 5 comments

#86 - Topology XML file

Issue - State: closed - Opened by guntrogu over 3 years ago - 2 comments

#83 - NCCL unable to use full TCP bandwidth in Azure

Issue - State: closed - Opened by rhl-bthr over 3 years ago - 6 comments

#82 - NCCL Broadcast bus bandwidth higher than network bandwidth

Issue - State: closed - Opened by rhl-bthr over 3 years ago - 6 comments

#81 - NCCL_HOME set, and still nccl.h: No such file or directory

Issue - State: closed - Opened by WurmD over 3 years ago - 5 comments

#80 - Add support for new datatype: bfloat16

Pull Request - State: closed - Opened by AddyLaddy over 3 years ago

#78 - NCCL alltoall tests failing at 256 GPUs

Issue - State: open - Opened by awan-10 over 3 years ago - 10 comments

#66 - Test CUDA failure common.cu:730 'unknown error'

Issue - State: closed - Opened by kumareshr over 3 years ago - 2 comments

#65 - when run test in default, How can I determine what nccl-algorithm is used

Issue - State: closed - Opened by huyutuo over 3 years ago - 2 comments

#64 - Add boot_id to the hostname hash due to collisions on Azure

Pull Request - State: closed - Opened by AddyLaddy over 3 years ago - 1 comment

#58 - Test NCCL failure common.cu:752 'internal error'

Issue - State: closed - Opened by ghost almost 4 years ago - 1 comment

#57 - GPU affinity sets different CPU masks when using the same NCCL_TOPO_FILE

Issue - State: closed - Opened by rexcsn almost 4 years ago - 3 comments

#55 - common.cu:375

Issue - State: closed - Opened by Hamidreza-Ramezani almost 4 years ago - 2 comments

#54 - Run with MPI on 40 processes test failed.

Issue - State: closed - Opened by TimJZ almost 4 years ago - 5 comments

#51 - show me the nccl.h is not found. who can help me ?? ^_^

Issue - State: closed - Opened by harrycrq about 4 years ago - 2 comments

#50 - undefine reference to ncclTestEngine at compile time

Issue - State: closed - Opened by AmericanEnglish about 4 years ago - 3 comments

#47 - ncclSend and ncclRecv undefined

Issue - State: closed - Opened by joehandzik over 4 years ago - 5 comments

#44 - System hangs running nccl-tests, with 2 2080ti and NVlink bridge.

Issue - State: closed - Opened by AlexWang1900 over 4 years ago - 2 comments

#42 - Feature request: write results to file

Issue - State: closed - Opened by christopherhesse over 4 years ago - 3 comments

#41 - Internal error

Issue - State: closed - Opened by hpadhuka over 4 years ago - 6 comments

#38 - Is PGI a suported compiler?

Issue - State: closed - Opened by dkokron over 4 years ago - 4 comments

#35 - make error `error: missing binary operator before token "("`

Issue - State: closed - Opened by zsef123 over 4 years ago - 4 comments

#34 - /usr/bin/ld:cannot find -lmpi

Issue - State: closed - Opened by Dongguage over 4 years ago - 2 comments

#32 - mpi.h: No such file or directory

Issue - State: closed - Opened by scottzockoll over 4 years ago - 6 comments

#30 - Test CUDA failure common.cu:730 'no CUDA-capable device is detected'

Issue - State: closed - Opened by shuxiaobo almost 5 years ago - 3 comments

#27 - Running NCCL test on multiple nodes

Issue - State: closed - Opened by leeQT almost 5 years ago - 6 comments

#26 - nccl-test with mpi hangs

Issue - State: closed - Opened by eric-haibin-lin about 5 years ago - 6 comments

#25 - Add bit redop test

Pull Request - State: open - Opened by wangxicoding about 5 years ago - 3 comments

#18 - Stuck when running MPI test

Issue - State: closed - Opened by kyoungrok0517 over 5 years ago - 13 comments

#15 - NCCL failure common.cu:916

Issue - State: closed - Opened by gmyofustc about 6 years ago - 14 comments

#13 - NCCL failure all_reduce.cu:95 'unhandled cuda error'

Issue - State: closed - Opened by leonf88 about 6 years ago - 2 comments

#12 - Cuda failure common.cu:891

Issue - State: closed - Opened by Ujjalbuet over 6 years ago - 2 comments

#9 - mpi run on multi nodes does not work

Issue - State: closed - Opened by gmyofustc over 6 years ago - 3 comments

#8 - nv_peer_mem NCCL2 nccl-tests fails with: Out of bounds values : 24 FAILED

Issue - State: closed - Opened by shijieheping over 6 years ago - 5 comments

#7 - Out of bounds values : 248 FAILED

Issue - State: closed - Opened by wm10240 over 6 years ago - 4 comments

#6 - NCCL failure common.cu:908 'unhandled cuda error'

Issue - State: closed - Opened by galphag over 6 years ago - 2 comments

#4 - Is NCCL suitable for calculating "sum = a1 + a2 + .. +an;"?

Issue - State: closed - Opened by NanXiao almost 7 years ago - 2 comments

#2 - Multinode NCCL 2.0 MPI Test code failure

Issue - State: closed - Opened by mpatwary almost 7 years ago - 8 comments