Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / microsoft/mscclpp issues and pull requests

#457 - Update nccl-api-test.yaml

Pull Request - State: open - Opened by Guilherme-silva-teixeira 8 days ago

#456 - update for ncclCommSplit

Pull Request - State: closed - Opened by Binyang2014 11 days ago

#455 - Add multi-nodes example & update doc

Pull Request - State: open - Opened by Binyang2014 11 days ago

#453 - Fix PR #449

Pull Request - State: closed - Opened by chhwang 18 days ago

#452 - Manage runtime environments

Pull Request - State: closed - Opened by chhwang 22 days ago - 6 comments

#451 - Resolve cuMemMap error

Pull Request - State: closed - Opened by Binyang2014 23 days ago - 2 comments

#450 - Auto-update version numbers in CMakeLists.txt

Pull Request - State: closed - Opened by chhwang 23 days ago

#449 - Lazily create streams for CudaIpcConnection

Pull Request - State: closed - Opened by chhwang 23 days ago - 6 comments

#448 - Update includes in header files

Pull Request - State: open - Opened by chhwang 24 days ago

#447 - update broadcast algo

Pull Request - State: closed - Opened by Binyang2014 24 days ago

#446 - Add support for CPX mode on MI300X

Pull Request - State: closed - Opened by nusislam 25 days ago - 1 comment

#445 - A two-stage copy design with scratch buffer

Pull Request - State: open - Opened by SreevatsaAnantharamu 25 days ago

#444 - Fix Python binding of exceptions

Pull Request - State: closed - Opened by chhwang 25 days ago - 4 comments

#443 - Fix CMake build messages

Pull Request - State: closed - Opened by chhwang 26 days ago

#442 - Merge mscclpp-lang to mscclpp project

Pull Request - State: closed - Opened by Binyang2014 26 days ago - 3 comments

#441 - Adding Read Put Packet operation at Executor

Pull Request - State: open - Opened by caiomcbr 27 days ago - 1 comment

#440 - [Bug] Allgather with proxy channel hangs at H2D cudaMemcpyAsync

Issue - State: closed - Opened by cubele 28 days ago - 5 comments

#440 - [Bug] Allgather with proxy channel hangs at H2D cudaMemcpyAsync

Issue - State: open - Opened by cubele 28 days ago - 2 comments

#439 - [Bug] Memory leak in sm and proxy channels on AMD in python

Issue - State: closed - Opened by liangyuRain 29 days ago - 4 comments

#438 - [Feature] Any plan for IBGDA?

Issue - State: open - Opened by FC-Li 29 days ago - 1 comment

#438 - [Feature] Any plan for IBGDA?

Issue - State: closed - Opened by FC-Li 29 days ago - 1 comment

#437 - Fix azure pipeline

Pull Request - State: closed - Opened by Binyang2014 29 days ago - 4 comments

#437 - Fix azure pipeline

Pull Request - State: closed - Opened by Binyang2014 29 days ago - 4 comments

#436 - Renaming channels

Pull Request - State: closed - Opened by chhwang 30 days ago - 14 comments

#435 - [Cherry-pick] Update version number (#433)

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago

#435 - [Cherry-pick] Update version number (#433)

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago

#434 - Enhance the nccl error message handling

Pull Request - State: closed - Opened by seagater about 1 month ago

#434 - Enhance the nccl error message handling

Pull Request - State: closed - Opened by seagater about 1 month ago

#433 - Update version number

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago - 1 comment

#433 - Update version number

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago - 1 comment

#432 - [Bug] Hang when using ProxyChan and one GPU is sending zeros bytes.

Issue - State: open - Opened by FC-Li about 1 month ago - 3 comments

#432 - [Bug] Hang when using ProxyChan and one GPU is sending zeros bytes.

Issue - State: closed - Opened by FC-Li about 1 month ago - 3 comments

#430 - Add comm related nccl APIs

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago

#429 - [Cherry-pick] Fix nccl-test failure issue (#421)

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago

#428 - Fix CI trigger issue

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago

#427 - [Cherry-pick] trigger ci for release branches (#426)

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago - 2 comments

#426 - trigger ci for release branches

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago

#425 - [Cherry-pick] NVLS support for NCCL API (#410)

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago - 2 comments

#424 - [Cherry-pick] Disable CuMemMap check for ROCm (#411)

Pull Request - State: closed - Opened by Binyang2014 about 1 month ago

#423 - Add `GpuBuffer` class

Pull Request - State: closed - Opened by chhwang about 1 month ago - 32 comments

#422 - Tackle build warnings

Pull Request - State: closed - Opened by chhwang about 1 month ago

#421 - Fix nccl-test failure issue

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago - 2 comments

#420 - Fix typos in the pipeline

Pull Request - State: closed - Opened by chhwang about 2 months ago

#419 - Add ncclBcast / ncclBroadcast support

Pull Request - State: closed - Opened by SreevatsaAnantharamu about 2 months ago - 4 comments

#418 - [Bug] Proxy channel over CudaIPC on AMD GPUs

Issue - State: closed - Opened by liangyuRain about 2 months ago - 3 comments

#417 - Scratch buffer copy-based implementation of ncclBcast / ncclBroadcast

Pull Request - State: closed - Opened by SreevatsaAnantharamu about 2 months ago - 1 comment

#416 - [Cherry-pick] Move pipeline to official org (#406)

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago

#415 - Flushing Proxy Channels at CPU side upon reaching the Inflight Request Limit

Pull Request - State: closed - Opened by caiomcbr about 2 months ago - 2 comments

#414 - Update README

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago - 2 comments

#413 - [Cherry-pick] Move pipeline to official org (#406)

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago - 2 comments

#412 - Supporting multi-node executors in NCCL API

Pull Request - State: closed - Opened by caiomcbr about 2 months ago - 4 comments

#411 - Disable CuMemMap check for ROCm

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago - 4 comments

#410 - NVLS support for NCCL API

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago - 4 comments

#409 - Expose nccl bcast api, uses broadcast

Pull Request - State: closed - Opened by pash-msft about 2 months ago

#409 - Expose nccl bcast api, uses broadcast

Pull Request - State: closed - Opened by pash-msft about 2 months ago

#408 - [Feature] Using the AMD Infinity Fabric in SMChannels

Issue - State: open - Opened by ThomasNing about 2 months ago - 3 comments

#408 - [Feature] Using the AMD Infinity Fabric in SMChannels

Issue - State: closed - Opened by ThomasNing about 2 months ago - 4 comments

#407 - Fix synchronization in allreduce8 kernel

Pull Request - State: open - Opened by dsidler about 2 months ago - 1 comment

#407 - Fix synchronization in allreduce8 kernel

Pull Request - State: closed - Opened by dsidler about 2 months ago - 6 comments

#406 - Move pipeline to official org

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago

#406 - Move pipeline to official org

Pull Request - State: open - Opened by Binyang2014 about 2 months ago

#405 - Exception Max Number Operation per Tb

Pull Request - State: closed - Opened by caiomcbr about 2 months ago

#405 - Exception Max Number Operation per Tb

Pull Request - State: closed - Opened by caiomcbr about 2 months ago

#404 - Support nccl-test with mscclpp-nccl on H100 GPUs

Pull Request - State: closed - Opened by seagater about 2 months ago - 1 comment

#404 - Support nccl-test with mscclpp-nccl on H100 GPUs

Pull Request - State: closed - Opened by seagater about 2 months ago - 1 comment

#403 - Reduce memory usage for scratch buffer

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago

#403 - Reduce memory usage for scratch buffer

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago

#402 - Initial broadcast support

Pull Request - State: closed - Opened by SreevatsaAnantharamu about 2 months ago

#402 - Initial broadcast support

Pull Request - State: open - Opened by SreevatsaAnantharamu about 2 months ago - 1 comment

#401 - Setup pipeline for mscclpp over nccl

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago

#401 - Setup pipeline for mscclpp over nccl

Pull Request - State: closed - Opened by Binyang2014 about 2 months ago

#400 - Revised ProxyChannel interfaces

Pull Request - State: closed - Opened by chhwang about 2 months ago

#400 - Revised ProxyChannel interfaces

Pull Request - State: closed - Opened by chhwang about 2 months ago

#399 - [NPKIT] Adding the NPKIT support for kernel allreduce7 in mscclpp-nccl

Pull Request - State: closed - Opened by PedramAlizadeh 2 months ago - 1 comment

#399 - [NPKIT] Adding the NPKIT support for kernel allreduce7 in mscclpp

Pull Request - State: open - Opened by PedramAlizadeh 2 months ago - 1 comment

#398 - [Feature] Cross-node communication without InfiniBand

Issue - State: closed - Opened by chenhongyu2048 2 months ago - 4 comments

#396 - Select algo according to json config

Pull Request - State: closed - Opened by Binyang2014 2 months ago

#395 - Question about DeviceSyncer and Ensuring Synchronization

Issue - State: closed - Opened by hidva 2 months ago - 1 comment

#394 - [Bug] Proxy chan hang at cudaMemcpyAsync

Issue - State: closed - Opened by FC-Li 2 months ago - 1 comment

#393 - AllGather Executor Support in NCCL Interface

Pull Request - State: closed - Opened by caiomcbr 2 months ago

#392 - Fix mscclpp_benchmark

Pull Request - State: closed - Opened by Binyang2014 2 months ago

#391 - Fixing Message Boundary AllReduce Fallback Code

Pull Request - State: closed - Opened by caiomcbr 2 months ago

#390 - Providing reduce-scatter test support

Pull Request - State: closed - Opened by caiomcbr 2 months ago

#389 - Fix typo

Pull Request - State: closed - Opened by Binyang2014 2 months ago

#388 - [Bug] Hang when one GPU get a new BasePtr while others not.

Issue - State: closed - Opened by FC-Li 2 months ago - 1 comment

#387 - [Bug] run nccl_api_test failed

Issue - State: open - Opened by yizhang2077 3 months ago - 1 comment

#386 - Add connection events for NPKit

Pull Request - State: closed - Opened by yzygitzh 3 months ago

#385 - Fix missing packet parameter for executor

Pull Request - State: closed - Opened by yzygitzh 3 months ago

#384 - Small Adjust in Test Data AllGather at Executor Test

Pull Request - State: closed - Opened by caiomcbr 3 months ago

#383 - Add cross threadblock barrier

Pull Request - State: closed - Opened by Binyang2014 3 months ago

#382 - [WIP] Use the default stream for CudaIpcConnection

Pull Request - State: closed - Opened by chhwang 3 months ago

#381 - Lazily create the context stream

Pull Request - State: closed - Opened by chhwang 3 months ago

#380 - Fixing Bug Const Offset in Execution Plan

Pull Request - State: closed - Opened by caiomcbr 3 months ago

#379 - Fix light load bug

Pull Request - State: closed - Opened by Binyang2014 3 months ago

#378 - Add kernel-based verification for executor_test

Pull Request - State: closed - Opened by yzygitzh 3 months ago

#377 - [Bug] flush() hang bug.

Issue - State: closed - Opened by TonyWu199 3 months ago