Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / microsoft/tutel issues and pull requests
#102 - Add performance figures
Pull Request -
State: closed - Opened by EricWangCN over 2 years ago
#101 - Add performance figures
Pull Request -
State: closed - Opened by EricWangCN over 2 years ago
#100 - Merge A2A FFN overlapping and 2DH A2A
Pull Request -
State: closed - Opened by yzygitzh over 2 years ago
#99 - handle occupancy compat for rocm4.2
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#98 - why Deepspeed MoE Top-2 Gate dosen't integrate Tutel acceleration
Issue -
State: closed - Opened by Satan012 over 2 years ago
- 1 comment
Labels: invalid
#97 - simplify all different usages into top-k usage
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#96 - Error:Exception: MoE JIT is designed to work on sample size = 800, while receiving sample size = 1600 (> 800)
Issue -
State: open - Opened by Satan012 over 2 years ago
- 2 comments
Labels: question
#95 - support TopKGate properties: is_postnorm
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#94 - add ffn_allreduce_range_size for data parallel
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#93 - Determine kernel max occupancy in JIT
Pull Request -
State: closed - Opened by abuccts over 2 years ago
#92 - split jit_activate out of jit_execute
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#91 - Fix Issue #90: cast constant size to int
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#90 - fast_cumsum_sub_one fails when the module is wrapped by ORTModule
Issue -
State: closed - Opened by foreveronehundred over 2 years ago
- 7 comments
#89 - Can DistributedDataParallel be added into helloworld_deepspeed.py ?
Issue -
State: closed - Opened by Satan012 over 2 years ago
- 2 comments
Labels: invalid
#88 - add save_load_checkpoint option in helloworld
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#87 - how to save checkpoint when use data parallel and moe expert
Issue -
State: open - Opened by Satan012 over 2 years ago
- 7 comments
Labels: question
#86 - add api for group creation
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#85 - fix typos
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#84 - Fix Bug - Fix various bugs in all-to-all FFN overlapping
Pull Request -
State: closed - Opened by yzygitzh over 2 years ago
#83 - Enable JIT compilation to support torch.distributed.pipeline environment
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#82 - Problem from applying pipeline parallel with Tutel's cumsum
Issue -
State: closed - Opened by foreveronehundred over 2 years ago
- 1 comment
#81 - Add 2D Hierarchical AlltoAll Algorithm
Pull Request -
State: closed - Opened by abuccts over 2 years ago
#80 - Add cpu support
Pull Request -
State: closed - Opened by EricWangCN over 2 years ago
#79 - add a fp64 test case
Pull Request -
State: closed - Opened by EricWangCN over 2 years ago
#78 - reset seeding in distributed synthetic data
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#77 - change args of custom kernels' compiler
Pull Request -
State: closed - Opened by EricWangCN over 2 years ago
#76 - INTERNAL ASSERT FAILED at custom_kernel.cpp
Issue -
State: closed - Opened by foreveronehundred over 2 years ago
- 1 comment
#75 - fix type of capacity
Pull Request -
State: closed - Opened by EricWangCN over 2 years ago
#74 - add helloworld_amp
Pull Request -
State: closed - Opened by EricWangCN over 2 years ago
#73 - not using CUDA_VISIBLE_DEVICES
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#72 - add fp64 option in examples; enhance launcher compat;
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#71 - support handling multi-gate options
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#70 - Question about multi-gate refer to multi-task learning
Issue -
State: open - Opened by Tokkiu over 2 years ago
- 5 comments
Labels: question
#69 - add fast launch usage for openmpi
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#68 - Add Feature - Support overlapping all-to-all with FFN computation in MoE layer
Pull Request -
State: closed - Opened by yzygitzh over 2 years ago
#67 - enhance logging & TUTEL_CUDA_SANDBOX option
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#66 - simplify example codes
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#65 - using init_data_model_parallel() to initialize proc
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#64 - fix nvrtc compatibility in some environments
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#63 - fix nvrtc compatibilty in some envs
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#62 - support using mpiexec for distributed launch
Pull Request -
State: closed - Opened by ghostplant over 2 years ago
#61 - Error met when using multi nodes
Issue -
State: closed - Opened by Lechatelia over 2 years ago
- 5 comments
#60 - Upgrade docker image for UT.
Pull Request -
State: closed - Opened by guoshzhao over 2 years ago
#59 - add a new test case
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
#58 - Add --gpus=all option for test pipeline.
Pull Request -
State: closed - Opened by guoshzhao almost 3 years ago
#57 - add unit test
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
#56 - change the initialization of input of helloworld
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
#55 - Setup unit-test pipeline.
Pull Request -
State: closed - Opened by guoshzhao almost 3 years ago
#54 - add initialization interface for data-model parallel
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#53 - support 1 expert sharded on multi-gpu if |local_experts| < 0
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#52 - question about how to set data Parallelism
Issue -
State: closed - Opened by Lechatelia almost 3 years ago
- 13 comments
#51 - Using execl instead of system
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
#50 - fix typos
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#49 - use childprocess to call command
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
#48 - Add setup without nccl
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
#47 - set seeds for megatron & ones gate result comparison
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#46 - avoid file permission issue while nvcc compiling
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
#45 - Add usage for Tutel-boosted Deepspeed MoE (top-1)
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#42 - add a fastmoe example and throughput results
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
#41 - Add throughput comparision
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
#40 - fix parallel methods
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#39 - add deepspeed & megatron examples for comparison
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#38 - add new gate_type: megatron
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#37 - set NO_NVRTC = 1 by default
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#36 - support handling indice values with mask
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#35 - change print to logging
Pull Request -
State: closed - Opened by EricWangCN almost 3 years ago
- 1 comment
#34 - moving examples into distribution
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#33 - add example of moe for attention
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#32 - add attention type for built-in experts
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#31 - make tutel compatible with bfloat16 dtype
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#30 - make tutel compatible with bfloat16 dtype
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#29 - using compatible interface to change linear dtype
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#28 - moving flag `fp32_gate` into gate_type argument
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#27 - support option for batch_prioritized_routing
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#26 - Fix a bug for fast_cumsum_sub_one of Tutel
Pull Request -
State: closed - Opened by foreveronehundred almost 3 years ago
- 2 comments
#25 - update example in README.md
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#24 - allow dict-type specific gate_type description
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#23 - add seeding for post-moe params
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#22 - cast fp16 to ROCm-supported dtype in amdgpu
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#21 - add implicit dropout layer if p > 0
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#20 - synchronize distributed launch method from new pytorch
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#19 - add affinity interface in examples
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#18 - open interface for Top3Gate & Top4Gate
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#17 - merge top1gate & top2gate
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#16 - add auto_numa when initializing alltoall
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#15 - add all_to_all extension for pytorch without builtin dist.all_to_all
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#14 - switch to nvcc when nvrtc version in torch is low
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#13 - add a2a timing option
Pull Request -
State: closed - Opened by ghostplant almost 3 years ago
#12 - Change JIT key for fast_cumsum_sub_one from just samples to (samples,…
Pull Request -
State: closed - Opened by jspark1105 almost 3 years ago
- 1 comment
#11 - Remove --restrict from NVRTC option
Pull Request -
State: closed - Opened by jspark1105 almost 3 years ago
- 1 comment
#10 - Change JIT key for fast_cumsum_sub_one from just samples to (samples,…
Pull Request -
State: closed - Opened by jspark1105 almost 3 years ago
- 1 comment
#9 - allow updating capacity in fast_dispatch
Pull Request -
State: closed - Opened by ghostplant about 3 years ago
#8 - avoid custom experts output having different dtype
Pull Request -
State: closed - Opened by ghostplant about 3 years ago
#7 - add option to choose JIT compiling type: nvrtc/nvcc
Pull Request -
State: closed - Opened by ghostplant about 3 years ago
#6 - fix for fairseq integration
Pull Request -
State: closed - Opened by ngoyal2707 about 3 years ago
- 1 comment
#5 - add --l_aux_wt flag in helloworld.py
Pull Request -
State: closed - Opened by ghostplant about 3 years ago
#4 - update explainations of using built-in experts
Pull Request -
State: closed - Opened by ghostplant about 3 years ago
#3 - Example: Add Example - Add tutel-moe example for pytorch DistributedDataParallel.
Pull Request -
State: closed - Opened by guoshzhao about 3 years ago
#2 - Setup: Revision - Remove ninja requirement
Pull Request -
State: closed - Opened by guoshzhao about 3 years ago
Labels: setup
#1 - Setup: Revision - Revise setup for package release
Pull Request -
State: closed - Opened by guoshzhao about 3 years ago
Labels: setup