Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / microsoft/tutel issues and pull requests

#102 - Add performance figures

Pull Request - State: closed - Opened by EricWangCN over 2 years ago

#101 - Add performance figures

Pull Request - State: closed - Opened by EricWangCN over 2 years ago

#100 - Merge A2A FFN overlapping and 2DH A2A

Pull Request - State: closed - Opened by yzygitzh over 2 years ago

#99 - handle occupancy compat for rocm4.2

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#98 - why Deepspeed MoE Top-2 Gate dosen't integrate Tutel acceleration

Issue - State: closed - Opened by Satan012 over 2 years ago - 1 comment
Labels: invalid

#97 - simplify all different usages into top-k usage

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#95 - support TopKGate properties: is_postnorm

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#94 - add ffn_allreduce_range_size for data parallel

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#93 - Determine kernel max occupancy in JIT

Pull Request - State: closed - Opened by abuccts over 2 years ago

#92 - split jit_activate out of jit_execute

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#91 - Fix Issue #90: cast constant size to int

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#90 - fast_cumsum_sub_one fails when the module is wrapped by ORTModule

Issue - State: closed - Opened by foreveronehundred over 2 years ago - 7 comments

#89 - Can DistributedDataParallel be added into helloworld_deepspeed.py ?

Issue - State: closed - Opened by Satan012 over 2 years ago - 2 comments
Labels: invalid

#88 - add save_load_checkpoint option in helloworld

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#87 - how to save checkpoint when use data parallel and moe expert

Issue - State: open - Opened by Satan012 over 2 years ago - 7 comments
Labels: question

#86 - add api for group creation

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#85 - fix typos

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#84 - Fix Bug - Fix various bugs in all-to-all FFN overlapping

Pull Request - State: closed - Opened by yzygitzh over 2 years ago

#82 - Problem from applying pipeline parallel with Tutel's cumsum

Issue - State: closed - Opened by foreveronehundred over 2 years ago - 1 comment

#81 - Add 2D Hierarchical AlltoAll Algorithm

Pull Request - State: closed - Opened by abuccts over 2 years ago

#80 - Add cpu support

Pull Request - State: closed - Opened by EricWangCN over 2 years ago

#79 - add a fp64 test case

Pull Request - State: closed - Opened by EricWangCN over 2 years ago

#78 - reset seeding in distributed synthetic data

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#77 - change args of custom kernels' compiler

Pull Request - State: closed - Opened by EricWangCN over 2 years ago

#76 - INTERNAL ASSERT FAILED at custom_kernel.cpp

Issue - State: closed - Opened by foreveronehundred over 2 years ago - 1 comment

#75 - fix type of capacity

Pull Request - State: closed - Opened by EricWangCN over 2 years ago

#74 - add helloworld_amp

Pull Request - State: closed - Opened by EricWangCN over 2 years ago

#73 - not using CUDA_VISIBLE_DEVICES

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#72 - add fp64 option in examples; enhance launcher compat;

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#71 - support handling multi-gate options

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#70 - Question about multi-gate refer to multi-task learning

Issue - State: open - Opened by Tokkiu over 2 years ago - 5 comments
Labels: question

#69 - add fast launch usage for openmpi

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#67 - enhance logging & TUTEL_CUDA_SANDBOX option

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#66 - simplify example codes

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#65 - using init_data_model_parallel() to initialize proc

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#64 - fix nvrtc compatibility in some environments

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#63 - fix nvrtc compatibilty in some envs

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#62 - support using mpiexec for distributed launch

Pull Request - State: closed - Opened by ghostplant over 2 years ago

#61 - Error met when using multi nodes

Issue - State: closed - Opened by Lechatelia over 2 years ago - 5 comments

#60 - Upgrade docker image for UT.

Pull Request - State: closed - Opened by guoshzhao over 2 years ago

#59 - add a new test case

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago

#58 - Add --gpus=all option for test pipeline.

Pull Request - State: closed - Opened by guoshzhao almost 3 years ago

#57 - add unit test

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago

#56 - change the initialization of input of helloworld

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago

#55 - Setup unit-test pipeline.

Pull Request - State: closed - Opened by guoshzhao almost 3 years ago

#54 - add initialization interface for data-model parallel

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#53 - support 1 expert sharded on multi-gpu if |local_experts| < 0

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#52 - question about how to set data Parallelism

Issue - State: closed - Opened by Lechatelia almost 3 years ago - 13 comments

#51 - Using execl instead of system

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago

#50 - fix typos

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#49 - use childprocess to call command

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago

#48 - Add setup without nccl

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago

#47 - set seeds for megatron & ones gate result comparison

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#46 - avoid file permission issue while nvcc compiling

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago

#45 - Add usage for Tutel-boosted Deepspeed MoE (top-1)

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#42 - add a fastmoe example and throughput results

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago

#41 - Add throughput comparision

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago

#40 - fix parallel methods

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#39 - add deepspeed & megatron examples for comparison

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#38 - add new gate_type: megatron

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#37 - set NO_NVRTC = 1 by default

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#36 - support handling indice values with mask

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#35 - change print to logging

Pull Request - State: closed - Opened by EricWangCN almost 3 years ago - 1 comment

#34 - moving examples into distribution

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#33 - add example of moe for attention

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#32 - add attention type for built-in experts

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#31 - make tutel compatible with bfloat16 dtype

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#30 - make tutel compatible with bfloat16 dtype

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#29 - using compatible interface to change linear dtype

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#28 - moving flag `fp32_gate` into gate_type argument

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#27 - support option for batch_prioritized_routing

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#26 - Fix a bug for fast_cumsum_sub_one of Tutel

Pull Request - State: closed - Opened by foreveronehundred almost 3 years ago - 2 comments

#25 - update example in README.md

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#24 - allow dict-type specific gate_type description

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#23 - add seeding for post-moe params

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#22 - cast fp16 to ROCm-supported dtype in amdgpu

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#21 - add implicit dropout layer if p > 0

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#20 - synchronize distributed launch method from new pytorch

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#19 - add affinity interface in examples

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#18 - open interface for Top3Gate & Top4Gate

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#17 - merge top1gate & top2gate

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#16 - add auto_numa when initializing alltoall

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#15 - add all_to_all extension for pytorch without builtin dist.all_to_all

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#14 - switch to nvcc when nvrtc version in torch is low

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#13 - add a2a timing option

Pull Request - State: closed - Opened by ghostplant almost 3 years ago

#12 - Change JIT key for fast_cumsum_sub_one from just samples to (samples,…

Pull Request - State: closed - Opened by jspark1105 almost 3 years ago - 1 comment

#11 - Remove --restrict from NVRTC option

Pull Request - State: closed - Opened by jspark1105 almost 3 years ago - 1 comment

#10 - Change JIT key for fast_cumsum_sub_one from just samples to (samples,…

Pull Request - State: closed - Opened by jspark1105 almost 3 years ago - 1 comment

#9 - allow updating capacity in fast_dispatch

Pull Request - State: closed - Opened by ghostplant about 3 years ago

#8 - avoid custom experts output having different dtype

Pull Request - State: closed - Opened by ghostplant about 3 years ago

#7 - add option to choose JIT compiling type: nvrtc/nvcc

Pull Request - State: closed - Opened by ghostplant about 3 years ago

#6 - fix for fairseq integration

Pull Request - State: closed - Opened by ngoyal2707 about 3 years ago - 1 comment

#5 - add --l_aux_wt flag in helloworld.py

Pull Request - State: closed - Opened by ghostplant about 3 years ago

#4 - update explainations of using built-in experts

Pull Request - State: closed - Opened by ghostplant about 3 years ago

#2 - Setup: Revision - Remove ninja requirement

Pull Request - State: closed - Opened by guoshzhao about 3 years ago
Labels: setup

#1 - Setup: Revision - Revise setup for package release

Pull Request - State: closed - Opened by guoshzhao about 3 years ago
Labels: setup