Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / stanford-futuredata/megablocks issues and pull requests
#97 - support amd/rocm
Issue -
State: open - Opened by ehartford 8 months ago
- 2 comments
Labels: enhancement, help wanted
#96 - Remove turbo
Pull Request -
State: closed - Opened by dblalock 9 months ago
#95 - AMP + BF16 failing
Issue -
State: open - Opened by jramapuram 10 months ago
- 2 comments
#94 - Unsharding scripts for megablocks models
Issue -
State: open - Opened by mayank31398 10 months ago
#93 - the wrong loss func was chosen at evaluation
Issue -
State: open - Opened by peterjc123 10 months ago
- 2 comments
#92 - Seeking a good multi-node training config
Issue -
State: open - Opened by rpand002 11 months ago
- 3 comments
#91 - selective router precision
Issue -
State: open - Opened by 152334H 11 months ago
- 1 comment
Labels: question
#90 - Does this framework support SFT?
Issue -
State: open - Opened by banksy23 11 months ago
- 1 comment
Labels: question
#89 - Updt triton pin
Pull Request -
State: closed - Opened by vchiley 11 months ago
- 1 comment
#88 - RuntimeError: Triton Error [CUDA]: invalid argument
Issue -
State: open - Opened by noob-ctrl 11 months ago
- 12 comments
Labels: question
#87 - Fix `moe_normalize_expert_weights` when `top_k=1`
Pull Request -
State: closed - Opened by 152334H 11 months ago
- 3 comments
#86 - Gradient scale size for expert gradient
Issue -
State: closed - Opened by fanshiqing 11 months ago
- 4 comments
#85 - different load_balancing_loss with different pipeline_parallel_size
Issue -
State: open - Opened by bozheng-hit 11 months ago
- 8 comments
Labels: question
#84 - How to integrate to transformers-based mixtral
Issue -
State: open - Opened by nxphi47 11 months ago
- 1 comment
Labels: question
#83 - ParallelDroplessMLP initialises self.mlp twice
Issue -
State: open - Opened by 152334H 11 months ago
- 6 comments
Labels: enhancement, help wanted
#82 - save loading_balancing_loss properly
Issue -
State: closed - Opened by gouchangjiang 11 months ago
- 2 comments
Labels: question
#81 - Why the second matrix of the mlp layer has the same shape of the first one?
Issue -
State: open - Opened by gouchangjiang 11 months ago
- 1 comment
Labels: question
#80 - [BUG] Optimizer Weights Not Reloaded When Training with bf16 Pretrained Weights
Issue -
State: open - Opened by RookieHong 11 months ago
- 1 comment
Labels: bug
#79 - fix the abnormal ‘CAPACITY_FACTOR’ value
Pull Request -
State: open - Opened by jordgedu 11 months ago
- 3 comments
#78 - Error from pip about missing torch module
Issue -
State: closed - Opened by michaelwhitford 11 months ago
- 4 comments
Labels: help wanted
#77 - Efficiency of torch mlp
Issue -
State: closed - Opened by imoneoi 11 months ago
- 2 comments
#76 - Fix default to be sparse
Pull Request -
State: closed - Opened by mvpatel2000 11 months ago
#75 - Add dmlp registry args
Pull Request -
State: closed - Opened by j316chuck 11 months ago
#74 - Refactor dtesnor
Pull Request -
State: closed - Opened by mvpatel2000 11 months ago
#73 - Dtensor to all paths
Pull Request -
State: closed - Opened by mvpatel2000 11 months ago
#72 - Mem opt glu bkwd
Pull Request -
State: closed - Opened by mvpatel2000 11 months ago
#71 - Add cast to tensor for DTensor inputs for groupedmlp
Pull Request -
State: closed - Opened by eracah 11 months ago
#70 - Change router weight norm from in-place
Pull Request -
State: closed - Opened by sashaDoubov 11 months ago
#69 - Skip updating load balancing loss on eval
Pull Request -
State: closed - Opened by sedrick-keh-tri 11 months ago
- 2 comments
#68 - Script for Full Fine-Tuning of Mixtral
Issue -
State: open - Opened by alpayariyak 11 months ago
- 1 comment
Labels: question
#67 - Docker issues with PyPI installation
Issue -
State: open - Opened by sedrick-keh-tri 11 months ago
- 3 comments
#66 - add mem optimized grouped glu
Pull Request -
State: closed - Opened by vchiley 11 months ago
#65 - enable custom activation functions
Pull Request -
State: closed - Opened by vchiley 11 months ago
- 4 comments
#64 - How do you use routing balancing loss under pipeline parallelism
Issue -
State: closed - Opened by szhengac 12 months ago
- 5 comments
#63 - Update README.md
Pull Request -
State: closed - Opened by eltociear 12 months ago
- 1 comment
#62 - Has anyone encountered this CUDA error?
Issue -
State: closed - Opened by bozheng-hit 12 months ago
- 15 comments
#61 - Question on offsets in figures 5
Issue -
State: closed - Opened by DaehanKim 12 months ago
- 1 comment
#60 - More customizable norm for expert weights
Pull Request -
State: closed - Opened by snarayan21 12 months ago
#59 - About the Multi-node Script
Issue -
State: closed - Opened by XingyuXie 12 months ago
- 4 comments
#58 - enable arg enabled normalization of routing weights
Pull Request -
State: closed - Opened by vchiley 12 months ago
#57 - [integrating megablocks with open_lm] Question about megablocks + FSDP
Issue -
State: closed - Opened by kernelmachine 12 months ago
- 8 comments
#56 - Update setup.py to support multiple device capabilities
Pull Request -
State: closed - Opened by simon-mo 12 months ago
- 6 comments
#55 - Update Megatron-LM scripts and integration for latest Docker container.
Pull Request -
State: closed - Opened by tgale96 12 months ago
#54 - Remove errant "*" in README
Pull Request -
State: closed - Opened by tgale96 12 months ago
#53 - Fix * in README
Pull Request -
State: closed - Opened by tgale96 12 months ago
#52 - Update dependencies and package organization.
Pull Request -
State: closed - Opened by tgale96 12 months ago
#51 - Installation fails due to missing mosaicml-turbo
Issue -
State: closed - Opened by AlpinDale 12 months ago
- 2 comments
#50 - Latest GitHub release version higher than main branch setup.py
Issue -
State: closed - Opened by nateraw 12 months ago
- 4 comments
#49 - Comparison against top-2 routing?
Issue -
State: open - Opened by sunnyszy 12 months ago
- 4 comments
Labels: question
#48 - Inference code
Issue -
State: closed - Opened by AlpinDale 12 months ago
- 5 comments
#47 - Fix bug in topology kernel for ffn_hidden_size>4096.
Pull Request -
State: closed - Opened by tgale96 12 months ago
- 2 comments
#46 - Wrong outputs for hidden dim 14336
Issue -
State: closed - Opened by pierrestock 12 months ago
- 3 comments
#45 - Support new model
Pull Request -
State: closed - Opened by pierrestock 12 months ago
- 4 comments
#44 - Add expert dropout
Pull Request -
State: closed - Opened by samhavens 12 months ago
#43 - Removing an extra size call
Pull Request -
State: closed - Opened by bcui19 12 months ago
#42 - Torch Moe
Pull Request -
State: closed - Opened by j316chuck 12 months ago
- 2 comments
#41 - Enable generic dimentionality for input
Pull Request -
State: closed - Opened by vchiley 12 months ago
#40 - Why not support tensor model parallel?
Issue -
State: closed - Opened by Richie-yan about 1 year ago
- 7 comments
#39 - Have megablocks rely on torch default precision
Pull Request -
State: closed - Opened by mvpatel2000 about 1 year ago
#38 - Add GLU support
Pull Request -
State: closed - Opened by sashaDoubov about 1 year ago
- 4 comments
#37 - Avoid duplicate `.cpu()` call
Pull Request -
State: closed - Opened by mvpatel2000 about 1 year ago
- 3 comments
#36 - Update version
Pull Request -
State: closed - Opened by mvpatel2000 about 1 year ago
#35 - How to add support for swiglu in Megablocks?
Issue -
State: closed - Opened by fanshiqing about 1 year ago
- 14 comments
#34 - Refactoring class hierarchy for FSDP wrapping
Pull Request -
State: closed - Opened by tgale96 about 1 year ago
- 2 comments
#32 - How to pip install the latest megablocks?
Issue -
State: closed - Opened by fanshiqing about 1 year ago
- 2 comments
#31 - Enable running MegaBlocks MoE without bias
Pull Request -
State: closed - Opened by vchiley about 1 year ago
#30 - Fix activation quantization
Pull Request -
State: closed - Opened by dblalock about 1 year ago
- 4 comments
#29 - Remove unusued import
Pull Request -
State: closed - Opened by mvpatel2000 about 1 year ago
- 1 comment
#28 - Fix grouped GEMM API
Pull Request -
State: closed - Opened by tgale96 about 1 year ago
#27 - Small optimizations for EP/TP
Pull Request -
State: closed - Opened by tgale96 about 1 year ago
#26 - Support memory_optimized_mlp with grouped_mlp.
Pull Request -
State: closed - Opened by tgale96 about 1 year ago
#25 - Gate grouped gemm install
Pull Request -
State: closed - Opened by mvpatel2000 about 1 year ago
- 2 comments
#24 - Make MegaBlocks go vroom on Hopper.
Pull Request -
State: closed - Opened by tgale96 about 1 year ago
- 1 comment
#23 - Add optional activation quantization
Pull Request -
State: closed - Opened by dblalock about 1 year ago
- 7 comments
#22 - update Megatron-LM submodule and update a test script
Pull Request -
State: closed - Opened by feifeibear about 1 year ago
#21 - Does megablocks support the true expert parallelism?
Issue -
State: closed - Opened by feifeibear about 1 year ago
- 2 comments
#20 - Fix weight gradients with expert model parallelism.
Pull Request -
State: closed - Opened by tgale96 about 1 year ago
#19 - Enable FSDP sharding for bias
Pull Request -
State: closed - Opened by b-chu about 1 year ago
- 1 comment
#18 - multi-node problem
Issue -
State: closed - Opened by sudahui about 1 year ago
- 5 comments
#17 - Activation memory optimization
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
- 2 comments
#16 - Update citation in README to MLSys
Pull Request -
State: closed - Opened by deepakn94 over 1 year ago
#15 - Adding support for tensor model parallelism when expert_parallel_world_size > num_experts.
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
#14 - Use builtin decorators for AMP.
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
#13 - Update out-of-date README.
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
#12 - Minor cleanup
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
#11 - Add support for fully-sharded data parallelism.
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
#10 - Add flag to force uniform assignment to experts for load balancing.
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
#9 - updt setup.py; fix tokens_per_expert casting
Pull Request -
State: closed - Opened by vchiley over 1 year ago
#8 - add guangnian webtext2 training scripts
Pull Request -
State: closed - Opened by feifeibear over 1 year ago
#7 - add guangnian webtext2 training scripts
Pull Request -
State: closed - Opened by feifeibear over 1 year ago
#6 - Optimizations for top_k > 1
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
#5 - Switch dMoE models to use bfloat16
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
#4 - Add support for bfloat16 and AdaFactor
Pull Request -
State: closed - Opened by tgale96 over 1 year ago
#3 - Current installation instructions don't quite work
Issue -
State: closed - Opened by deepakn94 almost 2 years ago
- 1 comment
#2 - Re-factoring for Composer integration.
Pull Request -
State: closed - Opened by tgale96 almost 2 years ago
#1 - Remove Megatron dependency from core layers and tests.
Pull Request -
State: closed - Opened by tgale96 almost 2 years ago