Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / jzhang38/easycontext issues and pull requests
#54 - Inquiry Regarding Zero3 and Sequence Parallelism Compatibility
Issue -
State: open - Opened by SihengLi99 about 1 month ago
#53 - dependency confilct
Issue -
State: open - Opened by SihengLi99 about 1 month ago
#52 - saving intermediate checkpoints
Issue -
State: open - Opened by 1190303125 about 1 month ago
#51 - Can not run the example script succesully.
Issue -
State: open - Opened by feifeibear about 2 months ago
#50 - feat: usp (unified sequence parallelism)
Pull Request -
State: closed - Opened by feifeibear about 2 months ago
#49 - unified sequence parallel
Pull Request -
State: closed - Opened by feifeibear about 2 months ago
#48 - add usp (Unified Sequence Parallelism)
Pull Request -
State: closed - Opened by feifeibear about 2 months ago
#47 - Size mismatch inside zigzag_ringattention backward
Issue -
State: open - Opened by jinghan23 about 2 months ago
#46 - RuntimeError: CUDA error: an illegal memory access was encountered
Issue -
State: open - Opened by uditsharma7 2 months ago
- 1 comment
#45 - Is this SFT method or PT method?
Issue -
State: open - Opened by 233function 2 months ago
- 1 comment
#44 - When will the model code support the Qwen series models?
Issue -
State: open - Opened by 233function 3 months ago
- 2 comments
#43 - TypeError: _flash_attn_forward() missing 1 required positional argument: 'softcap'
Issue -
State: open - Opened by Ziyang412 3 months ago
- 2 comments
#42 - How to estimate the maximum context length this repo can support for larger models?
Issue -
State: open - Opened by JingyangDeng 3 months ago
#41 - 拓展长上下文的技术是?
Issue -
State: open - Opened by zzhdbw 3 months ago
- 2 comments
#40 - Does this repo work with FSDP or Zero?
Issue -
State: closed - Opened by LorrinWWW 4 months ago
- 1 comment
#39 - Logits shift in loss computation
Issue -
State: open - Opened by shivamag125 4 months ago
- 1 comment
#38 - Does it support SFT training?
Issue -
State: open - Opened by Lomax314 4 months ago
#37 - comparison of different sequence parallel methods
Issue -
State: open - Opened by sunying2018 4 months ago
- 1 comment
#36 - Dataset length question
Issue -
State: open - Opened by 5taku 4 months ago
- 2 comments
#35 - Will EasyContext support Qwen series model?
Issue -
State: open - Opened by WeixuanXiong 5 months ago
#34 - May I see your wandb report while training?
Issue -
State: open - Opened by fahadh4ilyas 5 months ago
#33 - How to auto-regression generate?
Issue -
State: open - Opened by yileld 5 months ago
#32 - about seq parallel global batch size
Issue -
State: closed - Opened by Liu-yuliang 5 months ago
- 2 comments
#31 - Rotary embedding size missmatch
Issue -
State: closed - Opened by Toan-Do 6 months ago
- 4 comments
#30 - Can we just use the sloth gradient checkpointing by uncommenting this line?
Issue -
State: open - Opened by vkaul11 6 months ago
- 4 comments
#29 - can training codellama?
Issue -
State: closed - Opened by 5taku 6 months ago
- 2 comments
#28 - Support ulysses flash attn
Pull Request -
State: closed - Opened by Kwen-Chen 6 months ago
- 1 comment
#27 - how to infer the model?
Issue -
State: open - Opened by laoda513 6 months ago
#26 - Bug: Evals might be broken in pinned HF transformers version `cache=False`
Issue -
State: closed - Opened by michaelfeil 6 months ago
- 2 comments
#25 - shuffle bug?
Issue -
State: closed - Opened by fmmoret 6 months ago
- 3 comments
#24 - how to acquire the real whole batch sequenece training loss(reduction_mode=mean) ?
Issue -
State: open - Opened by littttttlebird 6 months ago
- 2 comments
#23 - attention_mask
Issue -
State: open - Opened by Nianqitongs 6 months ago
#22 - Need a running script for ‘dist_flash_attn’
Issue -
State: open - Opened by LzhinFdu 6 months ago
- 5 comments
#21 - Model stopped updating after 300-400 steps.
Issue -
State: closed - Opened by Bostoncake 6 months ago
- 9 comments
#20 - integrate it into the Transformers Trainer?
Issue -
State: open - Opened by jkl375 7 months ago
- 1 comment
#19 - Appending answer_ids to prompt in `eval_needle.py`
Issue -
State: closed - Opened by shan18 7 months ago
- 2 comments
#18 - Llama-2 models do not support `sliding_window` parameter
Issue -
State: closed - Opened by Bostoncake 7 months ago
- 3 comments
#17 - Confused by the train scripts
Issue -
State: closed - Opened by Bostoncake 7 months ago
- 3 comments
#16 - LongBench/InfiniteBench
Issue -
State: closed - Opened by sunying2018 7 months ago
#15 - Danube2 and Unsloth offloaded gradient ck
Pull Request -
State: closed - Opened by jzhang38 7 months ago
#14 - Error when the model vocabulary is larger than 120k
Issue -
State: closed - Opened by microhu 7 months ago
- 10 comments
#13 - error when finetuning yi-34b
Issue -
State: open - Opened by puppet101 7 months ago
- 2 comments
#12 - Data parallel + zigzag_ring_attn support
Issue -
State: open - Opened by WallE-Chang 7 months ago
- 2 comments
#11 - OOM when seq-length=700k
Issue -
State: open - Opened by jkl375 7 months ago
- 4 comments
#10 - Requirements for input length
Issue -
State: open - Opened by LzhinFdu 7 months ago
- 2 comments
#9 - train speed is too slow
Issue -
State: open - Opened by jkl375 7 months ago
- 2 comments
#8 - Not the real auto-regressive decoding mode ?
Issue -
State: open - Opened by microhu 7 months ago
- 1 comment
#7 - dataset description
Issue -
State: closed - Opened by sunying2018 7 months ago
- 3 comments
#6 - Which image is used for this job?
Issue -
State: open - Opened by AatroxZZ 7 months ago
- 9 comments
#5 - Modify interface
Pull Request -
State: closed - Opened by jzhang38 7 months ago
- 1 comment
#4 - Lightseq
Pull Request -
State: closed - Opened by jzhang38 7 months ago
- 5 comments
#3 - Does the input sharding match exact optimization of long sequence?
Issue -
State: closed - Opened by guanzhchen 7 months ago
- 2 comments
#2 - Switching to monkey patch
Pull Request -
State: closed - Opened by jzhang38 7 months ago
#1 - LICENSE
Issue -
State: closed - Opened by fmmoret 7 months ago
- 1 comment