Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / princeton-nlp/MeZO issues and pull requests
#44 - About gradient accumulation implementation
Issue -
State: open - Opened by xvyaward 11 days ago
#43 - zeroshot results of roberta-large
Issue -
State: open - Opened by pickpppcc 26 days ago
#42 - loss turns to 0 after several steps for llama2
Issue -
State: open - Opened by liuxiaozhu01 2 months ago
- 5 comments
#41 - Question about checkpointing
Issue -
State: open - Opened by zhaoaustin 2 months ago
- 1 comment
#40 - Full finetuning with Roberta-Large
Issue -
State: open - Opened by aparna-aketi 4 months ago
- 5 comments
#39 - Cannot reproduce the results for Roberta-large on SNLI with MeZO(LORA)
Issue -
State: open - Opened by Liu-M-H 4 months ago
- 3 comments
#38 - question for Linear Probing
Issue -
State: open - Opened by zhaoaustin 5 months ago
- 2 comments
#37 - question about MeZO-adam
Issue -
State: open - Opened by zhaoaustin 6 months ago
- 1 comment
#36 - Can you share the dataset class of SST-5, SNLI, TREC datasets?
Issue -
State: open - Opened by Ziiiirem 6 months ago
- 5 comments
#35 - roberta-large zero shot
Issue -
State: open - Opened by itongggg 7 months ago
#34 - can not reproduce the the result of roberta large on dataste sst-2
Issue -
State: open - Opened by itongggg 7 months ago
- 2 comments
#33 - Maybe need a requirement.txt file to facilitate environment preparation?
Issue -
State: open - Opened by lepangdan 9 months ago
- 1 comment
#32 - In which file is the code implemented by the algorithm?
Issue -
State: open - Opened by 1llss 11 months ago
- 1 comment
#31 - Zero Order implementation does not converge in CIFAR-10 dataset.
Issue -
State: open - Opened by amritansh6 12 months ago
- 1 comment
#31 - Zero Order implementation does not converge in CIFAR-10 dataset.
Issue -
State: open - Opened by amritansh6 12 months ago
- 1 comment
#30 - Standard FT does not work
Issue -
State: open - Opened by YaNgZhAnG-V5 about 1 year ago
- 4 comments
#29 - max_seq_length and max_seq_len confusion
Issue -
State: open - Opened by davidqqq about 1 year ago
- 1 comment
#28 - Cannot reproduce some results of OPT
Issue -
State: closed - Opened by WangFei-2019 about 1 year ago
- 3 comments
#27 - How to use MeZO in training a simple CIFAR-10 model
Issue -
State: open - Opened by Cascol-Chen about 1 year ago
- 3 comments
#26 - Add a pip-installable, simple implementation of MeZO (along with a distributed impl. and some tests)
Pull Request -
State: open - Opened by lebrice about 1 year ago
- 3 comments
#25 - Results of Trec dataset on Roberta-large(K=512) with MeZO(LoRA)
Issue -
State: open - Opened by Yanjun-Zhao about 1 year ago
- 8 comments
#25 - Results of Trec dataset on Roberta-large(K=512) with MeZO(LoRA)
Issue -
State: open - Opened by Yanjun-Zhao about 1 year ago
- 8 comments
#24 - Inconsistent results of MEZO for RoBERTa-large on SST-2
Issue -
State: open - Opened by han678 over 1 year ago
#23 - MeZO on ChatGLM6B
Issue -
State: closed - Opened by CharonsPluto over 1 year ago
- 2 comments
#22 - LoRA & p-tuning with multi-GPU
Issue -
State: open - Opened by haozhouamzn over 1 year ago
- 3 comments
#21 - Cannot reproduce the results for RoBERTa on SST-2
Issue -
State: open - Opened by TrueNobility303 over 1 year ago
- 1 comment
#20 - llama2 problem
Issue -
State: open - Opened by ghost over 1 year ago
- 1 comment
#19 - ValueError: The model did not return a loss from the inputs, only the following keys: logits,past_key_values. For reference, the inputs it received are input_ids,attention_mask.
Issue -
State: closed - Opened by thistleknot over 1 year ago
- 2 comments
#18 - AttributeError: 'TrainingArguments' object has no attribute 'linear_probing'
Issue -
State: closed - Opened by thistleknot over 1 year ago
- 4 comments
#17 - Nanogpt implementation
Issue -
State: open - Opened by thistleknot over 1 year ago
- 3 comments
#16 - Cannot reproduce the results of OPT on SST2
Issue -
State: closed - Opened by sglucas over 1 year ago
- 15 comments
#15 - Results on WSC and WIC datasets cannot be reproduced on OPT-13B with MeZO
Issue -
State: open - Opened by MathIsAll over 1 year ago
- 5 comments
#14 - About experimentical setting of 1000 examples
Issue -
State: closed - Opened by sglucas over 1 year ago
- 2 comments
#13 - MeZO on continue pre-training
Issue -
State: open - Opened by shan23chen over 1 year ago
- 1 comment
#12 - deepspeed reference on colab
Issue -
State: closed - Opened by huu4ontocord over 1 year ago
- 2 comments
#11 - Getting a RuntimeError after training with mezo
Issue -
State: open - Opened by sowmaster over 1 year ago
- 6 comments
#10 - Which trainer to use
Issue -
State: open - Opened by HaniItani over 1 year ago
- 7 comments
#9 - MeZO running script for roberta-large is not working
Issue -
State: closed - Opened by sanyalsunny111 over 1 year ago
- 1 comment
#8 - gpt_neo not supported
Issue -
State: closed - Opened by thistleknot over 1 year ago
- 8 comments
#7 - Best parameters found for datasets
Issue -
State: open - Opened by vvvm23 over 1 year ago
- 3 comments
#6 - Not convergent in custom dataset.
Issue -
State: open - Opened by jcao-ai over 1 year ago
- 9 comments
#5 - Can you provide more details about how to run the code?
Issue -
State: closed - Opened by kiseliu over 1 year ago
- 1 comment
#4 - MeZo can be used in NLG tasks?
Issue -
State: open - Opened by anonNo2 over 1 year ago
- 5 comments
#3 - Fix typo in run.py
Pull Request -
State: closed - Opened by eltociear over 1 year ago
#2 - Impact of Dropout?
Issue -
State: closed - Opened by helpmefindaname over 1 year ago
- 1 comment
#1 - Any benchmark on (MeZO) v.s. (ZeRO + CpuOffload + Grad checkpointing) ?
Issue -
State: closed - Opened by xingchensong over 1 year ago
- 2 comments