Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / CarperAI/trlx issues and pull requests
#601 - OOM error with PEFT LoRA on Llama2-7B
Issue -
State: open - Opened by arpaiva 2 months ago
- 1 comment
Labels: bug
#600 - Load the checkpoint fails
Issue -
State: open - Opened by AfraAmini 3 months ago
Labels: bug
#599 - cannot import name 'flatten_dataclass' from 'trlx.data.ilql_types'
Issue -
State: open - Opened by AfraAmini 4 months ago
Labels: bug
#598 - maybe bug in prepare & load's order
Issue -
State: open - Opened by daiwk 4 months ago
- 1 comment
Labels: bug
#597 - Error when running Ray Tune to launch hyperparameter sweep
Issue -
State: open - Opened by Jing-L97 4 months ago
- 1 comment
Labels: bug
#596 - Crash when using save_state with deepspeed: `model.state_dict` functions incompatible with new deepspeed.
Issue -
State: open - Opened by JohannesAck 5 months ago
Labels: bug
#595 - Data Loader Bug when running t5_summarization_daily_cnn.py
Issue -
State: open - Opened by yunanyan 6 months ago
Labels: bug
#594 - Why train dataloader is not prepared by Accelerator
Issue -
State: open - Opened by Jiaxin-Wen 7 months ago
Labels: bug
#593 - TRLX Environment customization
Issue -
State: open - Opened by heraldiclily 7 months ago
#591 - Issue of tensors share memory
Issue -
State: open - Opened by heraldiclily 8 months ago
- 2 comments
Labels: bug
#590 - [New Feature Request] Add KTO
Issue -
State: open - Opened by 1485840691-eng 10 months ago
Labels: feature request
#589 - RLHF text summarization diverges
Issue -
State: open - Opened by AlisonWen 11 months ago
Labels: bug
#588 - Integration of Self-Play Fine-Tuning (SPIN) Method for Enhancing Large Language Models
Issue -
State: open - Opened by SeungyounShin 11 months ago
Labels: feature request
#587 - Runtime error when running examples (ilql_sentiments_t5.py)
Issue -
State: open - Opened by youxiho1 11 months ago
- 2 comments
Labels: bug
#586 - Add citation info from the EMNLP paper
Pull Request -
State: closed - Opened by StellaAthena 11 months ago
#585 - MPT is not working
Issue -
State: open - Opened by ouhenio 12 months ago
Labels: bug
#584 - when i use trlx ppotrainer train a model llama 13b model, but saved huggingface mode ,but when it inference , it has some strange keys ,and the inference result did not show ,it also have no error , it seems the result disapper
Issue -
State: open - Opened by ldh127 12 months ago
- 1 comment
Labels: bug
#583 - Faster & memory-efficient logprobs calculation
Pull Request -
State: open - Opened by li-plus 12 months ago
- 1 comment
#582 - Attention mask when calculating log ratio for PPO
Issue -
State: open - Opened by kmy17518 about 1 year ago
#581 - Multi-GPU training errors with peft
Issue -
State: open - Opened by AliengirlLiv about 1 year ago
- 1 comment
Labels: bug
#580 - Issue since most recent transformers update
Issue -
State: open - Opened by siddharthverma314 about 1 year ago
- 1 comment
Labels: bug
#579 - update(requirements.txt): to the latest `transformers` & `deepspeed`
Pull Request -
State: open - Opened by maxreciprocate about 1 year ago
- 1 comment
#578 - fix(modeling_base): partial loading of a sharded checkpoint
Pull Request -
State: closed - Opened by maxreciprocate about 1 year ago
#577 - resume_from_checkpoint doesn't work
Issue -
State: closed - Opened by andrewsiah about 1 year ago
- 1 comment
Labels: bug
#576 - fix model state_dict retrieving in zero3
Pull Request -
State: closed - Opened by Jingru about 1 year ago
#575 - support parallel reward function
Pull Request -
State: open - Opened by Jingru about 1 year ago
- 16 comments
#574 - Support parallel reward_fn in PPO training
Issue -
State: closed - Opened by Jingru about 1 year ago
Labels: feature request
#573 - support customized run_name in tracker
Pull Request -
State: closed - Opened by Jingru about 1 year ago
- 1 comment
#572 - Support customized run name
Pull Request -
State: closed - Opened by Jingru about 1 year ago
#571 - multigpu support for summarization ppo example
Issue -
State: open - Opened by sayan1101 about 1 year ago
- 3 comments
Labels: bug
#570 - fix(examples/t5_summarize_cnn): move labels into `reward_fn` kwargs
Pull Request -
State: closed - Opened by maxreciprocate about 1 year ago
#569 - TypeError: reward_fn() got an unexpected keyword argument 'tokenizer'
Issue -
State: closed - Opened by sayan1101 about 1 year ago
- 1 comment
Labels: bug
#568 - support extra model and tokenizer configs during loading by from_pretrained in accelerate trainer
Pull Request -
State: closed - Opened by Jingru about 1 year ago
- 1 comment
#567 - Problem with LLama training with LoRA
Issue -
State: open - Opened by freQuensy23-coder about 1 year ago
- 3 comments
Labels: bug
#566 - fix(modeling_base): re-order `model.forward_kwargs` initialization
Pull Request -
State: closed - Opened by maxreciprocate about 1 year ago
- 1 comment
#565 - Question about saving peft checkpoint
Issue -
State: open - Opened by nhanph about 1 year ago
- 2 comments
Labels: bug
#564 - `position_ids` error in accelerate PPO trainer
Issue -
State: closed - Opened by pbarragan about 1 year ago
- 3 comments
Labels: bug
#563 - [Fix] Add default config LLaMa 2 converter Nemo
Pull Request -
State: closed - Opened by PhungVanDuy about 1 year ago
#562 - Add default config LLaMa 2 converter Nemo
Pull Request -
State: closed - Opened by PhungVanDuy about 1 year ago
#561 - How to generate reward-labeled dataset
Issue -
State: open - Opened by mikkelmedm about 1 year ago
Labels: feature request
#560 - feats: Add text enviroment examples
Pull Request -
State: open - Opened by PhungVanDuy about 1 year ago
#559 - How to train LLaMA2 on the summarize_rlhf example?
Issue -
State: open - Opened by missflash about 1 year ago
#557 - docs: update documentation
Pull Request -
State: closed - Opened by maxreciprocate about 1 year ago
- 1 comment
#556 - feat: Add support for DPO
Pull Request -
State: open - Opened by sandeepchittilla about 1 year ago
- 12 comments
#555 - Inference pipeline
Pull Request -
State: open - Opened by Dahoas about 1 year ago
- 1 comment
#554 - feat: add rejection finetuning trainer
Pull Request -
State: closed - Opened by maxreciprocate about 1 year ago
- 1 comment
#553 - Increasing max new tokens for generation arguments lead to errors
Issue -
State: open - Opened by wise-east about 1 year ago
- 3 comments
Labels: bug
#552 - fix(examples/hh): old gpt-j checkpoint loading
Pull Request -
State: closed - Opened by maxreciprocate about 1 year ago
#551 - revert(ppo_trainer): keep `save_pretrained` only over the base model
Pull Request -
State: closed - Opened by maxreciprocate over 1 year ago
#550 - Add trlX cite
Pull Request -
State: closed - Opened by Dahoas over 1 year ago
#549 - Unable to load and run inference on finetuned Alpaca model
Issue -
State: closed - Opened by doyled-it over 1 year ago
- 7 comments
Labels: bug
#548 - Memory occupy with multi GPUs Training
Issue -
State: open - Opened by yuanyaaa over 1 year ago
- 1 comment
#547 - chore(requirements.txt): update everything to the latest
Pull Request -
State: closed - Opened by maxreciprocate over 1 year ago
- 1 comment
#546 - mosaicml/mpt support
Pull Request -
State: closed - Opened by 50m-regent over 1 year ago
#545 - Unable to load the trained model to do the inference
Issue -
State: closed - Opened by CSerxy over 1 year ago
- 9 comments
#544 - RuntimeError: module must have its parameters and buffers on device
Issue -
State: closed - Opened by Adaickalavan over 1 year ago
- 4 comments
#543 - Freeze "output" embedding when using tied embeddings.
Pull Request -
State: closed - Opened by cat-state over 1 year ago
- 1 comment
#542 - Llama NeMo support
Pull Request -
State: closed - Opened by cat-state over 1 year ago
- 2 comments
#541 - Fix reward model state dict loading
Pull Request -
State: closed - Opened by maxjeblick over 1 year ago
#540 - ILQL training batch2 tensor dimensions error
Issue -
State: open - Opened by GenVr over 1 year ago
- 2 comments
#539 - Fix LLaMA example (LLaMA 2)
Pull Request -
State: closed - Opened by PhungVanDuy over 1 year ago
- 1 comment
#538 - Add DS-Chat comparison
Pull Request -
State: closed - Opened by cat-state over 1 year ago
- 2 comments
#536 - Caught signal 7 (Bus error: nonexistent physical address)
Issue -
State: closed - Opened by Adaickalavan over 1 year ago
- 5 comments
#535 - Model does not load in the expected dtype
Issue -
State: closed - Opened by AugustasMacijauskas over 1 year ago
- 5 comments
Labels: bug
#533 - Add support for LLaMA2
Issue -
State: closed - Opened by cvetanovskaa over 1 year ago
- 1 comment
Labels: feature request
#532 - Add support for Falcon 7B/40B
Issue -
State: open - Opened by cvetanovskaa over 1 year ago
- 1 comment
Labels: feature request
#530 - Value branch
Pull Request -
State: closed - Opened by Dahoas over 1 year ago
- 7 comments
#528 - Implement BoN for training and eval
Pull Request -
State: open - Opened by Dahoas over 1 year ago
- 5 comments
#526 - Fix logging
Pull Request -
State: closed - Opened by Dahoas over 1 year ago
#522 - Fix ordering of ppo epoch iteration
Pull Request -
State: closed - Opened by RobertKirk over 1 year ago
- 5 comments
#521 - Reward model negative numbers meaning
Issue -
State: closed - Opened by GenVr over 1 year ago
- 2 comments
#517 - Sanity check: SFT Model should be frozen (PPO)
Issue -
State: closed - Opened by Apsod over 1 year ago
- 2 comments
Labels: bug
#513 - 8-bit inference (#512)
Pull Request -
State: open - Opened by glerzing over 1 year ago
- 13 comments
#504 - Direct Policy Optimization
Issue -
State: open - Opened by Reichenbachian over 1 year ago
- 4 comments
Labels: feature request
#501 - strange design
Issue -
State: closed - Opened by efengx over 1 year ago
- 1 comment
Labels: bug
#498 - feat: support add tokens to tokenizer.
Pull Request -
State: open - Opened by congchan over 1 year ago
#497 - Add llama opendelta, float layer freezing, and optional ref model + zero3
Pull Request -
State: closed - Opened by Dahoas over 1 year ago
- 2 comments
#489 - fix(modeling_ppo): load reference head under zero3
Pull Request -
State: closed - Opened by maxreciprocate over 1 year ago
- 2 comments
#485 - Training stuck generating rollouts
Issue -
State: closed - Opened by javirandor over 1 year ago
- 6 comments
Labels: bug
#483 - RuntimeError using Accelerate + Zero-3 to launch `ppo_sentiments_llama.py` (uninitialized LayerNorm weight in Hydra head?)
Issue -
State: closed - Opened by mbalesni over 1 year ago
- 7 comments
Labels: bug
#482 - fix(modeling): deepspeed checkpoint loading
Pull Request -
State: closed - Opened by maxreciprocate over 1 year ago
- 3 comments
#481 - RuntimeError: Error(s) in loading state_dict for GPTRewardModel
Issue -
State: closed - Opened by maxjeblick over 1 year ago
- 9 comments
Labels: bug
#480 - How to use checkpoint?
Issue -
State: closed - Opened by mshtelma over 1 year ago
- 3 comments
Labels: bug
#479 - how to use hydra train ppo model?
Issue -
State: closed - Opened by akk-123 over 1 year ago
- 1 comment
Labels: documentation
#476 - LLaMA sentiment example doesn't work
Issue -
State: closed - Opened by mbalesni over 1 year ago
- 3 comments
Labels: bug
#474 - About gpt_reward_test
Issue -
State: closed - Opened by ItGirls over 1 year ago
- 4 comments
Labels: bug
#466 - tokenizer of the summarization rlhf example
Issue -
State: closed - Opened by DanqingZ over 1 year ago
- 1 comment
Labels: bug
#461 - RunTimeError using Accelerate + Zero Stage 3 to launch ppo_sentiments.py
Issue -
State: closed - Opened by alex-athanassakos over 1 year ago
- 4 comments
Labels: bug
#437 - When set the tracker to tensorboard, the following error happened.
Issue -
State: closed - Opened by cdxzyc over 1 year ago
- 3 comments
Labels: bug
#410 - !deepspeed examples/summarize_rlhf/sft/train_gptj_summarize.py is failing
Issue -
State: open - Opened by MyBruso over 1 year ago
- 8 comments
#408 - glm-10b, got size mismatch error when training ppo using zero3
Issue -
State: closed - Opened by YaguangGong over 1 year ago
- 2 comments
Labels: bug
#385 - ppo trained model and checkpoints are not accesible
Issue -
State: closed - Opened by arpitg1991 over 1 year ago
- 2 comments
Labels: bug
#375 - [feat] Add LLaMa Model support for PPO
Pull Request -
State: closed - Opened by PhungVanDuy over 1 year ago
- 6 comments
#372 - Cuda OOM with PPO on GPT2-medium
Issue -
State: closed - Opened by OleksandrKorovii over 1 year ago
- 4 comments
Labels: bug
#367 - Questions about model size and num_processes in summarize-rlhf
Issue -
State: closed - Opened by agave233 over 1 year ago
- 4 comments
#301 - Pass extra information for the reward function with every sample.
Issue -
State: closed - Opened by JulesGM almost 2 years ago
- 1 comment
Labels: feature request
#283 - NeMo QOL improvements
Issue -
State: closed - Opened by cat-state almost 2 years ago
- 1 comment
Labels: feature request
#251 - empty `old_values` and `old_rewards` in `accelerate_ppo_trainer.loss()`
Issue -
State: closed - Opened by JustinAWei almost 2 years ago
- 3 comments
Labels: bug
#168 - Text generation
Issue -
State: closed - Opened by imrankh46 almost 2 years ago
- 7 comments
Labels: documentation
#104 - Ray Tune sweep does not support multi GPU
Issue -
State: closed - Opened by LouisCastricato about 2 years ago
- 20 comments
Labels: bug