Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / CarperAI/trlx issues and pull requests

#601 - OOM error with PEFT LoRA on Llama2-7B

Issue - State: open - Opened by arpaiva 2 months ago - 1 comment
Labels: bug

#600 - Load the checkpoint fails

Issue - State: open - Opened by AfraAmini 3 months ago
Labels: bug

#599 - cannot import name 'flatten_dataclass' from 'trlx.data.ilql_types'

Issue - State: open - Opened by AfraAmini 4 months ago
Labels: bug

#598 - maybe bug in prepare & load's order

Issue - State: open - Opened by daiwk 4 months ago - 1 comment
Labels: bug

#597 - Error when running Ray Tune to launch hyperparameter sweep

Issue - State: open - Opened by Jing-L97 4 months ago - 1 comment
Labels: bug

#595 - Data Loader Bug when running t5_summarization_daily_cnn.py

Issue - State: open - Opened by yunanyan 6 months ago
Labels: bug

#594 - Why train dataloader is not prepared by Accelerator

Issue - State: open - Opened by Jiaxin-Wen 7 months ago
Labels: bug

#593 - TRLX Environment customization

Issue - State: open - Opened by heraldiclily 7 months ago

#591 - Issue of tensors share memory

Issue - State: open - Opened by heraldiclily 8 months ago - 2 comments
Labels: bug

#590 - [New Feature Request] Add KTO

Issue - State: open - Opened by 1485840691-eng 10 months ago
Labels: feature request

#589 - RLHF text summarization diverges

Issue - State: open - Opened by AlisonWen 11 months ago
Labels: bug

#587 - Runtime error when running examples (ilql_sentiments_t5.py)

Issue - State: open - Opened by youxiho1 11 months ago - 2 comments
Labels: bug

#586 - Add citation info from the EMNLP paper

Pull Request - State: closed - Opened by StellaAthena 11 months ago

#585 - MPT is not working

Issue - State: open - Opened by ouhenio 12 months ago
Labels: bug

#583 - Faster & memory-efficient logprobs calculation

Pull Request - State: open - Opened by li-plus 12 months ago - 1 comment

#582 - Attention mask when calculating log ratio for PPO

Issue - State: open - Opened by kmy17518 about 1 year ago

#581 - Multi-GPU training errors with peft

Issue - State: open - Opened by AliengirlLiv about 1 year ago - 1 comment
Labels: bug

#580 - Issue since most recent transformers update

Issue - State: open - Opened by siddharthverma314 about 1 year ago - 1 comment
Labels: bug

#579 - update(requirements.txt): to the latest `transformers` & `deepspeed`

Pull Request - State: open - Opened by maxreciprocate about 1 year ago - 1 comment

#578 - fix(modeling_base): partial loading of a sharded checkpoint

Pull Request - State: closed - Opened by maxreciprocate about 1 year ago

#577 - resume_from_checkpoint doesn't work

Issue - State: closed - Opened by andrewsiah about 1 year ago - 1 comment
Labels: bug

#576 - fix model state_dict retrieving in zero3

Pull Request - State: closed - Opened by Jingru about 1 year ago

#575 - support parallel reward function

Pull Request - State: open - Opened by Jingru about 1 year ago - 16 comments

#574 - Support parallel reward_fn in PPO training

Issue - State: closed - Opened by Jingru about 1 year ago
Labels: feature request

#573 - support customized run_name in tracker

Pull Request - State: closed - Opened by Jingru about 1 year ago - 1 comment

#572 - Support customized run name

Pull Request - State: closed - Opened by Jingru about 1 year ago

#571 - multigpu support for summarization ppo example

Issue - State: open - Opened by sayan1101 about 1 year ago - 3 comments
Labels: bug

#569 - TypeError: reward_fn() got an unexpected keyword argument 'tokenizer'

Issue - State: closed - Opened by sayan1101 about 1 year ago - 1 comment
Labels: bug

#567 - Problem with LLama training with LoRA

Issue - State: open - Opened by freQuensy23-coder about 1 year ago - 3 comments
Labels: bug

#566 - fix(modeling_base): re-order `model.forward_kwargs` initialization

Pull Request - State: closed - Opened by maxreciprocate about 1 year ago - 1 comment

#565 - Question about saving peft checkpoint

Issue - State: open - Opened by nhanph about 1 year ago - 2 comments
Labels: bug

#564 - `position_ids` error in accelerate PPO trainer

Issue - State: closed - Opened by pbarragan about 1 year ago - 3 comments
Labels: bug

#563 - [Fix] Add default config LLaMa 2 converter Nemo

Pull Request - State: closed - Opened by PhungVanDuy about 1 year ago

#562 - Add default config LLaMa 2 converter Nemo

Pull Request - State: closed - Opened by PhungVanDuy about 1 year ago

#561 - How to generate reward-labeled dataset

Issue - State: open - Opened by mikkelmedm about 1 year ago
Labels: feature request

#560 - feats: Add text enviroment examples

Pull Request - State: open - Opened by PhungVanDuy about 1 year ago

#559 - How to train LLaMA2 on the summarize_rlhf example?

Issue - State: open - Opened by missflash about 1 year ago

#557 - docs: update documentation

Pull Request - State: closed - Opened by maxreciprocate about 1 year ago - 1 comment

#556 - feat: Add support for DPO

Pull Request - State: open - Opened by sandeepchittilla about 1 year ago - 12 comments

#555 - Inference pipeline

Pull Request - State: open - Opened by Dahoas about 1 year ago - 1 comment

#554 - feat: add rejection finetuning trainer

Pull Request - State: closed - Opened by maxreciprocate about 1 year ago - 1 comment

#553 - Increasing max new tokens for generation arguments lead to errors

Issue - State: open - Opened by wise-east about 1 year ago - 3 comments
Labels: bug

#552 - fix(examples/hh): old gpt-j checkpoint loading

Pull Request - State: closed - Opened by maxreciprocate about 1 year ago

#550 - Add trlX cite

Pull Request - State: closed - Opened by Dahoas over 1 year ago

#549 - Unable to load and run inference on finetuned Alpaca model

Issue - State: closed - Opened by doyled-it over 1 year ago - 7 comments
Labels: bug

#548 - Memory occupy with multi GPUs Training

Issue - State: open - Opened by yuanyaaa over 1 year ago - 1 comment

#547 - chore(requirements.txt): update everything to the latest

Pull Request - State: closed - Opened by maxreciprocate over 1 year ago - 1 comment

#546 - mosaicml/mpt support

Pull Request - State: closed - Opened by 50m-regent over 1 year ago

#545 - Unable to load the trained model to do the inference

Issue - State: closed - Opened by CSerxy over 1 year ago - 9 comments

#544 - RuntimeError: module must have its parameters and buffers on device

Issue - State: closed - Opened by Adaickalavan over 1 year ago - 4 comments

#543 - Freeze "output" embedding when using tied embeddings.

Pull Request - State: closed - Opened by cat-state over 1 year ago - 1 comment

#542 - Llama NeMo support

Pull Request - State: closed - Opened by cat-state over 1 year ago - 2 comments

#541 - Fix reward model state dict loading

Pull Request - State: closed - Opened by maxjeblick over 1 year ago

#540 - ILQL training batch2 tensor dimensions error

Issue - State: open - Opened by GenVr over 1 year ago - 2 comments

#539 - Fix LLaMA example (LLaMA 2)

Pull Request - State: closed - Opened by PhungVanDuy over 1 year ago - 1 comment

#538 - Add DS-Chat comparison

Pull Request - State: closed - Opened by cat-state over 1 year ago - 2 comments

#536 - Caught signal 7 (Bus error: nonexistent physical address)

Issue - State: closed - Opened by Adaickalavan over 1 year ago - 5 comments

#535 - Model does not load in the expected dtype

Issue - State: closed - Opened by AugustasMacijauskas over 1 year ago - 5 comments
Labels: bug

#533 - Add support for LLaMA2

Issue - State: closed - Opened by cvetanovskaa over 1 year ago - 1 comment
Labels: feature request

#532 - Add support for Falcon 7B/40B

Issue - State: open - Opened by cvetanovskaa over 1 year ago - 1 comment
Labels: feature request

#530 - Value branch

Pull Request - State: closed - Opened by Dahoas over 1 year ago - 7 comments

#528 - Implement BoN for training and eval

Pull Request - State: open - Opened by Dahoas over 1 year ago - 5 comments

#526 - Fix logging

Pull Request - State: closed - Opened by Dahoas over 1 year ago

#522 - Fix ordering of ppo epoch iteration

Pull Request - State: closed - Opened by RobertKirk over 1 year ago - 5 comments

#521 - Reward model negative numbers meaning

Issue - State: closed - Opened by GenVr over 1 year ago - 2 comments

#517 - Sanity check: SFT Model should be frozen (PPO)

Issue - State: closed - Opened by Apsod over 1 year ago - 2 comments
Labels: bug

#513 - 8-bit inference (#512)

Pull Request - State: open - Opened by glerzing over 1 year ago - 13 comments

#504 - Direct Policy Optimization

Issue - State: open - Opened by Reichenbachian over 1 year ago - 4 comments
Labels: feature request

#501 - strange design

Issue - State: closed - Opened by efengx over 1 year ago - 1 comment
Labels: bug

#498 - feat: support add tokens to tokenizer.

Pull Request - State: open - Opened by congchan over 1 year ago

#497 - Add llama opendelta, float layer freezing, and optional ref model + zero3

Pull Request - State: closed - Opened by Dahoas over 1 year ago - 2 comments

#489 - fix(modeling_ppo): load reference head under zero3

Pull Request - State: closed - Opened by maxreciprocate over 1 year ago - 2 comments

#485 - Training stuck generating rollouts

Issue - State: closed - Opened by javirandor over 1 year ago - 6 comments
Labels: bug

#482 - fix(modeling): deepspeed checkpoint loading

Pull Request - State: closed - Opened by maxreciprocate over 1 year ago - 3 comments

#481 - RuntimeError: Error(s) in loading state_dict for GPTRewardModel

Issue - State: closed - Opened by maxjeblick over 1 year ago - 9 comments
Labels: bug

#480 - How to use checkpoint?

Issue - State: closed - Opened by mshtelma over 1 year ago - 3 comments
Labels: bug

#479 - how to use hydra train ppo model?

Issue - State: closed - Opened by akk-123 over 1 year ago - 1 comment
Labels: documentation

#476 - LLaMA sentiment example doesn't work

Issue - State: closed - Opened by mbalesni over 1 year ago - 3 comments
Labels: bug

#474 - About gpt_reward_test

Issue - State: closed - Opened by ItGirls over 1 year ago - 4 comments
Labels: bug

#466 - tokenizer of the summarization rlhf example

Issue - State: closed - Opened by DanqingZ over 1 year ago - 1 comment
Labels: bug

#461 - RunTimeError using Accelerate + Zero Stage 3 to launch ppo_sentiments.py

Issue - State: closed - Opened by alex-athanassakos over 1 year ago - 4 comments
Labels: bug

#437 - When set the tracker to tensorboard, the following error happened.

Issue - State: closed - Opened by cdxzyc over 1 year ago - 3 comments
Labels: bug

#410 - !deepspeed examples/summarize_rlhf/sft/train_gptj_summarize.py is failing

Issue - State: open - Opened by MyBruso over 1 year ago - 8 comments

#408 - glm-10b, got size mismatch error when training ppo using zero3

Issue - State: closed - Opened by YaguangGong over 1 year ago - 2 comments
Labels: bug

#385 - ppo trained model and checkpoints are not accesible

Issue - State: closed - Opened by arpitg1991 over 1 year ago - 2 comments
Labels: bug

#375 - [feat] Add LLaMa Model support for PPO

Pull Request - State: closed - Opened by PhungVanDuy over 1 year ago - 6 comments

#372 - Cuda OOM with PPO on GPT2-medium

Issue - State: closed - Opened by OleksandrKorovii over 1 year ago - 4 comments
Labels: bug

#367 - Questions about model size and num_processes in summarize-rlhf

Issue - State: closed - Opened by agave233 over 1 year ago - 4 comments

#301 - Pass extra information for the reward function with every sample.

Issue - State: closed - Opened by JulesGM almost 2 years ago - 1 comment
Labels: feature request

#283 - NeMo QOL improvements

Issue - State: closed - Opened by cat-state almost 2 years ago - 1 comment
Labels: feature request

#251 - empty `old_values` and `old_rewards` in `accelerate_ppo_trainer.loss()`

Issue - State: closed - Opened by JustinAWei almost 2 years ago - 3 comments
Labels: bug

#168 - Text generation

Issue - State: closed - Opened by imrankh46 almost 2 years ago - 7 comments
Labels: documentation

#104 - Ray Tune sweep does not support multi GPU

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 20 comments
Labels: bug