Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / lvwerra/trl issues and pull requests
#386 - Lora weights are not merged properly in merge_peft_adapter
Issue -
State: open - Opened by vvasily over 1 year ago
#385 - 「RuntimeError: where expected condition to be a boolean tensor, but got a tensor with dtype Half」When I run rf_training.py I get this error?
Issue -
State: open - Opened by xuyingjie521 over 1 year ago
#384 - is there a code example for this? Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU
Issue -
State: open - Opened by shan23chen over 1 year ago
#383 - [`core`] Add 4bit QLora
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#382 - the stack_llama example got stuck when hit the save_freq of 1000
Issue -
State: open - Opened by ljdavns over 1 year ago
- 1 comment
#381 - Problem with finetuning reward model
Issue -
State: open - Opened by GooDRomka over 1 year ago
- 1 comment
#380 - from_pretrain with peft adapter on the hub (# 379)
Pull Request -
State: open - Opened by glerzing over 1 year ago
- 1 comment
#379 - from_pretrain with peft adapter on the hub
Issue -
State: open - Opened by glerzing over 1 year ago
- 3 comments
#378 - Unable to save model while using Deepspeed Zero stage 3
Issue -
State: open - Opened by sparshgupta3 over 1 year ago
#377 - [`core`] Fix warning issue
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#376 - The issue of reward_model output length
Issue -
State: open - Opened by lurenlym over 1 year ago
- 4 comments
#375 - best-of-n sampler class
Pull Request -
State: open - Opened by metric-space over 1 year ago
- 2 comments
#374 - Unavailable not pass Dataset, because of warning issue
Issue -
State: closed - Opened by Myashka over 1 year ago
- 1 comment
#373 - 0 abstraction RL - a single model for RM & Value Head
Pull Request -
State: open - Opened by younesbelkada over 1 year ago
- 1 comment
#372 - Why「ppo_trainer.generate 」report the following error? Please help.
Issue -
State: open - Opened by xuyingjie521 over 1 year ago
- 4 comments
#371 - Delete test_training.py
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#370 - in merge_peft_adapter.py, why not merge lora model's weights into the base model?
Issue -
State: closed - Opened by aling1472 over 1 year ago
- 2 comments
#369 - suboptimal text getting generated in comparison to spaces demo.
Issue -
State: open - Opened by prasad4fun over 1 year ago
- 2 comments
#368 - Llama Reward Model is incorrectly merged
Issue -
State: open - Opened by mnoukhov over 1 year ago
- 11 comments
#367 - [`docs`] fix SFT doc
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#366 - Remove obsolete layer_norm_names parameter and add peft>=0.3.0 to requirements
Pull Request -
State: closed - Opened by teticio over 1 year ago
- 1 comment
#365 - ref_model is none when use lora?
Issue -
State: closed - Opened by akk-123 over 1 year ago
- 1 comment
#364 - ValueError: weight is on the meta device, we need a 'value' to put in on 0
Issue -
State: closed - Opened by xuyingjie521 over 1 year ago
- 2 comments
#363 - add is_trainable in kwargs
Pull Request -
State: closed - Opened by Opdoop over 1 year ago
- 1 comment
#362 - ValueError: num_samples should be a positive integer value, but got num_samples=0
Issue -
State: closed - Opened by naity over 1 year ago
- 3 comments
#361 - RuntimeError: 'weight' must be 2-D
Issue -
State: open - Opened by yashpreets over 1 year ago
- 1 comment
#360 - RuntimeError: size mismatch when train with llama 7b/30b (deepspeed zero3)
Issue -
State: open - Opened by kebijuelun over 1 year ago
- 1 comment
#359 - stack_llama: add parameter to control max_length (to mitigate OOM errors)
Pull Request -
State: closed - Opened by teticio over 1 year ago
- 1 comment
#358 - stack_llama: update instructions in README, fix broken _get_submodules and save tokenizer
Pull Request -
State: closed - Opened by teticio over 1 year ago
- 5 comments
#357 - Getting `trainer.generate()` to generate output using the reference model
Issue -
State: open - Opened by philharmonikerzzy over 1 year ago
- 2 comments
#356 - [StackLLaMA] Problems running reward_modeling.py using gpt2 as base for reward model
Issue -
State: open - Opened by samuelhoglund over 1 year ago
- 2 comments
#355 - Negative KL and generation args
Issue -
State: open - Opened by zwhe99 over 1 year ago
- 4 comments
#354 - Potential typo when calculating reference log probabilities?
Issue -
State: closed - Opened by rmill040 over 1 year ago
- 2 comments
#353 - rl_training.py got an unexpected keyword argument 'early_stopping'
Issue -
State: closed - Opened by prasad4fun over 1 year ago
- 2 comments
#352 - [`PPO`] Relax negative KL constraint
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#351 - UserWarning: KL divergence is starting to become negative: -0.18
Issue -
State: open - Opened by kebijuelun over 1 year ago
- 2 comments
#350 - [Feature] support IterableDataset in ppo training
Issue -
State: open - Opened by jiangwy99 over 1 year ago
#349 - BUG: Deepspeed doesn't work with PEFT integration
Issue -
State: open - Opened by avacaondata over 1 year ago
- 1 comment
#348 - StackLlama, which special tokens to use with tokenizer
Issue -
State: closed - Opened by mnoukhov over 1 year ago
- 2 comments
#347 - merged RL model performed baddly
Issue -
State: closed - Opened by vpegasus over 1 year ago
- 9 comments
#346 - 【discussion】why sft‘s train dataset using packing technique?
Issue -
State: closed - Opened by echoht over 1 year ago
- 4 comments
#345 - [stack llama] dataload and tokenizer error!
Issue -
State: open - Opened by echoht over 1 year ago
- 1 comment
#344 - Is it correct mask for batch generation in implementation trl.Trainer.generate?
Issue -
State: open - Opened by knowledgehacker over 1 year ago
- 1 comment
#343 - TypeError: LlamaForCausalLM.__init__() got an unexpected keyword argument 'layer_norm_names'
Issue -
State: closed - Opened by kebijuelun over 1 year ago
- 1 comment
#342 - Fix bug when loading local peft model
Pull Request -
State: closed - Opened by Opdoop over 1 year ago
- 9 comments
#341 - [BUG] Load local peft model while base model weight and lora weight are saved in different locations
Issue -
State: closed - Opened by Opdoop over 1 year ago
#340 - AttributeError: 'DistributedDataParallel' object has no attribute 'generate'
Issue -
State: open - Opened by haoemo over 1 year ago
- 3 comments
#339 - Fix argument's description
Pull Request -
State: closed - Opened by vinhkhuc over 1 year ago
- 1 comment
#338 - CPU/CUDA device error with `supervised_finetuning.py`
Issue -
State: closed - Opened by kl2004 over 1 year ago
- 2 comments
#337 - [StackLLaMA] 0 trainable params when loading LLaMA-7B in 8bit
Issue -
State: open - Opened by samuelhoglund over 1 year ago
- 7 comments
#336 - [`SFT`] Fix sft issues
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#335 - [`SFT `] SFT fixes
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#334 - Potential bug in PPOTrainer code.
Issue -
State: open - Opened by sverneka over 1 year ago
- 3 comments
#333 - Generation not working properly
Issue -
State: open - Opened by TheMrguiller over 1 year ago
- 13 comments
#332 - trl with sentence-transformers?
Issue -
State: open - Opened by leobaumgardt over 1 year ago
- 2 comments
#331 - TRL Currently breaks with bfloat16. Fixes `mini_batch_data` to support BFloat16
Pull Request -
State: open - Opened by JulesGM over 1 year ago
- 3 comments
#330 - [`PPOTrainer`] Fix tensorboard issue
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 2 comments
#329 - ValueError: can only convert an array of size 1 to a Python scalar
Issue -
State: closed - Opened by vpegasus over 1 year ago
- 5 comments
#328 - RuntimeError: expected scalar type Half but found Float in finetuning the frozen 8-bits Low Rank Adapter (cm_finetune_peft_imdb.py)
Issue -
State: open - Opened by leclem over 1 year ago
- 2 comments
#327 - RLHF pipeline for StackLLama with pretrained models doesn't work
Issue -
State: open - Opened by vvasily over 1 year ago
- 2 comments
#326 - 140/best n sampling
Pull Request -
State: closed - Opened by metric-space over 1 year ago
- 8 comments
#325 - Stack-llama rl_training script: CUDA Index error
Issue -
State: closed - Opened by jeromeku over 1 year ago
- 8 comments
#324 - added doc for using torch.distributed.launch/run
Pull Request -
State: closed - Opened by oroojlooy over 1 year ago
- 3 comments
#323 - [`core`] officially support SFT (Supervised Finetuning)
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#322 - Env Rewards increasing but Model Output Quality going down
Issue -
State: open - Opened by bhavnicksm over 1 year ago
- 1 comment
#321 - Why is the backward step in ppo_trainer not handled by accelerate's accumulate?
Issue -
State: open - Opened by sandeepchittilla over 1 year ago
- 1 comment
#320 - [`Docs`] Add details on multi-GPU / multi-node
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#319 - Problem when using seq2seq model
Issue -
State: open - Opened by eleluong over 1 year ago
- 3 comments
#318 - [`CI`] Fix broken tests
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 1 comment
#317 - Update README
Issue -
State: closed - Opened by oroojlooy over 1 year ago
- 4 comments
#316 - Tensor mis-match RuntimeError when use accelerate for gpt2-sentiment.py
Issue -
State: open - Opened by oroojlooy over 1 year ago
- 2 comments
#315 - Give a key to the wandb PPOConfig config entry
Pull Request -
State: closed - Opened by JulesGM over 1 year ago
- 9 comments
#314 - Best Way To Include Multiple Rewards?
Issue -
State: open - Opened by JosephGatto over 1 year ago
- 1 comment
#313 - Issues running stack llama example
Issue -
State: closed - Opened by vshah505 over 1 year ago
- 3 comments
#312 - fixed typo in error message
Pull Request -
State: closed - Opened by soerenarlt over 1 year ago
- 1 comment
#311 - will LoRA decrease the quality of generated reward model?
Issue -
State: closed - Opened by yzxyzh over 1 year ago
#310 - why can't I download the stack llama rm dataset from huggingface?
Issue -
State: closed - Opened by yzxyzh over 1 year ago
- 4 comments
#309 - fix DS for peft ref_model in ppo trainer
Pull Request -
State: closed - Opened by halfrot over 1 year ago
- 5 comments
#308 - runtime error in sentiment.py
Issue -
State: closed - Opened by amarazad over 1 year ago
- 2 comments
#307 - StackLLaMA - ValueError: Please specify `target_modules` in `peft_config`
Issue -
State: closed - Opened by thinh-huynh-re over 1 year ago
- 2 comments
#306 - Supporting Custom HuggingFace model in TRL
Issue -
State: open - Opened by srikar2097 over 1 year ago
#305 - Question about generation_kwargs in Stack_LLama rl_training
Issue -
State: closed - Opened by Kororinpas over 1 year ago
- 1 comment
#304 - When using accelerate with deepspeed and peft, any accelerator_kwargs passed to PPOConfig( ) are not used?
Issue -
State: open - Opened by sandeepchittilla over 1 year ago
- 3 comments
#303 - [`core`] Officially Support Reward Modeling
Pull Request -
State: closed - Opened by younesbelkada over 1 year ago
- 2 comments
#302 - The size of trained data with ConstantLengthDataset ?
Issue -
State: open - Opened by AIchenkai over 1 year ago
- 1 comment
#301 - Support for Codegen gpt2 model
Issue -
State: closed - Opened by amarazad over 1 year ago
- 3 comments
#300 - clm_finetune_peft_imdb.py need 40GB?
Issue -
State: open - Opened by vpegasus over 1 year ago
- 18 comments
#299 - Is value head gradient cut off?
Issue -
State: closed - Opened by likenneth over 1 year ago
- 3 comments
#298 - Fix arguments description
Pull Request -
State: closed - Opened by lvzii over 1 year ago
- 1 comment
#296 - Bug in the `PPOTrainer.log_stats` code
Issue -
State: closed - Opened by JulesGM over 1 year ago
- 2 comments
#295 - Log Token distribution of Query / Response
Pull Request -
State: closed - Opened by natolambert over 1 year ago
- 2 comments
#294 - clean examples folder
Pull Request -
State: closed - Opened by natolambert over 1 year ago
- 1 comment
#292 - Request for Adapter Script to run StackLLaMA locally
Issue -
State: closed - Opened by SupreethRao99 over 1 year ago
- 1 comment
#290 - Fine tuning stack_llama
Issue -
State: open - Opened by imrankh46 over 1 year ago
- 9 comments
#289 - Update requirements.txt
Pull Request -
State: open - Opened by cyrilzakka over 1 year ago
- 5 comments
#288 - What does `model_ref ` do
Issue -
State: open - Opened by zyzhang1130 over 1 year ago
- 7 comments
#287 - Cannot run stackllama example
Issue -
State: open - Opened by fecet over 1 year ago
- 2 comments
#285 - RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
Issue -
State: open - Opened by bingjie3216 over 1 year ago
- 5 comments
#282 - Questions about llama examples.
Issue -
State: closed - Opened by DespairL over 1 year ago
#281 - Loss spike check
Pull Request -
State: open - Opened by edbeeching over 1 year ago
- 2 comments