CarperAI/trlx issues and pull requests

#103 - Add support for more `CausalLM`s

Pull Request - State: closed - Opened by jon-tow about 2 years ago - 6 comments

#102 - Restructure sweeps for reuse

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago - 2 comments

#101 - Change to Ray Tune sweeps

Pull Request - State: closed - Opened by LouisCastricato about 2 years ago - 2 comments

#100 - Force class registry via imports

Pull Request - State: closed - Opened by jon-tow about 2 years ago

#99 - Fix github build

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago - 1 comment

#98 - Add optional normalization (cont.)

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago - 6 comments

#97 - Update README to align nomenclature correctness

Pull Request - State: closed - Opened by ayulockin about 2 years ago

#96 - NeMo-Megatron Integration

Issue - State: closed - Opened by cat-state about 2 years ago - 1 comment
Labels: feature request

#95 - Add optional reward scaling

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago - 4 comments

#94 - NeoX ILQL (WIP)

Pull Request - State: closed - Opened by cat-state about 2 years ago

#93 - Update readme instructions

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago

#92 - rerun https://github.com/CarperAI/trlx/pull/89

Pull Request - State: closed - Opened by cat-state about 2 years ago

#91 - Fix slow ilql eval

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago

#90 - Increase default eval interval

Pull Request - State: closed - Opened by cat-state about 2 years ago - 2 comments

#89 - ILQL loss refactor

Pull Request - State: closed - Opened by cat-state about 2 years ago - 2 comments

#88 - Refactor PPO objective function

Pull Request - State: closed - Opened by jon-tow about 2 years ago - 1 comment

#87 - Fix pipeline's context overflow

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago

#84 - Add examples tip to contribution guide

Pull Request - State: closed - Opened by jon-tow about 2 years ago

#83 - Best practices on repeatedly generating experience and training on it?

Issue - State: closed - Opened by paulbricman about 2 years ago - 2 comments

#82 - Update `TrainConfig` optimizer hyperparameters

Pull Request - State: closed - Opened by jon-tow about 2 years ago - 5 comments

#81 - EXAMPLE : Interpreter grounded Neural Program Synthesis [WIP]

Pull Request - State: closed - Opened by reshinthadithyan about 2 years ago - 5 comments

#80 - Add LORA support to TRLX

Issue - State: closed - Opened by ethankim00 about 2 years ago - 2 comments

#79 - How to attribute different rewards to parts of the same rollout with PPO?

Issue - State: open - Opened by paulbricman about 2 years ago - 5 comments
Labels: feature request

#78 - Add `entity` name config for `wandb` logging

Pull Request - State: closed - Opened by jon-tow about 2 years ago

#77 - [fix] Remove stale options from `ppo_gptj.yml`

Pull Request - State: closed - Opened by jon-tow about 2 years ago - 1 comment

#76 - Hyperparameter Optimization with Ray Tune and Weights and Biases

Pull Request - State: closed - Opened by ayulockin about 2 years ago - 2 comments

#75 - NeoX Integrate WIP

Pull Request - State: closed - Opened by cat-state about 2 years ago

#74 - Example/Test Model Benchmarks (Canonical WandB runs)

Issue - State: closed - Opened by cat-state about 2 years ago - 6 comments
Labels: feature request

#73 - change version in package to match lib

Pull Request - State: closed - Opened by cat-state about 2 years ago

#72 - Amos optimizer support

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 4 comments

#71 - Docs

Pull Request - State: closed - Opened by shahbuland about 2 years ago - 4 comments

#70 - Add ckpt/ to gitignore

Pull Request - State: closed - Opened by ayulockin about 2 years ago

#69 - How to attribute reward to multiple model runs in the same trajectory with PPO

Issue - State: open - Opened by dpaleka about 2 years ago - 7 comments
Labels: feature request

#68 - Support for RLHF tuned Seq2Seq models

Issue - State: closed - Opened by VenkateshDas about 2 years ago - 5 comments

#67 - Python 3.7 support (for Colab)

Issue - State: closed - Opened by honglu2875 about 2 years ago - 1 comment
Labels: bug

#66 - Stale configs for `ppo_gptj.yml`

Issue - State: closed - Opened by jon-tow about 2 years ago - 1 comment
Labels: bug

#65 - How we can improve the documentation for beginners

Issue - State: open - Opened by simoninithomas about 2 years ago - 6 comments
Labels: documentation

#64 - Update documentation (first review)

Pull Request - State: closed - Opened by simoninithomas about 2 years ago - 3 comments

#63 - Hyperparameter Optimization with Weights and Biases

Pull Request - State: closed - Opened by ayulockin about 2 years ago

#62 - Download simulacra

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago

#61 - Conceptual explanation of hydra models

Issue - State: closed - Opened by boblee22 about 2 years ago - 1 comment
Labels: documentation, feature request

#60 - Examples/simulacra.py doesn't work

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 1 comment
Labels: bug

#59 - What deepspeed config was this tested on?

Issue - State: closed - Opened by Breakend about 2 years ago - 6 comments
Labels: documentation

#58 - trlx has no attribute 'train'

Issue - State: closed - Opened by alat-rights about 2 years ago - 9 comments
Labels: bug

#57 - Fix grammar (tense)

Pull Request - State: closed - Opened by mrm8488 about 2 years ago - 1 comment

#54 - Add Jax support

Issue - State: open - Opened by Dahoas about 2 years ago - 3 comments
Labels: feature request

#53 - PPO Implementation Details - Checklist

Issue - State: closed - Opened by herbiebradley about 2 years ago - 2 comments

#52 - [docs] Add `CONTRIBUTING.md`

Pull Request - State: closed - Opened by jon-tow about 2 years ago

#51 - DDP and hydra model

Issue - State: closed - Opened by maxreciprocate about 2 years ago - 6 comments
Labels: bug

#50 - AttributeError: 'DistributedDataParallel' object has no attribute 'generate'

Issue - State: closed - Opened by boblee22 about 2 years ago - 10 comments
Labels: bug

#49 - Getting Started As a Domain Expert

Issue - State: closed - Opened by fredzannarbor about 2 years ago - 4 comments

#48 - Ppo reward normalization

Pull Request - State: closed - Opened by Dahoas about 2 years ago - 6 comments

#47 - Create CODEOWNERS

Pull Request - State: closed - Opened by cat-state about 2 years ago

#46 - FasterTransformer reward model support

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 9 comments

#45 - Add initial issue templates

Pull Request - State: closed - Opened by jon-tow about 2 years ago

#44 - Some readme improvements

Pull Request - State: closed - Opened by thedch about 2 years ago

#43 - Add initial GitHub workflows

Pull Request - State: closed - Opened by jon-tow about 2 years ago - 7 comments

#42 - [update] Improve package setup

Pull Request - State: closed - Opened by jon-tow about 2 years ago - 1 comment

#41 - Installation error due to multiple top-level packages

Issue - State: closed - Opened by jon-tow about 2 years ago - 1 comment

#40 - fix type errors and add mypy

Issue - State: closed - Opened by cat-state about 2 years ago - 1 comment
Labels: feature request

#39 - Add isort flake8

Pull Request - State: closed - Opened by cat-state about 2 years ago - 5 comments

#38 - Add flake8 and isort to pre-commit

Issue - State: closed - Opened by cat-state about 2 years ago - 2 comments

#37 - Add isort and black

Pull Request - State: closed - Opened by cat-state about 2 years ago

#36 - Add pre-commit with `black`

Pull Request - State: closed - Opened by cat-state about 2 years ago

#35 - Autoformat/Code Style

Issue - State: closed - Opened by cat-state about 2 years ago - 1 comment

#34 - Deadlock (nothing happening) in multi-GPU setting

Issue - State: closed - Opened by ayulockin about 2 years ago - 9 comments

#33 - Implemented hydra heads + adaptive kl

Pull Request - State: closed - Opened by Dahoas about 2 years ago

#32 - Support for Soft Prompts in PPO Model

Pull Request - State: closed - Opened by daia99 about 2 years ago - 2 comments

#31 - Docs

Pull Request - State: closed - Opened by shahbuland about 2 years ago

#30 - Large reward model issue.

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 10 comments

#29 - [bug] Support for Soft Prompts in PPO Model

Issue - State: closed - Opened by daia99 about 2 years ago - 2 comments

#28 - Self Play

Issue - State: open - Opened by cat-state about 2 years ago - 3 comments
Labels: feature request

#27 - Gradient metrics

Issue - State: open - Opened by cat-state about 2 years ago - 1 comment
Labels: feature request

#26 - RLHF with HH Anthropic data

Issue - State: closed - Opened by cat-state about 2 years ago - 2 comments

#25 - Learnt Reward Modelling example

Issue - State: closed - Opened by cat-state about 2 years ago - 4 comments

#24 - Simplify api

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago - 7 comments

#23 - Save strategy

Pull Request - State: closed - Opened by odellus about 2 years ago - 4 comments

#22 - Make ilql respect the config & remove sin

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago - 3 comments

#21 - Better colab notebook example.

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 2 comments

#20 - Stable diffusion support

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 3 comments
Labels: feature request

#19 - Integrate hydra ppo

Issue - State: closed - Opened by Dahoas about 2 years ago - 1 comment

#18 - Support direct loading into rollout storage for reward labeled datasets

Issue - State: closed - Opened by Dahoas about 2 years ago - 1 comment

#17 - Implement model saving/loading

Issue - State: closed - Opened by Dahoas about 2 years ago - 6 comments

#16 - Implement A2C

Issue - State: closed - Opened by Dahoas about 2 years ago - 14 comments
Labels: feature request

#15 - Remove boiler plate between ILQL and PPO

Issue - State: closed - Opened by LouisCastricato about 2 years ago

#14 - NeoX support

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 5 comments

#13 - Benchmark suite

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 5 comments
Labels: feature request

#12 - Hyper parameter sweeping

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 7 comments

#11 - Update ppo value head + print logs

Pull Request - State: closed - Opened by Dahoas about 2 years ago - 3 comments

#10 - remove sin

Pull Request - State: closed - Opened by ShivanshuPurohit about 2 years ago - 2 comments

#9 - Example in read me is now wrong

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 1 comment

#8 - Adds style file and reward function capabilities to ppo orchestrator

Pull Request - State: closed - Opened by LouisCastricato about 2 years ago - 1 comment

#7 - Create a unified API between PPO and ILQL

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 4 comments

#6 - stage ilql

Pull Request - State: closed - Opened by maxreciprocate about 2 years ago - 2 comments

#5 - QOL fixes

Pull Request - State: closed - Opened by LouisCastricato about 2 years ago - 1 comment

#4 - order of ops error

Pull Request - State: closed - Opened by MichaelEinhorn about 2 years ago - 3 comments

#3 - Create LICENSE

Pull Request - State: closed - Opened by LouisCastricato about 2 years ago

#2 - Fix typo

Pull Request - State: closed - Opened by mrm8488 about 2 years ago

#1 - ReadTheDocs

Issue - State: closed - Opened by LouisCastricato about 2 years ago - 2 comments

GitHub / CarperAI/trlx issues and pull requests