allenai/open-instruct issues and pull requests

#525 - Olmo2 and olmoe support

Pull Request - State: closed - Opened by vwxyzjn 7 days ago

#524 - RLVR arguments clarification

Issue - State: open - Opened by hank0316 9 days ago - 1 comment

#523 - Optionally save value model + GRPO

Pull Request - State: closed - Opened by hamishivi 14 days ago

#522 - Add dataset cache / mixing support

Pull Request - State: open - Opened by vwxyzjn 14 days ago - 1 comment

#521 - Fix vLLM worker for new version

Pull Request - State: closed - Opened by hamishivi 15 days ago

#520 - Add a note on deepspeed's gradient accumulation

Pull Request - State: closed - Opened by vwxyzjn 15 days ago

#519 - Fix auto save logic

Pull Request - State: closed - Opened by vwxyzjn 15 days ago

#518 - Add weka setup

Pull Request - State: closed - Opened by vwxyzjn 15 days ago

#517 - NCCL_CUMEM_ENABLE fix.

Pull Request - State: closed - Opened by vwxyzjn 16 days ago

#516 - Quick fix

Pull Request - State: closed - Opened by vwxyzjn 17 days ago

#515 - quick fix on oe-eval

Pull Request - State: closed - Opened by vwxyzjn 17 days ago

#514 - Push evaluation results into the datalake

Pull Request - State: closed - Opened by vwxyzjn 17 days ago

#513 - Allow using tokenizer chat template

Pull Request - State: closed - Opened by hamishivi 17 days ago

#512 - Different vocabulary for policy and reward model

Issue - State: closed - Opened by ashish230897 17 days ago - 1 comment

#511 - Fix synth_pref functions

Pull Request - State: closed - Opened by ljvmiranda921 17 days ago - 1 comment

#510 - merge tokenization logic, and allow for pre-tokenization caching.

Issue - State: open - Opened by vwxyzjn 20 days ago

#509 - Downgrade deepspeed

Pull Request - State: closed - Opened by vwxyzjn 20 days ago

#508 - Silly bug

Pull Request - State: closed - Opened by vwxyzjn 20 days ago

#507 - Apply accelerate change to dpo cache script

Pull Request - State: closed - Opened by hamishivi 20 days ago

#506 - Unable to generate Synth_Pref dataset

Issue - State: open - Opened by ranarag 20 days ago - 2 comments
Labels: bug

#505 - Add `--try_auto_save_to_beaker` arg

Pull Request - State: closed - Opened by vwxyzjn 22 days ago

#504 - Add documentation on caching models.

Pull Request - State: closed - Opened by vwxyzjn 22 days ago - 1 comment

#503 - Winter cleaning

Issue - State: open - Opened by vwxyzjn 22 days ago

#502 - Use the latest OLMo2 image

Pull Request - State: closed - Opened by vwxyzjn 22 days ago

#501 - PPO codebase

Issue - State: closed - Opened by ashish230897 22 days ago - 3 comments

#500 - Unable to Reproduce Safety-Eval Results for TULU-3

Issue - State: closed - Opened by ranarag 24 days ago - 2 comments

#499 - Add Enforce Eager Flag

Pull Request - State: closed - Opened by hamishivi 24 days ago - 2 comments

#498 - How to finetune Qwen-1.5/DeepSeek-1.5B parameter models

Issue - State: closed - Opened by Adefioye 24 days ago - 1 comment

#497 - Question about bos token in alpaca_farm/run_eval.py

Issue - State: closed - Opened by ZeguanXiao 29 days ago - 1 comment

#496 - How to eval Super_ni

Issue - State: closed - Opened by Trae1ounG about 1 month ago - 1 comment

#495 - Potential bug in gradient accumulation

Issue - State: closed - Opened by yxchng about 1 month ago - 3 comments

#494 - Is there any easy way to add full eval data evaluation every n iterations to RLVR?

Issue - State: closed - Opened by yxchng about 1 month ago - 1 comment

#493 - Question about the releas

Issue - State: closed - Opened by PINE4PPLE about 1 month ago

#492 - uv2

Pull Request - State: closed - Opened by vwxyzjn about 1 month ago - 3 comments

#491 - tulu3 preference data pipeline which in report is inconsistent with this code-repo

Issue - State: open - Opened by scattw about 1 month ago - 1 comment

#490 - Update oe-eval.sh

Pull Request - State: closed - Opened by natolambert about 1 month ago

#489 - initial persona data gen 2 commit

Pull Request - State: closed - Opened by fabrahman about 1 month ago - 4 comments

#488 - Is resuming from last checkpoint not supported in ppo_vllm_thread_ray_gtrl.py?

Issue - State: closed - Opened by yxchng about 1 month ago - 1 comment

#487 - Recommendations for multi-node training of a 7B model with RL

Issue - State: closed - Opened by zhudefa about 1 month ago - 1 comment

#486 - Request for Code of Synthesizing for Target Skills

Issue - State: closed - Opened by DeepLSUN about 2 months ago - 3 comments

#485 - [Question] About the training time of RLVR

Issue - State: closed - Opened by chchch0109 about 2 months ago - 1 comment

#484 - 72B Model PPO Training Time

Issue - State: closed - Opened by KAKSIS about 2 months ago - 1 comment

#483 - [Question] Code for DPO Loss is not length normalised?

Issue - State: closed - Opened by carlos-gemmell about 2 months ago - 1 comment

#482 - SFT Loss unable to decrease on MATH data

Issue - State: closed - Opened by yxchng about 2 months ago - 1 comment

#481 - How load checkpoint to generate samples?

Issue - State: closed - Opened by zhudefa about 2 months ago - 1 comment

#480 - Request for Access to answer_extraction_model

Issue - State: closed - Opened by iseesaw about 2 months ago - 2 comments

#479 - Update README.md for @luca

Pull Request - State: closed - Opened by natolambert about 2 months ago

#478 - Update README.md for citation

Pull Request - State: closed - Opened by natolambert about 2 months ago

#477 - Questions about hyperparameters in Llama-3.1-Tulu-3-8B Reproduction

Issue - State: closed - Opened by wgimperial about 2 months ago - 4 comments

#476 - Suggestions for Training a SFT Model with Extremely Long Contexts (8k–64k Tokens)

Issue - State: closed - Opened by zhudefa about 2 months ago - 1 comment

#475 - Issue of using DeepSpeed with ZeRO Stage 3 optimization

Issue - State: closed - Opened by notoookay about 2 months ago - 1 comment

#474 - How to evaluate in local environment?

Issue - State: closed - Opened by zhudefa about 2 months ago - 1 comment

#473 - Use the latest image for olmo

Pull Request - State: closed - Opened by vwxyzjn about 2 months ago

#472 - use the latest oe-eval-image

Pull Request - State: closed - Opened by vwxyzjn about 2 months ago

#471 - Update README.md

Pull Request - State: closed - Opened by natolambert about 2 months ago

#471 - Update README.md

Pull Request - State: closed - Opened by natolambert about 2 months ago

#470 - How to fine-tune Phi-3-small-128k-instruct for RLVR?

Issue - State: closed - Opened by yxchng about 2 months ago - 3 comments

#469 - Errors running tulu3_dpo_8b.yaml

Issue - State: closed - Opened by rghilduta 2 months ago - 3 comments

#468 - Fix bug. The parameter --keep_last_n_checkpoints -1 doesn't work.

Pull Request - State: open - Opened by shizhengLi 2 months ago - 2 comments

#467 - Will you support fine-tuning from olmo2?

Issue - State: open - Opened by zhudefa 2 months ago - 1 comment

#466 - how to train with own data?

Issue - State: closed - Opened by yxchng 2 months ago - 1 comment

#465 - TULU3 MATH eval

Issue - State: closed - Opened by ypwang61 2 months ago - 2 comments

#464 - effective batch size

Issue - State: closed - Opened by DachengLi1 2 months ago - 1 comment

#463 - What exactly have you done with MergeKit?

Issue - State: closed - Opened by sunshicheng1 2 months ago - 3 comments

#462 - Docker build faild caused by copy oe-eval-internal file not exist

Issue - State: closed - Opened by wgimperial 2 months ago - 2 comments

#462 - Docker build faild caused by copy oe-eval-internal file not exist

Issue - State: closed - Opened by wgimperial 2 months ago - 2 comments

#461 - Bypass reward model usage when `reward_model_multiplier` is 0

Pull Request - State: closed - Opened by SumanthRH 2 months ago - 1 comment

#460 - Online DPO on 72B models

Issue - State: closed - Opened by KAKSIS 2 months ago - 3 comments

#458 - DPO/PPO/etc should default to HF chat template

Issue - State: open - Opened by hamishivi 2 months ago - 1 comment

#458 - DPO/PPO/etc should default to HF chat template

Issue - State: open - Opened by hamishivi 2 months ago

#457 - store configs, minor improvements to pref data mixer

Pull Request - State: closed - Opened by natolambert 2 months ago

#456 - Where to find RLVR phase training?

Issue - State: closed - Opened by Wonder1905 2 months ago - 1 comment

#456 - Where to find RLVR phase training?

Issue - State: closed - Opened by Wonder1905 2 months ago - 1 comment

#455 - Is there a bigcodebench evaluation in open source code?

Issue - State: closed - Opened by mst272 2 months ago - 1 comment

#455 - Is there a bigcodebench evaluation in open source code?

Issue - State: closed - Opened by mst272 2 months ago - 1 comment

#454 - Cache utility

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#454 - Cache utility

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#453 - Revert hf cache in weka

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#453 - Revert hf cache in weka

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#452 - Use standard weka path

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#452 - Use standard weka path

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#451 - Add Acknowledgements

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#451 - Add Acknowledgements

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#450 - File does not exists

Issue - State: closed - Opened by YathishPoojary98 2 months ago - 1 comment

#449 - Use hf cache in juptier, save tons of time downloading dataset / models

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#449 - Use hf cache in juptier, save tons of time downloading dataset / models

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#448 - Fix oe-eval gpu count

Pull Request - State: closed - Opened by hamishivi 2 months ago

#448 - Fix oe-eval gpu count

Pull Request - State: closed - Opened by hamishivi 2 months ago

#447 - Allow eval to different repo

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#447 - Allow eval to different repo

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#446 - About cite/ack to OpenRLHF in README.md or paper.

Issue - State: closed - Opened by hijkzzz 2 months ago - 3 comments

#446 - About cite/ack to OpenRLHF in README.md or paper.

Issue - State: closed - Opened by hijkzzz 2 months ago - 3 comments

#445 - Add requirements for decontamination code

Pull Request - State: closed - Opened by pdasigi 2 months ago - 1 comment

#445 - Add requirements for decontamination code

Pull Request - State: open - Opened by pdasigi 2 months ago

#444 - Fix broken image paths

Pull Request - State: closed - Opened by ljvmiranda921 2 months ago

#443 - Push preview

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#443 - Push preview

Pull Request - State: closed - Opened by vwxyzjn 2 months ago

#442 - Add code for synthetic preference pipeline

Pull Request - State: closed - Opened by ljvmiranda921 2 months ago - 1 comment

#442 - Add code for synthetic preference pipeline

Pull Request - State: closed - Opened by ljvmiranda921 2 months ago - 1 comment

#441 - faster mmlu oe-eval

Pull Request - State: closed - Opened by hamishivi 2 months ago

GitHub / allenai/open-instruct issues and pull requests