allenai/open-instruct issues and pull requests

#397 - If constraints functions

Pull Request - State: closed - Opened by ValentinaPy 4 months ago

#396 - all 25 functions to evaluate constraints from IF taxonomy

Pull Request - State: closed - Opened by ValentinaPy 4 months ago - 1 comment

#396 - all 25 functions to evaluate constraints from IF taxonomy

Pull Request - State: closed - Opened by ValentinaPy 4 months ago - 1 comment

#395 - Faster, less memory by caching dpo logprobs

Pull Request - State: closed - Opened by vwxyzjn 4 months ago - 3 comments

#394 - "Eval Script Naming Issue: 'trutufulqa.sh' should be 'truthfulqa.sh'"

Issue - State: closed - Opened by pendulum445 4 months ago - 2 comments

#394 - "Eval Script Naming Issue: 'trutufulqa.sh' should be 'truthfulqa.sh'"

Issue - State: closed - Opened by pendulum445 4 months ago - 2 comments

#393 - GPU Requirement for PPO - CUDA Out of Memory Error During PPO Training

Issue - State: closed - Opened by RoozbehNahavandi 4 months ago - 2 comments

#393 - GPU Requirement for PPO - CUDA Out of Memory Error During PPO Training

Issue - State: closed - Opened by RoozbehNahavandi 4 months ago - 2 comments

#392 - Decontamination scripts

Pull Request - State: closed - Opened by pdasigi 4 months ago

#392 - Decontamination scripts

Pull Request - State: closed - Opened by pdasigi 4 months ago

#391 - Remove experiment name

Pull Request - State: closed - Opened by vwxyzjn 4 months ago

#390 - Prototype ppo + ray

Pull Request - State: closed - Opened by vwxyzjn 4 months ago

#390 - Prototype ppo + ray

Pull Request - State: closed - Opened by vwxyzjn 4 months ago

#389 - MMLU synthetic data

Pull Request - State: closed - Opened by nouhadziri 4 months ago

#389 - MMLU synthetic data

Pull Request - State: closed - Opened by nouhadziri 4 months ago

#388 - Add ds configs, better analysis

Pull Request - State: closed - Opened by natolambert 4 months ago - 1 comment

#387 - More systematic and reproducible conversion of SFT datasets

Pull Request - State: closed - Opened by yizhongw 4 months ago

#386 - GPT model synthetic data generation

Pull Request - State: open - Opened by VictoriaGraf 4 months ago

#385 - Make the oi safety eval beaker name shorter

Pull Request - State: closed - Opened by vwxyzjn 4 months ago

#384 - [qol] making leaderboard exps easier to search

Pull Request - State: closed - Opened by vwxyzjn 4 months ago - 1 comment

#383 - Update requirements.txt

Pull Request - State: closed - Opened by jacob-morrison 4 months ago

#382 - Redo dataset distribution plot from tulu 2

Pull Request - State: closed - Opened by natolambert 4 months ago

#381 - Faeze configs

Pull Request - State: closed - Opened by fabrahman 4 months ago

#380 - Which PPO model was trained using allenai/llama-3-tulu-2-8b-uf-mean-rm?

Issue - State: closed - Opened by arunasank 4 months ago - 4 comments

#379 - Add configs for DPO

Pull Request - State: closed - Opened by natolambert 4 months ago

#378 - config files for MATH and IF added

Pull Request - State: closed - Opened by fabrahman 4 months ago

#377 - Ground-Truth RL

Pull Request - State: closed - Opened by hamishivi 4 months ago

#376 - Remove newline at the end of tulu template.

Pull Request - State: closed - Opened by yizhongw 4 months ago

#375 - fix single node auto eval

Pull Request - State: closed - Opened by vwxyzjn 4 months ago

#374 - Fix simpo metrics

Pull Request - State: closed - Opened by vwxyzjn 4 months ago - 1 comment

#373 - Fix multi-node-eval

Pull Request - State: closed - Opened by vwxyzjn 4 months ago

#372 - Minor fix to DPO

Pull Request - State: closed - Opened by vwxyzjn 4 months ago - 1 comment

#371 - Add TPS metric

Pull Request - State: closed - Opened by vwxyzjn 4 months ago

#370 - Support multi-node with online DPO / PPO

Pull Request - State: closed - Opened by vwxyzjn 4 months ago

#368 - Onlinedpo Support rm with different vocab size

Pull Request - State: closed - Opened by vwxyzjn 4 months ago

#367 - Create synthetic MMLU via GPT-4

Pull Request - State: open - Opened by nouhadziri 5 months ago

#366 - files for multinode dpo

Pull Request - State: closed - Opened by jacob-morrison 5 months ago

#365 - Add gsm8k sft / generation dataset

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#364 - Online DPO optimization / support for RM of different vocab size

Pull Request - State: open - Opened by vwxyzjn 5 months ago

#363 - Fix memory leakr in reward modeling

Pull Request - State: closed - Opened by vwxyzjn 5 months ago - 1 comment

#362 - Process reward modeling support

Pull Request - State: open - Opened by fabrahman 5 months ago

#361 - Update online_dpo.md

Pull Request - State: closed - Opened by ValentinaPy 5 months ago

#360 - Let reward modeling use ai2 entity

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#359 - What kind of rewarding mechanism does tulu-v2.5-13b-uf-rm use?

Issue - State: closed - Opened by arunasank 5 months ago - 1 comment

#358 - Fix auto eval parsing

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#357 - Hotfix for get beaker dataset id

Pull Request - State: closed - Opened by vwxyzjn 5 months ago - 1 comment

#356 - PPO / Online DPO docs cleanup

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#355 - Resize embedding for reference model in DPO

Pull Request - State: closed - Opened by yizhongw 5 months ago

#354 - Log more stuff with DPO

Pull Request - State: closed - Opened by vwxyzjn 5 months ago - 1 comment

#353 - Fix auto eval with preemptible jobs

Pull Request - State: closed - Opened by vwxyzjn 5 months ago - 1 comment

#352 - Some errors when doing evaluation with hf models

Issue - State: closed - Opened by JiaQiSJTU 5 months ago - 1 comment

#351 - Only add IB env vars for multinode jobs

Pull Request - State: closed - Opened by jacob-morrison 5 months ago - 1 comment

#350 - Clean the encoding function for messages and support using different chat templates

Pull Request - State: open - Opened by yizhongw 5 months ago - 3 comments

#349 - Allowing the RM to resize embedding to better utilize tensorcore

Pull Request - State: closed - Opened by vwxyzjn 5 months ago - 1 comment

#348 - Suppor finetune / dpo with hf revision

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#347 - Remove the unintended breakpoint in reward_modeling.py

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#346 - Fix auto-eval 2

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#345 - Fix auto-eval with preemption

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#344 - Save model type as dpo in dpo script

Pull Request - State: closed - Opened by jacob-morrison 5 months ago

#343 - Add corresponding lora finetuning with config scripts

Pull Request - State: closed - Opened by notoookay 5 months ago - 2 comments

#342 - Allow custom leaderboard name

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#341 - Add meta data post hoc

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#340 - Update dpo_utils.py

Pull Request - State: closed - Opened by ValentinaPy 5 months ago - 1 comment

#339 - Update requirements.txt

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#338 - Pin `dataset` dependency to fix image error

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#337 - More preference datasets

Pull Request - State: closed - Opened by natolambert 5 months ago

#336 - Update default for metadata upload

Pull Request - State: closed - Opened by hamishivi 5 months ago - 2 comments

#335 - Fix and improvements rejection sampling generation

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#334 - add ultrafeedback versions

Pull Request - State: closed - Opened by natolambert 5 months ago

#333 - Support hf revision when generating

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#332 - Synthetic preference dataset

Pull Request - State: closed - Opened by vwxyzjn 5 months ago - 3 comments

#331 - Adding support for latest OLMo architectures

Pull Request - State: closed - Opened by natolambert 5 months ago - 2 comments

#330 - Problems about resuming from checkpoint for finetune_with_lora

Issue - State: open - Opened by ypwang61 5 months ago - 1 comment

#329 - Add retry method to eval utils

Pull Request - State: closed - Opened by hamishivi 5 months ago

#328 - Update submit_eval_jobs.py

Pull Request - State: closed - Opened by jacob-morrison 5 months ago

#327 - multinode training

Issue - State: closed - Opened by clu2gt 5 months ago - 2 comments

#326 - Support mixing dataset for reward modeling

Pull Request - State: closed - Opened by vwxyzjn 5 months ago - 1 comment

#325 - Update reward_modeling.py

Pull Request - State: closed - Opened by ValentinaPy 5 months ago

#324 - Adding support for OLMoE SFT + DPO w/ and w/o load balancing loss

Pull Request - State: closed - Opened by jacob-morrison 5 months ago

#323 - OLMoE SFT

Pull Request - State: closed - Opened by natolambert 5 months ago

#322 - Changing default model dtype

Issue - State: closed - Opened by notoookay 5 months ago

#321 - Fix hf revision pass through

Pull Request - State: closed - Opened by hamishivi 5 months ago

#320 - Upload metadata along with model weights

Pull Request - State: closed - Opened by hamishivi 5 months ago

#319 - Add new DPO config for data mixing

Pull Request - State: closed - Opened by ValentinaPy 5 months ago

#318 - Adding configurations for SFT mixtures we tried.

Pull Request - State: closed - Opened by yizhongw 5 months ago

#317 - Win rate plot experiment stuff

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#316 - Merge Dockerfiles

Pull Request - State: closed - Opened by hamishivi 5 months ago - 1 comment

#315 - Small fix to merge lora script

Pull Request - State: closed - Opened by hamishivi 5 months ago

#314 - Update eval suite

Pull Request - State: closed - Opened by hamishivi 5 months ago

#313 - Scripts for building preference datasets

Pull Request - State: closed - Opened by natolambert 5 months ago - 1 comment

#312 - Update dpo_tune.py

Pull Request - State: closed - Opened by ValentinaPy 5 months ago

#311 - Faster build using docker cache

Pull Request - State: closed - Opened by vwxyzjn 5 months ago - 1 comment

#310 - Auto eval actually works

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#309 - add dataset mixing

Pull Request - State: closed - Opened by ValentinaPy 5 months ago

#308 - DPO Data Mixing

Pull Request - State: closed - Opened by natolambert 5 months ago

#307 - Use beaker dataset to submit autoeval

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#306 - Push rejection sampling dataset to beaker dataset as well.

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#305 - only upload the hf_metadata with beaker job

Pull Request - State: closed - Opened by vwxyzjn 5 months ago

#304 - Olmoe sft auxloss

Pull Request - State: closed - Opened by Muennighoff 5 months ago

#303 - Will PPO fine-tuning be added?

Issue - State: closed - Opened by notoookay 5 months ago - 2 comments

GitHub / allenai/open-instruct issues and pull requests