Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / rlhflow/rlhf-reward-modeling issues and pull requests
#47 - Update gemma_two_head.py
Pull Request -
State: closed - Opened by Lichang-Chen 2 months ago
#46 - Missing code for ODIN
Issue -
State: open - Opened by maoliyuan 2 months ago
- 1 comment
#45 - Update README.md
Pull Request -
State: closed - Opened by Chenluye99 3 months ago
#44 - Update deepseek Top-1 acc on MATH
Pull Request -
State: closed - Opened by hanningzhang 3 months ago
#43 - Update README.md of Deepseek Pass 1 acc
Pull Request -
State: closed - Opened by hanningzhang 3 months ago
#42 - Rlhflow math
Pull Request -
State: closed - Opened by WeiXiongUST 3 months ago
#41 - add experiment setup and results for the math prm
Pull Request -
State: closed - Opened by hanningzhang 3 months ago
#40 - Rlhflow math: evaluation code and evaluation description in readme
Pull Request -
State: closed - Opened by hanningzhang 3 months ago
#39 - Pixi package management; notebooks folders; quarto paper setup.
Pull Request -
State: closed - Opened by professorwug 3 months ago
#38 - ODIN
Pull Request -
State: closed - Opened by Lichang-Chen 3 months ago
- 1 comment
#37 - Question regarding ARMO stage2-train code
Issue -
State: open - Opened by RayWang-iat 4 months ago
#36 - stage1-train:RuntimeError: torch.cat(): expected a non-empty list of Tensors
Issue -
State: closed - Opened by RayWang-iat 4 months ago
#35 - Armo-rm env set-up and data processing
Issue -
State: open - Opened by MaxwellJryao 5 months ago
- 1 comment
#34 - Add RRM augmentation
Pull Request -
State: closed - Opened by TerenceLiu4444 5 months ago
#33 - Clarification on Reward Usage in DPO Training
Issue -
State: open - Opened by vincezh2000 5 months ago
- 1 comment
#32 - ArmoRM-Llama3-8B-v0.1's tokenizer is different from Meta-Llama-3-8B-Instruct's
Issue -
State: closed - Opened by efsotr 5 months ago
- 7 comments
#31 - Semi-Supervised Reward Modeling (SSRM)
Pull Request -
State: closed - Opened by yifei-he 5 months ago
#30 - reproduce ArmoRM
Issue -
State: closed - Opened by richhh520 5 months ago
- 3 comments
#29 - preference dataset 404 not found
Issue -
State: closed - Opened by wty500 6 months ago
- 2 comments
#28 - Code to reproduce ArmoRM
Issue -
State: closed - Opened by halfrot 6 months ago
- 5 comments
#27 - Can I inquire about some training details about armo-rm?
Issue -
State: closed - Opened by xiaotian917 6 months ago
- 7 comments
#26 - Regarding the Gemma2 Reward Model Structure
Issue -
State: open - Opened by Loong435 6 months ago
- 2 comments
#25 - How to batch inference?
Issue -
State: closed - Opened by AIR-hl 7 months ago
#24 - "Token pattern not found in the list" error
Issue -
State: open - Opened by nshen7 7 months ago
- 3 comments
#23 - How to finetune ARMO model with custom dataset?
Issue -
State: closed - Opened by Helen-Cheung 7 months ago
- 4 comments
#22 - Bradley-Terry model removes lm head while saving
Issue -
State: open - Opened by Arnav0400 7 months ago
- 1 comment
#21 - Training and evaluating for pair_pm model.
Issue -
State: open - Opened by t-sifanwu 7 months ago
- 5 comments
#20 - How do you implement SLic on pair_pm model?
Issue -
State: open - Opened by t-sifanwu 8 months ago
- 1 comment
#19 - preference_700K dataset's details?
Issue -
State: closed - Opened by yechenzhi 8 months ago
- 4 comments
#18 - environment set up issue
Issue -
State: open - Opened by WayXG 8 months ago
- 1 comment
#17 - tutorial to reproduce ArmoRM
Issue -
State: closed - Opened by pluiez 8 months ago
- 1 comment
#16 - question of chat templates
Issue -
State: open - Opened by trueRosun 8 months ago
- 6 comments
#15 - Code for Armo on Reward Bench
Issue -
State: closed - Opened by philschmid 8 months ago
- 4 comments
#14 - How to calculate the avg score of reward bench?
Issue -
State: closed - Opened by eyuansu62 8 months ago
- 2 comments
#13 - Low Safety Score for RM-Gemma-2B Model
Issue -
State: closed - Opened by loss4Wang 9 months ago
- 2 comments
#12 - can we say PM is better than BT?
Issue -
State: closed - Opened by yechenzhi 9 months ago
- 2 comments
#11 - quesion about the output
Issue -
State: closed - Opened by yechenzhi 9 months ago
- 1 comment
#10 - How to construct new pairs for adding to the dataset
Issue -
State: closed - Opened by wlhgtc 9 months ago
- 1 comment
#9 - Does pair-pm supports multi-turn conversation?
Issue -
State: closed - Opened by heyzude 9 months ago
- 2 comments
#8 - Cannot understant the code at README.md of pair-pm
Issue -
State: closed - Opened by heyzude 9 months ago
- 4 comments
#7 - Pairwise preference model dev
Pull Request -
State: closed - Opened by WeiXiongUST 9 months ago
#6 - KeyError: 'input_ids_j' in training
Issue -
State: closed - Opened by iseesaw 9 months ago
- 2 comments
#5 - re-organize code
Pull Request -
State: closed - Opened by WeiXiongUST 10 months ago
#4 - Update eval_bench_mark.py
Pull Request -
State: closed - Opened by ZizhengYang 10 months ago
- 2 comments
#3 - Update eval_bench_mark.py allow use bf16 or f32
Pull Request -
State: closed - Opened by ZizhengYang 10 months ago
#2 - Cannot run the training script
Issue -
State: closed - Opened by peter-peng-w 10 months ago
- 1 comment
#1 - how to serve this model?
Issue -
State: closed - Opened by jxgu1016 11 months ago
- 1 comment