huggingface/open-r1 issues and pull requests

#92 - Improve repoduction of r1 reported score

Pull Request - State: open - Opened by hynky1999 10 days ago

#91 - Change conda for uv

Pull Request - State: open - Opened by andimarafioti 10 days ago

#89 - evaluate code for livecodebench, GPQA and Codeforces

Issue - State: open - Opened by liyunsheng13 10 days ago

#88 - As the inference on local device is unaffordable for most people.

Issue - State: open - Opened by PoTaTo-Mika 10 days ago - 2 comments

#87 - Solution for Potential Inflation of Reward Metrics for Unparseable Go…

Pull Request - State: open - Opened by agulati18 10 days ago

#87 - Solution for Potential Inflation of Reward Metrics for Unparseable Go…

Pull Request - State: open - Opened by agulati18 10 days ago

#86 - Issue: Potential Inflation of Reward Metrics for Unparseable Gold Solutions

Issue - State: open - Opened by agulati18 10 days ago

#86 - Issue: Potential Inflation of Reward Metrics for Unparseable Gold Solutions

Issue - State: open - Opened by agulati18 10 days ago

#85 - tokenizer.chat_template is not set and no template argument was passed

Issue - State: open - Opened by keeeeenw 10 days ago

#85 - tokenizer.chat_template is not set and no template argument was passed

Issue - State: open - Opened by keeeeenw 10 days ago

#83 - Why the average power usage of GRPO training is at 50%?

Issue - State: open - Opened by saidineshpola 10 days ago

#83 - Why the average power usage of GRPO training is at 50%?

Issue - State: open - Opened by saidineshpola 10 days ago

#82 - deepseek-hack-map:: v3&r1

Issue - State: open - Opened by ziwang-com 10 days ago

#82 - deepseek-hack-map:: v3&r1

Issue - State: open - Opened by ziwang-com 10 days ago

#81 - pip install -e '.[dev]' hangs indefinitely (hours so far) on lighteval

Issue - State: closed - Opened by axelmagn 10 days ago - 1 comment

#81 - pip install -e '.[dev]' hangs indefinitely (hours so far) on lighteval

Issue - State: closed - Opened by axelmagn 10 days ago - 1 comment

#78 - feat: Added reward model according to paper.

Pull Request - State: open - Opened by ahmeterdempmk 10 days ago

#78 - feat: Added reward model according to paper.

Pull Request - State: open - Opened by ahmeterdempmk 10 days ago

#77 - Add recipe configs to optimize scripts (#73)

Pull Request - State: open - Opened by LoserCheems 11 days ago

#77 - Add recipe configs to optimize scripts (#73)

Pull Request - State: open - Opened by LoserCheems 11 days ago

#76 - Bump actions/checkout from 2 to 4

Pull Request - State: closed - Opened by dependabot[bot] 11 days ago
Labels: dependencies, github_actions

#76 - Bump actions/checkout from 2 to 4

Pull Request - State: closed - Opened by dependabot[bot] 11 days ago
Labels: dependencies, github_actions

#75 - Bump actions/setup-python from 2 to 5

Pull Request - State: closed - Opened by dependabot[bot] 11 days ago
Labels: dependencies, github_actions

#75 - Bump actions/setup-python from 2 to 5

Pull Request - State: closed - Opened by dependabot[bot] 11 days ago
Labels: dependencies, github_actions

#74 - Create new.py

Pull Request - State: closed - Opened by Shouvik703 11 days ago - 1 comment

#74 - Create new.py

Pull Request - State: closed - Opened by Shouvik703 11 days ago - 1 comment

#73 - Add recipe configs to optimize SFT scripts

Issue - State: open - Opened by LoserCheems 11 days ago

#73 - Add recipe configs to optimize SFT scripts

Issue - State: open - Opened by LoserCheems 11 days ago

#72 - Contributions in README.md

Pull Request - State: closed - Opened by qgallouedec 11 days ago

#72 - Contributions in README.md

Pull Request - State: closed - Opened by qgallouedec 11 days ago

#71 - Add `--input-batch-size`, `--client-replicas` args and download Ray logs

Pull Request - State: closed - Opened by gabrielmbmb 11 days ago

#71 - Add `--input-batch-size`, `--client-replicas` args and download Ray logs

Pull Request - State: closed - Opened by gabrielmbmb 11 days ago

#70 - Added dependabot integration for Python and GitHub Actions

Pull Request - State: closed - Opened by ygdrax 11 days ago - 2 comments

#70 - Added dependabot integration for Python and GitHub Actions

Pull Request - State: closed - Opened by ygdrax 11 days ago - 2 comments

#69 - ValueError: please provide at least one prompt

Issue - State: open - Opened by fe1ixxu 11 days ago - 5 comments

#69 - ValueError: please provide at least one prompt

Issue - State: open - Opened by fe1ixxu 11 days ago - 5 comments

#68 - FP8 training

Issue - State: open - Opened by showgood163 11 days ago

#68 - FP8 training

Issue - State: open - Opened by showgood163 11 days ago

#67 - No GRPOTrainer Group-Based Advantage Normalisation (Paper Eq. 3)

Issue - State: closed - Opened by agulati18 11 days ago

#67 - No GRPOTrainer Group-Based Advantage Normalisation (Paper Eq. 3)

Issue - State: closed - Opened by agulati18 11 days ago

#66 - Adding GRPOTrainer Group-Based Advantage Normalisation (Paper Eq. 3)

Pull Request - State: closed - Opened by agulati18 11 days ago - 2 comments

#66 - Adding GRPOTrainer Group-Based Advantage Normalisation (Paper Eq. 3)

Pull Request - State: closed - Opened by agulati18 11 days ago - 2 comments

#65 - We Need a Better Training Pipeline: GRPO Trainer Struggles with Long Completion Lengths on H100x8

Issue - State: open - Opened by SeungyounShin 11 days ago - 2 comments

#65 - We Need a Better Training Pipeline: GRPO Trainer Struggles with Long Completion Lengths on H100x8

Issue - State: open - Opened by SeungyounShin 11 days ago - 2 comments

#64 - Could you provide the model trained by GRPO?

Issue - State: open - Opened by Ethereal-sakura 11 days ago

#64 - Could you provide the model trained by GRPO?

Issue - State: open - Opened by Ethereal-sakura 11 days ago

#62 - docs: fix grammar and phrasing issues (1, 2, 3)

Pull Request - State: closed - Opened by CharlesCNorton 11 days ago

#62 - docs: fix grammar and phrasing issues (1, 2, 3)

Pull Request - State: closed - Opened by CharlesCNorton 11 days ago

#61 - Datasets for fine arts

Issue - State: open - Opened by erkinalp 11 days ago - 1 comment

#61 - Datasets for fine arts

Issue - State: open - Opened by erkinalp 11 days ago - 1 comment

#60 - Datasets for linguistics

Issue - State: open - Opened by erkinalp 11 days ago - 1 comment

#60 - Datasets for linguistics

Issue - State: open - Opened by erkinalp 11 days ago - 1 comment

#59 - What are expected training speed in GRPO

Issue - State: open - Opened by whitead 11 days ago - 3 comments

#59 - What are expected training speed in GRPO

Issue - State: open - Opened by whitead 11 days ago - 3 comments

#56 - How to supervise non-math data?

Issue - State: open - Opened by Luodian 11 days ago - 4 comments

#56 - How to supervise non-math data?

Issue - State: open - Opened by Luodian 11 days ago - 4 comments

#55 - Reward verification and evaluation fixes

Pull Request - State: closed - Opened by hynky1999 12 days ago - 1 comment

#55 - Reward verification and evaluation fixes

Pull Request - State: closed - Opened by hynky1999 12 days ago - 1 comment

#55 - Reward verification and evaluation fixes

Pull Request - State: closed - Opened by hynky1999 12 days ago - 1 comment

#55 - Reward verification and evaluation fixes

Pull Request - State: closed - Opened by hynky1999 12 days ago - 1 comment

#54 - docs: update README.md

Pull Request - State: closed - Opened by eltociear 12 days ago

#54 - docs: update README.md

Pull Request - State: closed - Opened by eltociear 12 days ago

#53 - Plan of attack uses R1 Zero for supervised fine-tuning in step 3.

Issue - State: open - Opened by keskival 12 days ago - 1 comment

#53 - Plan of attack uses R1 Zero for supervised fine-tuning in step 3.

Issue - State: open - Opened by keskival 12 days ago - 1 comment

#52 - Add Environment Test Script

Pull Request - State: open - Opened by sambhavnoobcoder 12 days ago - 4 comments

#52 - Add Environment Test Script

Pull Request - State: open - Opened by sambhavnoobcoder 12 days ago - 4 comments

#51 - Make HuggingFaceH4/aime_2024 public

Issue - State: closed - Opened by mlabonne 12 days ago - 3 comments

#51 - Make HuggingFaceH4/aime_2024 public

Issue - State: closed - Opened by mlabonne 12 days ago - 3 comments

#50 - Datasets for law

Issue - State: open - Opened by erkinalp 12 days ago - 3 comments

#50 - Datasets for law

Issue - State: open - Opened by erkinalp 12 days ago - 3 comments

#48 - [Willing to Contribute] Integrate SGLang into open-r1

Issue - State: open - Opened by zhaochenyang20 12 days ago

#48 - [Willing to Contribute] Integrate SGLang into open-r1

Issue - State: open - Opened by zhaochenyang20 12 days ago

#47 - Crazy VRAM usage with longer prompts

Issue - State: open - Opened by andyl98 12 days ago - 13 comments

#47 - Crazy VRAM usage with longer prompts

Issue - State: open - Opened by andyl98 12 days ago - 13 comments

#46 - how to train on MultiNode MultiGPU

Issue - State: open - Opened by yuepengs 12 days ago

#46 - how to train on MultiNode MultiGPU

Issue - State: open - Opened by yuepengs 12 days ago

#45 - Wondering why no format reward?

Issue - State: open - Opened by Luodian 12 days ago - 1 comment

#45 - Wondering why no format reward?

Issue - State: open - Opened by Luodian 12 days ago - 1 comment

#44 - 能否支持NPU？

Issue - State: closed - Opened by laozhuang727 12 days ago - 2 comments

#44 - 能否支持NPU？

Issue - State: closed - Opened by laozhuang727 12 days ago - 2 comments

#43 - vllm speed tweaks

Pull Request - State: closed - Opened by anton-l 12 days ago

#43 - vllm speed tweaks

Pull Request - State: closed - Opened by anton-l 12 days ago

#41 - Implement make evaluate command

Pull Request - State: closed - Opened by mariagrandury 12 days ago - 1 comment

#41 - Implement make evaluate command

Pull Request - State: closed - Opened by mariagrandury 12 days ago - 1 comment

#40 - Fix typos

Pull Request - State: open - Opened by mariagrandury 12 days ago - 1 comment

#40 - Fix typos

Pull Request - State: open - Opened by mariagrandury 12 days ago - 1 comment

#35 - what about open-v3?

Issue - State: open - Opened by ehartford 12 days ago - 1 comment

#35 - what about open-v3?

Issue - State: open - Opened by ehartford 12 days ago - 1 comment

#34 - Function calling at thinking time

Issue - State: open - Opened by Kreijstal 12 days ago - 4 comments

#34 - Function calling at thinking time

Issue - State: open - Opened by Kreijstal 12 days ago - 4 comments

#33 - Add devcontainer configuration for VS Code

Pull Request - State: open - Opened by bhack 12 days ago

#33 - Add devcontainer configuration for VS Code

Pull Request - State: open - Opened by bhack 12 days ago

#33 - Add devcontainer configuration for VS Code

Pull Request - State: open - Opened by bhack 12 days ago

#33 - Add devcontainer configuration for VS Code

Pull Request - State: open - Opened by bhack 12 days ago

#32 - let's create a discussion board on github to avoid issues for feature requests or sprawl?

Issue - State: closed - Opened by russellballestrini 12 days ago - 2 comments

#32 - let's create a discussion board on github to avoid issues for feature requests or sprawl?

Issue - State: closed - Opened by russellballestrini 12 days ago - 2 comments

#31 - Datasets for Medicine

Issue - State: open - Opened by cyrilzakka 12 days ago - 4 comments

#30 - chore: update trl to grpo_vllm branch, move lighteval to uv

Pull Request - State: open - Opened by gerred 12 days ago - 4 comments

#29 - Add example of generating data with deepseek r1 and distilled models

Pull Request - State: closed - Opened by plaguss 12 days ago

#29 - Add example of generating data with deepseek r1 and distilled models

Pull Request - State: closed - Opened by plaguss 12 days ago

GitHub / huggingface/open-r1 issues and pull requests