tatsu-lab/alpaca_eval issues and pull requests

#441 - Allow specifying model revision

Issue - State: open - Opened by tomtseng 10 days ago

#441 - Allow specifying model revision

Issue - State: open - Opened by tomtseng 10 days ago

#440 - Can you realize deepseek as a judge?

Issue - State: open - Opened by NuoJohnChen 10 days ago

#440 - Can you realize deepseek as a judge?

Issue - State: open - Opened by NuoJohnChen 10 days ago

#439 - Don't set root logger to INFO to avoid turning on all packages' logs

Pull Request - State: open - Opened by srowen 11 days ago

#439 - Don't set root logger to INFO to avoid turning on all packages' logs

Pull Request - State: open - Opened by srowen 11 days ago

#438 - Add Llama-3.1-8B-Base-AFT to AlpacaEval

Pull Request - State: open - Opened by Linzwcs 13 days ago

#438 - Add Llama-3.1-8B-Base-AFT to AlpacaEval

Pull Request - State: open - Opened by Linzwcs 13 days ago

#437 - Add Amara-o1-7B-Qwen Amara-o2-7B-Qwen to AlpacaEval

Pull Request - State: open - Opened by Minami-su 22 days ago - 3 comments

#436 - openai.BadRequestError

Issue - State: open - Opened by EnlightenedAI 25 days ago

#435 - Question about best_of when use_beam_search=True

Issue - State: open - Opened by ZeguanXiao about 1 month ago

#434 - TypeError: 'NoneType' object is not iterable

Issue - State: open - Opened by junming-yang about 1 month ago

#433 - alpaca eval config file

Issue - State: open - Opened by zwRuan about 1 month ago

#433 - alpaca eval config file

Issue - State: open - Opened by zwRuan about 1 month ago

#432 - Add test-7B-o1-Qwen to AlpacaEval

Pull Request - State: closed - Opened by Minami-su about 2 months ago

#432 - Add test-7B-o1-Qwen to AlpacaEval

Pull Request - State: closed - Opened by Minami-su about 2 months ago

#431 - can't match the win rate posted on the leaderboard

Issue - State: open - Opened by lindseyfeng about 2 months ago

#431 - can't match the win rate posted on the leaderboard

Issue - State: open - Opened by lindseyfeng about 2 months ago

#430 - The response was filtered due to the prompt triggering Azure OpenAI's content management policy

Issue - State: open - Opened by stamina1121 about 2 months ago - 2 comments

#430 - The response was filtered due to the prompt triggering Azure OpenAI's content management policy

Issue - State: open - Opened by stamina1121 about 2 months ago - 1 comment

#429 - [BUG] tool_calls

Pull Request - State: closed - Opened by YannDubs about 2 months ago

#428 - Add TOA to AlpacaEval

Pull Request - State: closed - Opened by oceanypt about 2 months ago - 1 comment

#427 - How is the win rate (not LC) calculated?

Issue - State: closed - Opened by chanchimin 2 months ago - 1 comment

#426 - Add FuseChat-3.0 models to AlpacaEval

Pull Request - State: closed - Opened by yangzy39 2 months ago - 2 comments

#425 - How to overwrite the leaderboard and reannotate the missing question?

Issue - State: closed - Opened by chanchimin 2 months ago - 2 comments

#424 - Add FuseChat-Llama-3.1-8B-Instruct, FuseChat-Gemma-2-9B-Instruct and …

Pull Request - State: closed - Opened by yangzy39 2 months ago

#423 - why judge has a default temperature of 1？

Issue - State: closed - Opened by schrieffer-z 2 months ago - 1 comment

#422 - Can we have a self-hosted evaluator for avoiding loading a large model every evaluating?

Issue - State: closed - Opened by glorgao 3 months ago - 1 comment

#421 - ValueError: BuilderConfig BuilderConfig(name='alpaca_eval_gpt4_baseline', version=1.0.0, data_dir=None, data_files=None, description='Official AlpacaEval 2.0 evaluation set.') doesn't have a 'trust_remote_code' key.

Issue - State: closed - Opened by ojasraundale 3 months ago - 1 comment

#420 - TypeError: LogisticRegressionCV.fit() got an unexpected keyword argument 'groups'

Issue - State: closed - Opened by ambarion 3 months ago - 2 comments

#419 - Possibility of Re-running Only Failed Queries After Rate Limit Reached

Issue - State: closed - Opened by hank0316 3 months ago - 2 comments

#418 - ValueError: Trailing data

Issue - State: closed - Opened by multimodalpragmatic 4 months ago - 2 comments

#417 - annotation by `alpaca_eval_llama3_70b_fn` causes `IndexError`

Issue - State: closed - Opened by reihig-ut 4 months ago - 3 comments

#416 - Add Llama-3-Instruct-8B-RainbowPO to AlpacaEval

Pull Request - State: closed - Opened by hanyang1999 4 months ago - 2 comments

#415 - AttributeError: module 'alpaca_eval.metrics' has no attribute 'get_length_controlled_winrate'

Issue - State: closed - Opened by sunjie279 4 months ago - 1 comment

#414 - Add NullModel to AlpacaEval

Pull Request - State: closed - Opened by xszheng2020 4 months ago - 7 comments

#413 - Add GPO-Llama-3-8B-Instruct-GPM-2B and SPPO-Llama-3-8B-Instruct-GPM-2…

Pull Request - State: closed - Opened by xukp20 4 months ago - 3 comments

#412 - add Self-taught-llama3.1-70B-dpo as a evaluator

Pull Request - State: closed - Opened by tianlu-wang 5 months ago - 1 comment

#411 - Add SelfMoA_gemma-2-9b-it-SimPO, SelfMoA_gemma-2-9b-it-WPO-HB to AlpacaEval

Pull Request - State: closed - Opened by wenzhe-li 5 months ago

#410 - Will the annotator alpaca_eval_gpt4_turbo_fn also change?

Issue - State: closed - Opened by hsqmlzno1 5 months ago - 1 comment

#409 - Updated HF Link in model_configs for Llama-3-8B-Instruct-SkillMix

Pull Request - State: closed - Opened by parksimon0808 5 months ago

#408 - Question about Leaderboard

Issue - State: closed - Opened by MaoXinn 5 months ago - 1 comment

#407 - I can't duplicate your results locally

Issue - State: closed - Opened by suanflower 5 months ago - 2 comments

#406 - 'message': "Invalid function definition for 'make_partial_leaderboard': unexpected parameter 'strict' parameter supplied

Issue - State: closed - Opened by suanflower 5 months ago - 1 comment

#405 - Add Llama-3-8B-Instruct-SkillMix to AlpacaEval

Pull Request - State: closed - Opened by parksimon0808 5 months ago - 1 comment

#404 - add example for Llama3 vllm server

Pull Request - State: closed - Opened by cameron-chen 5 months ago - 4 comments

#403 - Add REBEL-Llama-3-8B-Instruct-Armo to AlpacaEval

Pull Request - State: closed - Opened by ZhaolinGao 6 months ago - 1 comment

#402 - [ENH] add metadata to completion: date, version,...

Pull Request - State: closed - Opened by YannDubs 6 months ago

#401 - Add evaluator weighted_alpaca_eval_gpt-4o-mini-2024-07-18

Pull Request - State: closed - Opened by tongyx361 6 months ago - 2 comments

#400 - How to compute winrate using a file of 'annotations.json'

Issue - State: closed - Opened by JinYujie99 6 months ago - 1 comment

#399 - Add blendaxai-gm-l6-vo31 to AlpacaEval

Pull Request - State: closed - Opened by ym-blendax-ai 6 months ago - 1 comment

#398 - Add Shopee-SlimMoA-v1 to AlpacaEval

Pull Request - State: closed - Opened by LLM-Alignment-sh 6 months ago - 1 comment

#397 - Add blendaxai-gm-l6-vo14 to AlpacaEval

Pull Request - State: closed - Opened by ym-blendax-ai 6 months ago

#396 - After I finished evaluating Meta-Llama3-8B-Instruct

Issue - State: closed - Opened by iGangao 6 months ago - 2 comments

#395 - Added Llama3-PBM-Nova-70B model

Pull Request - State: closed - Opened by PKU-Baichuan 6 months ago

#394 - [ENH] add strict decoding OAI

Pull Request - State: closed - Opened by YannDubs 6 months ago

#393 - [ENH] add mistral v0.3, Qwen2 70b, gtp4 mini

Pull Request - State: closed - Opened by YannDubs 6 months ago

#392 - [ENH] enable base_dir to be a list

Pull Request - State: closed - Opened by YannDubs 6 months ago

#391 - [ENH] OpenAI use tools instead of functions

Pull Request - State: closed - Opened by YannDubs 6 months ago

#390 - all the following results are about AlpacaEval 1.0 and have not been updated since?

Issue - State: closed - Opened by zhimin-z 6 months ago - 1 comment

#389 - Add blendaxai-gm-l3-v35 to AlpacaEval

Pull Request - State: closed - Opened by ym-blendax-ai 6 months ago - 2 comments

#388 - Added Blendax.AI-gm-l3-v1 model results

Pull Request - State: closed - Opened by ym-blendax-ai 6 months ago - 1 comment

#387 - Change the name of the Infinity-Instruct-7M-0729-Models to Infinity-Instruct-7M-Gen-Models

Pull Request - State: closed - Opened by cszhengyh 6 months ago

#386 - Change the name of the Infinity-Instruct-7M-0729-Models to Infinity-Instruct-7M-Gen-Models

Pull Request - State: closed - Opened by cszhengyh 6 months ago

#385 - Add link to gemma-2-9b-it-WPO-HB

Pull Request - State: closed - Opened by wzhouad 6 months ago - 3 comments

#384 - Add gemma-2-9b-it-WPO-HB to AlpacaEval

Pull Request - State: closed - Opened by wzhouad 6 months ago

#383 - Add Infinity-Instruct-7M-0729-Llama3_1-70B, Infinity-Instruct-7M-0729-Llama3_1-8B, Infinity-Instruct-7M-0729-mistral-7B to AlpacaEval

Pull Request - State: closed - Opened by cszhengyh 6 months ago - 1 comment

#382 - Non-reproducible results

Issue - State: closed - Opened by felipemaiapolo 7 months ago - 4 comments

#381 - [ENH] add example for LLama 3 vllm

Pull Request - State: closed - Opened by YannDubs 7 months ago

#380 - Using local models as evaluators

Issue - State: closed - Opened by luciolcv 7 months ago - 2 comments

#379 - error with default json - ValueError: cannot reindex on an axis with duplicate labels

Issue - State: closed - Opened by ishapuri 7 months ago - 2 comments

#378 - [ENH] add llama 3.1

Pull Request - State: closed - Opened by YannDubs 7 months ago

#377 - Add Llama-3-Instruct-8B-WPO-HB-v2 to AlpacaEval

Pull Request - State: closed - Opened by wzhouad 7 months ago - 1 comment

#376 - Questions about the 'use_beam_search' configuration.

Issue - State: closed - Opened by ChangyuChen347 7 months ago - 1 comment

#375 - Extra '\n' in the prompt of Llama-3-Instruct-8B-SimPO

Issue - State: closed - Opened by gohsyi 7 months ago - 3 comments

#374 - Why do we set do_sample to true during evaluation?

Issue - State: closed - Opened by gohsyi 7 months ago - 1 comment

#373 - [BUG] backward compatibility vllm do_sample -> use_beam_search

Pull Request - State: closed - Opened by YannDubs 7 months ago

#372 - [ENH] adding simplified glm

Pull Request - State: closed - Opened by YannDubs 7 months ago

#371 - [ENH] add the code to compute instruction_following

Pull Request - State: closed - Opened by YannDubs 7 months ago

#370 - update model links

Pull Request - State: closed - Opened by xiamengzhou 7 months ago

#369 - [ENH] add CI test for unwanted files

Pull Request - State: closed - Opened by YannDubs 7 months ago

#368 - Add gemma-2-9b-it-SimPO and gemma-2-9b-it-DPO to AlpacaEval

Pull Request - State: closed - Opened by xiamengzhou 7 months ago - 4 comments

#367 - Add Higgs Llama3-70B V2 Results

Pull Request - State: closed - Opened by sxjscience 7 months ago - 2 comments

#366 - Added Ghost 8B Beta (d0x5) model

Pull Request - State: closed - Opened by lh0x00 7 months ago - 4 comments

#365 - assert output_path is not None and name in leaderboard

Issue - State: closed - Opened by AIR-hl 7 months ago - 1 comment

#364 - Add Infinity-Instruct-3M-0625-Models to AlpacaEval

Pull Request - State: closed - Opened by cszhengyh 7 months ago - 1 comment

#363 - Argument `do_sample` is removed from `vllm_local_completions`

Issue - State: closed - Opened by xiamengzhou 7 months ago - 1 comment

#362 - .

Issue - State: closed - Opened by BeastyZ 8 months ago - 2 comments

#361 - Negative Correlation while using `alpaca_eval analyze_evaluators`

Issue - State: closed - Opened by EganGu 8 months ago - 1 comment

#360 - Encountering Error about cannot reindex on an axis with duplicate labels

Issue - State: closed - Opened by URRealHero 8 months ago - 2 comments

#359 - Add SPPO-Gemma-2-9B-It-PairRM to AlpacaEval

Pull Request - State: closed - Opened by angelahzyuan 8 months ago

#358 - Add Infinity-Instruct-3M-0613-Llama3-70B to AlpacaEval

Pull Request - State: closed - Opened by cszhengyh 8 months ago

#357 - Add Infinity-Instruct-3M-0613-Llama3-70B to AlpacaEval

Pull Request - State: closed - Opened by cszhengyh 8 months ago

#356 - Add Infinity-Instruct-3M-0613-Llama3-70B to AlpacaEval

Pull Request - State: closed - Opened by cszhengyh 8 months ago - 3 comments

#355 - AssertionError

Issue - State: closed - Opened by BBaekdabang 8 months ago

#354 - Add SPPO-Llama-3-Instruct-8B-PairRM to AlpacaEval

Pull Request - State: closed - Opened by Edward-Sun 8 months ago - 1 comment

#353 - ValueError: Trailing data

Issue - State: closed - Opened by BBaekdabang 8 months ago - 4 comments

#352 - Discrepancy between alpaca leaderboard and Chatbot arena ELO

Issue - State: closed - Opened by Varun221 8 months ago - 1 comment

#351 - Add Infinity-Instruct-3M-0613-Mistral-7B to AlpacaEval

Pull Request - State: closed - Opened by cszhengyh 8 months ago - 1 comment

#350 - cannot reindex

Issue - State: closed - Opened by Chirobocea 8 months ago - 3 comments

GitHub / tatsu-lab/alpaca_eval issues and pull requests