tatsu-lab/alpaca_eval issues and pull requests

#349 - [BUG] trust repo alpaca_eval

Pull Request - State: closed - Opened by YannDubs 8 months ago

#348 - Add claude-3-5-sonnet-20240620 to AlpacaEval

Pull Request - State: closed - Opened by MarjovanLier 8 months ago - 1 comment

#347 - Add OpenPipe Mixture of Agents model to Alpaca Eval

Pull Request - State: closed - Opened by saum7800 8 months ago

#346 - Details on Training GLM for Length-Controlled Winrate

Issue - State: closed - Opened by yix8 8 months ago - 3 comments

#345 - Add Nanbeige2-16B-Chat to AlpacaEval

Pull Request - State: closed - Opened by yuani114 8 months ago - 1 comment

#344 - Add Storm-7B, Storm-7B (best-of-64) to AlpacaEval

Pull Request - State: closed - Opened by yifan123 8 months ago - 3 comments

#343 - Evaluating a self hosted LLM through an API.

Issue - State: closed - Opened by CHRISTINEMUTHEE 8 months ago - 1 comment

#342 - Add Together-MoA, Together-MoA-Lite to AlpacaEval

Pull Request - State: closed - Opened by IsThatYou 8 months ago - 1 comment

#341 - ERROR:root:Error while parsing completion:

Issue - State: closed - Opened by AGTSAAA 9 months ago - 1 comment

#340 - How to solve the problem of null appearing in the evaluation results? Thank you very much

Issue - State: closed - Opened by AGTSAAA 9 months ago - 3 comments

#339 - tensor_parallel_size can not work

Issue - State: closed - Opened by AGTSAAA 9 months ago - 1 comment

#338 - [BUG] fix bs in VLLM and add chatml

Pull Request - State: closed - Opened by YannDubs 9 months ago

#337 - Why is max_num_seqs allowed here?

Issue - State: closed - Opened by RAY2L 9 months ago - 1 comment

#336 - Preference doesn't match log_probs in `annotations.json`

Issue - State: closed - Opened by YJWon99 9 months ago - 1 comment

#335 - confused about openai API

Issue - State: closed - Opened by junkangwu 9 months ago - 1 comment

#334 - Add merlinite-7B-AOT to AlpacaEval

Pull Request - State: closed - Opened by imelnyk 9 months ago - 1 comment

#333 - The code for computing instruction difficulty

Issue - State: closed - Opened by calvinh99 9 months ago - 1 comment

#332 - fix model link

Pull Request - State: closed - Opened by chujiezheng 9 months ago

#331 - Add ExPO + `Llama-3-Instruct-8B-SimPO` results

Pull Request - State: closed - Opened by chujiezheng 9 months ago - 1 comment

#330 - [ENH&BUG] improve VLLM

Pull Request - State: closed - Opened by YannDubs 9 months ago

#329 - Trouble with custom model hosted on OpenAI compatible endpoint

Issue - State: closed - Opened by tastycode 9 months ago - 1 comment

#328 - Unexpected low judge preference for some prompts

Issue - State: closed - Opened by geoalgo 9 months ago - 1 comment

#327 - Question on assumption of `model_identity` as a factor for preference on generated outputs.

Issue - State: closed - Opened by fgenie 9 months ago - 3 comments

#326 - Add REBEL-Llama-3-8B-Instruct to AlpacaEval

Pull Request - State: closed - Opened by ZhaolinGao 9 months ago - 1 comment

#325 - Huge performance gap when using annotator weighted_alpaca_eval_gpt4_turbo and alpaca_eval_gpt4_turbo_fn

Issue - State: closed - Opened by ZeroYuHuang 9 months ago - 4 comments

#324 - Add Aligner 2B+GPT-4 Turbo (04/09) Results

Pull Request - State: closed - Opened by AlignInc 9 months ago - 1 comment

#323 - Add Aligner 2B+GPT-4 Turbo (04/09) to AlpacaEval

Pull Request - State: closed - Opened by AlignInc 9 months ago

#322 - Add Phi 3 models

Issue - State: closed - Opened by EwoutH 9 months ago - 3 comments

#321 - [ENH] Use multi threading instead of processing

Pull Request - State: closed - Opened by YannDubs 9 months ago

#320 - Add Llama-3-Instruct-8B-SimPO to AlpacaEval

Pull Request - State: closed - Opened by xiamengzhou 9 months ago - 1 comment

#319 - [ENH] vicuna 1.5

Pull Request - State: closed - Opened by YannDubs 9 months ago

#318 - [CLEAN] move evaluators lb llama3

Pull Request - State: closed - Opened by YannDubs 9 months ago

#317 - [ENH] add LC SEM

Pull Request - State: closed - Opened by YannDubs 9 months ago

#316 - Alpaca Evaluation Instruction Difficulty used also for Custom Evaluation Dataset

Issue - State: closed - Opened by fanconic 9 months ago - 3 comments

#315 - Update README.md

Pull Request - State: closed - Opened by zhuang-li 9 months ago - 1 comment

#314 - llama3 evaluator

Pull Request - State: closed - Opened by zhuang-li 9 months ago - 2 comments

#313 - possibility of adding llama3-70b as the evaluator?

Issue - State: closed - Opened by zhuang-li 9 months ago - 3 comments

#312 - How to use AE1 to evaluate model

Issue - State: closed - Opened by matenglearn 9 months ago - 2 comments

#311 - [ADD] GPT4-o

Pull Request - State: closed - Opened by YannDubs 9 months ago

#310 - Overly High Win Rate for Alpaca v2 on mistral 7b orpo

Issue - State: closed - Opened by qingquansong 9 months ago - 12 comments

#309 - [verified] Yi-large

Pull Request - State: closed - Opened by YannDubs 9 months ago

#308 - The n_total of n_total result is not 805

Issue - State: closed - Opened by matenglearn 10 months ago - 6 comments

#307 - "Add Mistral-7B+RAHF-DUAL+LoRA to AlpacaEval"

Pull Request - State: closed - Opened by LiuAmber 10 months ago - 1 comment

#306 - Add <Mistral-7B+RAHF-DUAL+LoRA> to AlpacaEval

Pull Request - State: closed - Opened by LiuAmber 10 months ago

#305 - How to change cache path when evaluating multi-models

Issue - State: closed - Opened by bittersweet1999 10 months ago - 1 comment

#304 - Add Yi-Large Preview to AlpacaEval

Pull Request - State: closed - Opened by HyperdriveHustle 10 months ago - 2 comments

#303 - How to get the LC Win Rate in AlpacaEval 1.0 version?

Issue - State: closed - Opened by RZFan525 10 months ago - 4 comments

#302 - Fix typo in README.md

Pull Request - State: closed - Opened by tongyx361 10 months ago

#301 - What to do if the log prob is not returned?

Issue - State: closed - Opened by e0397123 10 months ago - 7 comments

#300 - How to solve the problem of null appearing in the evaluation results

Issue - State: closed - Opened by LiuAmber 10 months ago - 6 comments

#299 - Add ExPO results to AlpacaEval

Pull Request - State: closed - Opened by chujiezheng 10 months ago

#298 - Add SPPO-Mistral7B-PairRM to AlpacaEval

Pull Request - State: closed - Opened by Edward-Sun 10 months ago - 1 comment

#297 - Use verified by default

Pull Request - State: closed - Opened by YannDubs 10 months ago

#296 - Missing result file in notebook

Issue - State: closed - Opened by geoalgo 10 months ago - 1 comment

#295 - Missing item in results/llama-2-70b-chat-hf

Issue - State: closed - Opened by chchenhui 10 months ago - 1 comment

#294 - Add Storm-7B to AlpacaEval

Pull Request - State: closed - Opened by yifan123 10 months ago - 3 comments

#293 - Enable analyzing evaluators/annotators on data without multiple generator models

Pull Request - State: closed - Opened by rdnfn 10 months ago - 1 comment

#292 - [ENH] verifying all the qwens

Pull Request - State: closed - Opened by YannDubs 10 months ago

#291 - add Qwen1.5-110B-Chat self-report results

Pull Request - State: closed - Opened by Lukeming-tsinghua 10 months ago - 1 comment

#290 - Unable to reproduce results

Issue - State: closed - Opened by felipemaiapolo 10 months ago - 1 comment

#289 - Add link for FsfairX-Zephyr-Chat-v0.1

Pull Request - State: closed - Opened by hendrydong 10 months ago

#288 - Add Ghost 7B Alpha to AlpacaEval

Pull Request - State: closed - Opened by lh0x00 10 months ago

#287 - Llama-3-Instruct not using official prompt template?

Issue - State: closed - Opened by ZHZisZZ 10 months ago - 1 comment

#286 - Add the evaluation result for our latest model

Pull Request - State: closed - Opened by hendrydong 10 months ago - 2 comments

#285 - [ENH] llama3

Pull Request - State: closed - Opened by YannDubs 10 months ago

#284 - How instruction_difficulty feature is obtained

Issue - State: closed - Opened by stepyndriyy 10 months ago - 1 comment

#283 - [BUG] revert to GPT4 preview 1106

Pull Request - State: closed - Opened by YannDubs 10 months ago

#282 - Confusion in Model Evaluation Results Due to GPT Updates

Issue - State: closed - Opened by yifan123 10 months ago - 2 comments

#281 - Add support for analyzing evaluators with custom cross-annotations

Pull Request - State: closed - Opened by rdnfn 10 months ago - 1 comment

#280 - Update README.md

Pull Request - State: closed - Opened by Dominic789654 10 months ago

#279 - Add Nanbeige-Plus-Chat-v0.1 to AlpacaEval

Pull Request - State: closed - Opened by yuani114 10 months ago - 2 comments

#278 - [BUG] backward compatibility with AF

Pull Request - State: closed - Opened by YannDubs 10 months ago - 1 comment

#277 - Fix KeyError at line 17; annotations['preference'] -> annotation['preferences']

Pull Request - State: closed - Opened by wjdghks950 10 months ago - 4 comments

#276 - openai_configs.yaml when using Azure only

Issue - State: closed - Opened by Yuancheng-Xu 10 months ago - 7 comments

#275 - [ENH] adding drbx and gpt4 turbo

Pull Request - State: closed - Opened by YannDubs 10 months ago

#274 - Add Nanbeige2-8B-Chat to AlpacaEval

Pull Request - State: closed - Opened by yuani114 10 months ago - 1 comment

#273 - Question about the GPT-4 API

Issue - State: closed - Opened by HypherX 11 months ago - 11 comments

#272 - With unstable GPT-4 API, I encounterd a tricky problem

Issue - State: closed - Opened by njupopsicle 11 months ago - 2 comments

#271 - With unstable GPT-4 API, I encounterd a tricky problem

Issue - State: closed - Opened by njupopsicle 11 months ago - 1 comment

#270 - Question on Using Character-Level Length

Issue - State: closed - Opened by Leymore 11 months ago - 1 comment

#269 - Logistic regression for length-controlled winrate

Issue - State: closed - Opened by normster 11 months ago - 2 comments

#268 - Updating link to a super fast demo!

Pull Request - State: closed - Opened by kyleliang919 11 months ago - 1 comment

#267 - Add Conifer-7B-DPO to AlpacaEval

Pull Request - State: closed - Opened by liulixin29 11 months ago - 1 comment

#266 - "Add Mistral-7B-LoRA-RAHF-DUAL to AlpacaEval"

Pull Request - State: closed - Opened by LiuAmber 11 months ago - 2 comments

#265 - Add <Mistral-7B-LoRA-RAHF-DUAL> to AlpacaEval

Pull Request - State: closed - Opened by LiuAmber 11 months ago - 1 comment

#264 - Add TempNet-LLaMA2-Chat to AlpacaEval

Pull Request - State: closed - Opened by xumao-nju 11 months ago - 2 comments

#263 - It is possible to use existing evaluation files to complete the evaluation?

Issue - State: closed - Opened by bittersweet1999 11 months ago - 12 comments

#262 - Add Ein-70B-v0.1 to AlpacaEval

Pull Request - State: closed - Opened by bin-bi 11 months ago - 1 comment

#261 - Supplement for Aligner

Pull Request - State: closed - Opened by AlignInc 11 months ago

#261 - Supplement for Aligner

Pull Request - State: closed - Opened by AlignInc 11 months ago

#260 - Latest LC-AlpacaEval update broken?

Issue - State: closed - Opened by rraju1 11 months ago - 4 comments

#259 - Add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval

Pull Request - State: closed - Opened by AlignInc 11 months ago - 5 comments

#258 - Yann/length correction

Pull Request - State: open - Opened by YannDubs 11 months ago

#257 - Add Mistral-ORPO-Beta to AlpacaEval

Pull Request - State: closed - Opened by jiwooya1000 11 months ago - 1 comment

#256 - Add Samba-CoE-v0.2-best-of-16 to AlpacaEval

Pull Request - State: closed - Opened by kyleliang919 11 months ago - 1 comment

#255 - Reproducing numbers for evaluator human-agreement leaderboard.

Issue - State: closed - Opened by Varun221 12 months ago - 1 comment

#254 - Remove Deprecated model.to_bettertransformer() Call for Compatibility with Latest Transformers and Torch

Issue - State: closed - Opened by zxia545 12 months ago - 1 comment

#253 - Add Samba-CoE-v0.2 to AlpacaEval

Pull Request - State: closed - Opened by kyleliang919 12 months ago - 1 comment

#252 - A bug in `weighted_alpaca_eval_gpt4_turbo`

Issue - State: closed - Opened by RZFan525 12 months ago - 3 comments

#251 - [ENH] add mistral large

Pull Request - State: closed - Opened by YannDubs 12 months ago

GitHub / tatsu-lab/alpaca_eval issues and pull requests