Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / EleutherAI/lm-evaluation-harness issues and pull requests

#2365 - Extracting vLLM metrics

Issue - State: open - Opened by vsmolyakov 1 day ago

#2364 - Add Unitxt Multimodality Support

Pull Request - State: open - Opened by elronbandel 1 day ago

#2363 - Unitxt Multi Modality Support

Pull Request - State: closed - Opened by elronbandel 1 day ago

#2361 - boolq trust remote code

Issue - State: open - Opened by IvanSedykh 4 days ago

#2360 - [multimodal] llava-1.5-7b-hf doesn't work on `mmmu_val`

Issue - State: open - Opened by BabyChouSr 5 days ago - 4 comments
Labels: bug

#2359 - fix `cost_estimate` script

Pull Request - State: open - Opened by baberabb 5 days ago

#2357 - Add metabench task to LM Evaluation Harness

Pull Request - State: open - Opened by kozzy97 5 days ago - 1 comment

#2355 - --tasks mmlu

Issue - State: closed - Opened by belle9217 5 days ago - 1 comment
Labels: asking questions

#2355 - --tasks mmlu

Issue - State: closed - Opened by belle9217 5 days ago - 1 comment
Labels: asking questions

#2352 - Setting limit_mm_per_prompt for vllm_vlm fails argument parser

Issue - State: open - Opened by mgoin 6 days ago
Labels: bug

#2352 - Setting limit_mm_per_prompt for vllm_vlm fails argument parser

Issue - State: open - Opened by mgoin 6 days ago
Labels: bug

#2351 - squad v2: load metric with `evaluate`

Pull Request - State: closed - Opened by baberabb 6 days ago

#2351 - squad v2: load metric with `evaluate`

Pull Request - State: closed - Opened by baberabb 6 days ago

#2350 - fix writeout script

Pull Request - State: closed - Opened by baberabb 6 days ago

#2350 - fix writeout script

Pull Request - State: closed - Opened by baberabb 6 days ago

#2349 - Support pipeline parallel with OpenVINO models

Pull Request - State: open - Opened by sstrehlk 6 days ago - 1 comment

#2346 - Unexpected space character

Issue - State: open - Opened by eldarkurtic 6 days ago - 2 comments

#2345 - tasks RACE only high not "middle"

Issue - State: open - Opened by Choi-jun9803 6 days ago

#2344 - Reproduce QWen 2.5-14B-Instruct and LLaMa-3.1-8B-Instruct Results

Issue - State: open - Opened by ruleGreen 6 days ago - 1 comment

#2343 - gpt2 evaluation

Issue - State: open - Opened by sorobedio 6 days ago

#2341 - Merge New Tasks

Pull Request - State: closed - Opened by ToluClassics 7 days ago

#2340 - Added metric aggregation for leaderboard tasks.

Pull Request - State: closed - Opened by Am1n3e 7 days ago - 4 comments

#2339 - Fixed dummy model

Pull Request - State: closed - Opened by Am1n3e 7 days ago - 1 comment

#2338 - Locally reproducible HF-Leaderboard evals

Issue - State: open - Opened by eldarkurtic 7 days ago - 2 comments
Labels: asking questions

#2337 - Robustness Task

Pull Request - State: closed - Opened by rimashahbazyan 7 days ago - 1 comment

#2336 - Add a note for missing dependencies

Pull Request - State: closed - Opened by eldarkurtic 7 days ago - 1 comment

#2335 - Dynamical prompt with extremely promising results #RIPrompt

Issue - State: open - Opened by anthonyrisinger 7 days ago - 1 comment

#2334 - mmlu-pro: add newlines to task descriptions (not leaderboard)

Pull Request - State: closed - Opened by baberabb 8 days ago

#2333 - add newlines to mmlu_pro task descriptions (not leaderboard)

Pull Request - State: closed - Opened by baberabb 8 days ago

#2332 - change glianorex to test split

Pull Request - State: closed - Opened by baberabb 8 days ago - 1 comment

#2331 - Confusion over the model outputs

Issue - State: open - Opened by tranlm 8 days ago

#2330 - Failed to add a new metric

Issue - State: open - Opened by Ofir408 8 days ago

#2329 - `glianorex_en` task does not work

Issue - State: closed - Opened by casper-hansen 8 days ago - 1 comment
Labels: bug

#2328 - Hashing error when setting random seed for vllm model

Issue - State: open - Opened by yizhongw 9 days ago - 1 comment
Labels: asking questions

#2327 - openai: better error messages; fix greedy matching

Pull Request - State: closed - Opened by baberabb 11 days ago - 1 comment

#2326 - Support for Using Multiple Choice Datasets with GPT-4o Model via OpenAI API

Issue - State: closed - Opened by Laplace888 11 days ago - 3 comments
Labels: asking questions

#2325 - Fix float limit override

Pull Request - State: open - Opened by cjluo-omniml 11 days ago - 3 comments

#2324 - Bug in the float limit handling

Issue - State: open - Opened by cjluo-omniml 11 days ago - 6 comments
Labels: feature request

#2323 - Error for AGIEval when using fewshot

Issue - State: open - Opened by BaohaoLiao 12 days ago - 1 comment
Labels: bug, validation

#2322 - Which version to use

Issue - State: open - Opened by sorobedio 12 days ago - 9 comments
Labels: validation

#2321 - Mathvista

Pull Request - State: open - Opened by baberabb 12 days ago

#2320 - change group to tags in task `eus_exams` task configs

Pull Request - State: closed - Opened by baberabb 12 days ago

#2319 - how to get lm_eval version 4.2

Issue - State: closed - Opened by sorobedio 13 days ago - 1 comment

#2318 - Evaluation of MMLU tasks using the OpenAI API

Issue - State: closed - Opened by Laplace888 13 days ago - 3 comments
Labels: asking questions

#2317 - Multiple generations (sequential) per question

Issue - State: open - Opened by IntrepidEnki 13 days ago - 1 comment
Labels: feature request, asking questions

#2316 - GSM8K Problem On Colab With Finetuned Phi3.5 mini model

Issue - State: closed - Opened by SongTonyLi 14 days ago - 3 comments
Labels: asking questions

#2315 - remove comma

Pull Request - State: closed - Opened by baberabb 14 days ago

#2314 - Update neuron backend

Pull Request - State: closed - Opened by dacorvo 14 days ago - 4 comments

#2313 - Comma breaks __repr__ for write-out

Issue - State: closed - Opened by giuliolovisotto 14 days ago - 1 comment
Labels: bug

#2312 - mmlu translated professionally by OpenAI

Pull Request - State: open - Opened by giuliolovisotto 14 days ago - 1 comment

#2311 - add batch_size to `get_sample_size`

Pull Request - State: closed - Opened by baberabb 14 days ago

#2309 - Scrolls branch

Pull Request - State: open - Opened by blitzionic 14 days ago - 2 comments

#2308 - Chat templates

Issue - State: closed - Opened by IsraelAbebe 15 days ago

#2307 - avoid timeout errors with high concurrency in api_model

Pull Request - State: open - Opened by dtrawins 15 days ago - 3 comments

#2306 - Running multiple processes on a shared outlines cache database

Issue - State: open - Opened by e-tornike 15 days ago - 2 comments

#2305 - New Task: `openai_mmmlu` professionaly translated by OpenAI as part of o1 release

Issue - State: open - Opened by giuliolovisotto 15 days ago - 1 comment
Labels: feature request

#2304 - Fix missing key in custom task loading.

Pull Request - State: open - Opened by giuliolovisotto 15 days ago

#2303 - Missing key in dictionary when loading tasks.

Issue - State: open - Opened by giuliolovisotto 15 days ago
Labels: bug

#2302 - Configuring Azure OPENAI

Issue - State: open - Opened by sudhanshu-myl 15 days ago - 3 comments
Labels: asking questions

#2301 - Fail to reproduce the perplexity of Llama-2 7B on wikitext

Issue - State: open - Opened by Yonghao-Tan 16 days ago - 10 comments

#2300 - add new truncation strategy

Pull Request - State: open - Opened by artemorloff 16 days ago - 3 comments

#2299 - fix some bugs of mmlu

Pull Request - State: closed - Opened by eyuansu62 17 days ago - 3 comments

#2298 - fix some bugs of mmlu (flan_cot_fewshot and flan_n_shot)

Pull Request - State: closed - Opened by eyuansu62 17 days ago

#2297 - Update README.md

Pull Request - State: closed - Opened by SYusupov 17 days ago - 3 comments

#2296 - Low GPU Utilization During Multi-GPU evaluation - Efficiency Optimization

Issue - State: open - Opened by yang3121099 17 days ago - 1 comment
Labels: asking questions

#2295 - the log is end,the gpu is not calculate,but is storing,the result is not getting,is it normal?

Issue - State: closed - Opened by belle9217 18 days ago - 1 comment
Labels: asking questions

#2294 - Worse evaluation performance with PEFT adaptors

Issue - State: open - Opened by YananLi18 18 days ago - 1 comment

#2293 - RuntimeError: CUDA error: device-side assert triggered

Issue - State: open - Opened by milliemaoo 19 days ago - 2 comments

#2292 - Using multi-GPU with accelerate is not working

Issue - State: closed - Opened by commmet-ahn 20 days ago - 2 comments

#2290 - Couldn't parse .yaml file for configuration

Issue - State: open - Opened by ArchitJain1201 20 days ago

#2289 - A little typing issue

Issue - State: open - Opened by yuti01 20 days ago

#2288 - Treat tags in python tasks the same as yaml tasks

Pull Request - State: closed - Opened by giuliolovisotto 21 days ago - 1 comment

#2287 - Issue with openai completions API - related to logprobs

Issue - State: closed - Opened by dmakhervaks 21 days ago - 3 comments
Labels: bug

#2286 - What's going on with swde or squadv2 tasks ?

Issue - State: closed - Opened by ahatamiz 22 days ago - 2 comments

#2285 - Can we connect to Vertex AI model

Issue - State: open - Opened by patilpriyadarshini 22 days ago

#2284 - External API - same results different models

Issue - State: closed - Opened by deema-A 24 days ago - 2 comments

#2283 - Added TurkishMMLU to LM Evaluation Harness

Pull Request - State: closed - Opened by ArdaYueksel 25 days ago - 2 comments

#2282 - add mmlu readme

Pull Request - State: closed - Opened by baberabb 25 days ago

#2281 - Multi-node MMLU support ?

Issue - State: closed - Opened by ahatamiz 25 days ago - 2 comments
Labels: asking questions

#2280 - Bump version to v0.4.4 ; Fixes to TMMLUplus

Pull Request - State: closed - Opened by haileyschoelkopf 26 days ago

#2279 - zero accuracy on `mmlu_generative`

Issue - State: open - Opened by Luodian 26 days ago - 5 comments
Labels: bug

#2278 - May be parse LAST numbers in GSM8K "flexible-extract" filter?

Issue - State: closed - Opened by Pupy101 27 days ago - 2 comments
Labels: asking questions

#2277 - Free space allocated for LM in the memory after evaluation finishes

Pull Request - State: closed - Opened by ahmedamrelhefnawy 28 days ago - 3 comments

#2275 - Teleia group task

Pull Request - State: closed - Opened by gonz-mart 28 days ago - 1 comment