Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / EleutherAI/lm-evaluation-harness issues and pull requests
#2365 - Extracting vLLM metrics
Issue -
State: open - Opened by vsmolyakov 1 day ago
#2364 - Add Unitxt Multimodality Support
Pull Request -
State: open - Opened by elronbandel 1 day ago
#2363 - Unitxt Multi Modality Support
Pull Request -
State: closed - Opened by elronbandel 1 day ago
#2362 - Which filter value should be used among the accuracy test results?
Issue -
State: open - Opened by KKwanhee 4 days ago
#2361 - boolq trust remote code
Issue -
State: open - Opened by IvanSedykh 4 days ago
#2360 - [multimodal] llava-1.5-7b-hf doesn't work on `mmmu_val`
Issue -
State: open - Opened by BabyChouSr 5 days ago
- 4 comments
Labels: bug
#2359 - fix `cost_estimate` script
Pull Request -
State: open - Opened by baberabb 5 days ago
#2358 - Improve `docs/model_guide.md` with skeleton template code + description of utils like `Collator` and `Reorderer`
Issue -
State: open - Opened by haileyschoelkopf 5 days ago
Labels: documentation, feature request
#2357 - Add metabench task to LM Evaluation Harness
Pull Request -
State: open - Opened by kozzy97 5 days ago
- 1 comment
#2356 - Add a test for `scripts/write_out.py` and other `scripts/` utils
Issue -
State: open - Opened by haileyschoelkopf 5 days ago
#2356 - Add a test for `scripts/write_out.py` and other `scripts/` utils
Issue -
State: open - Opened by haileyschoelkopf 5 days ago
#2355 - --tasks mmlu
Issue -
State: closed - Opened by belle9217 5 days ago
- 1 comment
Labels: asking questions
#2355 - --tasks mmlu
Issue -
State: closed - Opened by belle9217 5 days ago
- 1 comment
Labels: asking questions
#2354 - Evaluation of MMLU tasks using a fined tuned Gemma 2 model
Issue -
State: open - Opened by chamath-eka 5 days ago
#2354 - Evaluation of MMLU tasks using a fined tuned Gemma 2 model
Issue -
State: open - Opened by chamath-eka 5 days ago
#2353 - HF: switch conditional checks to `self.backend` from `AUTO_MODEL_CLASS`
Pull Request -
State: open - Opened by baberabb 6 days ago
#2353 - HF: switch conditional checks to `self.backend` from `AUTO_MODEL_CLASS`
Pull Request -
State: open - Opened by baberabb 6 days ago
#2352 - Setting limit_mm_per_prompt for vllm_vlm fails argument parser
Issue -
State: open - Opened by mgoin 6 days ago
Labels: bug
#2352 - Setting limit_mm_per_prompt for vllm_vlm fails argument parser
Issue -
State: open - Opened by mgoin 6 days ago
Labels: bug
#2351 - squad v2: load metric with `evaluate`
Pull Request -
State: closed - Opened by baberabb 6 days ago
#2351 - squad v2: load metric with `evaluate`
Pull Request -
State: closed - Opened by baberabb 6 days ago
#2350 - fix writeout script
Pull Request -
State: closed - Opened by baberabb 6 days ago
#2350 - fix writeout script
Pull Request -
State: closed - Opened by baberabb 6 days ago
#2349 - Support pipeline parallel with OpenVINO models
Pull Request -
State: open - Opened by sstrehlk 6 days ago
- 1 comment
#2348 - squadv2 task occurred "AttributeError: module 'datasets' has no attribute 'load_metric'"
Issue -
State: closed - Opened by chengpong1127 6 days ago
Labels: bug
#2348 - squadv2 task occurred "AttributeError: module 'datasets' has no attribute 'load_metric'"
Issue -
State: closed - Opened by chengpong1127 6 days ago
Labels: bug
#2347 - The base model and chat model have no difference when using generate_until, loglikelihood, loglikelihood_rolling,right?
Issue -
State: open - Opened by belle9217 6 days ago
- 1 comment
Labels: asking questions
#2347 - The base model and chat model have no difference when using generate_until, loglikelihood, loglikelihood_rolling,right?
Issue -
State: open - Opened by belle9217 6 days ago
- 1 comment
Labels: asking questions
#2346 - Unexpected space character
Issue -
State: open - Opened by eldarkurtic 6 days ago
- 2 comments
#2345 - tasks RACE only high not "middle"
Issue -
State: open - Opened by Choi-jun9803 6 days ago
#2344 - Reproduce QWen 2.5-14B-Instruct and LLaMa-3.1-8B-Instruct Results
Issue -
State: open - Opened by ruleGreen 6 days ago
- 1 comment
#2343 - gpt2 evaluation
Issue -
State: open - Opened by sorobedio 6 days ago
#2342 - AttributeError: 'dict' object has no attribute 'has_test_docs'
Issue -
State: closed - Opened by Sshubam 7 days ago
#2341 - Merge New Tasks
Pull Request -
State: closed - Opened by ToluClassics 7 days ago
#2340 - Added metric aggregation for leaderboard tasks.
Pull Request -
State: closed - Opened by Am1n3e 7 days ago
- 4 comments
#2339 - Fixed dummy model
Pull Request -
State: closed - Opened by Am1n3e 7 days ago
- 1 comment
#2338 - Locally reproducible HF-Leaderboard evals
Issue -
State: open - Opened by eldarkurtic 7 days ago
- 2 comments
Labels: asking questions
#2337 - Robustness Task
Pull Request -
State: closed - Opened by rimashahbazyan 7 days ago
- 1 comment
#2336 - Add a note for missing dependencies
Pull Request -
State: closed - Opened by eldarkurtic 7 days ago
- 1 comment
#2335 - Dynamical prompt with extremely promising results #RIPrompt
Issue -
State: open - Opened by anthonyrisinger 7 days ago
- 1 comment
#2334 - mmlu-pro: add newlines to task descriptions (not leaderboard)
Pull Request -
State: closed - Opened by baberabb 8 days ago
#2333 - add newlines to mmlu_pro task descriptions (not leaderboard)
Pull Request -
State: closed - Opened by baberabb 8 days ago
#2332 - change glianorex to test split
Pull Request -
State: closed - Opened by baberabb 8 days ago
- 1 comment
#2331 - Confusion over the model outputs
Issue -
State: open - Opened by tranlm 8 days ago
#2330 - Failed to add a new metric
Issue -
State: open - Opened by Ofir408 8 days ago
#2329 - `glianorex_en` task does not work
Issue -
State: closed - Opened by casper-hansen 8 days ago
- 1 comment
Labels: bug
#2328 - Hashing error when setting random seed for vllm model
Issue -
State: open - Opened by yizhongw 9 days ago
- 1 comment
Labels: asking questions
#2327 - openai: better error messages; fix greedy matching
Pull Request -
State: closed - Opened by baberabb 11 days ago
- 1 comment
#2326 - Support for Using Multiple Choice Datasets with GPT-4o Model via OpenAI API
Issue -
State: closed - Opened by Laplace888 11 days ago
- 3 comments
Labels: asking questions
#2325 - Fix float limit override
Pull Request -
State: open - Opened by cjluo-omniml 11 days ago
- 3 comments
#2324 - Bug in the float limit handling
Issue -
State: open - Opened by cjluo-omniml 11 days ago
- 6 comments
Labels: feature request
#2323 - Error for AGIEval when using fewshot
Issue -
State: open - Opened by BaohaoLiao 12 days ago
- 1 comment
Labels: bug, validation
#2322 - Which version to use
Issue -
State: open - Opened by sorobedio 12 days ago
- 9 comments
Labels: validation
#2321 - Mathvista
Pull Request -
State: open - Opened by baberabb 12 days ago
#2320 - change group to tags in task `eus_exams` task configs
Pull Request -
State: closed - Opened by baberabb 12 days ago
#2319 - how to get lm_eval version 4.2
Issue -
State: closed - Opened by sorobedio 13 days ago
- 1 comment
#2318 - Evaluation of MMLU tasks using the OpenAI API
Issue -
State: closed - Opened by Laplace888 13 days ago
- 3 comments
Labels: asking questions
#2317 - Multiple generations (sequential) per question
Issue -
State: open - Opened by IntrepidEnki 13 days ago
- 1 comment
Labels: feature request, asking questions
#2316 - GSM8K Problem On Colab With Finetuned Phi3.5 mini model
Issue -
State: closed - Opened by SongTonyLi 14 days ago
- 3 comments
Labels: asking questions
#2315 - remove comma
Pull Request -
State: closed - Opened by baberabb 14 days ago
#2314 - Update neuron backend
Pull Request -
State: closed - Opened by dacorvo 14 days ago
- 4 comments
#2313 - Comma breaks __repr__ for write-out
Issue -
State: closed - Opened by giuliolovisotto 14 days ago
- 1 comment
Labels: bug
#2312 - mmlu translated professionally by OpenAI
Pull Request -
State: open - Opened by giuliolovisotto 14 days ago
- 1 comment
#2311 - add batch_size to `get_sample_size`
Pull Request -
State: closed - Opened by baberabb 14 days ago
#2310 - AttributeError: 'GPT2TokenizerFast' object has no attribute 'default_chat_template'. Did you mean: 'get_chat_template'?
Issue -
State: closed - Opened by IsraelAbebe 14 days ago
- 3 comments
#2309 - Scrolls branch
Pull Request -
State: open - Opened by blitzionic 14 days ago
- 2 comments
#2308 - Chat templates
Issue -
State: closed - Opened by IsraelAbebe 15 days ago
#2307 - avoid timeout errors with high concurrency in api_model
Pull Request -
State: open - Opened by dtrawins 15 days ago
- 3 comments
#2306 - Running multiple processes on a shared outlines cache database
Issue -
State: open - Opened by e-tornike 15 days ago
- 2 comments
#2305 - New Task: `openai_mmmlu` professionaly translated by OpenAI as part of o1 release
Issue -
State: open - Opened by giuliolovisotto 15 days ago
- 1 comment
Labels: feature request
#2304 - Fix missing key in custom task loading.
Pull Request -
State: open - Opened by giuliolovisotto 15 days ago
#2303 - Missing key in dictionary when loading tasks.
Issue -
State: open - Opened by giuliolovisotto 15 days ago
Labels: bug
#2302 - Configuring Azure OPENAI
Issue -
State: open - Opened by sudhanshu-myl 15 days ago
- 3 comments
Labels: asking questions
#2301 - Fail to reproduce the perplexity of Llama-2 7B on wikitext
Issue -
State: open - Opened by Yonghao-Tan 16 days ago
- 10 comments
#2300 - add new truncation strategy
Pull Request -
State: open - Opened by artemorloff 16 days ago
- 3 comments
#2299 - fix some bugs of mmlu
Pull Request -
State: closed - Opened by eyuansu62 17 days ago
- 3 comments
#2298 - fix some bugs of mmlu (flan_cot_fewshot and flan_n_shot)
Pull Request -
State: closed - Opened by eyuansu62 17 days ago
#2297 - Update README.md
Pull Request -
State: closed - Opened by SYusupov 17 days ago
- 3 comments
#2296 - Low GPU Utilization During Multi-GPU evaluation - Efficiency Optimization
Issue -
State: open - Opened by yang3121099 17 days ago
- 1 comment
Labels: asking questions
#2295 - the log is end,the gpu is not calculate,but is storing,the result is not getting,is it normal?
Issue -
State: closed - Opened by belle9217 18 days ago
- 1 comment
Labels: asking questions
#2294 - Worse evaluation performance with PEFT adaptors
Issue -
State: open - Opened by YananLi18 18 days ago
- 1 comment
#2293 - RuntimeError: CUDA error: device-side assert triggered
Issue -
State: open - Opened by milliemaoo 19 days ago
- 2 comments
#2292 - Using multi-GPU with accelerate is not working
Issue -
State: closed - Opened by commmet-ahn 20 days ago
- 2 comments
#2291 - Infer time by use library's external api is much longer than script
Issue -
State: open - Opened by lonleyodd 20 days ago
Labels: bug
#2290 - Couldn't parse .yaml file for configuration
Issue -
State: open - Opened by ArchitJain1201 20 days ago
#2289 - A little typing issue
Issue -
State: open - Opened by yuti01 20 days ago
#2288 - Treat tags in python tasks the same as yaml tasks
Pull Request -
State: closed - Opened by giuliolovisotto 21 days ago
- 1 comment
#2287 - Issue with openai completions API - related to logprobs
Issue -
State: closed - Opened by dmakhervaks 21 days ago
- 3 comments
Labels: bug
#2286 - What's going on with swde or squadv2 tasks ?
Issue -
State: closed - Opened by ahatamiz 22 days ago
- 2 comments
#2285 - Can we connect to Vertex AI model
Issue -
State: open - Opened by patilpriyadarshini 22 days ago
#2284 - External API - same results different models
Issue -
State: closed - Opened by deema-A 24 days ago
- 2 comments
#2283 - Added TurkishMMLU to LM Evaluation Harness
Pull Request -
State: closed - Opened by ArdaYueksel 25 days ago
- 2 comments
#2282 - add mmlu readme
Pull Request -
State: closed - Opened by baberabb 25 days ago
#2281 - Multi-node MMLU support ?
Issue -
State: closed - Opened by ahatamiz 25 days ago
- 2 comments
Labels: asking questions
#2280 - Bump version to v0.4.4 ; Fixes to TMMLUplus
Pull Request -
State: closed - Opened by haileyschoelkopf 26 days ago
#2279 - zero accuracy on `mmlu_generative`
Issue -
State: open - Opened by Luodian 26 days ago
- 5 comments
Labels: bug
#2278 - May be parse LAST numbers in GSM8K "flexible-extract" filter?
Issue -
State: closed - Opened by Pupy101 27 days ago
- 2 comments
Labels: asking questions
#2277 - Free space allocated for LM in the memory after evaluation finishes
Pull Request -
State: closed - Opened by ahmedamrelhefnawy 28 days ago
- 3 comments
#2276 - Do the version of CMMLU and MMLU make any differences?
Issue -
State: closed - Opened by yaolu-zjut 28 days ago
#2275 - Teleia group task
Pull Request -
State: closed - Opened by gonz-mart 28 days ago
- 1 comment