Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / EleutherAI/lm-evaluation-harness issues and pull requests
#2028 - vllm backend faild
Issue -
State: closed - Opened by chunniunai220ml 1 day ago
- 3 comments
#2027 - --log_samples not saving all inference output
Issue -
State: open - Opened by zitgit 1 day ago
#2026 - Test Open LLM Leaderboard 2
Issue -
State: open - Opened by matouk98 1 day ago
- 2 comments
#2025 - Duplicate `sample` entries
Issue -
State: open - Opened by baberabb 1 day ago
#2024 - Fix `trust_remote_code`-related test failures
Pull Request -
State: closed - Opened by haileyschoelkopf 2 days ago
#2023 - adds leaderboard tasks
Pull Request -
State: closed - Opened by NathanHB 2 days ago
#2022 - [add] multiple-choice-question versions of fld benchmark
Pull Request -
State: open - Opened by MorishT 2 days ago
- 1 comment
#2021 - YAML config was updated, but the project still remains the same as before
Issue -
State: closed - Opened by 2018211801 2 days ago
- 3 comments
#2020 - Add Redlite tasks for safety benchmarking
Pull Request -
State: open - Opened by inno-simon 3 days ago
- 1 comment
#2019 - Add MMLU-ru based on MERA
Pull Request -
State: closed - Opened by SpirinEgor 3 days ago
- 1 comment
#2018 - Does it support Triton server?
Issue -
State: closed - Opened by AndyZZt 3 days ago
- 1 comment
Labels: asking questions
#2017 - [Not For Merge] Enable chat-template for vLLM
Pull Request -
State: open - Opened by akjindal53244 4 days ago
- 1 comment
#2016 - Running on custom model, getting 'TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
Issue -
State: closed - Opened by Fchaubard 4 days ago
#2015 - Hotfix breaking import
Pull Request -
State: closed - Opened by StellaAthena 4 days ago
#2014 - Supporting Multimodality
Issue -
State: open - Opened by lintangsutawika 4 days ago
#2013 - Fix regexp parsing for bbh_cot_fewshot
Pull Request -
State: open - Opened by arkapal3 4 days ago
- 1 comment
#2012 - Compatibility with Models from PyReft Library
Issue -
State: open - Opened by crux82 5 days ago
- 6 comments
#2011 - Remove `LM` dependency from `build_all_requests`
Pull Request -
State: closed - Opened by baberabb 6 days ago
#2010 - Added MedConceptsQA Benchmark
Pull Request -
State: open - Opened by Ofir408 6 days ago
- 1 comment
#2009 - Remove `LM` dependency from `build_all_requests`
Pull Request -
State: closed - Opened by baberabb 6 days ago
#2008 - Refactor API models
Pull Request -
State: open - Opened by baberabb 6 days ago
#2007 - Wrong calculation of score when there are ties?
Issue -
State: open - Opened by apohllo 7 days ago
- 2 comments
#2006 - Error Correction: Eliminate undefined parameter in function call
Pull Request -
State: open - Opened by zhabuye 8 days ago
- 2 comments
#2005 - mmlu evaluation fail
Issue -
State: closed - Opened by jxiw 8 days ago
- 2 comments
#2004 - make pytorch an optional dependency
Pull Request -
State: open - Opened by dlwh 8 days ago
- 2 comments
#2003 - Fixes scrolls task bug with few_shot examples
Pull Request -
State: open - Opened by xksteven 8 days ago
- 7 comments
#2002 - Implementing lessons from OLMES
Issue -
State: open - Opened by lintangsutawika 8 days ago
#2001 - Add HuggingFace Text-Generation-Interface Support
Issue -
State: open - Opened by taoari 9 days ago
#2000 - Incorrect Multilingual arc implementation
Issue -
State: open - Opened by hynky1999 9 days ago
#1999 - Handle Empty openai response
Pull Request -
State: open - Opened by ciaranby 9 days ago
#1998 - Fix Datasets `--trust_remote_code`
Pull Request -
State: closed - Opened by haileyschoelkopf 9 days ago
- 2 comments
#1997 - Fix partial caching of openai models
Pull Request -
State: open - Opened by ciaranby 9 days ago
- 1 comment
#1996 - Add Gigachat model
Pull Request -
State: open - Opened by seldereyy 9 days ago
#1995 - Log `fewshot_as_multiturn` in results files
Pull Request -
State: closed - Opened by haileyschoelkopf 9 days ago
#1994 - Fix naming error; 'include: _paloma_template' -> 'include: paloma.yaml'
Pull Request -
State: closed - Opened by LucWeber 9 days ago
- 2 comments
#1993 - Fix Paloma Template yaml
Pull Request -
State: closed - Opened by haileyschoelkopf 9 days ago
- 1 comment
#1992 - Add HumanEval
Pull Request -
State: open - Opened by hjlee1371 9 days ago
- 1 comment
#1991 - added yaml and util file
Pull Request -
State: closed - Opened by satyamshukl 9 days ago
- 3 comments
#1990 - Fix self assignment in neuron_optimum.py
Pull Request -
State: closed - Opened by LSinev 10 days ago
- 2 comments
#1989 - [Fix] Replace generic exception classes with a more specific ones
Pull Request -
State: open - Opened by LSinev 10 days ago
- 1 comment
#1988 - main
Pull Request -
State: open - Opened by msamwelmollel 10 days ago
- 4 comments
#1987 - Added ArabicMMLU
Pull Request -
State: closed - Opened by Yazeed7 10 days ago
- 5 comments
#1986 - Added ArabicMMLU
Pull Request -
State: closed - Opened by Yazeed7 10 days ago
- 1 comment
#1985 - `piqa` task need add trust_remote_code true in piqa.yml
Issue -
State: closed - Opened by changwangss 10 days ago
#1984 - Long time testing Qwen2-72B
Issue -
State: open - Opened by djstrong 10 days ago
- 1 comment
Labels: bug
#1983 - add trust_remote_code for piqa
Pull Request -
State: closed - Opened by changwangss 10 days ago
- 1 comment
#1982 - Update interface.md
Pull Request -
State: closed - Opened by johnwee1 10 days ago
#1981 - Add Task: CBT
Pull Request -
State: closed - Opened by ookkeeeee 11 days ago
- 2 comments
#1980 - How to enable trust_remote_code when encountered programmatically via get_task_dict?
Issue -
State: closed - Opened by Jack-Khuu 11 days ago
- 3 comments
#1979 - add persianmmlu benchmark for assessing Persian Language understanding
Pull Request -
State: open - Opened by MrzEsma 11 days ago
- 2 comments
#1978 - Add a way to instantiate from HF.AutoModel (again)
Issue -
State: closed - Opened by dmitrii-palisaderesearch 11 days ago
- 2 comments
#1977 - add persianmmlu benchmark for assessing Persian Language understanding
Pull Request -
State: closed - Opened by MrzEsma 11 days ago
- 1 comment
#1976 - What is the output_type in the metric for?
Issue -
State: open - Opened by dennisrall 11 days ago
- 1 comment
#1975 - Fix local completion huggingface tokenizer
Pull Request -
State: open - Opened by okdshin 11 days ago
- 1 comment
#1974 - added bias and stereotype classification tasks
Pull Request -
State: closed - Opened by aditya20t 11 days ago
- 1 comment
#1973 - Add GigaChat API
Pull Request -
State: closed - Opened by seldereyy 11 days ago
- 1 comment
#1972 - incomplete task list
Issue -
State: closed - Opened by hlzhang109 12 days ago
- 2 comments
#1971 - Ubelievable long time when host the gguf mode ?
Issue -
State: open - Opened by hzgdeerHo 12 days ago
- 2 comments
#1970 - mela
Pull Request -
State: open - Opened by Geralt-Targaryen 12 days ago
- 2 comments
#1969 - Fix OpenAI API discrepancies
Pull Request -
State: open - Opened by chimezie 14 days ago
#1968 - Updates to fix OpenAI API compliance
Pull Request -
State: closed - Opened by chimezie 14 days ago
#1967 - OpenAI completions model not using OpenAI Completion API properly to extract LogProbs
Issue -
State: open - Opened by chimezie 14 days ago
- 2 comments
#1966 - TemplateLM#_encode_pair() only works for HF transformers auto-models
Issue -
State: closed - Opened by Birch-san 14 days ago
- 1 comment
#1965 - Error while installing
Issue -
State: closed - Opened by surya-narayanan 14 days ago
- 1 comment
#1964 - Add BertaQA dataset tasks
Pull Request -
State: closed - Opened by juletx 15 days ago
- 1 comment
#1963 - How to use a vllm hosted model?
Issue -
State: open - Opened by darsh-essential 15 days ago
- 1 comment
#1962 - Error when chat template is not a string
Issue -
State: open - Opened by djstrong 15 days ago
#1961 - Mmlu Pro
Pull Request -
State: open - Opened by ysjprojects 15 days ago
- 6 comments
#1960 - Multi-gpu evaluation with external library usage.
Issue -
State: closed - Opened by xinghaow99 15 days ago
- 1 comment
#1959 - Making torch dep optional?
Issue -
State: open - Opened by dlwh 16 days ago
- 4 comments
#1958 - Wandb logger can't handle groups with heterogenous metrics
Issue -
State: open - Opened by dmitrii-palisaderesearch 16 days ago
- 11 comments
#1957 - Cannot load model 'local-chat-completions' and 'local-completions'
Issue -
State: closed - Opened by awesom112 16 days ago
#1956 - fix: add directory filter to os.walk to ignore 'ipynb_checkpoints'
Pull Request -
State: closed - Opened by johnwee1 16 days ago
- 11 comments
#1955 - Fix a tiny typo in `docs/interface.md`
Pull Request -
State: closed - Opened by sadra-barikbin 16 days ago
#1954 - Fix task.py and evaluator.py
Pull Request -
State: closed - Opened by zhabuye 16 days ago
- 1 comment
#1953 - Keep getting error: 'VLLM' object has no attribute 'AUTO_MODEL_CLASS'
Issue -
State: closed - Opened by andrew0411 17 days ago
- 6 comments
#1952 - .ipynb_checkpoints causes eval harness to fail
Issue -
State: closed - Opened by johnwee1 17 days ago
#1951 - Plans for a new release?
Issue -
State: open - Opened by nathan-weinberg 17 days ago
- 4 comments
#1950 - LMJudge
Pull Request -
State: open - Opened by baberabb 17 days ago
- 4 comments
#1949 - Check compatibility of `local-completions` with VLLM (returns logits) for `multiple_choice` tasks
Issue -
State: open - Opened by haileyschoelkopf 17 days ago
Labels: bug
#1948 - Remove AMMLU Due to Translation
Pull Request -
State: closed - Opened by haileyschoelkopf 17 days ago
- 2 comments
#1947 - Add MMLU-Pro Dataset
Issue -
State: open - Opened by haileyschoelkopf 17 days ago
Labels: help wanted, feature request, good first issue
#1947 - Add MMLU-Pro Dataset
Issue -
State: open - Opened by haileyschoelkopf 17 days ago
Labels: help wanted, feature request, good first issue
#1946 - Alghafa benchmark
Pull Request -
State: open - Opened by khalil-Hennara 17 days ago
- 7 comments
#1946 - Alghafa benchmark
Pull Request -
State: open - Opened by khalil-Hennara 17 days ago
- 5 comments
#1945 - The output of ceval is not as the same format at the official version?
Issue -
State: open - Opened by ChuanhongLi 17 days ago
- 1 comment
#1944 - Results is weird for Qwen2-1.5B
Issue -
State: closed - Opened by SefaZeng 17 days ago
- 6 comments
#1943 - Allow running hugging face models with both data parallelism and model parallelism at once
Pull Request -
State: closed - Opened by clefourrier 18 days ago
#1942 - Fixed the [issue #1757](https://github.com/EleutherAI/lm-evaluation-harness/issues/1757) by editing the `yaml` files.
Pull Request -
State: closed - Opened by sci-m-wang 18 days ago
- 2 comments
#1941 - Save `fewshot_as_multiturn` argument in `results.json`
Issue -
State: closed - Opened by djstrong 18 days ago
- 1 comment
#1940 - Add the Arabic version with refactor to Arabic pica to be in alghafa …
Pull Request -
State: closed - Opened by khalil-Hennara 18 days ago
#1939 - Fix a tiny typo in `__main__.py`
Pull Request -
State: closed - Opened by sadra-barikbin 19 days ago
- 1 comment
#1938 - Regarding decontamination
Issue -
State: open - Opened by dsdanielpark 19 days ago
#1937 - Format of Personal Defined Dataset for Evaluation
Issue -
State: closed - Opened by OscarC9912 20 days ago
- 1 comment
#1936 - High Number of Tokens for openai-completions Models
Issue -
State: open - Opened by selinaxiao 21 days ago
#1935 - Parallel GPU evaluation using simple_evaluate /evaluate functions? #1934
Issue -
State: closed - Opened by PalaashAgrawal 21 days ago
- 1 comment
#1934 - Parallel GPU evaluation using simple_evaluate /evaluate functions?
Issue -
State: closed - Opened by Naitik1502 21 days ago
#1933 - Easier unitxt tasks loading and removal of unitxt library dependancy
Pull Request -
State: open - Opened by elronbandel 22 days ago
- 5 comments
#1932 - --trust_remote_code does it actually do anything?
Issue -
State: closed - Opened by devzzzero 22 days ago
- 8 comments
Labels: bug
#1931 - [add] fld logical formula task
Pull Request -
State: closed - Opened by MorishT 22 days ago
- 1 comment