Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / EleutherAI/lm-evaluation-harness issues and pull requests
#1076 - Adding HaluEval to the list of tasks
Pull Request -
State: closed - Opened by pminervini 10 months ago
- 4 comments
#1075 - BBH, gsm8k benchmark accuracy mismatch with paper
Issue -
State: closed - Opened by hills-code 10 months ago
- 9 comments
#1074 - Update _cot_fewshot_template_yaml
Pull Request -
State: closed - Opened by lintangsutawika 10 months ago
#1073 - .
Issue -
State: closed - Opened by DrewGalbraith 10 months ago
#1072 - Is there a current way to run lm-eval against a self-hosted inference server?
Issue -
State: closed - Opened by sfriedowitz 10 months ago
- 3 comments
Labels: help wanted, feature request
#1071 - FileNotFoundError: Couldn't find a module script at exact_match.py. Module 'exact_match' doesn't exist on the Hugging Face Hub either.
Issue -
State: closed - Opened by xinghuang2050 10 months ago
- 18 comments
Labels: bug
#1070 - Evaluation on Scrolls Tasks Error
Issue -
State: closed - Opened by AdityaKulshrestha 10 months ago
- 2 comments
Labels: bug
#1069 - Updates to `hf` model type modeling code
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
- 2 comments
#1068 - Support for model instance in `HFLM.pretrained` argument
Issue -
State: closed - Opened by gugarosa 10 months ago
- 4 comments
Labels: bug
#1067 - Eval Harness Refactor Help
Issue -
State: closed - Opened by StellaAthena 10 months ago
- 4 comments
#1066 - Updating docs hyperlinks
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1065 - Confirming links in docs work (WIP)
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1064 - Set actual version to v0.4.0
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1063 - Fiddling with READMEs, Reenable CI tests on `main`
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1062 - remove commented planned samplers in `lm_eval/api/samplers.py`
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1061 - Announce v0.4.0 in README
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1060 - [Refactor] Fix fewshot cot mmlu descriptions
Pull Request -
State: closed - Opened by lintangsutawika 10 months ago
#1059 - Indexing Bugfix in huggingface.py
Pull Request -
State: closed - Opened by roy-sc 10 months ago
- 4 comments
#1058 - AttributeError: can't set attribute 'pad_token'
Issue -
State: closed - Opened by APiaoG 10 months ago
- 1 comment
#1057 - Does lm-evaluation-harness support AWQ quantized model testing?
Issue -
State: closed - Opened by Enjia 10 months ago
- 3 comments
#1056 - [New Feature] Addressing Data Contamination in Evaluation Benchmarks
Issue -
State: closed - Opened by liyucheng09 10 months ago
- 2 comments
#1055 - `mmlu_flan_cot_fewshot` is not properly formatted?
Issue -
State: closed - Opened by pengzhenghao 10 months ago
- 2 comments
#1054 - How to implement zero-shot cot (calling model twice?)
Issue -
State: closed - Opened by pengzhenghao 10 months ago
- 3 comments
#1053 - assert len(continuation_enc) error in _loglikelihood_tokens for certain (but not all) tasks?
Issue -
State: open - Opened by lhl 10 months ago
- 8 comments
#1052 - Added no-softmax entries to MODEL_REGISTRY
Pull Request -
State: open - Opened by denizyuret 10 months ago
#1051 - [Refactor] Update docs ToC
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1050 - A new DROP benchmark is needed
Issue -
State: open - Opened by StellaAthena 10 months ago
- 18 comments
Labels: opinions wanted
#1049 - Update README.md
Pull Request -
State: closed - Opened by StellaAthena 10 months ago
#1048 - [Refactor] Additions to example notebook
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1047 - Miscellaneous documentation updates
Pull Request -
State: closed - Opened by StellaAthena 10 months ago
- 1 comment
#1046 - [Refactor] Update README.md
Pull Request -
State: closed - Opened by lintangsutawika 10 months ago
#1045 - Avoid creating model_cache for OVModelForCausalLM
Pull Request -
State: closed - Opened by andreyanufr 10 months ago
- 3 comments
#1044 - How to specify evaluation times using different seeds?
Issue -
State: closed - Opened by MarshtompCS 10 months ago
- 9 comments
#1043 - using "A:" replace "A: "
Issue -
State: closed - Opened by milliemaoo 10 months ago
- 1 comment
#1042 - Warning on gsm8k
Issue -
State: closed - Opened by liranringel 10 months ago
- 6 comments
#1041 - Adding nq_open to task_table.md
Pull Request -
State: closed - Opened by pminervini 10 months ago
#1040 - style(README): alert markdown GooseAI link
Pull Request -
State: closed - Opened by guspan-tanadi 10 months ago
- 1 comment
#1039 - Added the --no_softmax option
Pull Request -
State: closed - Opened by denizyuret 10 months ago
- 7 comments
#1038 - fixes for sampler
Pull Request -
State: closed - Opened by baberabb 10 months ago
#1037 - [refactor] mps requirement
Pull Request -
State: closed - Opened by baberabb 10 months ago
- 1 comment
#1036 - [Refactor] Fixes to sampler
Pull Request -
State: closed - Opened by lintangsutawika 10 months ago
#1035 - [Refactor] vllm data parallel
Pull Request -
State: closed - Opened by baberabb 10 months ago
- 7 comments
#1034 - scrolls pyrouge import error
Issue -
State: open - Opened by sshleifer 10 months ago
- 3 comments
#1033 - [Refactor] Urgent fix
Pull Request -
State: closed - Opened by lintangsutawika 10 months ago
#1032 - Rename bigbench.yml to default.yml
Pull Request -
State: closed - Opened by StellaAthena 10 months ago
#1031 - [Refactor] Versioning
Pull Request -
State: closed - Opened by lintangsutawika 10 months ago
- 2 comments
#1030 - Social iqa
Pull Request -
State: closed - Opened by StellaAthena 10 months ago
#1029 - [Refactor] BBH fixup
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
- 7 comments
#1028 - [big-refactor] Adding Flash Attention 2 to HF Model
Issue -
State: closed - Opened by orendar 10 months ago
- 2 comments
Labels: feature request
#1027 - [New Task] SIQA
Issue -
State: closed - Opened by haileyschoelkopf 10 months ago
Labels: help wanted, feature request, good first issue
#1026 - [New Task] CommonsenseQA
Issue -
State: closed - Opened by haileyschoelkopf 10 months ago
- 4 comments
Labels: help wanted, feature request, good first issue
#1025 - [Refactor] add notebook for overview
Pull Request -
State: closed - Opened by lintangsutawika 10 months ago
- 3 comments
#1024 - [Refactor] Use correct HF model type for MBart-like models
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
- 1 comment
#1023 - Some questions on the DROP evaluations
Issue -
State: closed - Opened by lpc-eol 10 months ago
- 3 comments
#1022 - [big-refactor] Wrong AutoModel Assignment for MBart
Issue -
State: closed - Opened by mcemilg 10 months ago
- 6 comments
Labels: bug
#1021 - Error occur when evaluating local model: transformer sentencepiece piece id out of range
Issue -
State: closed - Opened by AnqiZhou226 10 months ago
- 3 comments
#1020 - [Refactor] Update README
Pull Request -
State: closed - Opened by baberabb 10 months ago
#1019 - Is there a way to check if each sample in the dataset is correct or incorrect?
Issue -
State: closed - Opened by sean0042 10 months ago
- 2 comments
#1018 - [Refactor] Remove `examples/` folder
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1017 - The tokenizer add_special_tokens parameter for t5 model lambada task
Issue -
State: open - Opened by daisyden 10 months ago
- 11 comments
#1016 - can we pass specific configs for specific tasks while running multiple benchmarks?
Issue -
State: closed - Opened by sayan1101 10 months ago
- 2 comments
Labels: feature request
#1015 - Issue "No module named 'lm_eval'"
Issue -
State: closed - Opened by DreF174 10 months ago
- 4 comments
Labels: bug
#1014 - Add DeepSparseLM
Pull Request -
State: closed - Opened by mgoin 10 months ago
- 2 comments
#1013 - [New Task] COLLIE
Issue -
State: open - Opened by haileyschoelkopf 10 months ago
Labels: help wanted, feature request, good first issue
#1012 - [New Task Request] IFEval / Instruction-Following Eval
Issue -
State: closed - Opened by haileyschoelkopf 10 months ago
Labels: help wanted, feature request, good first issue
#1011 - [Refactor] vllm support
Pull Request -
State: closed - Opened by baberabb 10 months ago
- 7 comments
#1010 - [New Task] Implement GPQA dataset
Issue -
State: closed - Opened by haileyschoelkopf 10 months ago
- 1 comment
Labels: help wanted, feature request, good first issue
#1009 - [Refactor] Improve Handling of Stop-Sequences for HF Batched Generation
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1008 - [Refactor] Openai completions
Pull Request -
State: closed - Opened by lintangsutawika 10 months ago
#1007 - Update main-branch README
Pull Request -
State: closed - Opened by haileyschoelkopf 10 months ago
#1006 - Stability Upstream translated task
Issue -
State: open - Opened by StellaAthena 11 months ago
- 1 comment
Labels: feature request
#1005 - Fix indent in lm_eval/tasks/bigbench.py
Pull Request -
State: closed - Opened by Andrei-Aksionov 11 months ago
- 1 comment
#1004 - Adds Python 3.8 Compatibility
Pull Request -
State: closed - Opened by StellaAthena 11 months ago
#1003 - [big-refactor] Accelerate launch FSDP Runtime Error
Issue -
State: closed - Opened by fengzi258 11 months ago
- 1 comment
Labels: bug
#1002 - [Refactor] Bugfixes
Pull Request -
State: closed - Opened by haileyschoelkopf 11 months ago
#1001 - [Refactor] will check if group_name is None
Pull Request -
State: closed - Opened by lintangsutawika 11 months ago
#1000 - How to interpret TruthfulQA_mc write out file
Issue -
State: closed - Opened by Luckyluuuc 11 months ago
#999 - [Refactor] Squad misc
Pull Request -
State: closed - Opened by lintangsutawika 11 months ago
#998 - [Refactor] group name error with MMLU
Issue -
State: closed - Opened by tmabraham 11 months ago
- 1 comment
#997 - [Refactor] Fix CI tests
Pull Request -
State: closed - Opened by haileyschoelkopf 11 months ago
#996 - [Refactor] Minor cleanup on base `Task` subclasses
Pull Request -
State: closed - Opened by haileyschoelkopf 11 months ago
#995 - Model is not a local folder and is not a valid identifier.
Issue -
State: closed - Opened by Abhista414 11 months ago
- 1 comment
#994 - Added support of OpenVINO inference
Pull Request -
State: closed - Opened by AlexKoff88 11 months ago
- 6 comments
#993 - How to interpret generated results for truthful_qa test
Issue -
State: open - Opened by Joetib 11 months ago
- 3 comments
#992 - TypeError: HFLM.__init__() got an unexpected keyword argument 'use_accelerate'
Issue -
State: closed - Opened by shaunstoltz 11 months ago
- 6 comments
#991 - it seems like a bug in winogrande.py
Issue -
State: closed - Opened by gaoteng-git 11 months ago
- 1 comment
#990 - feat: add option to upload results to Zeno
Pull Request -
State: closed - Opened by Sparkier 11 months ago
- 6 comments
Labels: feature request
#989 - llama 2 70b gptq use too much cpu memory
Issue -
State: closed - Opened by fancyerii 11 months ago
- 3 comments
Labels: bug
#988 - [Refactor] BigBench
Issue -
State: closed - Opened by orendar 11 months ago
- 9 comments
Labels: bug
#987 - [Refactor] Alias fix
Pull Request -
State: closed - Opened by lintangsutawika 11 months ago
#986 - WIP: Add MedMcqa Task to lm-evaluation
Pull Request -
State: closed - Opened by issamYahiaoui 11 months ago
- 1 comment
#985 - [Refactor] Num_fewshot process
Pull Request -
State: closed - Opened by lintangsutawika 11 months ago
- 2 comments
#984 - evaluate model from local machine
Issue -
State: closed - Opened by umarbeknasimov 11 months ago
- 5 comments
#983 - How to see intermediate output?
Issue -
State: closed - Opened by Ezra-Yu 11 months ago
- 1 comment
Labels: documentation
#982 - SquadV2 results are not reproducible for Llama2-7B
Issue -
State: closed - Opened by gupta-abhay 11 months ago
- 11 comments
#981 - [Refactor] fixes for alternative MMLU tasks.
Pull Request -
State: closed - Opened by lintangsutawika 11 months ago
#980 - Average score metric isn't normalized whatsoever
Issue -
State: closed - Opened by kalomaze 11 months ago
- 1 comment
#979 - add description on task/group alias
Pull Request -
State: closed - Opened by lintangsutawika 11 months ago
#978 - Some questions on the DROP and WinoGrande Harness implementations
Issue -
State: closed - Opened by clefourrier 11 months ago
- 9 comments
Labels: help wanted, good first issue, validation
#977 - Fix unnatural tokenizations if possible
Pull Request -
State: closed - Opened by KlaudiaTH 11 months ago
- 1 comment