Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / EleutherAI/lm-evaluation-harness issues and pull requests
#1974 - added bias and stereotype classification tasks
Pull Request -
State: closed - Opened by aditya20t 4 months ago
- 1 comment
#1973 - Add GigaChat API
Pull Request -
State: closed - Opened by seldereyy 4 months ago
- 3 comments
#1972 - incomplete task list
Issue -
State: closed - Opened by hlzhang109 4 months ago
- 2 comments
#1971 - Ubelievable long time when host the gguf mode ?
Issue -
State: open - Opened by hzgdeerHo 4 months ago
- 2 comments
#1970 - mela
Pull Request -
State: closed - Opened by Geralt-Targaryen 4 months ago
- 5 comments
#1969 - Fix OpenAI API discrepancies
Pull Request -
State: closed - Opened by chimezie 4 months ago
#1968 - Updates to fix OpenAI API compliance
Pull Request -
State: closed - Opened by chimezie 4 months ago
#1967 - OpenAI completions model not using OpenAI Completion API properly to extract LogProbs
Issue -
State: open - Opened by chimezie 4 months ago
- 2 comments
#1966 - TemplateLM#_encode_pair() only works for HF transformers auto-models
Issue -
State: closed - Opened by Birch-san 4 months ago
- 1 comment
#1965 - Error while installing
Issue -
State: closed - Opened by surya-narayanan 4 months ago
- 1 comment
#1964 - Add BertaQA dataset tasks
Pull Request -
State: closed - Opened by juletx 4 months ago
- 1 comment
#1963 - How to use a vllm hosted model?
Issue -
State: open - Opened by darsh-essential 4 months ago
- 1 comment
#1962 - Error when chat template is not a string
Issue -
State: open - Opened by djstrong 4 months ago
- 1 comment
#1961 - Mmlu Pro
Pull Request -
State: closed - Opened by ysjprojects 4 months ago
- 14 comments
#1960 - Multi-gpu evaluation with external library usage.
Issue -
State: closed - Opened by xinghaow99 4 months ago
- 1 comment
#1959 - Making torch dep optional?
Issue -
State: open - Opened by dlwh 4 months ago
- 4 comments
#1958 - Wandb logger can't handle groups with heterogenous metrics
Issue -
State: open - Opened by dmitrii-palisaderesearch 4 months ago
- 11 comments
#1957 - Cannot load model 'local-chat-completions' and 'local-completions'
Issue -
State: closed - Opened by awesom112 4 months ago
#1956 - fix: add directory filter to os.walk to ignore 'ipynb_checkpoints'
Pull Request -
State: closed - Opened by johnwee1 4 months ago
- 11 comments
#1955 - Fix a tiny typo in `docs/interface.md`
Pull Request -
State: closed - Opened by sadra-barikbin 4 months ago
#1954 - Fix task.py and evaluator.py
Pull Request -
State: closed - Opened by zhabuye 4 months ago
- 1 comment
#1953 - Keep getting error: 'VLLM' object has no attribute 'AUTO_MODEL_CLASS'
Issue -
State: closed - Opened by andrew0411 4 months ago
- 8 comments
#1952 - .ipynb_checkpoints causes eval harness to fail
Issue -
State: closed - Opened by johnwee1 4 months ago
#1951 - Plans for a new release?
Issue -
State: closed - Opened by nathan-weinberg 4 months ago
- 5 comments
#1950 - LMJudge
Pull Request -
State: closed - Opened by baberabb 4 months ago
- 5 comments
#1949 - Check compatibility of `local-completions` with VLLM (returns logits) for `multiple_choice` tasks
Issue -
State: open - Opened by haileyschoelkopf 4 months ago
Labels: bug
#1948 - Remove AMMLU Due to Translation
Pull Request -
State: closed - Opened by haileyschoelkopf 4 months ago
- 2 comments
#1947 - Add MMLU-Pro Dataset
Issue -
State: open - Opened by haileyschoelkopf 4 months ago
Labels: help wanted, feature request, good first issue
#1947 - Add MMLU-Pro Dataset
Issue -
State: closed - Opened by haileyschoelkopf 4 months ago
Labels: help wanted, feature request, good first issue
#1946 - Alghafa benchmark
Pull Request -
State: open - Opened by khalil-Hennara 4 months ago
- 7 comments
#1946 - Alghafa benchmark
Pull Request -
State: open - Opened by khalil-Hennara 4 months ago
- 5 comments
#1945 - The output of ceval is not as the same format at the official version?
Issue -
State: open - Opened by ChuanhongLi 4 months ago
- 1 comment
#1944 - Results is weird for Qwen2-1.5B
Issue -
State: closed - Opened by SefaZeng 4 months ago
- 6 comments
#1943 - Allow running hugging face models with both data parallelism and model parallelism at once
Pull Request -
State: closed - Opened by clefourrier 4 months ago
#1942 - Fixed the [issue #1757](https://github.com/EleutherAI/lm-evaluation-harness/issues/1757) by editing the `yaml` files.
Pull Request -
State: closed - Opened by sci-m-wang 4 months ago
- 2 comments
#1941 - Save `fewshot_as_multiturn` argument in `results.json`
Issue -
State: closed - Opened by djstrong 4 months ago
- 1 comment
#1940 - Add the Arabic version with refactor to Arabic pica to be in alghafa …
Pull Request -
State: closed - Opened by khalil-Hennara 4 months ago
#1939 - Fix a tiny typo in `__main__.py`
Pull Request -
State: closed - Opened by sadra-barikbin 4 months ago
- 1 comment
#1938 - Regarding decontamination
Issue -
State: open - Opened by dsdanielpark 4 months ago
#1937 - Format of Personal Defined Dataset for Evaluation
Issue -
State: closed - Opened by OscarC9912 4 months ago
- 1 comment
#1936 - High Number of Tokens for openai-completions Models
Issue -
State: open - Opened by selinaxiao 4 months ago
#1935 - Parallel GPU evaluation using simple_evaluate /evaluate functions? #1934
Issue -
State: closed - Opened by PalaashAgrawal 4 months ago
- 1 comment
#1934 - Parallel GPU evaluation using simple_evaluate /evaluate functions?
Issue -
State: closed - Opened by Naitik1502 4 months ago
#1933 - Easier unitxt tasks loading and removal of unitxt library dependancy
Pull Request -
State: closed - Opened by elronbandel 4 months ago
- 8 comments
#1932 - --trust_remote_code does it actually do anything?
Issue -
State: closed - Opened by devzzzero 4 months ago
- 8 comments
Labels: bug
#1931 - [add] fld logical formula task
Pull Request -
State: closed - Opened by MorishT 4 months ago
- 1 comment
#1930 - `samples` is newline delimited
Pull Request -
State: closed - Opened by baberabb 4 months ago
#1929 - Prettify lm_eval --tasks list
Pull Request -
State: closed - Opened by anthony-dipofi 4 months ago
- 2 comments
#1928 - [New Task] Add Paloma benchmark
Pull Request -
State: closed - Opened by zafstojano 4 months ago
- 5 comments
#1927 - Modify pre-commit hook to check merge conflicts accidentally committed
Pull Request -
State: closed - Opened by LSinev 4 months ago
#1926 - Results filenames handling fix
Pull Request -
State: closed - Opened by KonradSzafer 4 months ago
- 3 comments
#1925 - --hf_hub_log_args causes IndexError
Issue -
State: closed - Opened by johnwee1 4 months ago
- 2 comments
#1924 - Update brier_score to be bounded [1,0]
Pull Request -
State: closed - Opened by xksteven 4 months ago
- 2 comments
#1923 - OOM Issue
Issue -
State: closed - Opened by zhentingqi 4 months ago
- 5 comments
#1922 - Multiprompt
Pull Request -
State: open - Opened by lintangsutawika 4 months ago
#1921 - Confusion matrix metric
Pull Request -
State: open - Opened by minaremeli 4 months ago
- 10 comments
#1920 - build commit_id=b281b09, I cannot find lm-eval command.
Issue -
State: closed - Opened by jieheroli 4 months ago
- 1 comment
#1919 - change openai completions params to fit API documentation
Pull Request -
State: open - Opened by artemorloff 4 months ago
#1918 - output_path may break postprocessing
Issue -
State: open - Opened by artemorloff 4 months ago
- 3 comments
#1917 - Add The Arabic version of the PICA benchmark
Pull Request -
State: closed - Opened by khalil-Hennara 4 months ago
#1916 - Test output table layout consistency
Pull Request -
State: closed - Opened by zafstojano 4 months ago
- 1 comment
#1915 - Add New Benchmark
Issue -
State: closed - Opened by khalil-Hennara 4 months ago
- 2 comments
#1914 - Fix fewshot seed only set when overriding num_fewshot
Pull Request -
State: closed - Opened by LSinev 4 months ago
#1913 - Update basque-glue
Pull Request -
State: closed - Opened by zhabuye 4 months ago
#1912 - Implement NoticIA
Pull Request -
State: closed - Opened by ikergarcia1996 4 months ago
#1911 - accuracy precision
Issue -
State: closed - Opened by lernerjenny 4 months ago
- 3 comments
#1910 - Add TensorRT-LLM support
Issue -
State: open - Opened by taewan2002 4 months ago
- 1 comment
Labels: feature request
#1909 - Fix social_iqa answer choices
Pull Request -
State: closed - Opened by haileyschoelkopf 4 months ago
#1908 - social_iqa choices do not use actual answers
Issue -
State: closed - Opened by ozgurcelik 4 months ago
- 2 comments
#1907 - Evaluation for MegatronT5 Model
Issue -
State: closed - Opened by wangyanbao666 4 months ago
- 4 comments
#1906 - Fewshot seed only set when overriding num_fewshot
Issue -
State: closed - Opened by stoical07 4 months ago
- 1 comment
Labels: bug
#1905 - Try to make existing tests run little bit faster
Pull Request -
State: closed - Opened by LSinev 4 months ago
- 1 comment
#1904 - Load sentencepiece tokenizer for evaluation
Issue -
State: closed - Opened by ayushsml 4 months ago
- 2 comments
#1903 - OpenaiCompletionsLM invokes the completions API with max_tokens set to 0
Issue -
State: open - Opened by chimezie 4 months ago
- 1 comment
#1902 - mlx Model (loglikelihood & generate_until)
Pull Request -
State: open - Opened by chimezie 4 months ago
- 8 comments
#1901 - Complete task list from pr 1727
Pull Request -
State: closed - Opened by anthony-dipofi 4 months ago
- 5 comments
#1900 - add arc_challenge_mt
Pull Request -
State: closed - Opened by jonabur 4 months ago
- 5 comments
#1899 - model_comparator.py broken
Issue -
State: open - Opened by johnwee1 4 months ago
#1898 - Add dataset card when pushing to HF hub
Pull Request -
State: closed - Opened by KonradSzafer 4 months ago
- 3 comments
#1897 - Add new Lambada translations
Pull Request -
State: closed - Opened by zafstojano 4 months ago
- 6 comments
#1896 - llama3-base gsm8k score
Issue -
State: closed - Opened by rangehow 4 months ago
- 2 comments
#1895 - Making hardcoded few shots compatible with the chat template mechanism
Pull Request -
State: closed - Opened by clefourrier 4 months ago
- 5 comments
#1894 - vLLM causing GPU memory leak with data_parallel_size=3
Issue -
State: closed - Opened by johnwee1 4 months ago
- 2 comments
#1893 - `higher_is_better` tickers in output table
Pull Request -
State: closed - Opened by zafstojano 4 months ago
- 1 comment
#1892 - GPU memory very high and unbalanced when testing Gemma
Issue -
State: closed - Opened by smartliuhw 4 months ago
- 2 comments
#1891 - TypeError of scrolls_narrativeqa
Issue -
State: open - Opened by hicleo 4 months ago
#1890 - Updated vllm imports in vllm_causallms.py
Pull Request -
State: closed - Opened by mgoin 4 months ago
#1889 - ImportError: cannot import name 'HfApi' from 'huggingface_hub'
Issue -
State: closed - Opened by baberabb 4 months ago
- 2 comments
#1888 - Aligning Prompts and choices of LogiQA task
Pull Request -
State: closed - Opened by abzb1 4 months ago
- 1 comment
#1887 - Mismatch Between Prompt Format and Expected Choices in LogiQA Dataset
Issue -
State: closed - Opened by abzb1 4 months ago
- 3 comments
#1886 - [HFLM]Add support for Ascend NPU
Pull Request -
State: closed - Opened by statelesshz 4 months ago
- 4 comments
#1885 - Multiple issues Encountered During Tasks Verification
Issue -
State: open - Opened by zhabuye 4 months ago
- 21 comments
#1884 - can we add C4 and PTB tasks for PpL?
Issue -
State: open - Opened by 123wujiao 4 months ago
- 1 comment
Labels: feature request
#1883 - Add Regression Testing
Issue -
State: open - Opened by haileyschoelkopf 4 months ago
- 4 comments
Labels: help wanted, feature request, good first issue
#1882 - eval with Alpaca template
Issue -
State: closed - Opened by oneonlee 4 months ago
- 1 comment
#1881 - test_docs of scrolls dataset
Issue -
State: closed - Opened by huweim 4 months ago
- 1 comment
#1880 - [HFLM]Use Accelerate's API to reduce hard-coded CUDA code
Pull Request -
State: closed - Opened by statelesshz 4 months ago
- 2 comments
#1879 - Release schedule - more regular PyPI releases?
Issue -
State: closed - Opened by stoical07 4 months ago
- 1 comment
#1878 - Add LegalBench tasks
Pull Request -
State: open - Opened by zafstojano 4 months ago
- 7 comments
#1877 - Use local data sets
Issue -
State: closed - Opened by claire360 4 months ago
- 2 comments