Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / hendrycks/test issues and pull requests
#26 - Incorrect answers
Issue -
State: open - Opened by cinjon about 2 months ago
#25 - What are expected to submit for the leaderboard integration?
Issue -
State: open - Opened by zhimin-z 8 months ago
#24 - Suggestion: appropriate use of medical word: catheter
Issue -
State: open - Opened by hiroyaiizuka 9 months ago
#23 - medical word typo in CSV file.
Issue -
State: closed - Opened by hiroyaiizuka 9 months ago
- 1 comment
#22 - Why is this repository called `test`?
Issue -
State: open - Opened by nikhilweee 10 months ago
#21 - Duplicate Answers in Validation Set
Issue -
State: open - Opened by riedgar-ms 10 months ago
#20 - can not download
Issue -
State: open - Opened by fourfireM 10 months ago
- 1 comment
#19 - Hrvatski
Issue -
State: open - Opened by Deni7s 11 months ago
#18 - Incorrect answer for Q5 in high_school_computer_science_dev.csv
Issue -
State: open - Opened by TenType 11 months ago
#17 - About mmlu subset "all" from hugging face
Issue -
State: closed - Opened by githubhyz about 1 year ago
#16 - Issues with the moral scenarios task
Issue -
State: closed - Opened by c1505 about 1 year ago
- 3 comments
#15 - why ["top_logprobs"][-1]
Issue -
State: closed - Opened by don-tpanic about 1 year ago
- 1 comment
#14 - Seems that setting logprobs=100 is not useful now.
Issue -
State: open - Opened by KL4805 about 1 year ago
- 1 comment
#13 - Evaluation script for Huggingface Causal models
Pull Request -
State: open - Opened by ollmer over 1 year ago
- 15 comments
#12 - Update crop.py
Pull Request -
State: closed - Opened by sbmaruf over 1 year ago
#11 - Answers A, B, C, D are not all equally likely - is it really accurate to use random baseline as comparison?
Issue -
State: open - Opened by bmosaicml over 1 year ago
#10 - Dismatch dataset categories
Issue -
State: open - Opened by zhichengg over 1 year ago
#9 - Fetching encoder json and bpe does not work (fixed by removing a typo)
Issue -
State: open - Opened by kovacgrgur over 1 year ago
#8 - Flan-T5 Benchmarking Code
Pull Request -
State: closed - Opened by Helw150 almost 2 years ago
#7 - Dataset size mismatched with paper
Issue -
State: closed - Opened by todpole3 about 2 years ago
- 2 comments
#6 - Unintended (?) repetition in moral_scenarios_val.csv
Issue -
State: closed - Opened by sleepinyourhat about 2 years ago
- 2 comments
#5 - Odd-Looking Samples in business_ethics_val.csv
Issue -
State: closed - Opened by henighan over 2 years ago
- 2 comments
#4 - Update README.md with papers with code mirror
Pull Request -
State: open - Opened by RJT1990 over 2 years ago
#3 - Human level performance?
Issue -
State: closed - Opened by rodrigonogueira4 about 4 years ago
- 5 comments
#2 - Allows multiple engine selection from command line
Pull Request -
State: closed - Opened by xksteven about 4 years ago
#1 - Adds downloading to make code more portable
Pull Request -
State: closed - Opened by xksteven about 4 years ago