Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / CLUEbenchmark/SuperCLUE issues and pull requests
#41 - GPT4-Turbo is missing from the general leaderboard
Issue -
State: closed - Opened by zhimin-z 6 months ago
- 1 comment
#40 - 咨询一下,从测评报告来看,SuperCLUE是采用自动化方式的客观评估,是否可提供针对某一模型的可实际运行的自动化评测的python样例代码(api调用或者web)?
Issue -
State: open - Opened by Romanzhang2024 6 months ago
#39 - Does it indicate using 5 shots for evaluation?
Issue -
State: closed - Opened by zhimin-z 7 months ago
- 1 comment
#38 - Where to download the benchmark dataset?
Issue -
State: open - Opened by zhimin-z 7 months ago
#37 - How to calculate the metrics from the table in the paper to the leaderboard?
Issue -
State: open - Opened by zhimin-z 7 months ago
- 1 comment
#36 - 大模型升级方式
Issue -
State: open - Opened by lukeup 7 months ago
#35 - 想问下 角色扮演 benchmark是怎么进行的
Issue -
State: closed - Opened by xealml 8 months ago
#34 - Where to locate the SuperCLUE-LYB leaderboards?
Issue -
State: open - Opened by zhimin-z 8 months ago
#33 - 能否增加翻译的评估排名
Issue -
State: open - Opened by lx0126z 9 months ago
#32 - 任务规划和工具使用的评价标准是什么样的?
Issue -
State: open - Opened by heibaidaolx123 10 months ago
- 1 comment
#31 - c-eval是真的离谱,希望superclue能更新的稍微快一点,比如1-2周更新一次
Issue -
State: open - Opened by iammeizu 10 months ago
- 2 comments
#30 - anthropic拼错了
Issue -
State: open - Opened by JerryJiang12923 11 months ago
#29 - 求教一下 逻辑与推理 具体指哪方面? 比如 "郭德纲2岁会看报,xxxx" ,请问郭德纲3岁会看书吗? 这个属于推理还是语义理解能力??
Issue -
State: open - Opened by ArtificialZeng 11 months ago
#28 - 请问可以把vicuna-33B模型加入评测吗?
Issue -
State: open - Opened by Mr-wang2016 11 months ago
#27 - 测评时如何与标准答案进行匹配
Issue -
State: open - Opened by Starry-Hu about 1 year ago
#26 - 数据集开源吗?可以在哪里下载呢
Issue -
State: open - Opened by vanshaw2017 about 1 year ago
- 3 comments
#25 - 关于prompt设计的问题
Issue -
State: open - Opened by lrs1353281004 about 1 year ago
- 1 comment
#24 - 排名变化的原因是什么?
Issue -
State: open - Opened by zhaojiawen-coding about 1 year ago
- 1 comment
#23 - test the 智源大模型吧
Issue -
State: open - Opened by forkyguo about 1 year ago
- 3 comments
#22 - 阿里的通义千问没有吗?
Issue -
State: closed - Opened by Pancat009 about 1 year ago
- 2 comments
#21 - 这里"idea-jiangzhiya"应该是"idea-jiangziya"吧?
Issue -
State: open - Opened by ilongshan about 1 year ago
- 1 comment
#20 - 没有文心一言吗
Issue -
State: closed - Opened by p81sunshine about 1 year ago
- 1 comment
#19 - Clarify which "Claude" is benchmarked?
Issue -
State: open - Opened by jekbradbury about 1 year ago
- 1 comment
#18 - 可以在superclue上测试自己的模型吗?
Issue -
State: open - Opened by guozhiyao about 1 year ago
- 2 comments
#17 - 什么时候回公开测试数据集?
Issue -
State: open - Opened by wangrui6 about 1 year ago
- 1 comment
#16 - 建议补全人类的“专业能力”数据
Issue -
State: open - Opened by Triang-jyed-driung about 1 year ago
- 1 comment
#15 - 人类的数值怎么来的?
Issue -
State: closed - Opened by So0ni about 1 year ago
- 3 comments
#14 - 置信度
Issue -
State: closed - Opened by littlepan0413 about 1 year ago
- 1 comment
#13 - 公开评测集和评测标准
Issue -
State: closed - Opened by plmsmile about 1 year ago
- 1 comment
#12 - 开始搞手机测评榜那一套了?GPT4对应苹果,国产大模型对应华米OV
Issue -
State: open - Opened by ZhuGeRoastedFish about 1 year ago
- 3 comments
#11 - 测评结果为什么全是整数?
Issue -
State: open - Opened by ltz0120 about 1 year ago
- 1 comment
#10 - 这个评测的参考价值
Issue -
State: closed - Opened by liuyajun52 about 1 year ago
- 2 comments
#9 - 作为一个测评榜,建议参考Chinese-LLaMA-Alpaca进行适度的测评说明和公开
Issue -
State: open - Opened by shm007g about 1 year ago
- 1 comment
#8 - 评测数据客观公正很重要
Issue -
State: open - Opened by shichengustc about 1 year ago
- 3 comments
#7 - 单项能力有多少道题目啊
Issue -
State: open - Opened by leonall about 1 year ago
- 2 comments
#6 - 这个superCLUE 有毒性和偏见等方面的评测吗
Issue -
State: open - Opened by devinbai about 1 year ago
- 3 comments
#5 - 生成与创作如何用选择题的形式测试的?
Issue -
State: closed - Opened by Howardqlz about 1 year ago
- 4 comments
#4 - 该如何引用你们的工作?
Issue -
State: closed - Opened by MikeGu721 about 1 year ago
- 1 comment
#3 - 我个人使用后的感受,星火大模型是真的不如文心一言。。
Issue -
State: open - Opened by MysteryMulberry about 1 year ago
- 8 comments
#2 - 超200人了,求拉群
Issue -
State: closed - Opened by dinglei8908 about 1 year ago
- 1 comment
#1 - 感谢徐亮老师团队的工作~关于评测细节 有一些疑问咨询下
Issue -
State: open - Opened by lrs1353281004 about 1 year ago
- 5 comments