An open API service for providing issue and pull request metadata for open source projects.

GitHub / meta-llama/llama-stack-evals issues and pull requests

#42 - feat: Add visualization dashboard for llama stack evals/func tests

Pull Request - State: open - Opened by ttharuntej about 2 months ago - 1 comment
Labels: cla signed

#40 - fix: sanitize some tool data when running evals

Pull Request - State: open - Opened by jtau about 2 months ago - 5 comments
Labels: cla signed

#39 - fix: drop responses from tool calls in open-ai calls

Pull Request - State: closed - Opened by hardikjshah about 2 months ago
Labels: cla signed

#38 - Add quantization data to the model card

Issue - State: open - Opened by reluctantfuturist about 2 months ago

#37 - docs: readme typo fix

Pull Request - State: closed - Opened by ktdreyer 2 months ago
Labels: cla signed

#36 - chore(docs): fix readme instructions for run all benchmarks

Pull Request - State: closed - Opened by reluctantfuturist 2 months ago
Labels: cla signed

#35 - Added GCP to Provider Conf

Pull Request - State: closed - Opened by terencezhang1997 2 months ago
Labels: cla signed

#34 - Add facebookresearch/CRAG as a RAG benchmark

Issue - State: open - Opened by jwm4 2 months ago - 1 comment

#33 - feat(cli): add run-full-suite command to execute all tests and evals

Pull Request - State: open - Opened by reluctantfuturist 2 months ago
Labels: cla signed

#32 - Fix flaky ifeval benchmark

Issue - State: open - Opened by reluctantfuturist 2 months ago

#31 - Adding Openrouter support to LSE

Pull Request - State: open - Opened by heyjustinai 2 months ago
Labels: cla signed

#30 - chore: Allow specifying --openai-compat-endpoint in run-benchmarks

Pull Request - State: closed - Opened by raghotham 2 months ago - 3 comments
Labels: cla signed

#29 - chore(docs): readme updates

Pull Request - State: closed - Opened by reluctantfuturist 2 months ago
Labels: cla signed

#28 - fix: typo in integration test CI job

Pull Request - State: closed - Opened by nathan-weinberg 2 months ago
Labels: cla signed

#27 - feat: add ramalama-stack to tested providers

Pull Request - State: open - Opened by nathan-weinberg 2 months ago
Labels: cla signed

#26 - feat: Support Lambda inference API

Pull Request - State: open - Opened by chuanli11 3 months ago - 5 comments
Labels: cla signed

#23 - docs(readme): more context on evals

Pull Request - State: closed - Opened by reluctantfuturist 3 months ago
Labels: cla signed

#22 - chore: add uv instructions

Pull Request - State: closed - Opened by ashwinb 3 months ago
Labels: cla signed

#21 - docs: Update README.md

Pull Request - State: closed - Opened by hardikjshah 3 months ago
Labels: cla signed

#20 - chore: refactor provider configs slightly so it is a bit clearer

Pull Request - State: closed - Opened by ashwinb 3 months ago
Labels: cla signed

#19 - feat(scripts): run all benchmarks

Pull Request - State: closed - Opened by reluctantfuturist 3 months ago - 1 comment
Labels: cla signed

#18 - fix: Update integration-tests.yml

Pull Request - State: closed - Opened by hardikjshah 3 months ago
Labels: cla signed

#17 - fix(readme): repo url fix

Pull Request - State: closed - Opened by reluctantfuturist 3 months ago
Labels: cla signed

#16 - Implement Multilingual MMLU

Issue - State: closed - Opened by reluctantfuturist 3 months ago
Labels: enhancement

#15 - Implement LiveCodeBench

Issue - State: open - Opened by reluctantfuturist 3 months ago
Labels: enhancement

#14 - feat: Add benchmark descriptions

Pull Request - State: closed - Opened by hardikjshah 3 months ago - 1 comment
Labels: cla signed

#13 - docs(readme): update with new verified evals

Pull Request - State: closed - Opened by reluctantfuturist 3 months ago
Labels: cla signed

#12 - feat: Bfcl v3 with API and Tools

Pull Request - State: closed - Opened by hardikjshah 3 months ago
Labels: cla signed

#11 - fix: apply extra_body on each test

Pull Request - State: open - Opened by louisgv 3 months ago
Labels: cla signed

#10 - Latency/throughput test

Issue - State: open - Opened by reluctantfuturist 3 months ago

#9 - Functional test for context window limit

Issue - State: open - Opened by reluctantfuturist 3 months ago

#8 - chore: kill llama-stack references we dont need them

Pull Request - State: closed - Opened by ashwinb 3 months ago
Labels: cla signed

#7 - feat: add list-providers, update list-models and run-benchmarks

Pull Request - State: closed - Opened by ashwinb 3 months ago - 2 comments
Labels: cla signed

#6 - test: regex_parser_multiple_choice_grader

Pull Request - State: closed - Opened by ehhuang 3 months ago
Labels: cla signed

#5 - fix: update BFCL implementation

Pull Request - State: closed - Opened by hardikjshah 3 months ago
Labels: cla signed

#4 - Add CI for llama-stack-evals

Issue - State: closed - Opened by reluctantfuturist 3 months ago - 3 comments

#3 - test: add some integration tests

Pull Request - State: closed - Opened by ehhuang 3 months ago
Labels: cla signed

#2 - chore: update python version

Pull Request - State: closed - Opened by ehhuang 3 months ago
Labels: cla signed

#1 - chore: enable linting

Pull Request - State: closed - Opened by ashwinb 3 months ago
Labels: cla signed