GitHub / meta-llama/llama-stack-evals issues and pull requests
#42 - feat: Add visualization dashboard for llama stack evals/func tests
Pull Request -
State: open - Opened by ttharuntej about 2 months ago
- 1 comment
Labels: cla signed
#40 - fix: sanitize some tool data when running evals
Pull Request -
State: open - Opened by jtau about 2 months ago
- 5 comments
Labels: cla signed
#39 - fix: drop responses from tool calls in open-ai calls
Pull Request -
State: closed - Opened by hardikjshah about 2 months ago
Labels: cla signed
#38 - Add quantization data to the model card
Issue -
State: open - Opened by reluctantfuturist about 2 months ago
#37 - docs: readme typo fix
Pull Request -
State: closed - Opened by ktdreyer 2 months ago
Labels: cla signed
#36 - chore(docs): fix readme instructions for run all benchmarks
Pull Request -
State: closed - Opened by reluctantfuturist 2 months ago
Labels: cla signed
#35 - Added GCP to Provider Conf
Pull Request -
State: closed - Opened by terencezhang1997 2 months ago
Labels: cla signed
#34 - Add facebookresearch/CRAG as a RAG benchmark
Issue -
State: open - Opened by jwm4 2 months ago
- 1 comment
#33 - feat(cli): add run-full-suite command to execute all tests and evals
Pull Request -
State: open - Opened by reluctantfuturist 2 months ago
Labels: cla signed
#32 - Fix flaky ifeval benchmark
Issue -
State: open - Opened by reluctantfuturist 2 months ago
#31 - Adding Openrouter support to LSE
Pull Request -
State: open - Opened by heyjustinai 2 months ago
Labels: cla signed
#30 - chore: Allow specifying --openai-compat-endpoint in run-benchmarks
Pull Request -
State: closed - Opened by raghotham 2 months ago
- 3 comments
Labels: cla signed
#29 - chore(docs): readme updates
Pull Request -
State: closed - Opened by reluctantfuturist 2 months ago
Labels: cla signed
#28 - fix: typo in integration test CI job
Pull Request -
State: closed - Opened by nathan-weinberg 2 months ago
Labels: cla signed
#27 - feat: add ramalama-stack to tested providers
Pull Request -
State: open - Opened by nathan-weinberg 2 months ago
Labels: cla signed
#26 - feat: Support Lambda inference API
Pull Request -
State: open - Opened by chuanli11 3 months ago
- 5 comments
Labels: cla signed
#25 - bfclv3-api sending invalid "responses" key in tool requests
Issue -
State: closed - Opened by bbrowning 3 months ago
#24 - bfclv3-api sending improper tools requests for input rows with `missing_functions`
Issue -
State: open - Opened by bbrowning 3 months ago
#23 - docs(readme): more context on evals
Pull Request -
State: closed - Opened by reluctantfuturist 3 months ago
Labels: cla signed
#22 - chore: add uv instructions
Pull Request -
State: closed - Opened by ashwinb 3 months ago
Labels: cla signed
#21 - docs: Update README.md
Pull Request -
State: closed - Opened by hardikjshah 3 months ago
Labels: cla signed
#20 - chore: refactor provider configs slightly so it is a bit clearer
Pull Request -
State: closed - Opened by ashwinb 3 months ago
Labels: cla signed
#19 - feat(scripts): run all benchmarks
Pull Request -
State: closed - Opened by reluctantfuturist 3 months ago
- 1 comment
Labels: cla signed
#18 - fix: Update integration-tests.yml
Pull Request -
State: closed - Opened by hardikjshah 3 months ago
Labels: cla signed
#17 - fix(readme): repo url fix
Pull Request -
State: closed - Opened by reluctantfuturist 3 months ago
Labels: cla signed
#16 - Implement Multilingual MMLU
Issue -
State: closed - Opened by reluctantfuturist 3 months ago
Labels: enhancement
#15 - Implement LiveCodeBench
Issue -
State: open - Opened by reluctantfuturist 3 months ago
Labels: enhancement
#14 - feat: Add benchmark descriptions
Pull Request -
State: closed - Opened by hardikjshah 3 months ago
- 1 comment
Labels: cla signed
#13 - docs(readme): update with new verified evals
Pull Request -
State: closed - Opened by reluctantfuturist 3 months ago
Labels: cla signed
#12 - feat: Bfcl v3 with API and Tools
Pull Request -
State: closed - Opened by hardikjshah 3 months ago
Labels: cla signed
#11 - fix: apply extra_body on each test
Pull Request -
State: open - Opened by louisgv 3 months ago
Labels: cla signed
#10 - Latency/throughput test
Issue -
State: open - Opened by reluctantfuturist 3 months ago
#9 - Functional test for context window limit
Issue -
State: open - Opened by reluctantfuturist 3 months ago
#8 - chore: kill llama-stack references we dont need them
Pull Request -
State: closed - Opened by ashwinb 3 months ago
Labels: cla signed
#7 - feat: add list-providers, update list-models and run-benchmarks
Pull Request -
State: closed - Opened by ashwinb 3 months ago
- 2 comments
Labels: cla signed
#6 - test: regex_parser_multiple_choice_grader
Pull Request -
State: closed - Opened by ehhuang 3 months ago
Labels: cla signed
#5 - fix: update BFCL implementation
Pull Request -
State: closed - Opened by hardikjshah 3 months ago
Labels: cla signed
#4 - Add CI for llama-stack-evals
Issue -
State: closed - Opened by reluctantfuturist 3 months ago
- 3 comments
#3 - test: add some integration tests
Pull Request -
State: closed - Opened by ehhuang 3 months ago
Labels: cla signed
#2 - chore: update python version
Pull Request -
State: closed - Opened by ehhuang 3 months ago
Labels: cla signed
#1 - chore: enable linting
Pull Request -
State: closed - Opened by ashwinb 3 months ago
Labels: cla signed