GitHub / anaconda/llm-eval issues and pull requests
#92 - Refactor sik llms
Pull Request -
State: closed - Opened by shane-kercheval about 2 months ago
#91 - Updated license to BSD-3
Pull Request -
State: closed - Opened by shane-kercheval about 2 months ago
#90 - [Snyk] Fix for 3 vulnerabilities
Pull Request -
State: open - Opened by anaconda-security-bot about 2 months ago
#89 - Added gpt 4.1 models and pricing to openai.py
Pull Request -
State: closed - Opened by shane-kercheval 4 months ago
#88 - Added LLMCheck, DelayedSemaphore, and Retry/Delay for errors during evaluation
Pull Request -
State: closed - Opened by shane-kercheval 4 months ago
#87 - Added support for AWS Bedrock & Claude
Pull Request -
State: closed - Opened by shane-kercheval 6 months ago
#86 - Update dependencies to fix env conflicts
Pull Request -
State: closed - Opened by m-navarro93 8 months ago
#85 - updated linting based on latest version of ruff
Pull Request -
State: closed - Opened by shane-kercheval 8 months ago
#84 - Dev shane
Pull Request -
State: closed - Opened by shane-kercheval 8 months ago
#83 - Updated examples and docstrings based on the prior refactor
Pull Request -
State: closed - Opened by shane-kercheval 9 months ago
#82 - Dev shane
Pull Request -
State: closed - Opened by shane-kercheval 9 months ago
#81 - Dev shane
Pull Request -
State: closed - Opened by shane-kercheval 9 months ago
#80 - Dev shane
Pull Request -
State: closed - Opened by shane-kercheval 9 months ago
#79 - Eval refactor
Pull Request -
State: closed - Opened by shane-kercheval 9 months ago
#78 - Kur 215/nvidia f1
Pull Request -
State: closed - Opened by shane-kercheval 9 months ago
#77 - added additional metrics (f1 score, precision/recall, max f1 (from list of possible answers); added example with nvidia rag datasets
Pull Request -
State: closed - Opened by shane-kercheval 9 months ago
#76 - [Snyk] Fix for 4 vulnerabilities
Pull Request -
State: closed - Opened by nwankwon 9 months ago
#75 - Kur 215/nvidia f1
Pull Request -
State: closed - Opened by shane-kercheval 9 months ago
#74 - [Snyk] Fix for 4 vulnerabilities
Pull Request -
State: closed - Opened by rmyers 9 months ago
#73 - [Snyk] Fix for 4 vulnerabilities
Pull Request -
State: closed - Opened by elize-anaconda 9 months ago
#72 - Add Anthropic
Pull Request -
State: closed - Opened by ahuang11 9 months ago
- 1 comment
#71 - Feature/kur 215
Pull Request -
State: closed - Opened by Prasidh 9 months ago
#70 - [Snyk] Security upgrade python from 3.11 to 3.14-rc-slim
Pull Request -
State: closed - Opened by nwankwon 9 months ago
#69 - Feature/kur 215/Eval dataset generation + metric calculation
Pull Request -
State: closed - Opened by Prasidh 9 months ago
#68 - Update test_mistralai.py
Pull Request -
State: closed - Opened by ahuang11 10 months ago
#67 - Support MistralAI (service) calls
Pull Request -
State: closed - Opened by ahuang11 10 months ago
#66 - [Snyk] Security upgrade python from 3.11 to 3.13.0rc1-slim
Pull Request -
State: closed - Opened by nwankwon 10 months ago
- 1 comment
#65 - Fix not string types with allow_regex
Pull Request -
State: closed - Opened by ahuang11 11 months ago
#64 - gpt-4o-2024-08-06 openai pricing
Pull Request -
State: closed - Opened by shane-kercheval 11 months ago
#63 - Prompt refactor
Pull Request -
State: closed - Opened by shane-kercheval 12 months ago
#62 - A util to intercept messages and do evals statically
Issue -
State: open - Opened by ahuang11 12 months ago
- 2 comments
#61 - Do not require a assistant/user pair for previous_messages
Issue -
State: closed - Opened by ahuang11 12 months ago
- 1 comment
#60 - fixed async issue where we were not handling creating/joining the event loop
Pull Request -
State: closed - Opened by shane-kercheval 12 months ago
#59 - updated EvalHarness so that it works with async candidates
Pull Request -
State: closed - Opened by shane-kercheval 12 months ago
#58 - Update Installation Instructions
Pull Request -
State: closed - Opened by m-navarro93 about 1 year ago
#57 - updated pydantic from 2.5 to 2.8
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#56 - added `gpt-4o-mini-2024-07-18` model and pricing
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#55 - [Snyk] Security upgrade python from 3.11 to 3.13.0b2-slim
Pull Request -
State: closed - Opened by nwankwon about 1 year ago
#54 - Example evaluations supported on README
Pull Request -
State: closed - Opened by sadasant about 1 year ago
#53 - Update requirements.txt
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#52 - Fixed linting
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#51 - Fix issue #49 - Readme feedback
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#50 - Update documentation to explain and show yaml options for Candidates and Evals
Issue -
State: open - Opened by shane-kercheval about 1 year ago
#49 - Readme feedback
Issue -
State: closed - Opened by kathatherine about 1 year ago
- 1 comment
#48 - Fixed dependency changes after previous PR
Pull Request -
State: closed - Opened by sadasant about 1 year ago
#47 - Updated Checks so that if value_extractor fails, the Check fails (but doesn't cause an exception) and the error will be captured in the Check metadata
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#46 - Updating OpenAI and Tiktoken dependencies
Pull Request -
State: closed - Opened by sadasant about 1 year ago
#45 - Updated value_extractor in Check object to convert numeric strings to integers to support list indexing
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#44 - Updated response for LambdaCheck to include check_type and check_metadata in response metadata
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#43 - added LambdaCheck for dynamic testing
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#42 - chore: Add OpenAIServerCandidate
Pull Request -
State: closed - Opened by creeves-anaconda about 1 year ago
#41 - added value_extractor to Check
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#40 - updated examples
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#39 - fix EvalHarness so that it doesn't try to use multiprocessing for a single candidate (nothing to parallelize with current implementation)
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#38 - PythonCodeBlockTests check now allows users to pass env_namespace where they can define variables/functions that is used to run the llm-generated and corresponding tests
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#37 - allow callable checks and callable candidates when running Eval/EvalHarness; refactored Check/Eval so that all checks are passed a RequestData object
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#36 - added negate option to MatchCheck, ContainsCheck, and RegexCheck
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#35 - added support for non-registered checks/candidates and for non-string evals inputs/outputs
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#34 - added gpt-4o support and cost per tokens; added unit test to test all models that appear in the cost dictionary
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#33 - Fix f-string
Pull Request -
State: closed - Opened by ahuang11 about 1 year ago
#32 - Support tool calls
Pull Request -
State: closed - Opened by ahuang11 about 1 year ago
- 7 comments
#31 - Documentation
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#30 - Documentation
Pull Request -
State: closed - Opened by shane-kercheval about 1 year ago
#29 - Setup Project for Packaging
Pull Request -
State: closed - Opened by m-navarro93 over 1 year ago
- 1 comment
#28 - Multi-Evals (create multiple evals from combinations of system messages, previous messages, and prompts)
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#27 - added error_callback to EvalHarness; updated OpenAI models and pricing
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#26 - now ensuring that values passed into Eval are converted to strings
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#25 - bug fixes and linting
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#24 - Idea for possibly fixing context limit reached error
Pull Request -
State: closed - Opened by sadasant over 1 year ago
#23 - Fix to eval-specific system_message and previous_messages
Pull Request -
State: closed - Opened by sadasant over 1 year ago
#22 - Add an llm-eval version to the yaml specifications
Issue -
State: open - Opened by sadasant over 1 year ago
#21 - changes to support setting system_message and previous_messages in eval
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#20 - fixed hugging face issue and updated message formatting logic
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#19 - added helper functions to get code block/test results on evalresult
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#18 - support for num_samples in EvalHarness (i.e. running evals `num_samples` number of times)
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#17 - Initial soft metrics via LLMCheck
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#16 - When code tests have bad syntax, the error does not surface
Issue -
State: open - Opened by sadasant over 1 year ago
#15 - Allowing evals to modify system parameters
Pull Request -
State: closed - Opened by sadasant over 1 year ago
#14 - Dev shane
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#13 - added EvalResult to_json, from_json
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#12 - added code_block_timeout and code_test_timeout to checks.PythonCodeBlockTests
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#11 - Dev shane
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#10 - various renaming for consistency
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#9 - added timestamp to EvalResult
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#8 - added property to EvalResult to return the code-block-run CheckResult
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#7 - added filter functions for list of EvalResults
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#6 - code block errors now return dictionary of error name and message rather than actual exception (for serialization)
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#5 - Dev shane
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#4 - for PythonCodeBlocksRun, added the custom tests to the result/metadata; renamed `code_block_checks` to `code_tests`
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#3 - update test_openai.py to adjust prompt to fix randomly failing test
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#2 - Updated default openai model to latest 3.5 version; updated readme with slight changes
Pull Request -
State: closed - Opened by shane-kercheval over 1 year ago
#1 - chore: Configure Renovate
Pull Request -
State: closed - Opened by anaconda-renovate[bot] over 1 year ago
- 1 comment
Labels: renovate