anaconda/llm-eval issues and pull requests

#92 - Refactor sik llms

Pull Request - State: closed - Opened by shane-kercheval about 2 months ago

#91 - Updated license to BSD-3

Pull Request - State: closed - Opened by shane-kercheval about 2 months ago

#90 - [Snyk] Fix for 3 vulnerabilities

Pull Request - State: open - Opened by anaconda-security-bot about 2 months ago

#89 - Added gpt 4.1 models and pricing to openai.py

Pull Request - State: closed - Opened by shane-kercheval 4 months ago

#88 - Added LLMCheck, DelayedSemaphore, and Retry/Delay for errors during evaluation

Pull Request - State: closed - Opened by shane-kercheval 4 months ago

#87 - Added support for AWS Bedrock & Claude

Pull Request - State: closed - Opened by shane-kercheval 6 months ago

#86 - Update dependencies to fix env conflicts

Pull Request - State: closed - Opened by m-navarro93 8 months ago

#85 - updated linting based on latest version of ruff

Pull Request - State: closed - Opened by shane-kercheval 8 months ago

#84 - Dev shane

Pull Request - State: closed - Opened by shane-kercheval 8 months ago

#83 - Updated examples and docstrings based on the prior refactor

Pull Request - State: closed - Opened by shane-kercheval 9 months ago

#82 - Dev shane

Pull Request - State: closed - Opened by shane-kercheval 9 months ago

#81 - Dev shane

Pull Request - State: closed - Opened by shane-kercheval 9 months ago

#80 - Dev shane

Pull Request - State: closed - Opened by shane-kercheval 9 months ago

#79 - Eval refactor

Pull Request - State: closed - Opened by shane-kercheval 9 months ago

#78 - Kur 215/nvidia f1

Pull Request - State: closed - Opened by shane-kercheval 9 months ago

#77 - added additional metrics (f1 score, precision/recall, max f1 (from list of possible answers); added example with nvidia rag datasets

Pull Request - State: closed - Opened by shane-kercheval 9 months ago

#76 - [Snyk] Fix for 4 vulnerabilities

Pull Request - State: closed - Opened by nwankwon 9 months ago

#75 - Kur 215/nvidia f1

Pull Request - State: closed - Opened by shane-kercheval 9 months ago

#74 - [Snyk] Fix for 4 vulnerabilities

Pull Request - State: closed - Opened by rmyers 9 months ago

#73 - [Snyk] Fix for 4 vulnerabilities

Pull Request - State: closed - Opened by elize-anaconda 9 months ago

#72 - Add Anthropic

Pull Request - State: closed - Opened by ahuang11 9 months ago - 1 comment

#71 - Feature/kur 215

Pull Request - State: closed - Opened by Prasidh 9 months ago

#70 - [Snyk] Security upgrade python from 3.11 to 3.14-rc-slim

Pull Request - State: closed - Opened by nwankwon 9 months ago

#69 - Feature/kur 215/Eval dataset generation + metric calculation

Pull Request - State: closed - Opened by Prasidh 9 months ago

#68 - Update test_mistralai.py

Pull Request - State: closed - Opened by ahuang11 10 months ago

#67 - Support MistralAI (service) calls

Pull Request - State: closed - Opened by ahuang11 10 months ago

#66 - [Snyk] Security upgrade python from 3.11 to 3.13.0rc1-slim

Pull Request - State: closed - Opened by nwankwon 10 months ago - 1 comment

#65 - Fix not string types with allow_regex

Pull Request - State: closed - Opened by ahuang11 11 months ago

#64 - gpt-4o-2024-08-06 openai pricing

Pull Request - State: closed - Opened by shane-kercheval 11 months ago

#63 - Prompt refactor

Pull Request - State: closed - Opened by shane-kercheval 12 months ago

#62 - A util to intercept messages and do evals statically

Issue - State: open - Opened by ahuang11 12 months ago - 2 comments

#61 - Do not require a assistant/user pair for previous_messages

Issue - State: closed - Opened by ahuang11 12 months ago - 1 comment

#60 - fixed async issue where we were not handling creating/joining the event loop

Pull Request - State: closed - Opened by shane-kercheval 12 months ago

#59 - updated EvalHarness so that it works with async candidates

Pull Request - State: closed - Opened by shane-kercheval 12 months ago

#58 - Update Installation Instructions

Pull Request - State: closed - Opened by m-navarro93 about 1 year ago

#57 - updated pydantic from 2.5 to 2.8

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#56 - added `gpt-4o-mini-2024-07-18` model and pricing

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#55 - [Snyk] Security upgrade python from 3.11 to 3.13.0b2-slim

Pull Request - State: closed - Opened by nwankwon about 1 year ago

#54 - Example evaluations supported on README

Pull Request - State: closed - Opened by sadasant about 1 year ago

#53 - Update requirements.txt

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#52 - Fixed linting

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#51 - Fix issue #49 - Readme feedback

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#50 - Update documentation to explain and show yaml options for Candidates and Evals

Issue - State: open - Opened by shane-kercheval about 1 year ago

#49 - Readme feedback

Issue - State: closed - Opened by kathatherine about 1 year ago - 1 comment

#48 - Fixed dependency changes after previous PR

Pull Request - State: closed - Opened by sadasant about 1 year ago

#47 - Updated Checks so that if value_extractor fails, the Check fails (but doesn't cause an exception) and the error will be captured in the Check metadata

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#46 - Updating OpenAI and Tiktoken dependencies

Pull Request - State: closed - Opened by sadasant about 1 year ago

#45 - Updated value_extractor in Check object to convert numeric strings to integers to support list indexing

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#44 - Updated response for LambdaCheck to include check_type and check_metadata in response metadata

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#43 - added LambdaCheck for dynamic testing

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#42 - chore: Add OpenAIServerCandidate

Pull Request - State: closed - Opened by creeves-anaconda about 1 year ago

#41 - added value_extractor to Check

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#40 - updated examples

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#39 - fix EvalHarness so that it doesn't try to use multiprocessing for a single candidate (nothing to parallelize with current implementation)

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#38 - PythonCodeBlockTests check now allows users to pass env_namespace where they can define variables/functions that is used to run the llm-generated and corresponding tests

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#37 - allow callable checks and callable candidates when running Eval/EvalHarness; refactored Check/Eval so that all checks are passed a RequestData object

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#36 - added negate option to MatchCheck, ContainsCheck, and RegexCheck

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#35 - added support for non-registered checks/candidates and for non-string evals inputs/outputs

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#34 - added gpt-4o support and cost per tokens; added unit test to test all models that appear in the cost dictionary

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#33 - Fix f-string

Pull Request - State: closed - Opened by ahuang11 about 1 year ago

#32 - Support tool calls

Pull Request - State: closed - Opened by ahuang11 about 1 year ago - 7 comments

#31 - Documentation

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#30 - Documentation

Pull Request - State: closed - Opened by shane-kercheval about 1 year ago

#29 - Setup Project for Packaging

Pull Request - State: closed - Opened by m-navarro93 over 1 year ago - 1 comment

#28 - Multi-Evals (create multiple evals from combinations of system messages, previous messages, and prompts)

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#27 - added error_callback to EvalHarness; updated OpenAI models and pricing

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#26 - now ensuring that values passed into Eval are converted to strings

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#25 - bug fixes and linting

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#24 - Idea for possibly fixing context limit reached error

Pull Request - State: closed - Opened by sadasant over 1 year ago

#23 - Fix to eval-specific system_message and previous_messages

Pull Request - State: closed - Opened by sadasant over 1 year ago

#22 - Add an llm-eval version to the yaml specifications

Issue - State: open - Opened by sadasant over 1 year ago

#21 - changes to support setting system_message and previous_messages in eval

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#20 - fixed hugging face issue and updated message formatting logic

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#19 - added helper functions to get code block/test results on evalresult

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#18 - support for num_samples in EvalHarness (i.e. running evals `num_samples` number of times)

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#17 - Initial soft metrics via LLMCheck

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#16 - When code tests have bad syntax, the error does not surface

Issue - State: open - Opened by sadasant over 1 year ago

#15 - Allowing evals to modify system parameters

Pull Request - State: closed - Opened by sadasant over 1 year ago

#14 - Dev shane

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#13 - added EvalResult to_json, from_json

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#12 - added code_block_timeout and code_test_timeout to checks.PythonCodeBlockTests

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#11 - Dev shane

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#10 - various renaming for consistency

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#9 - added timestamp to EvalResult

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#8 - added property to EvalResult to return the code-block-run CheckResult

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#7 - added filter functions for list of EvalResults

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#6 - code block errors now return dictionary of error name and message rather than actual exception (for serialization)

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#5 - Dev shane

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#4 - for PythonCodeBlocksRun, added the custom tests to the result/metadata; renamed `code_block_checks` to `code_tests`

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#3 - update test_openai.py to adjust prompt to fix randomly failing test

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#2 - Updated default openai model to latest 3.5 version; updated readme with slight changes

Pull Request - State: closed - Opened by shane-kercheval over 1 year ago

#1 - chore: Configure Renovate

Pull Request - State: closed - Opened by anaconda-renovate[bot] over 1 year ago - 1 comment
Labels: renovate

GitHub / anaconda/llm-eval issues and pull requests