Parse judgments with structured output prompting, one response model, one judge model at a time. eb4ec23 justinxzhao commited on Oct 3, 2024
Add token usage tracking for openai and fix token usage tracking for anthropic. 1afb9ca justinxzhao commited on Oct 1, 2024
Factor out LLM chat rendering so that it persists even when the submit button isn't active. a0dca54 justinxzhao commited on Oct 1, 2024
Factor out judge results code so that it persists when the submit button is inactivated. 279a804 justinxzhao commited on Oct 1, 2024
Added general rendering of chats so that they don't disappear during app saving. 6fae7e2 justinxzhao commited on Oct 1, 2024
Some refactoring, judging responses for direct assessment. 577870e justinxzhao commited on Sep 29, 2024