* Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences * https://arxiv.org/abs/2404.12272 * https://blog.langchain.dev/aligning-llm-as-a-judge-with-human-preferences/ * 用人類校正資料,去 few-shot 評估 prompt 做對齊 * https://x.com/eugeneyan/status/1817664535186317341 - [[LangSmith Evaluations 影片系列]] 最後兩集 * https://eugeneyan.com/writing/aligneval/ * 一個半自動化的評估 app * https://x.com/eugeneyan/status/1851654159692616020 (2024/10/30) * https://aligneval.com/ * Introducing Align Evals: Streamlining LLM Application Evaluation (2025/7/29) * https://blog.langchain.com/introducing-align-evals/