* Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences
* https://arxiv.org/abs/2404.12272
* https://blog.langchain.dev/aligning-llm-as-a-judge-with-human-preferences/
* 用人類校正資料,去 few-shot 評估 prompt 做對齊
* https://x.com/eugeneyan/status/1817664535186317341
- [[LangSmith Evaluations 影片系列]] 最後兩集
* https://eugeneyan.com/writing/aligneval/
* 一個半自動化的評估 app
* https://x.com/eugeneyan/status/1851654159692616020 (2024/10/30)