> [[Agent 開發知識庫]]
* Agent 評估指標舉例 Metrics for Evaluating AI Agents https://www.galileo.ai/blog/metrics-for-evaluating-ai-agents
* https://v2docs.galileo.ai/concepts/metrics/agentic/agentic-overview
- https://docs.smith.langchain.com/evaluation/tutorials/agents
- LangChain AgentEvals https://github.com/langchain-ai/agentevals
- https://x.com/LangChainAI/status/1906759734516228514 (2025/4/1)
- 用來評估 agent trajectory 的評估器
* AI agent testing 框架 https://github.com/plurai-ai/intellagent
* https://x.com/NirDiamantAI/status/1882080786024628279
* Arize 課程 [[Arize Evaluating AI Agents]]
* https://www.deeplearning.ai/short-courses/evaluating-ai-agents/
* The AI Agent Evaluation Blueprint (2025/5/8)
* https://galileo.ai/blog/ai-agent-evaluation-blueprint-part-1
* How to evaluate AI agents with Braintrust (2025/6/11)
* https://www.youtube.com/watch?v=tKInkwOwk8M
* paper: Survey on Evaluation of LLM-based Agents
* https://arxiv.org/abs/2503.16416
* https://x.com/omarsar0/status/1939691782477902313 (2025/6/30)
* How to Setup Evals For Agents w/ Harrison Chase (2025/7/10)
- https://maven.com/p/a58f3f/how-to-setup-evals-for-agents
- https://claude.ai/public/artifacts/e6360a6e-1288-4d9d-83d1-d5b35c4a049d
* AI Agent Evaluation | Pratik Bhavsar, Galileo (2025/7/23)
* https://www.youtube.com/watch?v=c5wyHzPU4yE
* https://x.com/omarsar0/status/1947738722755027266
* 逐字稿: https://claude.ai/public/artifacts/40a13373-855f-4b8b-af17-e1b5d0b3a32e
* Anthropic: Writing effective tools for agents — with agents (2025/9/11)
* https://www.anthropic.com/engineering/writing-tools-for-agents
* LangChain & LangSmith 分享 (2026/1)
* https://blog.langchain.com/evaluating-deep-agents-our-learnings/
* https://blog.aihao.tw/2026/02/17/traces-new-source-of-truth/
* https://blog.aihao.tw/2026/02/17/langsmith-insights-agent-deep-dive/
* https://www.langchain.com/conceptual-guides/agent-observability-powers-agent-evaluation
* 摘要: https://blog.aihao.tw/2026/02/17/traces-new-source-of-truth/
* Anthropic: Demystifying evals for AI agents (2026/1/9)
* https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents
* 摘要: https://blog.aihao.tw/2026/02/17/demystifying-evals-for-ai-agents/