> [[Agent 開發知識庫]] * Agent 評估指標舉例 Metrics for Evaluating AI Agents https://www.galileo.ai/blog/metrics-for-evaluating-ai-agents * https://v2docs.galileo.ai/concepts/metrics/agentic/agentic-overview - https://docs.smith.langchain.com/evaluation/tutorials/agents - LangChain AgentEvals https://github.com/langchain-ai/agentevals - https://x.com/LangChainAI/status/1906759734516228514 (2025/4/1) - 用來評估 agent trajectory 的評估器 * AI agent testing 框架 https://github.com/plurai-ai/intellagent * https://x.com/NirDiamantAI/status/1882080786024628279 * Arize 課程 [[Arize Evaluating AI Agents]] * https://www.deeplearning.ai/short-courses/evaluating-ai-agents/ * The AI Agent Evaluation Blueprint (2025/5/8) * https://galileo.ai/blog/ai-agent-evaluation-blueprint-part-1 * How to evaluate AI agents with Braintrust (2025/6/11) * https://www.youtube.com/watch?v=tKInkwOwk8M * paper: Survey on Evaluation of LLM-based Agents * https://arxiv.org/abs/2503.16416 * https://x.com/omarsar0/status/1939691782477902313 (2025/6/30) * How to Setup Evals For Agents w/ Harrison Chase (2025/7/10) - https://maven.com/p/a58f3f/how-to-setup-evals-for-agents - https://claude.ai/public/artifacts/e6360a6e-1288-4d9d-83d1-d5b35c4a049d * AI Agent Evaluation | Pratik Bhavsar, Galileo (2025/7/23) * https://www.youtube.com/watch?v=c5wyHzPU4yE * https://x.com/omarsar0/status/1947738722755027266 * 逐字稿: https://claude.ai/public/artifacts/40a13373-855f-4b8b-af17-e1b5d0b3a32e * Anthropic: Writing effective tools for agents — with agents (2025/9/11) * https://www.anthropic.com/engineering/writing-tools-for-agents * LangChain & LangSmith 分享 (2026/1) * https://blog.langchain.com/evaluating-deep-agents-our-learnings/ * https://blog.aihao.tw/2026/02/17/traces-new-source-of-truth/ * https://blog.aihao.tw/2026/02/17/langsmith-insights-agent-deep-dive/ * https://www.langchain.com/conceptual-guides/agent-observability-powers-agent-evaluation * 摘要: https://blog.aihao.tw/2026/02/17/traces-new-source-of-truth/ * Anthropic: Demystifying evals for AI agents (2026/1/9) * https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents * 摘要: https://blog.aihao.tw/2026/02/17/demystifying-evals-for-ai-agents/