Hallucination 幻覺 - ihower's Notes

* Twitter: Why Large Language Models Hallucinate and How to Reduce it (2023/9/8) https://twitter.com/bindureddy/status/1699975747786641855 * Fixing Hallucinations in LLMs https://medium.com/better-programming/fixing-hallucinations-in-llms-9ff0fd438e33 (2023/8) * 為什麼大型語言模型會產生幻覺以及如何減少它 tweets https://twitter.com/abacusai/status/1701249387727376871 (2023/9/11) * Fixing Hallucinations in LLMs https://betterprogramming.pub/fixing-hallucinations-in-llms-9ff0fd438e33 (2023/8) * 科普分析介紹 https://www.tidio.com/blog/ai-hallucinations/ (2023/9) * 科普介紹為何會有幻覺 https://queue.acm.org/detail.cfm?id=3688007 (2024/9/9) * paper: https://twitter.com/omarsar0/status/1722985251129966705 * 評測問題 https://twitter.com/DrJimFan/status/1724464105371939301 * Hallucination is not a bug, it is LLM's greatest feature. https://twitter.com/karpathy/status/1733299213503787018 * https://x.com/lateinteraction/status/1733300626921251249 * Mitigating Hallucination in LLMs paper * https://arxiv.org/abs/2401.01313 * https://arxiv.org/abs/2401.01313v3 * https://twitter.com/omarsar0/status/1742633831234994189 * Mitigation Techniques Against Hallucinations - The Biggest Issue With Real World LLM Usage * https://twitter.com/bindureddy/status/1742753571744231599 * Extrinsic Hallucinations in LLMs (2024/7/7) * https://lilianweng.github.io/posts/2024-07-07-hallucination/ * 為何幻覺 * 幻覺檢測 * 反幻覺方法 * https://arxiv.org/abs/2311.05232 * 有針對幻覺做分類 * I Hallucinations: Why Large Language Models Make Things Up (And How to Fix It) * https://www.kapa.ai/blog/ai-hallucination (2024/11/25) * 排行榜 https://github.com//hallucination-leaderboard * https://www.xiaohu.ai/c/xiaohu-ai/hallucination-leaderboard * paper: A comprehensive taxonomy of hallucinations in Large Language Models (2025/8/3) * https://arxiv.org/abs/2508.01781 * https://x.com/omarsar0/status/1952731083465994347 (8/5) ![[video.mp4]] * How You Catch Production Hallucinations in Real Time (2025/8) - https://maven.com/p/285276/how-you-catch-production-hallucinations-in-real-time - https://blog.quotientai.co/how-to-detect-hallucinations-in-retrieval-augmented-systems-a-primer/ - 關鍵洞察: 現在生產環境中 80-90% 的幻覺都是外在型的! 不是模型不夠聰明，而是它沒有正確理解或使用提供的 context - 可以做無參考評估 * OpenAI: Why language models hallucinate (2025/9/5) * https://openai.com/index/why-language-models-hallucinate/ * https://galileo.ai/blog/why-language-models-hallucinate ## 檢測法 * SelfCheckGPT 檢測法 * 一次產生多個輸出，檢查輸出之間的相似性是否矛盾 * https://towardsdatascience.com/real-time-llm-hallucination-detection-9a68bb292698 * Cross-Examination 法 * https://arxiv.org/abs/2305.13281 * 在 https://docs.parea.ai/blog/eval-metrics-for-llm-apps-in-prod#cross-examination-for-hallucination-detection 看到的 * ASPIRE https://blog.research.google/2024/01/introducing-aspire-for-selective.html * Benchmarking Hallucination Detection Methods in RAG (2024/9/30) * 檢測方法比較 * https://cleanlab.ai/blog/rag-tlm-hallucination-benchmarking/ * https://cleanlab.ai/blog/prevent-hallucinated-responses/ * awesome-hallucination-detection * https://github.com/EdinburghNLP/awesome-hallucination-detection * https://github.com/cvs-health/uqlm * https://github.com/KRLabsOrg/LettuceDetect * https://x.com/karminski3/status/1913753480919175589 (2025/4/20) * https://www.quotientai.co/ * 一家 SaaS 檢測 * papers from [[Generative AI Design Patterns]] * https://arxiv.org/abs/2303.08896 * https://arxiv.org/abs/2405.19648v1 * https://arxiv.org/abs/2407.21424v1 ## Multimodal * Hallucination of Multimodal Large Language Models: A Survey https://arxiv.org/abs/2404.18930 (2024/4) ## OpenAI news https://fortune.com/2023/08/01/can-ai-chatgpt-hallucinations-be-fixed-experts-doubt-altman-openai/?utm_source=substack&utm_medium=email Open AI 悄咪咪下线了他们的 ChatGPT生成内容的检测器： https://techcrunch.com/2023/07/25/openai-scuttles-ai-written-text-detector-over-low-rate-of-accuracy/ 教你的LLM总是用事实而不是虚构的东西来回答: (from AIGC Weekly #32) 支持结构化查询语言的向量数据库可以存储多种类型的数据，提高向量搜索查询的准确性和效率。幻觉是大型语言模型在陌生主题上不准确的现象。通过添加事实和外部知识，可以减少幻觉的出现。使用向量SQL可以进行精细的向量搜索，提高LLM系统的性能。 https://blog.myscale.com/2023/07/17/teach-your-llm-vector-sql/ ## "Catching up on the weird world of LLMs" - Simon Willison (North Bay Python 2023) Figure out what kind of things cause hallucinations, then avoid them