Late Chunking - ihower's Notes

Late Chunking in Long-Context Embedding Models * https://jina.ai/news/late-chunking-in-long-context-embedding-models/ (2024/8/22) * https://x.com/JinaAI_/status/1826649439324254291 (2024/8/22) * https://x.com/JinaAI_/status/1841903098174046494 (2024/10/4) * https://jina.ai/news/what-late-chunking-really-is-and-what-its-not-part-ii/ (2024/10/4) * 利用長上下文嵌入模型的力量來嵌入短分塊 * https://weaviate.io/blog/late-chunking (2024/9/5) * https://x.com/JinaAI_/status/1854538252922802412 (2024/11/7) * https://colab.research.google.com/drive/1iz3ACFs5aLV2O_uZEjiR1aHGlXqj0HY7?usp=sharing * 和 [[Contextual Retrieval]] 比較: https://medium.com/kx-systems/late-chunking-vs-contextual-retrieval-the-math-behind-rags-context-problem-d5a26b9bbd38 > **封閉 API（如 OpenAI 的 embedding API）** 無法支援 Late Chunking，因為它通常只能直接返回 chunk-level embedding，而不能讓你獲取 transformer 中間層的 token embeddings