> 也跟 [[ColBERT]] 相關 [ColPali](https://huggingface.co/blog/manu/colpali) 是一個多模態檢索器,消除了對繁重且脆弱的文件處理器的需求。它原生處理圖像,並處理和編碼圖像區塊以與文本兼容,從而消除了進行光學字符識別(OCR)或圖像標題的需求。 - https://github.com/AnswerDotAI/byaldi/ - https://github.com/illuin-tech/colpali - https://blog.vespa.ai/retrieval-with-vision-language-models-colpali/ (2024/7/15) - re-ranker https://x.com/llama_index/status/185638871635451 5279 針對 PDF 每頁轉成 image 做檢索,然後 LLM 直接放 image 進去當作參考圖片 * https://baoyu.io/translations/rag/retrieval-with-vision-language-models-colpali * https://x.com/dotey/status/1813429905910067671 * https://x.com/jerryjliu0/status/1815904500491972663 (2024/7/24) * https://x.com/mervenoyann/status/1831409380040044762 (2024/9/5) * 索引圖像並直接檢索圖像,然後將這個圖像作為上下文提供給 VLM * https://x.com/mervenoyann/status/1831737088468791711 (2024/9/6) * https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb * https://x.com/bclavie/status/1832090968691978341 (2024/9/7) * https://danielvanstrien.xyz/posts/post-with-code/colpali/2024-09-23-generate_colpali_dataset.html 微調 * https://blog.vespa.ai/the-rise-of-vision-driven-document-retrieval-for-rag/ (2024/8/19) * 有限的多語言支持:雖然 ColPali 在處理非英語語言(TabQuAD 數據集是法語)方面顯示出潛力,但它主要是基於英語數據進行訓練,因此在其他語言上的表現可能不太一致... * https://huggingface.co/blog/manu/colpali * https://github.com/tonywu71/colpali-cookbooks > 2024/10 初步測試,中文應該是不太行。ColPali 模型訓練只有用英文訓練。 * ColQWen2 支援中文!! * https://pyvespa.readthedocs.io/en/latest/examples/pdf-retrieval-with-ColQwen2-vlm_Vespa-cloud.html * Multimodal RAG over PDFs using ColQwen2, Qwen2.5, and Weaviate * https://github.com/weaviate/recipes/blob/main/weaviate-features/multi-vector/multi-vector-colipali-rag.ipynb * https://x.com/helloiamleonie/status/1962482840810975527 (2025/9/1) * https://github.com/tjmlabs/ColiVara * ElasticSearch (2025/3) * https://www.elastic.co/search-labs/blog/series/colpali-model-elasticsearch ## VARAG: Vision Augmented Retrieval and Generation https://github.com/adithya-s-k/VARAG https://x.com/tuturetom/status/1840370031538159622 (2024/9/29) https://x.com/adithya_s_k/status/1840028869195112534 (2024/9/28) * OCR * Vision RAG 用 JinaCLIP * ColPali RAG * Hybrid ColPali RAG ## ColiVara (SaaS) https://github.com/tjmlabs/ColiVara https://x.com/_avichawla/status/1889562591363838236 https://x.com/akshay_pachaar/status/1886396550089511119