> 也跟 [[ColBERT]] 相關
[ColPali](https://huggingface.co/blog/manu/colpali) 是一個多模態檢索器,消除了對繁重且脆弱的文件處理器的需求。它原生處理圖像,並處理和編碼圖像區塊以與文本兼容,從而消除了進行光學字符識別(OCR)或圖像標題的需求。
- https://github.com/AnswerDotAI/byaldi/
- https://github.com/illuin-tech/colpali
- https://blog.vespa.ai/retrieval-with-vision-language-models-colpali/ (2024/7/15)
- re-ranker https://x.com/llama_index/status/185638871635451 5279
針對 PDF 每頁轉成 image 做檢索,然後 LLM 直接放 image 進去當作參考圖片
* https://baoyu.io/translations/rag/retrieval-with-vision-language-models-colpali
* https://x.com/dotey/status/1813429905910067671
* https://x.com/jerryjliu0/status/1815904500491972663 (2024/7/24)
* https://x.com/mervenoyann/status/1831409380040044762 (2024/9/5)
* 索引圖像並直接檢索圖像,然後將這個圖像作為上下文提供給 VLM
* https://x.com/mervenoyann/status/1831737088468791711 (2024/9/6)
* https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb
* https://x.com/bclavie/status/1832090968691978341 (2024/9/7)
* https://danielvanstrien.xyz/posts/post-with-code/colpali/2024-09-23-generate_colpali_dataset.html 微調
* https://blog.vespa.ai/the-rise-of-vision-driven-document-retrieval-for-rag/ (2024/8/19)
* 有限的多語言支持:雖然 ColPali 在處理非英語語言(TabQuAD 數據集是法語)方面顯示出潛力,但它主要是基於英語數據進行訓練,因此在其他語言上的表現可能不太一致...
* https://huggingface.co/blog/manu/colpali
* https://github.com/tonywu71/colpali-cookbooks
> 2024/10 初步測試,中文應該是不太行。ColPali 模型訓練只有用英文訓練。
* ColQWen2 支援中文!!
* https://pyvespa.readthedocs.io/en/latest/examples/pdf-retrieval-with-ColQwen2-vlm_Vespa-cloud.html
* Multimodal RAG over PDFs using ColQwen2, Qwen2.5, and Weaviate
* https://github.com/weaviate/recipes/blob/main/weaviate-features/multi-vector/multi-vector-colipali-rag.ipynb
* https://x.com/helloiamleonie/status/1962482840810975527 (2025/9/1)
* https://github.com/tjmlabs/ColiVara
* ElasticSearch (2025/3)
* https://www.elastic.co/search-labs/blog/series/colpali-model-elasticsearch
## VARAG: Vision Augmented Retrieval and Generation
https://github.com/adithya-s-k/VARAG
https://x.com/tuturetom/status/1840370031538159622 (2024/9/29)
https://x.com/adithya_s_k/status/1840028869195112534 (2024/9/28)
* OCR
* Vision RAG 用 JinaCLIP
* ColPali RAG
* Hybrid ColPali RAG
## ColiVara (SaaS)
https://github.com/tjmlabs/ColiVara
https://x.com/_avichawla/status/1889562591363838236
https://x.com/akshay_pachaar/status/1886396550089511119