* langchain use cases: https://python.langchain.com/docs/use_cases/qa_structured/sql * https://blog.langchain.dev/query-construction/ * use RAG to fetch few-shot examples for less flaky text-to-SQL * https://twitter.com/jerryjliu0/status/1747402661404831852 * Scale 使用微調的案例 https://scale.com/blog/text2sql-fine-tuning * Pinterest 經驗分享 https://medium.com/pinterest-engineering/how-we-built-text-to-sql-at-pinterest-30bad30dabff (2024/4/3) * LLMs Meet SQL: Revolutionizing Data Querying with Natural Language Processing 2024/3/6 * https://levelup.gitconnected.com/llms-meet-sql-revolutionizing-data-querying-with-natural-language-processing-52487337f043 * 蠻詳細的一篇綜述,可惜用 langchain 做範例 * 也補充了很多 paper,例如 https://arxiv.org/abs/2305.11853 * How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings * DSPy 案例 https://twitter.com/jjovalle99/status/1777766617444663519 https://www.ai2sql.io/ * Mistral 範例 https://github.com/mistralai/cookbook/blob/main/third_party/Neon/neon_text_to_sql.ipynb * paper: A Survey on Employing Large Language Models for Text-to-SQL Tasks (2024/8) * https://arxiv.org/abs/2407.15186 * Uber 的經驗分享 https://www.uber.com/en-TW/blog/query-gpt/ * Awesome 整理 https://github.com/eosphoros-ai/Awesome-Text2SQL * How I build AI Query Wizard for Enterprise-Scale with 500+ Tables (2024/8/19) * https://levelup.gitconnected.com/sql-generator-how-i-build-ai-query-wizard-for-enterprise-scale-with-500-tables-fc290692632a * waii https://www.waii.ai/ * 看起來是付費服務 * https://blog.waii.ai/complex-sql-joins-with-langgraph-and-waii-9e3b093b2942 * Linkedin 的經驗 https://www.linkedin.com/blog/engineering/ai/practical-text-to-sql-for-data-analytics * Enhancing Text-to-SQL With Synthetic Summaries: A Few-Shot Learning Approach * https://www.timescale.com/blog/enhancing-text-to-sql-with-synthetic-summaries * https://chatgpt.com/share/679cedb1-1d90-8008-95b0-d9dea3c9195f * 先有一個 SQL 知識庫,透過 摘要 增加它,讓檢索更準 * 最後生成 text-to-sql 時,找到參考的 SQL 跟它的摘要 ## SQL Agent 可以串個 BI database 來做查詢 https://towardsdatascience.com/can-llms-replace-data-analysts-getting-answers-using-sql-8cf7da132259 https://twitter.com/xiaohuggg/status/1747226873195794817 https://github.com/vanna-ai/vanna ## Instructor + SQLModel https://jxnl.github.io/instructor/examples/sqlmodel/ https://python.useinstructor.com/examples/sqlmodel/ > SQLModel 作者也是 FastAPI 作者耶 ## Vanna https://github.com/vanna-ai/vanna https://twitter.com/xiaohuggg/status/1747226873195794817 https://twitter.com/llama_index/status/1750196064660127848 https://twitter.com/jerryjliu0/status/1750296862144507954 ## WrenAI https://www.getwren.ai/ https://github.com/canner/wrenai 經驗分享: https://blog.getwren.ai/4-key-technical-challenges-using-rag-with-llms-to-query-database-text-to-sql-and-how-to-solve-it-5d5a3d6682e5 ## Querypls: Prompt to SQL https://github.com/samadpls/Querypls/ ## dataherald https://github.com/Dataherald/dataherald https://blog.langchain.dev/dataherald/ 在 langchain blog 介紹的工具 (2024/2/14) ## Langchain https://python.langchain.com/docs/use_cases/sql/ https://python.langchain.com/docs/integrations/toolkits/sql_database https://python.langchain.com/v0.2/docs/tutorials/sql_qa/#agents SQL agent https://medium.com/@lucnguyen_61589/understanding-the-magic-deconstructing-langchains-sql-agent-667881b9e209 Building a Chat Application with LangChain, LLMs, and Streamlit for Complex SQL Database Interaction https://towardsdatascience.com/building-a-chat-app-with-langchain-llms-and-streamlit-for-complex-sql-database-interaction-7433245079f3 案例 LangChain SQL Agent for Massive Documents Interaction (2024/3/7) https://pub.towardsai.net/langchain-sql-agent-for-massive-documents-interaction-510fc4bc65a4 langgraph 版本 https://langchain-ai.github.io/langgraph/tutorials/sql-agent/ ## Llamaindex * https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo.html * Combining Text-to-SQL with Semantic Search for Retrieval Augmented Generation 2023/5/28 * https://blog.llamaindex.ai/combining-text-to-sql-with-semantic-search-for-retrieval-augmented-generation-c60af30ec3b * https://gpt-index.readthedocs.io/en/latest/examples/query_engine/SQLAutoVectorQueryEngine.html * [LlamaIndex: Harnessing the Power of Text2SQL and RAG to Analyze Product Reviews](https://blog.llamaindex.ai/llamaindex-harnessing-the-power-of-text2sql-and-rag-to-analyze-product-reviews-204feabdf25b) * https://docs.llamaindex.ai/en/stable/examples/pipeline/query_pipeline_sql.html * https://twitter.com/jerryjliu0/status/1756486047356399940 2024/2/11 * Three-levels of Text-to-SQL 結合 RAG * ClickHouse 案例(SQL+語意搜尋) 2024/2/29 * https://clickhouse.com/blog/building-hackernews-stackoverflow-chatbot-with-llamaindex-and-clickhouse ## Dataherald https://www.dataherald.com/ ## 混合應用 * Combining Text-to-SQL with Semantic Search for Retrieval Augmented Generation * https://medium.com/llamaindex-blog/combining-text-to-sql-with-semantic-search-for-retrieval-augmented-generation-c60af30ec3b * SQL Auto Vector Query Engine * https://docs.llamaindex.ai/en/latest/examples/query_engine/SQLAutoVectorQueryEngine.html# * 判斷是否先查 SQL,查完再用 vector 結合輸出 * llamaindex 似乎還有幾個 SQL engine * 結合 pgvector 一起產生 SQL + vector search * https://github.com/langchain-ai/langchain/tree/master/templates/sql-pgvector * https://github.com/langchain-ai/langchain/blob/master/cookbook/retrieval_in_sql.ipynb * databricks 案例: 似乎也是 SQL + vector search (飯店搜尋: 價格區間 跟用戶偏好 vector search ) * https://www.databricks.com/blog/improve-your-rag-application-response-quality-real-time-structured-data * https://docs.databricks.com/en/_extras/notebooks/source/machine-learning/structured-data-for-rag.html ## SaaS https://askyourdatabase.com/ ## 專用模型 https://twitter.com/rishdotblog/status/1752329471867371659 ## Benchmark https://bird-bench.github.io/