OpenAI Prompt engineering - ihower's Notes

> 歡迎訂閱我的 [AI Engineer 電子報](https://aihao.eo.page/6tcs9) 和瀏覽 [[Generative AI Engineer 知識庫]] 文件網址: https://platform.openai.com/docs/guides/prompt-engineering 蠻不錯的 Prompt Engineering 整理。比 [[ChatGPT Prompt Engineering for Developers]] 更偏向工程面的重點雖然沒有談實際怎麼寫code，但是講了策略跟戰術大方向這些戰術有些必須用工程來解決，無法在 ChatGPT 層級讓用戶自己來推薦搭配 [[Building Systems with the ChatGPT API]] 這門課一起看才會理解怎麼開發。注意，有些範例只適用 GPT-4 ### 1. Write clear instructions 把指示寫清楚 GPT 越是不需要猜你想要什麼，效果越好。輸出長度、輸出內容的難易、想要的格式等等。 - [Include details in your query to get more relevant answers](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-include-details-in-your-query-to-get-more-relevant-answers) - 把需求寫詳細一點 - [Ask the model to adopt a persona](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-ask-the-model-to-adopt-a-persona) - 在 system message 加上角色特性 - [Use delimiters to clearly indicate distinct parts of the input](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-use-delimiters-to-clearly-indicate-distinct-parts-of-the-input) - 用分隔符號(""" 或 XML 或是 section title 來區隔輸入部分 - [Specify the steps required to complete a task](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-specify-the-steps-required-to-complete-a-task) - 把步驟寫出來 Step 1, Step 2, Step 3 ..... - [Provide examples](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-provide-examples) - 提供範例 - [Specify the desired length of the output](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-specify-the-desired-length-of-the-output) - 指定回覆長度，字數會不太準。若用段落數或 bullet points 會比較可靠 ### 2. Provide reference text 提供參考資料 GPT 會亂說話，特別是關於內部資訊 or 引用資料。提供參考資訊可以減少假造 - [Instruct the model to answer using a reference text](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-instruct-the-model-to-answer-using-a-reference-text) - 指示模型使用參考資料 - 不過要動態找相關資料，需要用到 Embedding 來做 knowledge retrieval。請參考另一個戰術 Use embeddings-based search to implement efficient knowledge retrieval - [Instruct the model to answer with citations from a reference text](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-instruct-the-model-to-answer-with-citations-from-a-reference-text) - 指示模式只能使用引用資料，並且加註引用段落，不然就回答資訊不足 > 動態找相關資料，也是需要工程師才能做 ### 3. Split complex tasks into simpler subtasks 拆解複雜問題複雜的任務拆成簡單任務的 workflow，讓前面任務的輸出成為後面任務的輸入 - [Use intent classification to identify the most relevant instructions for a user query](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-use-intent-classification-to-identify-the-most-relevant-instructions-for-a-user-query) - 適合對於不同情況有獨立指示的對話情境 - 根據用戶意圖做情境分類樹，輸出用戶適合哪一種情境分類，給出那個情境的 prompt - 過程可以是 recursively 拆解一個任額成為循序的階段 - 可以降低成本，因為每次 query 只需要包含任務下一解段需要的 prompt - 也可以降低錯誤率 - 可指示 model 使用特殊符號來設定 state machine，用於記住目前對話的階段 - [For dialogue applications that require very long conversations, summarize or filter previous dialogue](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-for-dialogue-applications-that-require-very-long-conversations-summarize-or-filter-previous-dialogue) - GPT context 有上限，在長對話中，若達到預定 threshold，就觸發一個 prompt 摘要之前的對話，或是做成非同步摘要 - 另一個方式是動態抓出最相關的對話，參考另一個戰術 Use embeddings-based search to implement efficient knowledge retrieval - [Summarize long documents piecewise and construct a full summary recursively](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-summarize-long-documents-piecewise-and-construct-a-full-summary-recursively) - GPT context 有上限，長文本摘要需要拆段落進行處理，然後拼接起來 - 過程會是 recursively 處理直到全部文本完成 - 如果處理時需要前面的段落，則提供前面段落的摘要會有幫助 - [ ] OpenAI 曾用 GPT3 做過總結一本書的研究: https://openai.com/research/summarizing-books > 這三個戰術都需要工程師才能做 ### 4. Give GPTs time to "think" 讓 GPT 有時間思考 GPT 跟人一樣需要時間來做推理思考，因此要求有推理過程，會比較靠譜 - [Instruct the model to work out its own solution before rushing to a conclusion](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-instruct-the-model-to-work-out-its-own-solution-before-rushing-to-a-conclusion) - 指示模型先推導出解答，再比較用戶的答案是否正確 - 而不是直接讓模型去判斷用戶的答案 - [Use inner monologue or a sequence of queries to hide the model's reasoning process](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-use-inner-monologue-or-a-sequence-of-queries-to-hide-the-model-s-reasoning-process) - 若推導過程不想讓用戶看到，可以用內心獨白方式來隱藏 - 讓想要隱藏的部分指示模型用 (structured format) 例如 """ 包起來，方便你解析後隱藏不讓用戶看到 - 範例是一個 tutoring 系統 - [Ask the model if it missed anything on previous passes](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-ask-the-model-if-it-missed-anything-on-previous-passes) - 追問詢問模型是否有任何遺漏 - 例如摘錄一份長文件，有可能 model 會遺漏 > 但這範例拆成兩個 prompt，追問會多花一次 query.... :( > 覺得可以只用一個 prompt 達成一樣的效果，只要在最後請 model 在列完之後，自己檢查即可。因為 model 應該是有能力判斷自己寫的對不對。 ### 5. Use external tools 使用外部工具可以透過其他工具來彌補GPT的弱點，例如用 text retrieval system 抓出相關資訊、用 code execution engine 來算數學和執行程式。如果一項任務用工具比 GPT 更好，你應該結合採用。 - [Use embeddings-based search to implement efficient knowledge retrieval](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-use-embeddings-based-search-to-implement-efficient-knowledge-retrieval) - 將外部資訊作為輸入的內容之一 - embedding 技術用來實作動態的語意搜尋 - [ ] Cookbook 範例 https://github.com/openai/openai-cookbook/blob/main/examples/vector_databases/Using_vector_databases_for_embeddings_search.ipynb - [Use code execution to perform more accurate calculations or call external APIs](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-use-code-execution-to-perform-more-accurate-calculations-or-call-external-apis) - GPT 不擅長數學或長計算，可用 model 來指示寫code來跑計算 - 指示 model 將 code 放在指定格式，你提取出來後執行 - 還需要的話，可以將結果當作下一次 prompt 的輸入 - 另一個 code execution 用途是呼叫外部 API，只要告訴 model 如何使用 API 即可 - 注意: 需要一個 sandbox 環境來執行代碼，因為 model 不是絕對安全 ### 6. Test changes systematically 有系統的測試如果可以測量，改進就比較容易。樣本數少的話，改進 prompt 時可能只對少數案例有效，但整體效果變差了。因此你需要一個完整的測試 test suite 來評估 (evals, Evaluation procedures) 好的評估是 * 代表真實案例，或至少是多樣的 * 有多個測試案例有統計效力 * 容易自動化或重複測試評估可以自動或人工或混合除了客觀標準，也可以讓 model outputs 用其他 model query 來評估 OpenAI 開源的自動評估工具 https://github.com/openai/evals 使用 model-based evals 和是否需要人工評估，需要多實驗才知道是否好用 - [Evaluate model outputs with reference to gold-standard answers](https://platform.openai.com/docs/guides/gpt-best-practices/tactic-evaluate-model-outputs-with-reference-to-gold-standard-answers) - 假設已知問題有正確答案，且有參考的來源事實 (facts) - 那我們可以用 model query 來計算答案中包含多少來源事實，用來評估 - 另一個變形: 比較用戶答案和標準答案 - ，是 disjoint, subset, superset, equal 或是 contradiction