原版: https://www.deeplearning.ai/short-courses/finetuning-large-language-models/
簡中翻譯: https://www.youtube.com/playlist?list=PLiuLMb-dLdWKtPM1YahmDHOjKN_a2Uiev
課程心得: 投影片內容還不錯,code 有點太細節了
* 可以理解為何需要微調: Prompting v.s. Finetuning
* 可以理解 pre-training 和 fine-tuning 的流程
* 但其實如何訓練的細節 code 不太重要,反正都有高階的 library 包好了,例如 Lamini libray,或是直接呼叫 GPT fine-tuning API 等
* PEFT 只在最後稍微提一下而已,課程示範的微調其實都是 full fine-tuning (?)
## Introduction
* 用自己私有數據進行模型微調
* 相比 prompt,微調可以更好調整 LLM 說話的語氣風格,可以更有一致性
## Why finetune
![[Pasted image 20230829181813.png]]
![[Pasted image 20230919151348.png]]
![[Pasted image 20230919151426.png]]
* 學習新知識
* 更一致可靠的輸出和行為
* 減少幻覺
* 定製為特定的用途
![[Pasted image 20230919151650.png]]
![[Pasted image 20230919151715.png]]
![[Pasted image 20230919152751.png]]
![[Pasted image 20230919153058.png]]
![[Pasted image 20230919153130.png]]
![[Pasted image 20230919153141.png]]
## Where fine-tuning fits in
![[Pasted image 20230919153652.png]]
![[Pasted image 20230919153736.png]]
* EleutherAI 有公開一份 pretraining data: The Pile
![[Pasted image 20230919153938.png]]
![[Pasted image 20230919154038.png]]
* 我們是微調整個 LLM 的權重,而不是部分
![[Pasted image 20230919155807.png]]
* 行為改變: 例如 LLM 明確知道現在是聊天模式
![[Pasted image 20230919160039.png]]
![[Pasted image 20230919160118.png]]
* ~1k 數據集是理想需要的數量
![[Pasted image 20230919160420.png]]
![[Pasted image 20230919160632.png]]
![[Pasted image 20230919160847.png]]
![[Pasted image 20230919160855.png]]
![[Pasted image 20230919160904.png]]
## Instruction finetuning
![[Pasted image 20230919161103.png]]
![[Pasted image 20230919161818.png]]
![[Pasted image 20230919161832.png]]
![[Pasted image 20230919162003.png]]
![[Pasted image 20230919162114.png]]
![[Pasted image 20230919162122.png]]
* Data Prep 針對不同類型微調的關鍵
![[Pasted image 20230919162312.png]]
Alpaca 準備了兩套 Prompt templates:
![[Pasted image 20230919162432.png]]
![[Pasted image 20230919162525.png]]
## Data preparation
![[Pasted image 20230919163508.png]]
* Real 資料比較好,特別是寫作任務,因為 LLM 生成的數據會有特定 patterns,就無法學到新的模式或構思方式
* 是句品質比數量較重要,因為 pre-training 階段已經有學到很多知識了。
![[Pasted image 20230919163517.png]]
![[Pasted image 20230919163923.png]]
![[Pasted image 20230919165111.png]]
對模型來說,跑 batch 需要一樣的 tokens 長度,因此需要做 padding (fixed size tensor)
同理,也需要做 truncation (可選左or右)
![[Pasted image 20230919165425.png]]
![[Pasted image 20230919165623.png]]
![[Pasted image 20230919165703.png]]
![[Pasted image 20230919165730.png]]
![[Pasted image 20230919165745.png]]
![[Pasted image 20230919165826.png]]
## Training process
![[Pasted image 20230919170817.png]]
![[Pasted image 20230919170832.png]]
PyTorch code:
![[Pasted image 20230919170959.png]]
![[Pasted image 20230919171145.png]]
![[Pasted image 20230919171229.png]]
(中間一大段 training code截圖省略)
![[Pasted image 20230919172736.png]]
* 微調可以用在 moderation: 讓 LLM 不要回答離題的問題
![[Pasted image 20230919172815.png]]
## Evaluation and iteration
![[Pasted image 20230919173100.png]]
![[Pasted image 20230919173143.png]]
另一種分析和評估的框架:
![[Pasted image 20230919173253.png]]
![[Pasted image 20230919173555.png]]
![[Pasted image 20230919173638.png]]
![[Pasted image 20230919173707.png]]
![[Pasted image 20230919173813.png]]
![[Pasted image 20230919173950.png]]
ARC 是小學問題評估。微調的話,應該關注在你的使用情境下的評估。
## Consideration on getting started now
![[Pasted image 20230919174326.png]]
![[Pasted image 20230919174443.png]]
![[Pasted image 20230919174541.png]]
![[Pasted image 20230919174609.png]]
![[Pasted image 20230919174813.png]]