能力 | 模型名 | 训练方法 | OpenAI API |
Before GPT-3 | | | |
Pretrain + Fintune like Bert | GPT-1 | Language Modeling + Task Finetune | - |
Generation+Zero-shot task | GPT-2 | Language Modeling | - |
GPT-3 Series | | | |
Generation+World Knowledge+In-context Learning | GPT-3 Initial | Language Modeling | Davinci |
+Follow Human Instruction+generalize to unseen task | Instruct-GPT initial | Instruction Tuning | Davinci-Instruct-Beta |
+Code Understanding+Code Generation | Codex initial | Training on Code | Code-Cushman-001 |
GPT-3.5 Series | | | |
++Code Understandning++Code Generation++Complex Reasoning / Chain of Thought (why?)+long-term dependency (probably) | Current Codex Strongest model in GPT3.5 Series | Training on text + code Tuning on instructions | Code-Davinci-002 (currently free. current = Dec. 2022) |
++Follow Human Instruction--In-context learning--Reasoning++Zero-shot generation | Instruct-GPT supervisedTrade in-context learning for zero-shot generation | Supervised instruction tuning | Text-Davinci-002 |
+Follow human value+More detailed generation+in-context learning+zero-shot generation | Instruct-GPT RLHF More aligned than 002, less performance loss | Instruction tuning w. RLHF | Text-Davinci-003 |
++Follow human value++More detailed generation++Reject questions beyond its knowledge (why?) ++Model dialog context --In-context learning | ChatGPT Trade in-context learning for dialog history modeling | Tuning on dialog w. RLHF | - |