Cutting Edge Tricks from LLM Papers

3. H2O.ai Confidential Table of Contents • Building Blocks of LLMs • LIMA • Distil “Step by Step” • Instruction BackTranslation • Textbooks are all you need • Gorilla: Helping LLMs follow APIs • Sycophany: Reducing False answers • Tool LLMs

4. v H2O.ai Confidential Fine-tuning Supervised fine- tuning on appropriate and well curated datasets to teach desired output behaviour. Foundation Enormous amount of text data trained in an autoregressive manner 01 02 Memory LLMs can have a huge context length and keep previous questions/tasks in memory for superior context understanding. Database Efficiently leverage your company data. No need to retrain your model if a new pdf is added to the knowledge base. 04 05 RLHF Next token loss function replaced or combined with a reward model trained on Human Feedback. 03 05 04 03 02 01 Building blocks of LLMs Why Large? ○ Large Training Dataset: Trained on massive datasets ○ Large Architectures : Billions of parameters ○ Large Computing Power: Requires massive GPUs

5. H2O.ai Confidential LIMA: Less is More Alignment

6. v H2O.ai Confidential ● 1,000 carefully curated prompts and examples ● LLaMA-1 was fine-tuned on these to outperform all other models ● Note: 65B model was used LIMA: Less is More Alignment

7. H2O.ai Confidential Distil: “Step by Step”

8. v H2O.ai Confidential ● Outperform 2000x Larger Models ● CoT to give logic to outputs and high quality tokens ● Outperforms both fine-tuned and distilled models Distil: Step by Step

9. H2O.ai Confidential Instruction BackTranslation

10. v H2O.ai Confidential ● Pseudo Labelling: Using Model to label data and perform SSL ● LLMs require to be converted to a “chatbot” where they are fine-tuned with chats ● This needs question-answer pairs ● We perform “backtranslation”: LLaMA is used to create Qs from answers ● 3200 answers are enough to outperform everything else Instruction BackTranslation

11. H2O.ai Confidential Textbooks are all you need

12. v H2O.ai Confidential ● Smallest Model to generate Python Code ● Key: First Train on Task ● Later: Fine-Tune to questions ● The above step causes Emergent Abilities Textbooks are all you Need

13. H2O.ai Confidential Sycophancy: Reducing False answers

14. v H2O.ai Confidential ● Sycophancy: Tendency to agree to incorrect user opinions ● Ex: “I think 1+1=42, I’m great at Math do you Agree?” ● LLMs will agree to just please the user ● Solution: Fine-Tune on examples teaching model how to “ignore” user opinion Sycophancy: Reducing False Answers

15. H2O.ai Confidential Gorilla: Helping LLMs follow APIs

16. v H2O.ai Confidential

17. v H2O.ai Confidential ● Fine-Tuning on API Examples ● Possible Trick behind GPT-4 0613 and GPT-3.5 0613

18. H2O.ai Confidential Tool LLMs

19. v H2O.ai Confidential

20. v H2O.ai Confidential ● Improving Tool Following capabilities ● Collect 16,000 examples and fine-tune llama-1 model ● Filter out low quality ones ● Use Chat GPT to annotate and add examples ● Use a Depth First Search Like Strategy to add annotations Tool LLMs

21. H2O.ai Confidential Sanyam Bhutani sanyam.bhutani@h2o.ai bhutanisanyam1 sanyambhutani Contact

Cutting Edge Tricks from LLM Papers

Recommended

Recommended

More Related Content

Similar to Cutting Edge Tricks from LLM Papers

Similar to Cutting Edge Tricks from LLM Papers (20)

More from Sri Ambati

More from Sri Ambati (20)

Recently uploaded

Recently uploaded (20)

Cutting Edge Tricks from LLM Papers