SlideShare a Scribd company logo
Submit Search
Upload
Login
Signup
trlX framework reviews
Report
Sang-Kil Park
Follow
Software Enginner, Tech Lead at Kakao Corp
Apr. 6, 2023
•
0 likes
•
22 views
1
of
17
trlX framework reviews
Apr. 6, 2023
•
0 likes
•
22 views
Download Now
Download to read offline
Report
Technology
Here's a recap while studying the learning structure of the trlX framework and the PPO algorithm.
Sang-Kil Park
Follow
Software Enginner, Tech Lead at Kakao Corp
Recommended
Mashup Daum
Sang-Kil Park
816 views
•
13 slides
Introduction to Data Science
Christy Abraham Joy
42.1K views
•
51 slides
Time Management & Productivity - Best Practices
Vit Horky
158.5K views
•
42 slides
The six step guide to practical project management
MindGenius
28.1K views
•
27 slides
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
8.6K views
•
21 slides
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools
53.5K views
•
138 slides
More Related Content
Recently uploaded
Keynote: Two years at the British Library... and counting / Alan Danskin (Bri...
CILIP MDG
24 views
•
33 slides
Advancing Equity and Inclusion for Deaf Students in Higher Education
3Play Media
143 views
•
24 slides
ISO Survey 2022: ISO 27001 certificates (ISMS)
Andrey Prozorov, CISM, CIPP/E, CDPSE. LA 27001
79 views
•
10 slides
Orchestration, Automation and Virtualisation Maturity Model
CSUC - Consorci de Serveis Universitaris de Catalunya
51 views
•
19 slides
Swiss Re Reinsurance Solutions - Automated Claims Experience – Insurer Innova...
The Digital Insurer
30 views
•
13 slides
Knowledge graph use cases in natural language generation
Elena Simperl
84 views
•
47 slides
Recently uploaded
(20)
Keynote: Two years at the British Library... and counting / Alan Danskin (Bri...
CILIP MDG
•
24 views
Advancing Equity and Inclusion for Deaf Students in Higher Education
3Play Media
•
143 views
ISO Survey 2022: ISO 27001 certificates (ISMS)
Andrey Prozorov, CISM, CIPP/E, CDPSE. LA 27001
•
79 views
Orchestration, Automation and Virtualisation Maturity Model
CSUC - Consorci de Serveis Universitaris de Catalunya
•
51 views
Swiss Re Reinsurance Solutions - Automated Claims Experience – Insurer Innova...
The Digital Insurer
•
30 views
Knowledge graph use cases in natural language generation
Elena Simperl
•
84 views
BuilderAI Proposal_Malesniak
Michael Lesniak
•
85 views
Framing Few Shot Knowledge Graph Completion with Large Language Models
MODUL Technology GmbH
•
27 views
Webinar: Discover the Power of SpiraTeam - A Jira Alternative To Revolutioniz...
Inflectra
•
33 views
Demystifying ML/AI
Matthew Reynolds
•
29 views
Metadata & Discovery Group Conference 2023 - Day 1 Programme
CILIP MDG
•
24 views
Nymity Framework: Privacy & Data Protection Update in 7 States
TrustArc
•
123 views
Announcing InfluxDB Clustered
InfluxData
•
54 views
Product Listing Presentation_Cathy.pptx
CatarinaTorrenuevaMa
•
62 views
Unleashing Innovation: IoT Project with MicroPython
Vubon Roy
•
25 views
Navigating the Future
OnBoard
•
24 views
Accelerating Data Science through Feature Platform, Transformers, and GenAI
FeatureByte
•
139 views
Prompt Engineering - an Art, a Science, or your next Job Title?
Maxim Salnikov
•
18 views
h2 meet pdf test.pdf
JohnLee971654
•
52 views
Scaling out with WordPress
Konstantin Kovshenin
•
54 views
Featured
More than Just Lines on a Map: Best Practices for U.S Bike Routes
Project for Public Spaces & National Center for Biking and Walking
6.1K views
•
51 slides
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
DevGAMM Conference
3.1K views
•
12 slides
Barbie - Brand Strategy Presentation
Erica Santiago
23.7K views
•
46 slides
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
25K views
•
64 slides
Introduction to C Programming Language
Simplilearn
8.1K views
•
39 slides
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
Palo Alto Software
88K views
•
39 slides
Featured
(20)
More than Just Lines on a Map: Best Practices for U.S Bike Routes
Project for Public Spaces & National Center for Biking and Walking
•
6.1K views
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
DevGAMM Conference
•
3.1K views
Barbie - Brand Strategy Presentation
Erica Santiago
•
23.7K views
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
•
25K views
Introduction to C Programming Language
Simplilearn
•
8.1K views
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
Palo Alto Software
•
88K views
9 Tips for a Work-free Vacation
Weekdone.com
•
7K views
I Rock Therefore I Am. 20 Legendary Quotes from Prince
Empowered Presentations
•
142.6K views
How to Map Your Future
SlideShop.com
•
274.8K views
Beyond Pride: Making Digital Marketing & SEO Authentically LGBTQ+ Inclusive -...
AccuraCast
•
3.3K views
Read with Pride | LGBTQ+ Reads
Kayla Martin-Gant
•
1K views
Exploring ChatGPT for Effective Teaching and Learning.pptx
Stan Skrabut, Ed.D.
•
56.1K views
How to train your robot (with Deep Reinforcement Learning)
Lucas García, PhD
•
42.1K views
4 Strategies to Renew Your Career Passion
Daniel Goleman
•
121.8K views
The Student's Guide to LinkedIn
LinkedIn
•
86.8K views
Different Roles in Machine Learning Career
Intellipaat
•
12.2K views
Defining a Tech Project Vision in Eight Quick Steps pdf
TechSoup
•
9.5K views
The Hero's Journey (For movie fans, Lego fans, and presenters!)
Dan Roam
•
29K views
10 Inspirational Quotes for Graduation
Guy Kawasaki
•
302.2K views
The Health Benefits of Dogs
The Presentation Designer
•
35K views
trlX framework reviews
1.
trlX framework review Apr
4, 2023 Sang-Kil Park
2.
Architecture
3.
Model
4.
Algorithm https://github.com/openai/lm-human-preferences/blob/master/lm_human_preferen ces/train_policy.py Adaptive KL
Controller as described in Ziegler et al. "Fine-Tuning Language Models from Human Preferences".
6.
rollout : rewards
7.
mean_kl
8.
learn : advantages,
returns
9.
learn : loss
12.
stats
13.
https://wandb.ai/likejazz/trlx/runs/lu4roilu