Direct Preference Optimization - Search Videos

Direct Preference Optimization (DPO) explained

Direct Preference Optimization (DPO) explained

A Simpler Way to Fine-Tune Language Models than with RLHF

100 viewsDec 27, 2024

Direct Preference Optimization Tutorial

論文紹介：Direct Preference Optimization: Your Language Model is Secretly a Reward Model

論文紹介：Direct Preference Optimization: Your Language Model is Secretly a Reward Model

speakerdeck.com

21. Direct Preference Optimization (DPO) (Rafailov et al., 2023)

21. Direct Preference Optimization (DPO) (Rafailov et al., 2023)

YouTubeLOADING_

14 views3 months ago

DeepLearning.AI on Instagram: "Our course recommendation of the day is “Post-training of LLMs, ” where you’ll learn how to customize pre-trained language models using Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL). You'll learn when to use each method, how to curate training data, and implement them in code to shape model behavior effectively. Enroll at the link in bio or comment "LLM" to receive the link in your inbox."

DeepLearning.AI on Instagram: "Our course recommendation of the day is “Post-training of LLMs, ” where you’ll learn how to customize pre-trained language models using Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL). You'll learn when to use each method, how to curate training data, and implement them in code to shape model behavior effectively. Enroll at the link in bio or comment "LLM" to receive the link in your inbox."

Instagramdeeplearningai

8.1K views4 months ago

Top videos

Direct Preference Optimization (DPO) explained + OpenAI Fine-tuning example

Direct Preference Optimization (DPO) explained + OpenAI Fine-tuning example

YouTubeSimeon Emanuilov

786 viewsDec 26, 2024

Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math

Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math

YouTubeUmar Jamil

34.1K viewsApr 14, 2024

Aligning LLMs with Direct Preference Optimization

Aligning LLMs with Direct Preference Optimization

YouTubeDeepLearningAI

34.1K viewsFeb 8, 2024

Direct Preference Optimization Applications

DPT: Dynamic Preference Transfer for Cross-Domain Sequential Recommendation | Proceedings of the 34th ACM International Conference on Information and Knowledge Management

DPT: Dynamic Preference Transfer for Cross-Domain Sequential Recommendation | Proceedings of the 34th ACM International Conference on Information and Knowledge Management

Model Predictive Control

Model Predictive Control

YouTubeSteve Brunton

334K viewsJun 11, 2018

Intro to Linear Programming

Intro to Linear Programming

YouTubeDr. Trefor Bazett

296.3K viewsApr 6, 2021

Direct Preference Optimization (DPO) explained + OpenAI Fine-tuning example

Direct Preference Optimization (DPO) explained + OpenAI Fine-tu…

786 viewsDec 26, 2024

YouTubeSimeon Emanuilov

Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math

Direct Preference Optimization (DPO) explained: Bradley-Terry m…

34.1K viewsApr 14, 2024

YouTubeUmar Jamil

Aligning LLMs with Direct Preference Optimization

Aligning LLMs with Direct Preference Optimization

34.1K viewsFeb 8, 2024

YouTubeDeepLearningAI

Direct Preference Optimization (DPO) Explained: AI Alignment

Direct Preference Optimization (DPO) Explained: AI Alignment

7 views3 months ago

YouTubeVLR Software Training

Direct Preference Optimization (DPO) - How to fine-tune LLMs directly without reinforcement learning

Direct Preference Optimization (DPO) - How to fine-tune LLMs dir…

30.7K viewsJun 21, 2024

YouTubeSerrano.Academy

Direct Preference Optimization (DPO) | Paper Explained

Direct Preference Optimization (DPO) | Paper Explained

1.4K views2 months ago

Direct Nash Optimization: Teaching language models to self-improve with general preferences

Direct Nash Optimization: Teaching language models to self-improve …

論文紹介：Direct Preference Optimization: Your Language Mod…

speakerdeck.com

Direct Preference Optimization: Your Language Model is Secretly …

39.1K viewsDec 22, 2023

YouTubeAI Coffee Break with Letitia

DPO Coding | Direct Preference Optimization (DPO) Code impleme…

384 views11 months ago

YouTubeAILinkDeepTech

LLMs | Alignment of Language Models: Contrastive Learning | Le…

1.6K viewsSep 26, 2024

4 Ways to Align LLMs: RLHF, DPO, KTO, and ORPO

4.1K viewsJul 10, 2024

YouTubeSnorkel AI

Direct Preference Optimization (DPO): Your Language Model is S…

19.2K viewsAug 10, 2023

YouTubeGabriel Mongaras

[Paper Review] Direct preference optimization(DPO) : Your languag…

8 views5 months ago

YouTubeLOADING_

DPO - Part1 - Direct Preference Optimization Paper Explanation | …

2K viewsAug 12, 2023

YouTubeNeural Hacks with Vasanth

Reinforcement Learning, RLHF, & DPO Explained

16.2K viewsJun 12, 2024

YouTubeMark Hennings

DPO (Direct Preference Optimization)についてNotebookL…

2 views3 months ago

YouTubeAi情報Note

Hands-on 10: Large Language Model Alignment with Direct Prefe…

3.7K views7 months ago

YouTubeBrainOmega

21. Direct Preference Optimization (DPO) (Rafailov et al., 2023)

14 views3 months ago

YouTubeLOADING_

DPO : L'Alternative RLHF qui Révolutionne l'Alignement IA

26 views3 months ago

YouTubeDeep Learner, One Step at a Time

DPO (Direct Preference Optimization) 算法讲解

50.6K viewsMar 3, 2024

bilibiliRethinkFun

LLM Alignment Methods - DPO vs IPO vs KTO vs PCL

1.6K viewsJan 27, 2024

YouTubeFahd Mirza

W12L53: Direct Preference Optimization (DPO)

1.1K views6 months ago

YouTubeIIT Madras - B.S. Degree Programme

[인공지능,머신러닝,딥러닝] (심화) Direct preference optimization (DP…

2.7K viewsMar 18, 2024

YouTube컴달인 - 컴퓨터 달인

Diffusion Model Alignment Using Direct Preference Optimization

1.5K viewsNov 24, 2023

bilibiliPaperWeekly

UMass CS685 S24 (Advanced NLP) #12: Direct preference optimizatio…

3.1K viewsMar 13, 2024

YouTubeMohit Iyyer

Direct Preference Optimization (DPO)

7.3K viewsNov 13, 2023

YouTubeTrelis Research

RLHF, PPO and DPO for Large language models

3.6K viewsFeb 18, 2024

YouTubeArvind N

Fast Fine Tuning and DPO Training of LLMs using Unsloth

5.9K viewsMar 25, 2024

YouTubeAI Anytime

See more videos