Sign in to confirm you’re not a bot

This helps protect our community. Learn more

DINOv2 from Meta AI - Finally a Foundational Model in Computer Vision?

AI Papers Academy

29.3K subscribers

20K views 2 years ago

In April 2023, Meta AI released DINOv2, a foundational computer vision model pretrained using self-supervised learning on a large curated dataset, which can be used without finetuning! In this video we'll examine interesting topics from this exciting release. We'll start by explaining what is a foundational model and why DINOv2 can be counted as such. Next we'll see how you may use DINOv2 in your code. DINOv2 was released with few versions of different model sizes. We'll explain how Meta AI have created the smaller models using model distillation. We then review parts of the process of creating the large curated dataset that was used to train DINOv2. We'll finish by talking about how using self-supervised learning helped DINOv2 reach amazing pixel level understanding of the image, comparing to text guided images that are commonly used. Blog post - https://aipapersacademy.com/dinov2-fr... GitHub repo - https://github.com/facebookresearch/d.

…

...more

...more

Introduction

Foundational Model

Using DINOv2

Model Distillation

SSL with Curated Data

Pixel level learning

DINOv2 from Meta AI - Finally a Foundational Model in Computer Vision?

676Likes

20,802Views

2023Apr 25

In April 2023, Meta AI released DINOv2, a foundational computer vision model pretrained using self-supervised learning on a large curated dataset, which can be used without finetuning! In this video we'll examine interesting topics from this exciting release. We'll start by explaining what is a foundational model and why DINOv2 can be counted as such. Next we'll see how you may use DINOv2 in your code. DINOv2 was released with few versions of different model sizes. We'll explain how Meta AI have created the smaller models using model distillation. We then review parts of the process of creating the large curated dataset that was used to train DINOv2. We'll finish by talking about how using self-supervised learning helped DINOv2 reach amazing pixel level understanding of the image, comparing to text guided images that are commonly used. Blog post - https://aipapersacademy.com/dinov2-fr... GitHub repo - https://github.com/facebookresearch/d... arxiv paper - https://arxiv.org/abs/2304.07193 To understand Vision Transformer, the backbone architecture of DINOv2 - https://aipapersacademy.com/vision-tr... 👍 Please like & subscribe if you enjoy this content ---------------------------------------------------------------------------------- Support us - https://paypal.me/aipapersacademy ---------------------------------------------------------------------------------- Chapters: 0:00 Introduction 0:52 Foundational Model 2:45 Using DINOv2 3:10 Model Distillation 4:41 SSL with Curated Data 6:21 Pixel level learning

Introduction

Foundational Model

Using DINOv2

Model Distillation

Transcript

Follow along using the transcript.

AI Papers Academy

29.3K subscribers