Yejin Kim

Study it and apply all!
Like to see error code because it is much better than warrning!

[Paper] A Simple Framework for Contrastive Learning of Visual Representations

최대 1 분 소요

“A Simple Framework for Contrastive Learning of Visual Representations”이란 논문에 대한 리뷰입니다.

원문은 링크에서 확인할 수 있습니다.

Background

GAN type model is computationally expensive
Contrastive learning in the latent space

Key

Composition of multiple data augmentation does matter
여러 augmentation을 사용해야 하며 특히 unsupervised contrastive learning에선 color augmentation이 더 빛을 발휘한다.
Contrastive cross entropy loss with temperature parameter
Contrastive loss
Compute loss across all positive pairs (I,j) & (j,i)
Gradient에서 보면 결국 각 sample pair별로 크기가 달라져 weight 역할이 된다.
특히 cosine similarity와 dot product간 차이를 보면 L2 norm을 안쓰는 dot product은 task accuracy는 더 좋지만 linear activation을 쓰는 representation는 좋지 못하다.

Pre-process

Generate a couple of augmented data

#Config

standard SGD/Momentum with linear learning rate scaling(scheduling 인듯)

Insight

Fine tune할 때 activation사용하지 않은 representation
Activation은 non-linear가 확실히 좋다.
단 latent representation으론 activation 안쓰고 encoder결과를 사용한다.
Benefits from larger batch size& deeper and wider
The larger batch size is, the more negative samples are
단 training이 많아지면(epoch이 커지면) 차이는 줄어든다.
Unsupervised learning benefits more from bigger models than supervised one

공유하기

Twitter Facebook LinkedIn

댓글남기기

참고

[Survey] Recent technical reports

최대 1 분 소요

This is a collection of recent technical reports from several vendors including Google DeepMind, x.AI, AllenAI, AI21Labs, Databricks and HyperCLOVA.

[Survey] Recent approaches about Super alignments

1 분 소요

This is a collection of recent approaches and papers about super-alignment and relevant topics.

[Survey] Recent approaches about Efficient ML

최대 1 분 소요

This is a collection of recent approaches and papers about Efficient ML including Parameter Efficient Fine Tuning(PEFT), qunatization, pruning and other topi...

[Paper] Using Sliced Mutual Information to Study Memorization and Generalization

최대 1 분 소요

Using Sliced Mutual Information to Study Memorization and Generalization PDF Link