Make Embeddings Great Again with Self-supervised Learning

Bohua Peng

4 May 2021

content

Preliminary 🤔
Generative Methods 💚
Handcrafted Pretext Training 💡
Contrastive Learning 🕯️
- metric learning
- Loss functions
- Experiments 💻
Future directions🌟
- Hard negative mining
- Decorrelating / Multiple ontological representations)

Preliminary

Plato’s allegory of the cave

A lower-dimensional representation

Preliminary

Due to information loss ...

The same goes for deep representation learning

Embedding zoo

word embeddings

wave embeddings

face embeddings

Can we learn representations without human annotations?

Handcrafted pretext tasks

Learning from predicting a part of data with the rest

These methods are very flexible. However, they are not good enough ...

These heuristics are quite fragile. They need a set of carefully pre-defined configurations which are difficult but not too difficult, otherwise ...

Handcrafted pretext tasks

[1] Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles 2016

Generative methods

VAE

Learning by reconstructing

Limitations:

Posterior collapse
- constant representations
- noisy representations
A hard trade-off between generality and fidelity

disentangled representations
variational inference backbone

Pros:

Generative methods

Learning by generating - GAN

Problems:

mode collapse
forgetfulness w.r.t discriminator

Self-Supervised GANs via Auxiliary Rotation Loss CVPR 2019

Adding pretext tasks to GAN

Text

Is there a more general proxy task to learn other than reconstruction , e.g., relational classification?

Can we not trade off generality and interpretability?

Pulling semantically

similar objects close to each other with a contrastive loss