Bohua Peng
4 May 2021
Plato’s allegory of the cave
A lower-dimensional representation
Due to information loss ...
The same goes for deep representation learning
Embedding zoo
word embeddings
wave embeddings
face embeddings
Can we learn representations without human annotations?
Handcrafted pretext tasks
Learning from predicting a part of data with the rest
These methods are very flexible. However, they are not good enough ...
These heuristics are quite fragile. They need a set of carefully pre-defined configurations which are difficult but not too difficult, otherwise ...
Handcrafted pretext tasks
[1] Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles 2016
Generative methods
VAE
Learning by reconstructing
Limitations:
Pros:
Generative methods
Learning by generating - GAN
Problems:
Self-Supervised GANs via Auxiliary Rotation Loss CVPR 2019
Text
Is there a more general proxy task to learn other than reconstruction , e.g., relational classification?
Can we not trade off generality and interpretability?
Pulling semantically
similar objects close to each other with a contrastive loss
Learning by comparing
Let Y = 0, if anchor X1 and input image X2 are from the same person.
[2] Learning a Similarity Metric Discriminatively, with Application to Face Verification 2005
contrastive loss:
Pushing semantically
diverse objects away from each other with a contrastive loss
Let Y = 0, if anchor X1 and input image X2 are from different people.
https://keras.io/examples/vision/siamese_network/
Consider views of the same images as positive keys while views of other images as negative keys
Instance level discrimination
🤔
However, it meets mode collapse
The successful story of SimCLR
SimCLR 2020
Normalized temperature-scaled cosine similarity
Swapped-version loss to avoid mode collapse
InfoNCE
(contrastive cross entropy)
Verify 4 contrastive methods pretrained on CIFAR10
T-SNE visualization of learned representations for CIFAR10
Same data augmentation:
random resized crop
color jittering
Linear evaluation protocol:
Augmentation and regularisation are not allowed when finetuning the linear projection head
MoCo pretrained on histopathological dataset
T-SNE visualization of learned representations for CIFAR10
Learned feature maps
results
benovelent
malignant
"Hard" Negative Mining
Relate to InfoMax
easy pair
hard pair
Negative samples might not be necessary
Barlow Tiwns