The Blog



Share

ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision