Multimodal AI - Part 1: CLIP

Multimodal AI has always been an area I am personally very fascinated by, because I believe our understanding of the world comes from more than just language - it’s also vision and audio (and touch and smell, of course, but that might be outside the scope of what I’m capable of exploring in my own time). Since working at Adobe, I’ve grown an even deeper appreciation for how hard it is to teach machines to reason about the visual world, and beyond that, the physical world. ...

March 18, 2026 · Nhi