A slightly improved DVD-JEPA demo [P]

Hey!

I came across this post, which I found quite neat as a minimal demonstration of JEPA. However, as the comments pointed out, there was some room for improvement. So I added a few things such as environment noise and a fair* comparison to a pixel-space baseline.

I think the inclusion of environment noise is pretty key, as LeCun himself has stated often and clearly that one of the key motivating factors for JEPA is its ability to disregard unpredictable and irrelevant environment details.

Anyway, here’s the result which I think speaks for itself:

https://i.redd.it/kadcsrx4nn8h1.gif

I think my version paints a much clearer picture of JEPA’s promise. I did remove the web-demo and anomaly detection bit as I felt that wasn't so important to the core demonstration of JEPA as an idea

Linking my fork for those interested. Note: Since this was a very quick afternoon-project , I did use AI to make most of the changes, though I did try to do so thoughtfully. Hate that if you must.

*fair as in: roughly same parameter count and compute budget. I considered the linear probe and decoder compute budget to be independent from core model training.

submitted by /u/Kirne
[link] [comments]

A slightly improved DVD-JEPA demo [P]

Want to read more?

Tagged with