[P] Built a portable GPU ISA after reading too many architecture manuals [P]
I’ve been reading GPU architecture docs in my free time. NVIDIA PTX, AMD ISA reference guides, Intel Xe, reverse-engineered Apple GPU stuff. Over 5,000 pages across 16 microarchitectures.
After a while you notice all four vendors are doing the same 11 things with different names. So I wrote a spec that covers all of them and built a toolchain around it. It’s called WAVE. You write a kernel once, it compiles to a portable binary, then thin backends translate it to Metal, PTX, HIP, or SYCL.
Same binary verified on Apple M4 Pro, NVIDIA T4, and AMD MI300X. My co-author Onyinye built PyTorch integration and got identical training results across all backends.
Please star on GitHub: https://github.com/Oabraham1/wave
Preprint: https://arxiv.org/abs/2603.28793
Read full docs and how I built everything: https://wave.ojima.me
pip install wave-gpu
[link] [comments]
Want to read more?
Check out the full article on the original site