2 min readfrom Machine Learning

EMA-Gated Temporal Sequence Compression in Vision Transformers [P]

Vision Transformers waste 90% of their compute recalculating stationary asphalt. NeuroFlow tracks semantic surprise in embedding space, physically eliminating background tokens before the encoder.

Result: 55.8x wall-clock speedup for ViTs on high-res video (1792p) with 97% fidelity. No fine-tuning required.

NeuroFlow is a dynamic routing framework for Vision Transformer video inference. It exploits temporal redundancy by tracking per-patch semantic surprise via an Exponential Moving Average (EMA) of patch-level embeddings, effectively answering the architectural mismatch between O(N2) self-attention and highly redundant natural video streams.

Key Contributions

  • Architecture C (Dual-Memory Reconstruction): A completely training-free inference engine that combines a Layer 0 Retinal Gate with a Layer 12 Cortical Cache. It achieves 71.55% zero-shot top-1 accuracy at 84.0% token sparsity on SigLIP, retaining 92.4% of dense accuracy without modifying any weights.
  • Architecture B (Extreme Wall-Clock Speedup): Physically eliminates stationary tokens before the encoder. With sparse manifold distillation, it reduces 1792p SigLIP 2 inference from 678 ms to 11.9 ms—a 55.80× wall-clock speedup at 97.37% embedding fidelity.
  • LLM Ablation: Characterises the architectural boundaries of applying similarity-gated bypass to autoregressive language models (Phi-3-mini), demonstrating 0% token drift in syntactically constrained generation.

Code and paper: https://github.com/ynnk-research/-NeuroFlow

submitted by /u/Bobby-Ly
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#natural language processing for spreadsheets
#rows.com
#financial modeling with spreadsheets
#generative AI for data analysis
#Excel alternatives for data analysis
#natural language processing
#AI formula generation techniques
#self-service analytics tools
#enterprise-level spreadsheet solutions
#row zero
#no-code spreadsheet solutions
#self-service analytics
#NeuroFlow
#Vision Transformers
#wall-clock speedup
#semantic surprise
#embedding space
#self-attention
#Exponential Moving Average
#embedding fidelity