2 min readfrom Machine Learning

Looking for arXiv endorsement (cs.CV) to post my ViT positional embeddings paper [R]

Hi everyone,

I'm looking for someone to endorse me for arXiv submission in cs.CV (computer vision) or cs.LG. I have a completed paper and want to upload it as a preprint.

About the paper:

Title: Positional Encodings in Vision Transformers: A Geometric Account of Spatial Organization and Robustness

Summary: This paper investigates how different positional encoding schemes (learned absolute, sinusoidal, and rotary) shape the internal representations of Vision Transformers. We introduce a metric called Spatial Similarity Distance Correlation (SSDC) to quantify spatial structure in token representations. Using controlled interventions (random permutation at inference, random permutation training, and positional magnitude scaling), we show that:

  1. ViTs develop non‑trivial spatial structure even without positional embeddings, but this structure is content‑driven and collapses under token permutation.

  2. All positional encodings shift models toward index‑anchored spatial organization that persists under content disruption.

  3. Robustness to distributional shifts (JPEG compression, Gaussian blur) is primarily associated with the presence of a stable positional reference frame and correlates directly with SSDC as measured under intervention.

The paper includes experiments on ImageNet‑100 with ViT‑S models, multiple random seeds, and full statistical reporting.

PDF available at: https://github.com/mahmoud-mannes/neurips-geometry-paper/blob/main/paper/main.pdf

submitted by /u/Octacinth
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#rows.com
#financial modeling with spreadsheets
#natural language processing for spreadsheets
#generative AI for data analysis
#Excel alternatives for data analysis
#AI-driven spreadsheet solutions
#Vision Transformers
#arXiv
#cs.CV
#Positional Encodings
#positional embeddings
#computer vision
#Spatial Similarity Distance Correlation
#endorsement
#Robustness
#ImageNet-100
#preprint
#token representations
#ViT-S models
#Spatial Organization