June 21, 2026•1 min read•from Machine Learning

EMA on LoRA ? [R]

Hi guys

Does anyone know of papers where EMA on LoRA adapters has been used successfully?

Im interested in cases where the EMA adapter acts as a self-teacher generating soft labels for the trainable adapter.

On-policy self-distillation [1] uses ema for the teacher. However, they seem to fully fine-tune. Any empirical results showing the idea is working on lora/ left models?

[1] https://arxiv.org/abs/2601.19897

submitted by /u/South-Conference-395
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article→

Tagged with

#natural language processing for spreadsheets

#self-service analytics tools

#generative AI for data analysis

#Excel alternatives for data analysis

#self-service analytics

#rows.com

#EMA

#LoRA

#adapters

#self-distillation

#soft labels

#teacher

#self-teacher

#fine-tuning

#on-policy

#empirical results

#left models

#trainable adapter

#machine learning

#models