•1 min read•from Machine Learning
EMA on LoRA ? [R]
Hi guys
Does anyone know of papers where EMA on LoRA adapters has been used successfully?
Im interested in cases where the EMA adapter acts as a self-teacher generating soft labels for the trainable adapter.
On-policy self-distillation [1] uses ema for the teacher. However, they seem to fully fine-tune. Any empirical results showing the idea is working on lora/ left models?
[link] [comments]
Want to read more?
Check out the full article on the original site
Tagged with
#natural language processing for spreadsheets
#self-service analytics tools
#generative AI for data analysis
#Excel alternatives for data analysis
#self-service analytics
#rows.com
#EMA
#LoRA
#adapters
#self-distillation
#soft labels
#teacher
#self-teacher
#fine-tuning
#on-policy
#empirical results
#left models
#trainable adapter
#machine learning
#models