1 min readfrom Machine Learning

I created an LLM post-training method called RPS. Preliminary results show that it improved Qwen3-8b's program synthesis reliability. [R]

RPS is inspired by neuroscience. As humans, we learn basic skills as kids with high neuro-plasticity. We then learn advanced skills as teens and adults with low neuro-plasticity. RPS trains a model in 2 stages. In stage 1, the model is trained on easy data with high learning rate. In stage 2, the model is trained on hard data with 10% the learning rate of stage 1. RPS is basically a combination of existing ideas: curriculum learning + learning rate decay.

ARC-AGI 1 public eval scores:

base model: Qwen3-8b

RPS: 4%

EPS (equal learning rate in both stages): 2.4%

Program Synthesis Stats:

Program executions without error:

RPS: 1145/1200

EPS: 870/1200

https://iamjasonfeng.blogspot.com/2026/05/regressive-plasticity-schedule.html

https://github.com/iamjasonfeng/RPS

submitted by /u/iamjasonfeng
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#machine learning in spreadsheet applications
#rows.com
#financial modeling with spreadsheets
#big data management in spreadsheets
#generative AI for data analysis
#conversational data analysis
#Excel alternatives for data analysis
#real-time data collaboration
#intelligent data visualization
#data visualization tools
#enterprise data management
#big data performance
#data analysis tools
#data cleaning solutions
#RPS
#Qwen3-8b
#program synthesis
#neuro-plasticity
#curriculum learning
#learning rate decay