2 min readfrom Machine Learning

Derivative-Free Neural Network Optimization: MNIST Case [R]

Derivative-Free Neural Network Optimization: MNIST Case [R]
Derivative-Free Neural Network Optimization: MNIST Case [R]

A direct optimization test was conducted on a neural network for MNIST image classification. The network features a 784-32-10 architecture with a total of 25,450 continuous parameters (weights and biases). Instead of employing backpropagation or gradient information, the parameters were optimized using MDP, a Derivative-Free Optimization method.

​The objective was to directly minimize the Cross-Entropy Loss on a subset of 5,000 training images. Final evaluations were performed on independent validation and test sets.

​In the best run, MDP achieved an objective loss of 0.0004083, a validation accuracy of 93.7%, and a test accuracy of 93.4%. These results outperform the baseline established by Adam, which achieved a final loss of 0.002945, a validation accuracy of 91.8%, and a test accuracy of 91.7% using the same network architecture.

​Notably, this optimization was successfully performed over a 25,450-dimensional search space, achieving convergence across 1,000,000 function evaluations without relying on gradients or population-based methods.

​The code for this test, along with other Python implementation examples, is available in the examples folder of the official project repository:

https://github.com/misa-hdez/sgo-lab

submitted by /u/Mis4318
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#rows.com
#natural language processing for spreadsheets
#generative AI for data analysis
#Excel alternatives for data analysis
#financial modeling with spreadsheets
#cloud-based spreadsheet applications
#no-code spreadsheet solutions
#Derivative-Free Optimization
#Neural Network
#MNIST
#Image Classification
#Optimization
#MDP
#Backpropagation
#Gradient
#Cross-Entropy Loss
#Validation Accuracy
#Test Accuracy
#Adam
#Architecture