model-agnostic sensitivity approximator [P]
(to preface, i'm 16 and this is the first package i've ever built. any feedback would be appreciated!)
what i've noticed is that most industry-standard xai tools (think shap/lime) focus on feature attribution (why did the model made this prediction), but it doesn't do anything further.
i wanted to go a step beyond that, so i built a tool that approximates ∂[prediction]/∂[feature], basically how sensitive the model prediction is to each feature of a given instance, allowing for effective risk management in areas where knowing how to change a prediction is more important than understanding the prediction itself.
it's meant to be used for continuous and nondifferentiable black box models, especially ones like random forest or xgb.
it uses a perturbation-based approach (heavily inspired by LIME, i really like that tool), where it pertubs each feature within a given window of the instance (window size controlled by feature distribution), and then computes secant slopes ( (f(perturbation) - f(original)) / (perturbation-original) ) for each perturbation and uses a linear regression (x=perturbation, y=secant slope) to estimate slope at original instance. secant slopes are gaussian weighted based on the perturbation's distance from original value.
to be honest, the results were a little underwhelming. i compared my tool to simply using centered finite differences ( (f(x+h)-f(x-h)) / 2h where h is small ), and found that its performance was marginal on a pytorch nn (using autograd for ground truth). however, on a random forest model where gradients couldn't be analytically found, my tool's sensitivties remained much more stable compared to CFD, whose sensitivities depended heavily on size of the epsilon (the h-value).
if you wanted to try it out it's pip install sage-explainer. more info on my github repo yashkher-123/sage.
[link] [comments]
Want to read more?
Check out the full article on the original site