Practical stability of Shapley values ​​in the interpretation of AI models

signs, we need to sort out 2^N combinations to evaluate the contribution of each feature in full. In practice this is not done – approximation is used.

This raises questions:

  • Is such a very rough calculation enough for us (especially if we have a picture that we would like to evaluate in as much detail as possible)?

  • How much of the Shapley values ​​are random and data-driven?

In this post we will deal with them!

My name is Sabrina, and I love the field of explainable AI from ear to toe!

Over time, I managed to get involved in reading seminars in the field, write a course, and now I’m writing a thesis, which will then grow (I hope) into a PhD thesis and beyond (unless I want to give it all up and go into the forest).

Moreover, I lead DataBlogand I try to popularize the area as best I can, because I think it is important and beautiful. I will be glad to have like-minded people and be open to discussion!

Roughness in calculations.

For what?

First, let's figure out what values ​​we need to calculate and why we need to be rude. Let us turn to the complete formula for Shapley values. For a sign i in the model fthe Shapley value is:

[1]    \phi_i(f) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(|N| - |S| - 1)!}{|N|!} \left( f(S \cup \{i\}) - f(S) \right)

Where:

N– a multitude of all signs,

– S—subset of features that excludes i,

f(S \cup \{i\}) - f(S) — contribution of the feature i when adding it to a subset S

 \phi_i(f) – a function that reflects the winnings in the game. This could be a change in quality metric, model forecast, or loss. In the library shap  \phi_i(f) there is a predicted value.

Intuitive meaning:

Shapley values ​​calculate the “average” contribution of a feature i for all possible combinations of other characteristics, which gives the distribution of the contribution. And everything would be fine if it weren’t for the terrible factorials (the “!” signs) and the number of subsets that we need to analyze. At N signs need to be considered 2^Nsubsets that, starting from N=15 (approximately) already unpleasant. But at the same time, we also need to run the inference model… This is where approximation helps us.

How?

Approximation – a method consisting of replacing some objects with others, in some sense close to the original ones, but simpler [Wiki]. In our context, it is a way to simplify complex calculations by using approximate methods instead of exact Shapley values.

The original article describes several approaches for approximation. They are divided into two categories: model-agnostic (those that can be used for any models) and model-specific (those that already rely on a certain model structure) approximations.

The following are proposed as independent models:

  • Shapley regression values: — we calculate the importance of features using the formula [1]using as f linear models. This approach is good for multicollinearity, because if there are at least two features that can replace each other, adding the second to the coalition in the presence of the first will not significantly change the forecast (if at all).

  • Shapley sampling values: – based on a random selection of subsets of features. That is, instead of all $2^N$, we take only a few and combine the resulting result. The downside is that the accuracy will still depend on the number of samples analyzed.

  • Quantitative input influence: — I couldn’t find accurate information about him, if anyone knows — welcome!

As model-dependent:

  • Linear SHAP: – methods for linear models. Its beauty lies in the use of the information that already exists – the coefficients of the linear model. The contribution here is calculated as weight multiplied by the value of the characteristic at the point minus the expected value (observed average) of the characteristic. The formula looks like this:

\phi_i(f, x) = w_i(x_i - E[x_i]),

Where
w_i — characteristic coefficient i,
x_i — the value of the attribute for the point in question,
E[x_i] = \frac{\sum_i^n x_i}{n} — mathematical expectation of the value of the attribute (sample average of the data)

Interpreting the formula, to summarize once again, we look at how the deviation of a feature from its expectation affects the forecast.

  • Deep SHAP: – from the name, a method suitable for DNN. Under the hood it engages DeepLIFT — gradient method of explanation.

The above list is not exhaustive – with the growing demand for SHAP, other ways to speed up calculations appeared. Instead of dwelling on each, let's record a fact: approximations exist and depart somewhat from the initially strong theory of Shapley values.

Is it correct to use approximations?

And so, we found out and showed that the Shapley values ​​are calculated not in the original, but in a simplification. Does this help us get accurate estimates?

Yes!

The approximations are introduced in such a way that we require them to correspond (or be sufficiently close) to the theoretical Shapley values. Among the main requirements are:

  1. Local precision is the sum of all Shapley values \phi_i for a specific forecast f(x) must exactly match the forecast value. That is:

    f(x) = \sum_{i=1}^{M} \phi_i + \phi_0,Where \phi_0 — the basic value of the model, when all the features are not

    influence the model.

  2. The Blockhead's Axiom (analogue) — if the attribute value is missing, then its contribution should be zero:
    I[i] = 0 \Rightarrow \phi_i = 0,Where I[i] — indicator of the presence of a sign i.

  3. Consistency — if in two models the forecast of one depends on the addition of a feature i stronger than in the second, then the Shapley value of this feature in the second model should be lower than in the first:
    f'(x) - f'(x \setminus i) \ge f(x) - f(x \setminus i) \Rightarrow \phi_i(f', x) \ge \phi_i(f, x) Where f', f — two models under consideration.

And the cherry on the cake, the theorem is proven:

Only one possibilitydifferent model of explanation g is an additive method for assessing the contribution of features and satisfies requirements 1, 2 and 3! And this is done in the shap framework.

Let's try to look at this

Theory is always beautiful and good, but its benefits are felt more clearly in practice. I wondered: would the explanations be stable for identical models trained from different starting points? And how random are the signs that the Shapley method can give us?

This is how the experiment was conducted.

Consider 5 equivalent models Net(X). By equivalent we mean the following models:

  1. same architecture

  2. trained on one data set

  3. having comparable (different only by an insignificant value \varepsilon) quality

Consider the test set and Shapley values ​​for a specific example of a specific class class_k (I took MNIST, so I looked at 10 classes k\in [0, 9]).

Algorithm:

For each class k,

  • consider distribution Shapley values ​​of each of the five models

  • for each pair of distributions (for one class we will have C^2_{5}=10 steam)

    – analyze the “sameness” (homogeneity) of the distributions, considering the chi-square test with a significance level 0.025
    – analyze the results

Note:

Without going too deep into statistics, chi-square tests whether the distribution of feature importances is the same across models. The conclusion is made based on the test statistics and the “p-value” – the probability, given the truth of the null hypothesis, to obtain the same results. If this probability is below the critical value (for me it is 0.025) we say that the distributions are different.

For 100 pairs, I found that the distributions would be statistically significantly different for only 5% of observations. Sustainable? Quite.

3 out of 100 analyzed samples

3 out of 100 analyzed samples

Summing up

And so, we have considered the issue of stability of Shapley values ​​and the importance of using approximations in their calculation. And everything seems to be fine, Shapley’s values ​​are theoretically sound, practically applicable and simply beautiful. But let's leave admiration and dryly consolidate the conclusions we have received:

Approximation is a reasonable and correct way to calculate Shapley values. However, it is important to say that this works under the assumption of independence of features.

Stability of Shapley values ​​in experiment shows that the method is able to demonstrate meaningful stability and the Shapley values ​​can indeed show important features based on the properties of the data.

BUT: stability in one experiment may be violated during others. As the author of the post, I have two goals here – to show that the tool is useful in practice and to arouse interest in new experiments.

As you can see, there are limitations and nuances. And here Explainable AI is an amazing field, full of questions and contradictions. I hope this article was useful to you and gave you new knowledge that will be useful in practice. At a minimum, when understanding the use of specific explainers with shap under the hood in practice.

Thanks for reading!

I wish you reliable and understandable models,

Your Data-author! 🙂

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *