XAI methodologies are broadly categorized across three primary dimensions: Intrinsic vs. Post-Hoc, Model-Agnostic vs. Model-Specific, and Local vs. Global.
1. Intrinsic vs. Post-Hoc Interpretability
- Intrinsic Interpretability refers to designing a model from the ground up to be self-explanatory. This involves constraining the architecture so its internal mechanisms are viewable. Examples include generalized additive models (GAMs) or attention-based mechanisms where the attention weights can be explicitly mapped back to the input tokens.
- Post-Hoc Interpretability accepts the model as an unchangeable black box. The explanation method is applied after the model has been trained and run. It attempts to reverse-engineer or approximate the model’s behavior by analyzing the relationships between varied inputs and their corresponding outputs.
2. Model-Agnostic vs. Model-Specific
- Model-Agnostic tools can be applied to any machine learning algorithm, regardless of its architecture. These tools treat the model as a pure function mapping input $X$ to output $Y$. Whether the underlying model is a support vector machine, a random forest, or a 50-layer convolutional neural network, the XAI tool functions identically.
- Model-Specific tools are custom-engineered for specific architectural paradigms. For example, methods that compute gradients directly through a neural network’s layers to find pixel importance are highly model-specific, as they rely entirely on the differentiability of neural network backpropagation.
3. Local vs. Global Explanations
- Global Explanations seek to describe the holistic behavior of the model across the entire dataset. It answers questions like: “What are the top three features this credit-scoring model weights across all historical loan applications?”
- Local Explanations explain a singular, specific prediction. It ignores the model’s broad tendencies and focuses entirely on one instance. For instance: “Why did the model reject John Doe’s loan application specifically, given his exact income, debt ratio, and credit history?”