Skip to content

AttentionRollout

pnpxai.explainers.attention_rollout

AttentionRolloutBase

Bases: ZennitExplainer

Base class for AttentionRollout and TransformerAttribution explainers.

Supported Modules: Attention

Parameters:

Name Type Description Default
model Module

The PyTorch model for which attribution is to be computed.

required
interpolate_mode Optional[str]

The interpolation mode used by the explainer. Available methods are: "bilinear" and "bicubic"

'bilinear'
head_fusion_method Literal['min', 'max', 'mean']

(Optional[str]): Method to apply to head fusion. Available methods are: "min", "max", "mean"

'min'
discard_ratio float

(Optional[float]): Describes ration of attention values to discard.

0.9
forward_arg_extractor Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]

Optional function to extract forward arguments from inputs.

None
additional_forward_arg_extractor Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]

Optional function to extract additional forward arguments.

None
n_classes Optional[int]

(Optional[int]): Number of classes

None
forward_arg_extractor Optional[ForwardArgumentExtractor]

A function that extracts forward arguments from the input batch(s) where the attribution scores are assigned.

None
additional_forward_arg_extractor Optional[ForwardArgumentExtractor]

A secondary function that extract additional forward arguments from the input batch(s).

None
**kwargs

Keyword arguments that are forwarded to the base implementation of the Explainer

required
Reference

Samira Abnar, Willem Zuidema. Quantifying Attention Flow in Transformers.

SUPPORTED_MODULES = [Attention] class-attribute instance-attribute
interpolate_mode = interpolate_mode instance-attribute
head_fusion_method = head_fusion_method instance-attribute
discard_ratio = discard_ratio instance-attribute
head_fusion_function property
EXPLANATION_TYPE: ExplanationType = 'attribution' class-attribute instance-attribute
TUNABLES = {} class-attribute instance-attribute
model = model.eval() instance-attribute
forward_arg_extractor = forward_arg_extractor instance-attribute
additional_forward_arg_extractor = additional_forward_arg_extractor instance-attribute
device property
n_classes = n_classes instance-attribute
__init__(model: Module, interpolate_mode: Literal['bilinear'] = 'bilinear', head_fusion_method: Literal['min', 'max', 'mean'] = 'min', discard_ratio: float = 0.9, forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, additional_forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, n_classes: Optional[int] = None) -> None
collect_attention_map(inputs, targets) abstractmethod
rollout(*args) abstractmethod
attribute(inputs: Union[Tensor, Tuple[Tensor]], targets: Tensor) -> Union[Tensor, Tuple[Tensor]]

Computes attributions for the given inputs and targets.

Parameters:

Name Type Description Default
inputs Tensor

The input data.

required
targets Tensor

The target labels for the inputs.

required

Returns:

Type Description
Union[Tensor, Tuple[Tensor]]

torch.Tensor: The result of the explanation.

get_tunables()

Provides Tunable parameters for the optimizer

Tunable parameters

interpolate_mode (str): Value can be selected of "bilinear" and "bicubic"

head_fusion_method (str): Value can be selected of "min", "max", and "mean"

discard_ratio (float): Value can be selected in the range of range(0, 0.95, 0.05)

__repr__()
copy()
set_kwargs(**kwargs)
__init_subclass__() -> None
AttentionRollout

Bases: AttentionRolloutBase

Implementation of AttentionRollout explainer.

Supported Modules: Attention

Parameters:

Name Type Description Default
model Module

The PyTorch model for which attribution is to be computed.

required
interpolate_mode Optional[str]

The interpolation mode used by the explainer. Available methods are: "bilinear" and "bicubic"

'bilinear'
head_fusion_method Literal['min', 'max', 'mean']

(Optional[str]): Method to apply to head fusion. Available methods are: "min", "max", "mean"

'min'
discard_ratio float

(Optional[float]): Describes ration of attention values to discard.

0.9
forward_arg_extractor Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]

Optional function to extract forward arguments from inputs.

None
additional_forward_arg_extractor Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]

Optional function to extract additional forward arguments.

None
n_classes Optional[int]

(Optional[int]): Number of classes

None
**kwargs

Keyword arguments that are forwarded to the base implementation of the Explainer

required
Reference

Samira Abnar, Willem Zuidema. Quantifying Attention Flow in Transformers.

EXPLANATION_TYPE: ExplanationType = 'attribution' class-attribute instance-attribute
SUPPORTED_MODULES = [Attention] class-attribute instance-attribute
TUNABLES = {} class-attribute instance-attribute
model = model.eval() instance-attribute
forward_arg_extractor = forward_arg_extractor instance-attribute
additional_forward_arg_extractor = additional_forward_arg_extractor instance-attribute
device property
n_classes = n_classes instance-attribute
interpolate_mode = interpolate_mode instance-attribute
head_fusion_method = head_fusion_method instance-attribute
discard_ratio = discard_ratio instance-attribute
head_fusion_function property
__init__(model: Module, interpolate_mode: Literal['bilinear'] = 'bilinear', head_fusion_method: Literal['min', 'max', 'mean'] = 'min', discard_ratio: float = 0.9, forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, additional_forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, n_classes: Optional[int] = None) -> None
collect_attention_map(inputs, targets)
rollout(weights_all)
__repr__()
copy()
set_kwargs(**kwargs)
attribute(inputs: Union[Tensor, Tuple[Tensor]], targets: Tensor) -> Union[Tensor, Tuple[Tensor]]

Computes attributions for the given inputs and targets.

Parameters:

Name Type Description Default
inputs Tensor

The input data.

required
targets Tensor

The target labels for the inputs.

required

Returns:

Type Description
Union[Tensor, Tuple[Tensor]]

torch.Tensor: The result of the explanation.

get_tunables()

Provides Tunable parameters for the optimizer

Tunable parameters

interpolate_mode (str): Value can be selected of "bilinear" and "bicubic"

head_fusion_method (str): Value can be selected of "min", "max", and "mean"

discard_ratio (float): Value can be selected in the range of range(0, 0.95, 0.05)

__init_subclass__() -> None
TransformerAttribution

Bases: AttentionRolloutBase

Implementation of TransformerAttribution explainer.

Supported Modules: Attention

Parameters:

Name Type Description Default
model Module

The PyTorch model for which attribution is to be computed.

required
interpolate_mode Optional[str]

The interpolation mode used by the explainer. Available methods are: "bilinear" and "bicubic"

'bilinear'
head_fusion_method Literal['min', 'max', 'mean']

(Optional[str]): Method to apply to head fusion. Available methods are: "min", "max", "mean"

'mean'
discard_ratio float

(Optional[float]): Describes ration of attention values to discard.

0.9
forward_arg_extractor Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]

Optional function to extract forward arguments from inputs.

None
additional_forward_arg_extractor Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]

Optional function to extract additional forward arguments.

None
n_classes Optional[int]

(Optional[int]): Number of classes

None
**kwargs

Keyword arguments that are forwarded to the base implementation of the Explainer

required
Reference

Chefer H., Gur S., and Wolf L. Self-Attention Attribution: Transformer interpretability beyond attention visualization.

SUPPORTED_MODULES = [Attention] class-attribute instance-attribute
alpha = alpha instance-attribute
beta = beta instance-attribute
stabilizer = stabilizer instance-attribute
zennit_canonizers = zennit_canonizers or [] instance-attribute
layer = layer instance-attribute
zennit_composite property
attributor property
EXPLANATION_TYPE: ExplanationType = 'attribution' class-attribute instance-attribute
TUNABLES = {} class-attribute instance-attribute
model = model.eval() instance-attribute
forward_arg_extractor = forward_arg_extractor instance-attribute
additional_forward_arg_extractor = additional_forward_arg_extractor instance-attribute
device property
n_classes = n_classes instance-attribute
interpolate_mode = interpolate_mode instance-attribute
head_fusion_method = head_fusion_method instance-attribute
discard_ratio = discard_ratio instance-attribute
head_fusion_function property
__init__(model: Module, interpolate_mode: Literal['bilinear'] = 'bilinear', head_fusion_method: Literal['min', 'max', 'mean'] = 'mean', discard_ratio: float = 0.9, alpha: float = 2.0, beta: float = 1.0, stabilizer: float = 1e-06, zennit_canonizers: Optional[List[Canonizer]] = None, layer: Optional[Union[Module, Sequence[Module]]] = None, forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, additional_forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, n_classes: Optional[int] = None) -> None
default_head_fusion_fn(attns) staticmethod
collect_attention_map(inputs, targets)
rollout(grads, rels)
__repr__()
copy()
set_kwargs(**kwargs)
attribute(inputs: Union[Tensor, Tuple[Tensor]], targets: Tensor) -> Union[Tensor, Tuple[Tensor]]

Computes attributions for the given inputs and targets.

Parameters:

Name Type Description Default
inputs Tensor

The input data.

required
targets Tensor

The target labels for the inputs.

required

Returns:

Type Description
Union[Tensor, Tuple[Tensor]]

torch.Tensor: The result of the explanation.

get_tunables()

Provides Tunable parameters for the optimizer

Tunable parameters

interpolate_mode (str): Value can be selected of "bilinear" and "bicubic"

head_fusion_method (str): Value can be selected of "min", "max", and "mean"

discard_ratio (float): Value can be selected in the range of range(0, 0.95, 0.05)

__init_subclass__() -> None
GenericAttention

Bases: AttentionRolloutBase

EXPLANATION_TYPE: ExplanationType = 'attribution' class-attribute instance-attribute
SUPPORTED_MODULES = [Attention] class-attribute instance-attribute
TUNABLES = {} class-attribute instance-attribute
model = model.eval() instance-attribute
forward_arg_extractor = forward_arg_extractor instance-attribute
additional_forward_arg_extractor = additional_forward_arg_extractor instance-attribute
device property
n_classes = n_classes instance-attribute
interpolate_mode = interpolate_mode instance-attribute
head_fusion_method = head_fusion_method instance-attribute
discard_ratio = discard_ratio instance-attribute
head_fusion_function property
__init__(model: Module, alpha: float = 2.0, beta: float = 1.0, stabilizer: float = 1e-06, head_fusion_function: Optional[Callable[[Tensor], Tensor]] = None, n_classes: Optional[int] = None) -> None
__repr__()
copy()
set_kwargs(**kwargs)
attribute(inputs: Union[Tensor, Tuple[Tensor]], targets: Tensor) -> Union[Tensor, Tuple[Tensor]]

Computes attributions for the given inputs and targets.

Parameters:

Name Type Description Default
inputs Tensor

The input data.

required
targets Tensor

The target labels for the inputs.

required

Returns:

Type Description
Union[Tensor, Tuple[Tensor]]

torch.Tensor: The result of the explanation.

get_tunables()

Provides Tunable parameters for the optimizer

Tunable parameters

interpolate_mode (str): Value can be selected of "bilinear" and "bicubic"

head_fusion_method (str): Value can be selected of "min", "max", and "mean"

discard_ratio (float): Value can be selected in the range of range(0, 0.95, 0.05)

__init_subclass__() -> None
collect_attention_map(inputs, targets) abstractmethod
rollout(*args) abstractmethod
rollout_min_head_fusion_function(attn_weights)
rollout_max_head_fusion_function(attn_weights)
rollout_mean_head_fusion_function(attn_weights)