AttentionRollout
pnpxai.explainers.attention_rollout
AttentionRolloutBase
Bases: ZennitExplainer
Base class for AttentionRollout
and TransformerAttribution
explainers.
Supported Modules: Attention
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
Module
|
The PyTorch model for which attribution is to be computed. |
required |
interpolate_mode
|
Optional[str]
|
The interpolation mode used by the explainer. Available methods are: "bilinear" and "bicubic" |
'bilinear'
|
head_fusion_method
|
Literal['min', 'max', 'mean']
|
(Optional[str]): Method to apply to head fusion. Available methods are: |
'min'
|
discard_ratio
|
float
|
(Optional[float]): Describes ration of attention values to discard. |
0.9
|
forward_arg_extractor
|
Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]
|
Optional function to extract forward arguments from inputs. |
None
|
additional_forward_arg_extractor
|
Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]
|
Optional function to extract additional forward arguments. |
None
|
n_classes
|
Optional[int]
|
(Optional[int]): Number of classes |
None
|
forward_arg_extractor
|
Optional[ForwardArgumentExtractor]
|
A function that extracts forward arguments from the input batch(s) where the attribution scores are assigned. |
None
|
additional_forward_arg_extractor
|
Optional[ForwardArgumentExtractor]
|
A secondary function that extract additional forward arguments from the input batch(s). |
None
|
**kwargs
|
Keyword arguments that are forwarded to the base implementation of the Explainer |
required |
Reference
Samira Abnar, Willem Zuidema. Quantifying Attention Flow in Transformers.
SUPPORTED_MODULES = [Attention]
class-attribute
instance-attribute
interpolate_mode = interpolate_mode
instance-attribute
head_fusion_method = head_fusion_method
instance-attribute
discard_ratio = discard_ratio
instance-attribute
head_fusion_function
property
EXPLANATION_TYPE: ExplanationType = 'attribution'
class-attribute
instance-attribute
TUNABLES = {}
class-attribute
instance-attribute
model = model.eval()
instance-attribute
forward_arg_extractor = forward_arg_extractor
instance-attribute
additional_forward_arg_extractor = additional_forward_arg_extractor
instance-attribute
device
property
n_classes = n_classes
instance-attribute
__init__(model: Module, interpolate_mode: Literal['bilinear'] = 'bilinear', head_fusion_method: Literal['min', 'max', 'mean'] = 'min', discard_ratio: float = 0.9, forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, additional_forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, n_classes: Optional[int] = None) -> None
collect_attention_map(inputs, targets)
abstractmethod
rollout(*args)
abstractmethod
attribute(inputs: Union[Tensor, Tuple[Tensor]], targets: Tensor) -> Union[Tensor, Tuple[Tensor]]
Computes attributions for the given inputs and targets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
Tensor
|
The input data. |
required |
targets
|
Tensor
|
The target labels for the inputs. |
required |
Returns:
Type | Description |
---|---|
Union[Tensor, Tuple[Tensor]]
|
torch.Tensor: The result of the explanation. |
get_tunables()
Provides Tunable parameters for the optimizer
Tunable parameters
interpolate_mode
(str): Value can be selected of "bilinear"
and "bicubic"
head_fusion_method
(str): Value can be selected of "min"
, "max"
, and "mean"
discard_ratio
(float): Value can be selected in the range of range(0, 0.95, 0.05)
__repr__()
copy()
set_kwargs(**kwargs)
__init_subclass__() -> None
AttentionRollout
Bases: AttentionRolloutBase
Implementation of AttentionRollout
explainer.
Supported Modules: Attention
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
Module
|
The PyTorch model for which attribution is to be computed. |
required |
interpolate_mode
|
Optional[str]
|
The interpolation mode used by the explainer. Available methods are: "bilinear" and "bicubic" |
'bilinear'
|
head_fusion_method
|
Literal['min', 'max', 'mean']
|
(Optional[str]): Method to apply to head fusion. Available methods are: |
'min'
|
discard_ratio
|
float
|
(Optional[float]): Describes ration of attention values to discard. |
0.9
|
forward_arg_extractor
|
Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]
|
Optional function to extract forward arguments from inputs. |
None
|
additional_forward_arg_extractor
|
Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]
|
Optional function to extract additional forward arguments. |
None
|
n_classes
|
Optional[int]
|
(Optional[int]): Number of classes |
None
|
**kwargs
|
Keyword arguments that are forwarded to the base implementation of the Explainer |
required |
Reference
Samira Abnar, Willem Zuidema. Quantifying Attention Flow in Transformers.
EXPLANATION_TYPE: ExplanationType = 'attribution'
class-attribute
instance-attribute
SUPPORTED_MODULES = [Attention]
class-attribute
instance-attribute
TUNABLES = {}
class-attribute
instance-attribute
model = model.eval()
instance-attribute
forward_arg_extractor = forward_arg_extractor
instance-attribute
additional_forward_arg_extractor = additional_forward_arg_extractor
instance-attribute
device
property
n_classes = n_classes
instance-attribute
interpolate_mode = interpolate_mode
instance-attribute
head_fusion_method = head_fusion_method
instance-attribute
discard_ratio = discard_ratio
instance-attribute
head_fusion_function
property
__init__(model: Module, interpolate_mode: Literal['bilinear'] = 'bilinear', head_fusion_method: Literal['min', 'max', 'mean'] = 'min', discard_ratio: float = 0.9, forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, additional_forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, n_classes: Optional[int] = None) -> None
collect_attention_map(inputs, targets)
rollout(weights_all)
__repr__()
copy()
set_kwargs(**kwargs)
attribute(inputs: Union[Tensor, Tuple[Tensor]], targets: Tensor) -> Union[Tensor, Tuple[Tensor]]
Computes attributions for the given inputs and targets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
Tensor
|
The input data. |
required |
targets
|
Tensor
|
The target labels for the inputs. |
required |
Returns:
Type | Description |
---|---|
Union[Tensor, Tuple[Tensor]]
|
torch.Tensor: The result of the explanation. |
get_tunables()
Provides Tunable parameters for the optimizer
Tunable parameters
interpolate_mode
(str): Value can be selected of "bilinear"
and "bicubic"
head_fusion_method
(str): Value can be selected of "min"
, "max"
, and "mean"
discard_ratio
(float): Value can be selected in the range of range(0, 0.95, 0.05)
__init_subclass__() -> None
TransformerAttribution
Bases: AttentionRolloutBase
Implementation of TransformerAttribution
explainer.
Supported Modules: Attention
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
Module
|
The PyTorch model for which attribution is to be computed. |
required |
interpolate_mode
|
Optional[str]
|
The interpolation mode used by the explainer. Available methods are: "bilinear" and "bicubic" |
'bilinear'
|
head_fusion_method
|
Literal['min', 'max', 'mean']
|
(Optional[str]): Method to apply to head fusion. Available methods are: |
'mean'
|
discard_ratio
|
float
|
(Optional[float]): Describes ration of attention values to discard. |
0.9
|
forward_arg_extractor
|
Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]
|
Optional function to extract forward arguments from inputs. |
None
|
additional_forward_arg_extractor
|
Optional[Callable[[Tuple[Tensor]], Union[Tensor, Tuple[Tensor]]]]
|
Optional function to extract additional forward arguments. |
None
|
n_classes
|
Optional[int]
|
(Optional[int]): Number of classes |
None
|
**kwargs
|
Keyword arguments that are forwarded to the base implementation of the Explainer |
required |
Reference
Chefer H., Gur S., and Wolf L. Self-Attention Attribution: Transformer interpretability beyond attention visualization.
SUPPORTED_MODULES = [Attention]
class-attribute
instance-attribute
alpha = alpha
instance-attribute
beta = beta
instance-attribute
stabilizer = stabilizer
instance-attribute
zennit_canonizers = zennit_canonizers or []
instance-attribute
layer = layer
instance-attribute
zennit_composite
property
attributor
property
EXPLANATION_TYPE: ExplanationType = 'attribution'
class-attribute
instance-attribute
TUNABLES = {}
class-attribute
instance-attribute
model = model.eval()
instance-attribute
forward_arg_extractor = forward_arg_extractor
instance-attribute
additional_forward_arg_extractor = additional_forward_arg_extractor
instance-attribute
device
property
n_classes = n_classes
instance-attribute
interpolate_mode = interpolate_mode
instance-attribute
head_fusion_method = head_fusion_method
instance-attribute
discard_ratio = discard_ratio
instance-attribute
head_fusion_function
property
__init__(model: Module, interpolate_mode: Literal['bilinear'] = 'bilinear', head_fusion_method: Literal['min', 'max', 'mean'] = 'mean', discard_ratio: float = 0.9, alpha: float = 2.0, beta: float = 1.0, stabilizer: float = 1e-06, zennit_canonizers: Optional[List[Canonizer]] = None, layer: Optional[Union[Module, Sequence[Module]]] = None, forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, additional_forward_arg_extractor: Optional[ForwardArgumentExtractor] = None, n_classes: Optional[int] = None) -> None
default_head_fusion_fn(attns)
staticmethod
collect_attention_map(inputs, targets)
rollout(grads, rels)
__repr__()
copy()
set_kwargs(**kwargs)
attribute(inputs: Union[Tensor, Tuple[Tensor]], targets: Tensor) -> Union[Tensor, Tuple[Tensor]]
Computes attributions for the given inputs and targets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
Tensor
|
The input data. |
required |
targets
|
Tensor
|
The target labels for the inputs. |
required |
Returns:
Type | Description |
---|---|
Union[Tensor, Tuple[Tensor]]
|
torch.Tensor: The result of the explanation. |
get_tunables()
Provides Tunable parameters for the optimizer
Tunable parameters
interpolate_mode
(str): Value can be selected of "bilinear"
and "bicubic"
head_fusion_method
(str): Value can be selected of "min"
, "max"
, and "mean"
discard_ratio
(float): Value can be selected in the range of range(0, 0.95, 0.05)
__init_subclass__() -> None
GenericAttention
Bases: AttentionRolloutBase
EXPLANATION_TYPE: ExplanationType = 'attribution'
class-attribute
instance-attribute
SUPPORTED_MODULES = [Attention]
class-attribute
instance-attribute
TUNABLES = {}
class-attribute
instance-attribute
model = model.eval()
instance-attribute
forward_arg_extractor = forward_arg_extractor
instance-attribute
additional_forward_arg_extractor = additional_forward_arg_extractor
instance-attribute
device
property
n_classes = n_classes
instance-attribute
interpolate_mode = interpolate_mode
instance-attribute
head_fusion_method = head_fusion_method
instance-attribute
discard_ratio = discard_ratio
instance-attribute
head_fusion_function
property
__init__(model: Module, alpha: float = 2.0, beta: float = 1.0, stabilizer: float = 1e-06, head_fusion_function: Optional[Callable[[Tensor], Tensor]] = None, n_classes: Optional[int] = None) -> None
__repr__()
copy()
set_kwargs(**kwargs)
attribute(inputs: Union[Tensor, Tuple[Tensor]], targets: Tensor) -> Union[Tensor, Tuple[Tensor]]
Computes attributions for the given inputs and targets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
Tensor
|
The input data. |
required |
targets
|
Tensor
|
The target labels for the inputs. |
required |
Returns:
Type | Description |
---|---|
Union[Tensor, Tuple[Tensor]]
|
torch.Tensor: The result of the explanation. |
get_tunables()
Provides Tunable parameters for the optimizer
Tunable parameters
interpolate_mode
(str): Value can be selected of "bilinear"
and "bicubic"
head_fusion_method
(str): Value can be selected of "min"
, "max"
, and "mean"
discard_ratio
(float): Value can be selected in the range of range(0, 0.95, 0.05)