Interval Bound Propagation 101

3 minute read

Published:

The purpose of this post is to explain the basics of the Interval Bound Propagation (IBP) method. It is powerful tool which allows to provide guarantees about output ranges of a neural network. I mostly follow the notation from deep reinforcement learning (RL), therefore the typical neural network for which I want to compute bounds is a neural policy.

Setup

Suppose we have:

  • A feed-forward NN with monotonic activations:
\[\pi_{\theta} : \mathbb{R}^{N} \to \mathbb{R}^{M} \quad \text{(policy that maps states into action probabilities)}\]

where $N$ is the dimension of the state input vector (i.e. how many features represent the agent’s state) and $M$ is the number of actions available to the agent. Parameters are $\theta = {(W_i, b_i)}{i=1}^{K}$ and the neural network (policy) $\pi{\theta}$ is composed of $K$ layers.

Each layer performs the following operations:

\[\hat{z}_k = W_k z_{k-1} + b_k, \quad z_k = \sigma(\hat{z}_k)\]

where $z_0 = s$, $\pi_{\theta}(s) = \hat{z}_K$, and $\sigma$ is the activation function, which we assume to be monotonic (e.g. ReLU).

Interval Arithmetic

We can define interval representations for matrices, and their corresponding arithmetic operations. We represent each matrix $A$ by its element-wise interval bounds $[A^L, A^U]$.

Given interval matrices $[A^L, A^U]$ and $[B^L, B^U]$, we can define the arithmetic operations:

Interval Matrix Addition

\([A^L, A^U] \oplus [B^L, B^U] = [A^L + B^L, \; A^U + B^U]\)

Interval Matrix Multiplication

Using centre matrices:

  • $A_\mu = (A^L + A^U) / 2$,
  • $B_\mu = (B^L + B^U) / 2$,

and radius matrices:

  • $A_r = (A^U - A^L) / 2$,
  • $B_r = (B^U - B^L) / 2$,

Rump’s algorithm gives: \([A^L, A^U] \otimes [B^L, B^U] = [C^L, C^U]\) where:

  • $C^L = A_\mu B_\mu -A_\muB_r - A_rB_\mu- A_r B_r$
  • $C^U = A_\mu B_\mu +A_\muB_r + A_rB_\mu+ A_r B_r$

The operations defined this way ensure that:

  • $A’ \oplus B’ \in [C^L, C^U]$
  • $A’ \otimes B’ \in [C^L, C^U]$

$\forall \; A’ \in [A^L, A^U] \text{ and } \forall \; B’ \in [B^L, B^U]$

Bounding the Forward Pass

Using the interval operations defined above, we can propagate bounds through the neural network.

For any input $s$ and parameters $W_k \in [W_k^L, W_k^U]$, $b_k \in [b_k^L, b_k^U]$ $\forall \, k \in 1, \ldots, K$, we compute interval bounds:

\[[\hat{z}_k^L, \hat{z}_k^U] = [W_k^L, W_k^U] \otimes [z_{k-1}^L, z_{k-1}^U] \oplus [b_k^L, b_k^U]\] \[[z_k^L, z_k^U] = [\sigma(\hat{z}_k^L), \; \sigma(\hat{z}_k^U)]\]

This procedure provides the guarantee that:

\[\pi_{\theta}(s) \in [\hat{z}_K^L, \hat{z}_K^U]\]

for all parameter combinations within their respective intervals. In words, for a given state input $s$, the output logits of the network are guaranteed to lie within the interval $[\hat{z}_K^L, \hat{z}_K^U]$ for any choice of neural policy’s parameters within their specified bounds.

Bounding the Specification $\varphi$

Assume we have the interval bounds on the output logits of the network, $[\hat{z}_K^L, \hat{z}_K^U]$, computed using the procedure above.

Standard performance metrics, such as model accuracy, are monotonically increasing in the logit of the correct class, and monotonically decreasing in the logits of incorrect classes.

The worst-case output logits within the interval $[\hat{z}_K^L, \hat{z}_K^U]$ are given by:

\[\hat{z}_K^{\text{worst}} = \hat{z}_K^L \cdot e_y + \hat{z}_K^U \cdot (1 - e_y)\]

where $e_y \in \mathbb{R}^C$ is the one-hot vector corresponding to the true class $y$.

To be continued…