If you want to read this article in __English__ click here.

# Introduction to constraint-based formal verification of neural networks.

Artificial intelligence (AI) has already proven useful in many areas to date and has even become indispensable in some areas, such as computer vision. In safety-critical areas, however, there are many things to consider when using AI. To give a first impression of this, the following images show some hard-to-see traffic signs from a common German street scene. In the slightly varied images of these scenes, it even becomes almost impossible to recognize some of the traffic signs. We want to check whether such undesired behavior as not recognizing traffic signs correctly can be excluded within a reasonable range of lighting conditions.

As digitization continues to advance in our everyday lives, we also increasingly rely on software to perform flawlessly even in the most difficult tasks. Ideally, this reliance is backed up by evidence. For software, there are many proven approaches to prove the intended and correct behavior of algorithms. This is necessary for diverse use cases and even mandatory for safety-critical applications, such as autonomous driving. For many use cases, the state-of-the-art approaches are based on machine learning (ML)^{1}.

For example, in image processing today, object recognition is typically approached with ML. Such ML systems learn to solve a task using training data. While the function is defined by humans, the functionality is learned by the ML algorithm itself. However, the problem with ML-based applications now is that they currently lack proven methods to formally verify their intended behavior. Therefore, it is currently not possible to derive the same guarantees as in classical software development. Overcoming this gap is being pursued at full speed, especially by industries that want to use ML components in safety-critical applications. An obvious example at this point is the automotive industry, which is working on automated driving based on ML components.

This blog series is dedicated to the formal verification of ML models. For this purpose, we first introduce the topic in general and then present a concrete approach for neural networks. Someone who has not yet dealt with this topic may ask himself questions like:

- What do we want to verify?
- What is formal verification?
- Why do we need formal verification?
- How to verify neural networks using constraints?

If such questions have arisen, then this is the right place for some answers! So without further ado, let's start with the most important questions ...

## What do we want to verify?

In short, the correct functionality of ML systems such as neural networks. More specifically, we need a formal description of the intended behavior of the ML system, including definitions for malfunctions and the input domain in which the system should function properly. To illustrate, let's already consider the example that will be the final goal of this blog series: a formally verified stop sign detector.

Thus, let us assume a neural network for stop sign detection trained on a subset of the German Traffic Sign Detection Benchmark (GTSDB) dataset containing only images with exactly one stop sign. We then expect this network to correctly detect stop signs in images like the following:

The emphasis here is on "images like the ones shown here". This subset of the GTSDB is only a sample of all scenarios encountered in road traffic in Germany. Ideally, this sample is also representative of the data we have not seen. Therefore, a dataset should represent the general concept of the relevant scenarios. In ML jargon, this means that the data represent their process of formation and are thus representative of the data distribution encountered in the real world.

However, if data from a different distribution is used as input to an already trained ML model, it would be dangerous to expect the model to still work properly. As an example of this, below are images that are not conceptually represented by the GTSDB dataset:

Ensuring the correct behavior of ML algorithms under *domain shifts*, i.e., correctly processing data from outside the training data distribution, is addressed by the field of domain adaptation, which is itself an extensive topic. However, we do not intend to perform domain adaptation. This brief digression is merely meant to illustrate that we need to think about our domain and, in particular, our expectations about the behavior of the neural network within that domain.

In practice, we have to deal with partially obscured road signs, difficult weather conditions, versatile lighting situations and many other difficult situations. For example, here is an image from the GTSDB dataset with a stop sign that is quite difficult to see even for the human eye:

This example is not even the most difficult case to expect and consider. Unfavorable circumstances can make it even more difficult to see stop signs like this one. Such circumstances arise, for example, from poor weather conditions such as fog or difficult lighting conditions at sunrise and sunset. Therefore, we are faced with the question of how good an ML system needs to be for its intended task. Especially when ML systems are compared to humans, some difficulties arise here. This question has to be answered individually for each ML function depending on its purpose and the associated risks. We do not want to define safety requirements for our stop sign detector at this point. However, it should be kept in mind that ML systems have their limitations and we have to specify exactly the domain for which we expect correct behavior.

This brings us back to the initial question "What should be verified?". We want to prove that neural networks function as intended within the defined domain. For our example, this means that the system should be insensitive to changes in lighting conditions up to certain thresholds. In practice, such thresholds are usually derived from safety requirements.

## What is formal verification?

Formal verification is the process of proving the correctness of properties of a system. In our context, we want to guarantee the correct behavior of a neural network in certain situations. To do this, we need a formally describable hypothesis and, in addition, a formal approach to prove or disprove this hypothesis. The concept we use here is "proof by contradiction". This type of proof assumes that the converse of a hypothesis is true and attempts to prove that converse. If an example exists that proves the antithesis, it is called a counterexample to the original thesis. Let us illustrate this with a simple example of natural language with the thesis: "All birds can fly". The corresponding counter thesis is: "There is at least one bird that cannot fly". "Penguins" or "ostriches" would thus be counterexamples to the original thesis and would thus refute it.

For a more mathematical example, the following snippet can be unfolded, which already spoils a bit of the functionality of the approach presented in Part 2.

For the proof of contradiction, a counterthesis is established, which mathematically formulates the opposite of the original thesis. Proofs for this antithesis are at the same time counterexamples for the original thesis. This may seem unnecessarily complicated at first glance, but in practice this approach allows for a simple design of the counterexamples sought. This is illustrated by the following example. The advantages and operation of this approach will be explained in more detail in Part 2:

These:

Für das gegebene Modell eines neuronalen Netzes gilt

f(x)=
\begin{pmatrix}
f_1(x) \\
f_2(x)
\end{pmatrix}=
\begin{pmatrix}
x_1 * \omega_{1,1} + x_2 * \omega_{1,2} + b_1 \\
x_1 * \omega_{2,1} + x_2 * \omega_{2,2} + b_2
\end{pmatrix} mit den Gewichten \begin{pmatrix}
\omega_{1,1} & \omega_{1,2} \\
\omega_{2,1} & \omega_{2,2}
\end{pmatrix} =
\begin{pmatrix}
1 & 4 \\
3 & -3
\end{pmatrix} und Bias \begin{pmatrix}
b_1 \\
b_2
\end{pmatrix} =
\begin{pmatrix}
5 \\
1
\end{pmatrix} existiert keine Eingabe x =
\begin{matrix}
\begin{pmatrix}
x_1 \\
x_2
\end{pmatrix} &
\begin{matrix}
x_1 \in [0, 6]\\
x_2 \in [1, 2]
\end{matrix}
\end{matrix} , so dass f_1(x) \leq f_2(x) .

Gegenthese: Es gibt mindestens eine Eingabe x =
\begin{matrix}
\begin{pmatrix}
x_1 \\
x_2
\end{pmatrix} &
\begin{matrix}
x_1 \in [0, 6]\\
x_2 \in [1, 2]
\end{matrix}
\end{matrix} , so dass f_1(x) \leq f_2(x) .

Gegenbeispiel: x = \begin{pmatrix}
6 \\
1
\end{pmatrix}

## Why do we need formal verification?

For safety-critical applications, such as autonomous driving, standard ML metrics such as high accuracy on a test data set are not sufficient. Neural networks are highly complex functions that are also known to be highly volatile, especially if not properly designed and trained. This shortcoming is exploited by Adversarial Attacks and consequently raises legitimate concerns. Adversarial Attacks are methods designed to produce inputs that force unintended behavior from neural networks. Here, we will give a brief overview of the most relevant points of Adversarial Attacks:

- There is a wide variety of approaches
- Most of the defense strategies specifically target certain attacks and do not necessarily improve the quality / robustness of networks
- Attacks are designed to fool networks as easily as possible, but typically not with realistic input
- Pretty much any neural network can be reliably attacked and thus deceived

The discrepancy between a high success rate of adversarial attacks and the fact that they are not necessarily designed to produce realistic inputs leaves a large gap. However, when we look at ML systems that are part of safety-critical applications, we need accurate information about the domain of correct behavior. Although approaches to formal verification are in some ways also special types of adversarial attacks, their focus on realistic transformations that map specific input domains makes them more meaningful. Such transformations are not only more intuitively meaningful, but also minimize domain shifts when properly designed, resulting in an overall more promising verification process.

So why have we made this technical digression on the subject of "Adversarial Attacks"? There are many laws, standards and industry-specific release processes that require proof for the intended function of software components. Without this, even a perfectly functioning software function cannot be released, because it cannot be guaranteed that it will actually function perfectly. When it comes to software that contains ML components, such proofs can be very difficult and even impossible without proper methods. Thus, insofar as we consider software that uses neural networks, a suitable verification procedure must take into account the intended task of the software and the functionality of the neural network.

## How to verify neural networks using constraints?

The approach we use is based on the concept of proof by contradiction. Since we want to show that a neural network behaves correctly, we thus consider the counter hypothesis that it does not behave correctly within the search space. Consequently, we try to find a counterexample of correct behavior. If no such counterexample exists, the original hypothesis is proven, i.e., the neural network behaves safely with respect to the tested property. In the presented approach, the search space is represented by a single datum or data set and at least one transformation, such as contrast or brightness change. The transformations map specific properties and are representative of a defined expected behavior. All this information is now required in a formally verifiable way.

For this purpose, we use Mixed Integer Linear Programming(MILP). MILP encodings are essentially a system of linear (in)equations and variable bounds that provide a suitable representation for optimization and constraint satisfaction problems. The advantage of such MILP encodings is the usability of integers and floating point numbers as variables, which allows a representation of neural networks with linear constraints. Thus, if we want to verify that a neural network is robust, for example against obscuration to a certain degree, we need to translate the following components into an MILP encoding:

- The search space based on test data and the transformations
- All operations of the neural network
- The definition of correct behavior / counterexamples of correct behavior.

Thus, we have now introduced in detail the task of "translating human-understandable security criteria into a correct and complete mathematical problem suitable as a proof for releasing software with neural networks as components." In the next chapter we will look in detail at an approach that addresses this challenging task ...

^{1Typically,}AI refers to systems that behave "intelligently" with respect to a specific task, and ML are the approaches/algorithms that learn concepts or behavior from data, i.e. ML is seen as part of AI, among other things. We want to avoid confusion about such definitions, so from here on we will only use the term ML.

List of figures:

Figures 1-8, 11-13: Houben, S., Schlipsing, M., Stallkamp, J., Salmen, J., & Igel, C. (2013). Proposal for IJCNN 2013 competition: The German Traffic Sign Detection Benchmark.

English Version

## Part 1: Introduction to Constraint-based Formal Verification of Neural Networks

Nowadays Artificial Intelligence (AI) has proven to be useful in many domains and is even indispensable in some areas like computer vision. However, there are many things to consider if AI is used in safety-critical areas. To give a first impression, the following images show some difficult to detect traffic signs from an ordinary scene of German streets. Additionally, there are slight variations of those scenes, where it is even more difficult to detect some of the traffic signs. We want to verify that such an unwanted behavior, like not detecting street signs properly, can be ruled out within a reasonable domain of illumination conditions like the ones in the images shown below:

While our everyday life is becoming more and more digital, we also increasingly rely on software to work properly even on the hardest tasks. Ideally, this trust is backed by evidence. For software, there are many approved approaches to prove the intended and correct behavior of algorithms. This is necessary for diverse application areas and even mandatory for safety critical applications like autonomous driving. For many use cases machine learning (ML)^{1} based algorithms are the state-of-the-art approach.

For example, in image processing, object detection is typically approached with ML nowadays. Such ML systems learn to solve a task from training data. While the function is defined by humans, the functionality is learned by the ML algorithm itself. However, the problem with ML based applications is that they currently lack approved methods for the formal verification of their intended behavior. Consequently, with state-of-the-art methods we cannot derive the same guarantees as in classical software development. Closing this gap is being pushed forward with high pressure, especially by industries that want to deploy ML components in safety-critical applications. An obvious example at this point is the automotive industry, which is working on automated driving based on ML components.

This blog series is dedicated to formal verification of ML models. We introduce the topic in general and additionally present a concrete approach designed for neural networks. If you have never dealt with this topic before, the title may already raise some questions like:

- What do we want to verify?
- What is formal verification?
- Why do we need formal verification?
- How do we formally verify neural networks using constraints?

If such questions come to your mind, then you are at the right place. For now, let us start with the most important questions ...

## What do we want to verify?

In short, the correct functionality of ML systems like neural networks. More precisely, we need a formal description of the intended behavior of the ML system, including definitions for malfunctions and the input domain where the system should work properly. To illustrate this, we look at the example that will be the final goal of this series: a formally verified stop sign detector.

On a subset of the German Traffic Sign Detection Benchmark(GTSDB) data set where always exactly one stop sign is in the picture, a neural network has been trained to detect this stop sign. Therefore, we expect this network to detect stop signs correctly in images like the ones below:

The emphasis lies on "images like the ones shown here". This set of training data from the GTSDB is just a sample from all scenarios that can be encountered in traffic in Germany. Ideally, this sample is also representative for the data that we have not seen. Hence, a dataset should stand for the general concept of the relevant scenarios. In ML-terms, this means that the data should sufficiently depict its generating process and therefore be representative of the data distribution encountered in the real world. However, if data from another distribution is taken as input to an already trained ML model, it would be dangerous to expect it to still work properly. The following pictures show some examples of stop signs, which are not sufficiently represented by the GTSDB:

Enabling ML algorithms to perform under domain shifts, i.e., handle data from outside the training data distribution, is addressed by the field of domain adaptation, which itself is an extensive topic. However, we are not dealing with domain adaptation in this blog, but this short digression should make clear that we must think about our domain and especially about our expectations about the behavior of the neural network within this domain.

In practice, we must deal with partially occluded traffic signs, difficult weather conditions, various lighting situations and many other tricky situations. For example, here is an image from the GTSDB dataset with a spot stop sign that is pretty hard to detect, even for the human eye:

This example is not even the worst case we have to expect and consider. Unfavorable circumstances can make it even harder to detect stop signs like this one. Such circumstances arise for example from bad weather conditions like fog or difficult lighting conditions encountered at sunrise and sunset. Therefore, we are facing the question of how good an ML system has to be in its intended task, especially in comparison to humans. This difficult question must be answered individually for each ML function depending on its purpose. The important part to keep in mind is that ML systems have their limitations, and we must precisely specify the domain for which we expect correct behavior.

This brings us back to the initial question of "what should be verified?". We want to prove that neural networks work as intended within the defined search space. For our example, this means that the system should be resilient against influences of different lighting conditions up to certain thresholds. In practice, such threshold values are usually derived from safety requirements.

## What is formal verification?

Formal verification is the process of proving the correctness of properties of a system. In our context, we want to guarantee correct behavior of a neural network in certain situations. For that, we need a formally describable hypothesis and additionally a formal approach to prove or refute this hypothesis. The concept we are utilizing here is called "proof by contradiction". This type of proof assumes that the opposite of a hypothesis is true and tries to prove this counter-thesis. If an example proving the counter-thesis exists, we call this example a counterexample in regard to the original hypothesis.

Let us illustrate this with a simple example based on natural language: the thesis should be "all birds can fly". Then the counter-thesis would be "there exist at least one bird that cannot fly" and "penguins" or "ostriches" would be counterexamples for the original thesis, thus refuting it.

Unfold the block below for a more mathematical example, which is a little spoiler for the functionality of the approach presented in Part 2.

\textsf{\small {Thesis:}} \\ \textsf{\small {For\ the\ given\ model}}\ f(x)= \begin{pmatrix} f_1(x) \\ f_2(x) \end{pmatrix}= \begin{pmatrix} x_1 * \omega_{1,1} + x_2 * \omega_{1,2} + b_1 \\ x_1 * \omega_{2,1} + x_2 * \omega_{2,2} + b_2 \end{pmatrix}\\ \textsf{\small {with\ the\ weights}}\ \begin{pmatrix} \omega_{1,1} & \omega_{1,2} \\ \omega_{2,1} & \omega_{2,2} \end{pmatrix} = \begin{pmatrix} 1 & 4 \\ 3 & -3 \end{pmatrix} \textsf{\small {and\ bias}}\ \begin{pmatrix} b_1 \\ b_2 \end{pmatrix} = \begin{pmatrix} 5 \\ 1 \end{pmatrix} \\ \textsf{\small {no\ input}}\ x = \begin{matrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} & \begin{matrix} x_1 \in [0, 6]\\ x_2 \in [1, 2] \end{matrix} \end{matrix} \textsf{\small {, \ exists\ such\ that}}\ f_1(x) \leq f_2(x). \\ \textsf{\small {Counter-Thesis: \ There\ exists\ at\ least\ one\ input}}\ x = \begin{matrix} \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} & \begin{matrix} x_1 \in [0, 6]\\ x_2 \in [1, 2] \end{matrix} \end{matrix} , \\ \textsf{\small {such\ that}}\ f_1(x) \leq f_2(x). \\ \textsf{\small {Counterexample:}}\ x = \begin{pmatrix} 6 \\ 1 \end{pmatrix}

## Why do we need formal verification?

For some safety critical scenarios, like autonomous driving, standard ML metrics like a good test set accuracy are simply not enough. Neural networks are highly complex functions, known to also be highly volatile if not designed and trained properly. This flaw is exploited by adversarial attacks and consequently raises legitimate concerns. Adversarial attacks are methods designed to create inputs that force unintentional behavior of neural networks. At this point, we want to give a brief overview of the most relevant points about adversarial attacks for this blog:

- There is a high diversity of approaches
- Most defense strategies address specific attacks and do not necessarily improve the overall performance / robustness of networks
- Attacks are designed to fool the network as easily as possible, but typically not necessarily with a focus on realism
- Pretty much any neural network can be attacked and fooled quite reliably

The discrepancy between a high success rate of adversarial attacks and the fact that they are not necessarily meant to produce realistic inputs leaves a big gap. However, when we look at ML systems that are part of safety critical applications, we need precise information about the domain of correct behavior. Although approaches for formal verification are also special kinds of adversarial attacks, their focus on realistic input transformations, which represent specified domains of relevant inputs, makes them more meaningful. Such transformations are not only intuitively more significant, but also minimize a domain shift if designed properly, resulting in an actual promising verification process.

So, why did we do this technical digression? There are lots of laws, norms and industry specific release processes, requiring proofs that software components work as intended. Without them, even a perfectly working software function cannot be released, because we cannot assure that it is indeed working perfectly. When we are talking about software that includes ML components, such proofs can be exceedingly difficult and without suitable verification methods even impossible to give. Hence, when we look at software that uses neural networks, a suitable verification method must consider the intended task of the software and the functionality of the neural network.

## How do we formally verify neural networks using constraints?

The approach we are using is based on the concept of proof by contradiction. Since we want to show that a neural network behaves correctly, we look at the counterhypothesis stating it does not behave correctly within the search space. Consequently, we try to find a counterexample for correct behavior. If no such counterexample exists, then the original thesis holds true, i.e., the neural network behaves safely with regard to the checked property. With the presented approach, the search space is defined by a sample or a data set and at least one transformation like the change of contrast or brightness. The robustness property that is checked is the invariance of the neural network's output on the data under the given transformations. For example, classification results stay the same when the luminosity or contrast of the input is changed. Now we need all this information in a formally verifiable form. For this, we are using mixed integer linear programming(MILP). MILP encodings are essentially a system of linear (in)equations and variable boundaries, which are a suitable representation for optimization- and constraint-satisfaction-problems. The advantage of such MILP encodings is the usability of integer and non-integer variables, allowing a representation of neural networks with linear constraints.

Thus, if we want to verify that a neural network is robust, for example against darkening up to a certain degree, then we have to translate the following parts into a MILP encoding:

- The search space, which is based on test data and the transformations
- All operations of the neural network
- The definition of correct behavior / counterexamples for correct behavior

Up to this point we extensively introduced the task to 'translate human understandable safety criteria into a sound and complete mathematical problem, which is suitable as evidence for the release of software with neural networks as components. In the next articlewe are going to look in detail into an approach that performs this challenging task ...

^{1Typically}, AI refers to systems that behave '*intelligently' in relation to a specific task and ML means approaches/algorithms that learn concepts or behavior from data. This means ML is commonly seen as part of AI, among other things. We want to avoid confusions about such definitions and therefore only use the term ML from this point on.*

Table of Figures:

Figure 1-8, 11-13: Houben, S., Schlipsing, M., Stallkamp, J., Salmen, J., & Igel, C. (2013). Proposal for IJCNN 2013 competition: The German Traffic Sign Detection Benchmark.