2.3 The Qualifier Lattice

Next: 2.4 Type Inference Up: 2 Background Previous: 2.2 Type Qualifiers and

2.3 The Qualifier Lattice

The cqual tool needs to know not only how integer types with qualifiers relate but also how qualifiers affect pointer types, pointer-to-pointer types, function types, and so on. Fortunately, standard results on subtyping tell us how to extend the subtyping on integers to other data types [1,28].

We supply cqual with a configuration file placing the qualifiers (in this case, tainted and untainted) in a lattice [14]. A lattice is a partial order where for each pair of elements and , the least upper bound and greatest lower bound of and both always exist. Using a lattice makes the implementation slightly easier. For finding format string bugs, we specify in the lattice configuration file that $\texttt{untainted}{} < \texttt{tainted}{}$ .

Given this configuration file, cqual extends the supplied lattice on qualifiers to a subtyping relation on qualified C types. We have already seen one of the subtyping rules:

$\displaystyle \inference{Q_1 \leq Q_2} {Q_1\;\texttt{int}{} \leq Q_2\;\texttt{int}{}}$

This is a natural-deduction style inference rule. In general, an inference rule says that if the statements above the line are true, then the statements below the line are also true. This particular inference rule is read as follows: If $Q_1 \leq Q_2$ in the lattice (

and

are qualifiers), then $Q_1\;\texttt{int}{}$ is a subtype of $Q_2\;\texttt{int}{}$ (note the overloading of $\leq$ ). For our example, it means that $\texttt{untainted}{}\;\texttt{int}{} \leq \texttt{tainted}{}\;\texttt{int}{}$ . The same kind of rule applies to any primitive type (char, double, etc.).

For pointer types, we need to be a little careful. Naively, we might expect to use the following rule for pointers:

$\displaystyle \inference[(Wrong)]{Q_1 \leq Q_2 & T_1 \leq T_2} {Q_1\;\textit{ptr}(T_1) \leq Q_2\;\textit{ptr}(T_2)}$

Here the type $Q_1\;\textit{ptr}(T_1)$ is a pointer to type

, and the pointer is qualified with

. Note that

represents an extended C type, and thus may itself be decorated with tainted/untainted qualifiers. In C, the type $Q_1\;\textit{ptr}(T_1)$ might be written

typedef T1 *ptr_to_t1;
typedef Q1 ptr_to_t1 q1_ptr_to_t1;

The rule (Wrong) says that if $Q_1\leq Q_2$ in the lattice and

is a subtype of

, then we can conclude that $Q_1\;\textit{ptr}(T_1)$ is a subtype of $Q_2\;\textit{ptr}(T_2)$ .

Unfortunately, this turns out to be unsound, as illustrated by the following code fragment:

  tainted char *t;
  untainted char *u;

  t = u;    /* Allowed by (Wrong) */
  *t = <tainted data>;
    /* Oops! This writes tainted data
       into untainted buffer *u */

According to (Wrong), the first assignment t = u typechecks, because $\textit{ptr}(\texttt{untainted}\;\texttt{char})$ is a subtype of $\textit{ptr}(\texttt{tainted}\;\texttt{char})$ . Then *t becomes an alias of *u, yet they have different types. Therefore we can store tainted data into *u by going through *t, even though *u is supposed to be untainted.

This is a well-known problem, and the standard solution, which is followed by cqual, is to use the following rule:

$\displaystyle \inference{Q_1 \leq Q_2 & \tau_1 = \tau_2} {Q_1\;\textit{ptr}(\tau_1) \leq Q_2\;\textit{ptr}(\tau_2)}$

The key restriction here is that $\tau_1 = \tau_2$ . Intuitively, this restriction says that any two objects that may be aliased must be given exactly the same type.²In particular, if $\tau_1$ and $\tau_2$ are decorated with qualifiers, the qualifiers must themselves match exactly, too.

**Figure 3:** An example of constraint generation. The left column is a code fragment; the right column gives the inferred constraints on the qualifier variables.
$\begin{figure}\small\begin{tabular}{ll} \texttt{tainted char getenv(const char... ...\ &$\textit{t\_p} \leq \textit{printf\_arg0\_p}$\\ \end{tabular}\end{figure*}$

Next: 2.4 Type Inference Up: 2 Background Previous: 2.2 Type Qualifiers and

Umesh Shankar 2001-05-16