The reference manual for the KQuery language
The KQuery language is the textual representation of constraint expressions and queries which is used as input to the Kleaver constraint solver.
Currently the language is capable of representing quantifier free formulas over bitvectors and arrays, with direct support for all standard operations on bitvectors. The language has been designed to be compact and easy to read and write.
The KQuery language is closely related to the C++ API for Exprs, see also the doxygen Expr documentation.
In this document, syntax is given in Extended Backus-Naur Form and appears as:
Unless noted, the rules are described in terms of tokens not characters, and tokens can be separate by white space and comments.
In some case, a production like
child-expression is used as an alias for the
expression production, when subsequent text needs to differentiate the expression.
Examples are shown using:
(Eq w32 a b)
A KQuery source file consists of a sequence of declarations.
Currently, the language supports two kinds of declarations:
- Array Declarations: Use to declare an array of bitvectors for use in subsequent expressions.
- Query Commands: Used to define queries which should be executed by the constraint solver. A query consists of a set of constraints (assumptions), a query expression, and optionally expressions and arrays to compute values for if the query expression is invalid.
Comments begin with “#” and continue until the end of line. For example:
(Add w32 1 1) # Two, hopefully
Expression and Version Labels
Expressions are frequently shared among constraints and query expressions. In order to keep the output succinct and readable, expression labels can be used to introduce a lexical binding which can be used in subsequent expressions. Expression labels are globally scoped through the entire source file, and a definition must preceed any use in the source file.
Likewise, versions are frequently shared among reads and can be labelled in the same fashion.
(Add w32 N0:(Add w32 1 1) N0) # Four
array const_array : w32 -> w8 = [5,6]
(Read w8 0 U0:[0=255] @ const_array) # U0 now refers to an array [255,6]
(Read w8 1 U0) # Read from byte offset 1 of [255,6]
Identifiers are used for specifying array names and for expression labels.
Note that in order to keep open the possibility to introduce explicit integral and floating-point types, the following identifiers are treated as reserved keywords:
floating-point-type = "fp[0-9]+([.].*)?"
integer-type = "i[0-9]+"
Numeric constants can be specified as follows.
0b1000_0001 # 129
Non-decimal constants can be signed. The “_” character is ignored when evaluating constants, but is available for use as a separator.
Types are explicit operands to most expressions, and indicate the bit-width of the type.
The numeric portion of the token is taken to be a decimal integer specifying the bit-width of the type.
Arrays are the basic type for defining symbolic variables (the language does not currently support simple variables).
Arrays can be initialized to be either symbolic, or to have a given list of constant values. For constant arrays, the initializer list must exactly match the size of the array (if the size was unspecified, it will be the number of constant values).
array foo : w32 -> w8 = symbolic # A ten element symbolic array
array foo : w8 -> w1 = [ true, false, false, true ] # A constant array of
Query declarations describe the queries that the constraint solver should run, along with optional additional arguments to specify expressions and arrays for which counterexamples should be provided.
(query  false)
(query [(Eq w8 (Read w8 0 mem) 10)] false  [ mem ])
A query command consists a query, consisting of a constraint list and a query expression, and two optional lists for use when a counterexample is desired.
constraint-list is a list of expressions (with boolean type) which are assumed to hold. Although not required in the language, many solvers require that this set of constraints be consistent. The
query-expression is the expression to determine the validity of.
If a counterexample is desired for invalid queries,
eval-expr-list is a list of expressions for which a possible value should be constructed, and
eval- array-list is a list of arrays for which values for the entire array should be provided. All counterexamples results must be simultaneously feasible.
Versions are used to refer to an array with an ordered sequence of writes to it.
array small_array : w32 -> w8 = symbolic # The array we will read from
(Read w8 0 small_array) # No Updates to small_array
(Read w8 1 [1=0xff] @ small_array) # Read from small_array at byte offset 1
with update where byte 1 set to decimal 255
A version can be specified either by an identifier, which can refer to an array or a labelled version, or by an explicit list of writes which are to be concatenated to another version (the most recent writes are first).
Expressions are strongly typed, and have the following general form:
EXPR_NAME is the expression name,
EXPR_TYPE is the expression type (which may be optional), followed by any additional arguments.
An expression reference can be used to refer to a previously labelled expression.
Expression and version labels are in separate namespaces, it is the users responsibility to use separate labels to preserve readability.
Constants are specified by a numeric token or a type and numeric token.
When a constant is specified without a type, the resulting expression is only well-formed if its type can be inferred from the enclosing context. The
false constants always have type
(Add w32 10 20) # The type for 10 and 20 is inferred to be w32.
Add, Sub, Mul, UDiv, SDiv, URem, SRem
Arithmetic operations are always binary and the types of the left- and right- hand side expressions must match the expression type.
Truncated unsigned division. Undefined if divisor is 0.
Unsigned remainder. Undefined if divisor is 0.
Signed division. Undefined if divisor is 0.
Signed remainder. Undefined if divisor is 0. Sign of the remainder is the same as that of the dividend.
Bitwise negation. The result is the bitwise negation (one’s complement) of the input expression. If the type is specified, it must match the expression type.
And, Or, Xor, Shl, LShr, AShr
These bitwise operations are always binary and the types of the left- and right-hand side expressions must match the expression type.
Logical shift left. Moves each bit of
X to the left by
Y positions. The
Y right-most bits of
X are replaced with zero, and the left-most bits discarded.
Logical shift right. Moves each bit of
X to the right by
Y positions. The
Y left-most bits of
X are replaced with zero, and the right-most bits discarded.
Arithmetic shift right. Behaves as
LShr except that the left-most bits of
X copy the initial left-most bit (the sign bit) of
Eq, Ne, Ult, Ule, Ugt, Uge, Slt, Sle, Sgt, Sge
Comparison operations are always binary and the types of the left- and right- hand side expression must match. If the type is specified, it must be
Concat evaluates to a
type bits formed by concatenating
Extract evaluates to
type bits from
child-expression taken from
offset-number is the index of the least-significant bit in
child-expression which should be extracted.
ZExt evaluates to the lowest
type bits of
child-expression, with undefined bits set to zero.
SExt evaluates to the lowest
type bits of
child-expression, with undefined bits set to the most-significant bit of
The Read expression evaluates to the first write in
version for which
index-expression is equivalent to the index in the write. The type of the expression must match the range of the root array in
version, and the type of
index-expression must match the domain.
The Select expression evalues to
true-expression if the condition evaluates to true, and to
false-expression if the condition evaluates to false. The
cond-expression must have type
Both the true and false expressions must be well-formed, regardless of the condition expression. In particular, it is not legal for one of the expressions to cause a division-by-zero during evaluation, even if the Select expression will never evaluate to that expression.
Several common expressions are not implemented directly in the Expr library, but can be expressed in terms of other operations. A number of these are implemented as “macros”. The pretty printer recognizes and prints the appropriate Expr forms as the macro, and the parser recognizes them and turns them into the underlying representation.
This macro form can be used to generate a Sub from zero.
ReadLSB and ReadMSB can be used to simplify contiguous array accesses. The type of the expression must be a multiple
N of the array range type. The expression expands to a concatenation of
N read expressions, where each read is done at a subsequent offset from the
index-expression. For ReadLSB (ReadMSB), the concatenation is done such that the read at
index-expression forms the least- (most-) significant bits.