Summary Constraints over the Reals (c) 2004-02-22 Fabian M. Suchanek This is a summary of the course "Solving Constraints over the Real Numbers" held by Stefan Ratschan in the WS 2003 at Saarland University. By reading the following text, you accept that the author does not accept any responsibility for the correctness or completeness of this text. If you have any corrections or remarks (especially concerning th statements marked with a "(?)"), please send me a mail. This is the only way to make the publication of this summary useful for me, too. My e-mail address is f.m.suchanek@zweb.de, but the letter 'z' has to be removed from the address. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prerequisites ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cf. also "Analysis" (analysis.txt, in German) Set: An unordered collection of different elements. S = {s1,...sn} S = { a | p(x) } is the set of all those x for which p(x) holds Size, cardinality of a set: The number of its elements. Notation: |{s1,s2,...sn}| = n Intersection of two sets: The set of those elements which occur in both of them. Notation: X /\ Y Union of two sets: The set of all of their elements. Notation: X \/ Y Difference of two sets: The set of those elements which occur in the first one, but not in the second one. Notation: X \ Y Bit: The set {0,1}. Boolean: The set {true, false}. Sequence, Tuple: An ordered collection of elements. Notation: t=(t[1],t[2],...t[n]) Example: t=(1,'A',abc) is a tupel Index of an element t in a tuple T: The number of elements preceding t in T. Length, Dimension of a tupel: The number of its elements. Concatenation of two tuples t1 and t2: The tuple t1 + t2 := (t1[1],t1[2],...t1[n1], t2[1],t2[2],...t2[n2]). Pair: A tupel of dimension 2. Subset of a set S: A set, the elements of which are all contained in S. Notation: S' =< S Superset of a set S: A set, which contains all elements of S. Notation: S' >= S Powerset of a set S: The set of all subsets of S. Notation: P(S) Partition of a set S: A set of subsets S[1],...S[n] of S, such that each element of S is in exactly one of them. Usually, it should be S[i]!={} and S[i]!=S. Cartesian Product of sets A[1],A[2],...A[n]: The set of all possible tuples, the first element of which is in A[1], the second element of which is in A[2] etc. Notation: A x B x C = D Example: {1,2} x {a,b,c} = { (1,a), (2,a), (1,b), (2,b), (1,c), (2,c)} n-th Power of a set S: The set S^n := S x S x S x ... S where 'S' occurs n times. Relation over the sets A[1],...A[n]: A subset of the cartesian product of A[1],...A[n]. Binary relation: A relation over two sets. Notation for a relation R < A x B: * (a,b) is an element of the relation R * r(a,b) * a r b * (a,b) is in r Example: The relation ">" is the set of all pairs of real numbers, in which the first number is greater than the second. It holds: ">" = { (1,0), (27.3,12.5), (17,-12), ... } One writes: >(3,2) or 3 > 2 or (3,2) is in ">". Relation on a set S: A relation R < S x S. Reflexive pair: A pair (x,x). Reflexive relation: A relation on a set S, which contains all reflexive pairs of S. Strict, Irreflexive relation: A binary relation, which does not include any reflexive pair. Symetric relation: A binary relation R, which for every (a,b) in R also contains (b,a) for a!=b. Anti-symetric relation: A relation on a set A, which contains either the pair (a,b) or the pair (b,a), but not both. Asymetric relation: An anti-symetric irreflexive relation. Transitive relation: A relation on a set A, which for every (a,b) and (b,c) it contains, also contains (a,c). Function: A relation f over two sets A and B, such that for every a in A there is either none or exactly one element b in B with a f b. Notation for f: f:A->B Notation for "a f b": f(a)=b. Any notion, which depends on some other notion may be considered a function. In particular, all definitions in this summary of the form "x of y and z: something" can be interpreted as a function x(y,z) := something. Arity of a function f:A->B: The number of sets, the cartesian product of which makes up A. Result of a function f:A->B for an argument x: The element of B such that f(x)=b. Result of a function f:A->B for a set S: The set of the results of f for all s in S. Notation: f(S) := {f(s) | s in S} Composition of two functions f:A->B and g:B->C: The function (f o g):A->C, where (f o g)(x) = g(f(x)). Operation: A function of arity 2. Notation: a f b := f(a,b) Associative operation: An operation f, for which a f (b f c) = (a f b) f c Commutative operation: An operation f, for which a f b = b f a Predicate, Boolean function: A function returning a boolean value. Each set (and thus any relation and any function) can be seen as a boolean function, returning 'true' for elements of the set and 'false' else. Boolean functions of binary relations are mostly written like operators. Variable: A small letter from the end of the alphabet. Expression, term: * a variable OR * a value OR * a function symbol, followed by '(', a term and ')' OR * a term, followed by an operator symbol, followed by a term Interpretation: A function mapping variables to values. Evaluation: A function, which takes an interpretation and a term, replaces the variables by the values given by the interpretation and applies the functions to their arguments as given by the term. Usually, this process is implicit and one simply writes for example x+4=6 for x=2 for saying that the evaluation of "x+4" with the interpretation mapping x to 2 is 6. // For a very detailed definition, see "Human-oriented // Theorem-Proving" (htp.txt) Equivalent terms: Two terms, which always evaluate to the same result under the same interpretation. Function term of a function f:A x B x ... ->Z: A term with variables a,b,c,.., which is equivalent to the result of f(a,b,c...). Mostly, functions are defined by their function term and relations are defined by the function term of their boolean function. Commutative Ring: A set K with the two operations * Multiplication: *:K x K->K * Addition: +:K x K->K such that it holds (K1) + Associativity: x + (y + z) = (x + y) + z (K2) + commutativity: x + y = y + x (K3) Existence of Zero: Ex 0 in K: x + 0 = 0 + x = x. (K4) Existence of the Negative: All x Ex -x: x + -x = 0 (K5) * Associativity: x * (y * z) = (x * y) * z (K6) * Commutativity: x * y = y * x (K7) Existence of One: Ex 1 in K: x * 1 = 1 * x = x. (K8) Distributivity: (x + y) * z = x * z + y * z Subtraction in a commutative ring K: The operation -:K x K -> K with x-y := x+(-y) Direct predecessor of a commutative ring element i: The element i-1. Predecessor of a commutative ring element i: The direct predecessor or a predecessor of the direct predecessor. Direct successor of a commutative ring element i: The element i+1. Successor of a commutative ring element i: The direct successor or a successor of the direct successor. Integer: The commutative ring consisting of 0 and all its successors and predecessors. The elements of this ring are usually noted {0,1,-1,2,-2,...}. Natural: The subset of the elements of Integer, which are successors of 0. Field: A commutative ring where it holds additionally (K9) Existence of the inverse: All x!=0 Ex x^-1: x * x^-1 = x^-1 * x = 1 // The German word for "Field" is "Koerper" Division in a field K: The operation /:K x K -> K with x/y := x*(y^-1) Rational: The field {a/b | a is Integer, b is Integer}. With the notion a/1 = a, the set of Integers is a subset of the set of Rationals. Ordering on a field K: A subset K+ of K such that (O1) All x: (x E K+) XOR (x = 0) XOR (-x E K+) (O2) x E K+, y E K+ => x+y E K+ (O3) x E K+, y E K+ => x*y E K+ One defines the following relations: x > 0 := x E K+ x > y := x-y E K+ x>= y := (x-y E K+) or (x=y) x < y := y > x x =< y := y >= x Ordered field: A field for which there exists an ordering. Each ordered field is a superset of the rational numbers, if named appropriately. Archimedic field: An ordered field (K,K+), such that for each natural number n, there is an x in K, such that n>x. // There is no type mismatch in this definition, because // Natural < Integer < Rational < OrderedField, // so that each natural number can be considered an element // of K, so that '>' is well-defined. Real: The unique set, which is * a field * ordered * archimedic * complete cf. "Analysis" (analysis.txt, in German) for a definition With the usual naming, Real is a superset of Rational. Infinity: A symbol INF, by which one extends the Real number. One defines: INF o x := INF with o being +, *, /, -, x > 0 -INF o x := -INF with o being +, *, /, -, x > 0 INF o -1 := -INF with o being *,/ INF > x for all x -INF < x for all x Real+: The set of all non-negative real numbers. Notation: Real+ := {r | r in Real, r >= 0} pi: The real number 3.1415926535897932348... Comparison: One of the relations '<', '>', '=<', '>=', '=' on real numbers. Equidistant set: A set S of reals, such that there is a distance r, such that S={r*i | i is Integer}. Finite set: A set with a finite size. Countable set: A set S, for which there is a function f:S->Natural, such that every s in S is mapped to a different natural number. Each finite set is countable. Example: The Integers are countable, because there exists f(i)=i<0?-i*2+1:i*2. Dense set: An ordered field, such that for each x1 and x2, there is always an x with x1 < x < x2. Discrete set: A set which is not dense. Uniform relative distance set: A discrete set S, such that for every x in S and the smallest element x' in S with xReal, such that a ^ 0 = 1 a ^ 1 = a a ^(b+c) = a^b * a^c Logarithm to base b: The function log[b]:Real->Real such that log[b](x)=y <=> b^y=x. Binary logarithm: The logarithm to base 2, noted log2. i-th Root: The function root[i]:Real->Real, such that root[i](x)=y <=> y^i=x. Square root: The function 2nd root, called sqrt. ceil: The function ceil:R->Integer, so that ceil(x) is the smallest integer, which is not smaller than x. floor: The function floor:R->Integer, so that floor(x) is the greatest integer, which is not greater than x. modulo: The operation mod:Integer x Integer -> Integer, such that x mod y := x-floor(x/y)*y. Arithmetic if: The function '?': Boolean x A x A -> A for a set A, which is defined as true ? a : b := a false ? a: b := b This function evaluates the first argument and only one of the two others. Maximum: The function max:Real x Real -> Real, such that max(a,b):=a>b?a:b. Minimum: The function min:Real x Real -> Real, such that min(a,b):=aReal+ such that |x|:=x<0?-x:x. Limit of a function f for k towards v: The value lim k->v f(k) := r such that for any eps>0, there is a delta>0, such that |f(v-delta)-r|[-1;1], which is sin(x) := SUM { (-1)^k*x^(2*k+1)/(2*k+1)! | k=0..INF } cosine: The function cos:Real->[-1;1], which is cos(x) := SUM { (-1)^k*x^(2*k)/(2*k)! | k=0..INF } tangent: The function tan:Real->Real, which is tan(x) := sin(x)/cos(x) Identity: The function id, which maps its argument to itself. Vector: An element of Real^n. As a result, a vector of dimension 1 is a real number. Sum of two vectors x and y of length n: The vector x+y := (x[1]+y[1], ..., x[n]+y[n]). r-Multiple of a vector x of length n: The vector r*x := (r*x[1], ..., r*x[n]). Negative of a vector x of length n: The vector -x := (-x[1],...-x[n]). Dot-product of two vectors x and y of lngth n: The real value x * y := SUM { x[i]*y[i] | i=1..n } The dot-product measures the "similarity" of the two vectors. It is 0 if the vectors are orthogonal. // For details, cf "Analyse de Donnees" (maths.txt, in French) m*n-Matrix: A tuple M of m vectors of length n. The i-th element of the j-th vector is noted by M[j][i]. Product of m*n-matrix A and n*l-matrix B: The m*l-matrix C with C[i][j] = SUM { A[i][k]*B[k][j] | k=1..n } Since each vector can be seen as a 1*n-matrix, this definition also applies to the product of vectors and matrices. Square matrix: An n*n-matrix. n*n Identity matrix: The n*n-matrix "In" with In[i][j] = i==j?1:0 Inverse of a n*n-matrix M: The n*n-matrix M^-1, such that M*M^-1=In Singular matrix: A matrix, for which there exists no inverse matrix. Transposed matrix of a square matrix M: The matrix t(M) with t(M)[j][i] = M[i][j] Determinant of a n*n matrix M: The value det(M), such that * det(M)=1, if M is the identity matrix * det(M)=0, if there are two equal lines in M * det(M') = x*det(M), if M' results from multiplying one line in M with x // A bunch of interesting properties result from these definitions, // see "Algebra" (algebra.txt, in German) or "Analyse de donnees" // (maths.txt, in French) for details. In particular: It holds: det(A*B) = det(B*A) det(t(A)) = det(A) det(A^-1) = det(A)^-1 if A is not singulary det(D) = D[1][1]*D[2][2]*...D[n][n] if D[i][j]=0 for i>j det(A)=0 <=> A singulary <=> there are two lines in A or two columns in A, one of which is a multiple of the other det((v[1], ...,r*v[k], ...v[n])) = r*det((v[1],...v[n])) det((v[1], ...,a+v[k], ...v[n])) = det((v[1],...v[k],...,v[n])) + det((v[1],...a, ...,v[n])) det((v[1],...,v[i],...,v[j],...v[n])) = - det((v[1],...,v[j],...,v[i],...v[n])) det((v[1],...,v[i] ,...,v[j],...v[n])) = det((v[1],...,v[i]+r*v[j],...,v[j],...v[n])) It follows that det(M)=det(M'), if M' results from M by adding a multiple of a line of M to another line of M. This yields the following algorithm for the calculation of the determinant of M: Transform M to the identity matrix, by adding multiple of lines to other line. You may also multiply lines by real values. Keep track of all these values, call them r[i]. Then det(M)=1/r[1]/r[2]/.../r[k]. Determinant of a 2*2 matrix M: The value det(A) := a*d-c*b, if a b M = c d That is: Sum up the products of all \-Diagonals, subtract the products of all /-Diagonals. Determinant of a 3*3 matrix M: The value det(A) := a*e*i + b*f*g + c*d*h - c*e*g - b*d*i - a*f*h if a b c M = d e f g h i That is: Sum up the products of all \-Diagonals, subtract the products of all /-Diagonals. Eigenvalue of a n*n matrix M: A non-zero real value lambda, such that there is a vector x!=(0,0,...0), such that M*x = lambda*x It holds M*x = lambda*x <=> M*x - lambda*x = 0 <=> (M - In*lambda)*x = (0,0,...0) <=> there are two columns in (M - In*lambda), one of which is a multiple of the other (since x!=(0,0,0,...)) <=> det(M - In*lambda) = 0 Thus, lambda can be calculated by help of the determinant. // For a detailed treatment of this stuff, see "Analyse de Donnees" // (maths.txt, in French) Eigenvector of a square matrix M: A vector x!=(0,0,0,...), such that for an eigenvalue lambda of M, M*x = lambda*x If x is an eigenvector, then r*x is also an eigenvector for any real number r!=0. If the eigenvalue is known, then the eigenvector can be computed by the above equation. Connective, logical opeator, boolean operator: A function f:Boolean x Boolean -> Boolean. Conjunction, Logical and: The logical operator &&: Boolean x Boolean -> Boolean a && b := (a==true) ? b : false Disjunction, Logical or: The logical operator ||: Boolean x Boolean -> Boolean a || b := (a==true) ? true : b Logical negation: The function ~:Boolean->Boolean ~ a := (a==true)?false:true Logical implication, material implication: The logical operator a => b := ~a \/ b Logical equivalence: The logical operator a <=> b := (a => b) /\ (b=>a) Universal quantifier: The function "All", which maps a variable and a term to 'true', if the term evaluates to 'true' for all possible values for the variable. Notation: All x : term Existential quantifier: The function "Ex", which maps a variable and a term to 'true', if the term evaluates to 'true' for at least one possible value for the variable. Notation: Ex x : term Binary representation of a natural i: The sequence binrep(i) of length ceil(log2(i)) with binrep(i)[j] := i div 2^(ceil(log2(i))-j-1) mod 2. Example for notation: 6d = 110b Binary representation of an integer i: If i>0, then the binary representation of the natural i, else the binary representation of the natural 2^ceil(log2(-i)+1)+i. Decimal of a binary representation b: The number decimal(b) := SUM {b[i]*2^(length(b)-i-1) | i=0..length(b)-1} Order of a function f:Natural->Real+: A function g:Natural->Real, such that there is a real x0 and a real c, such that f(x)x0. Class of a function g:Natural->Real: A set of functions having the same order as g. Notation for "g:N->Real is an element of O(f)": g in O(f) Algorithm: A sequence of instructions. Many functions can be calculated by algorithms. Termination of an algorithm: Its property that it cannot be executed forever. Partial correctness of an algorithm: Its property that it gives the right result, if it terminates. Partial correctness does not yet guarantee that the algorithm terminates. Total correctness of an algorithm: Its property that it is partially correct and terminates. In this case, the algorithm does really what it is intended to do. Time of an algorithm for an input size n: The order of the function which calculates the time needed to execute the algorithm, in dependence upon the input size n. One says "The algorithm is in time O(g)". ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Number Representations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cycle: The function cycle:Real x Real x Real-> Real with cycle(x,a,b) := (x-a) mod (b-a) + a. This function "wraps" an x to the interval [a,b]. Compatibility of an operation f:A x A->A with the Reals: Its property that a f b = real(a) f real(b), where A is a subset of the Reals. Unsigned Machine Integer of bit size k: The set {0,...2^k-1} with the operations a o b := cycle(real(a) o real(b),0,2^k-1) where o is +, -, *, /. Signed Machine Integer of bit size k: The set {-2^(k-1),...2^(k-1)-1} with the operations a o b := cycle(real(a) o real(b),-2^(k-1),2^(k-1)-1) where o is +, -, *, /. Overflow: The fact that the sum/multiplication/subtraction of two machine integers is not compatible with the Reals, because no number greater than the greatest machine integer ever results. Properties of machine integers of a given bit size: * Cardinality: finite, countable Counting function: countMInt(i)=i<0?-i*2+1:i*2. * Distribution: equidistant * Their operations are not compatible with the reals due to overflows and the restriction to integer numbers with division * They can be represented on a computer by their binary representation by a fixed number of bits * Operations on them can be done on a computer in time O(1). Properties of Integers: * Cardinality: infinite, countable Counting function: countInt(i)=i<0?-i*2+1:i*2. * Distribution: equidistant * Their addition, subtraction and multiplication is compatible with the Reals, whereas their division is not * Their representation on a computer may take infinitely much space * Their addition on a computer takes time O(k) and their naive multiplication O(k^2), where k is the binary logarithm of the maximum of the two operation arguments Properties of Rationals: * Cardinality: Infinite, countable Counting function: countRat(a/b) = a==0 ? 0: b<0 ? countRat(-a/-b) : a<0 ? countRat(-a/b)+1 : a==1 ? countRat((b-1)/1)+2: countRat((a-1)/(b+1))+2 * Distribution: dense * Their operations are compatible with the Reals, except for some operations like roots * Their representation on a computer may take infinitely much space * Their operations on a computer may take time O(k^2), where k is the binary logarithm of the maximum of the operation arguments Real algebraic: The set of all reals x, for which there exist Integers i[0],...i[n], such that i[0]*x^0 + i[1]*x^1 + ... i[n]*x^n = 0 It holds that Rational < RealAlgebraic < Real. Properties of Real Algebraics: * Cardinality: Infinite, countable Counting function: decimal(<0>+binrep(MAX {log2(i[j])})+binrep(n) +binrep(i[0])+ ... binrep(i[n])) where the binary representations are all assumed to be of length MAX {log2(i[j])}. * Distribution: Dense * Their operations, including roots, are compatible with the Reals, except for some operations ("transcendental operations") * Their representation on a computer may take infinitely much space * Their operations on a computer may take a very long time. Floating point of mantissa size m and exponent size e: The set { mant*2^(expo-m+1) | -2^(m-1) =< mant =< 2^(m-1)-1, -2^(e-1) =< expo=< 2^(e-1)-1} This is the set of all numbers, which can be expressed by a * 2 ^ (b-m+1) where a and b are signed machine integers with sizes m and e respectively. Rounding: Returning a result, which is close to the correct result. usually, rounding is done when the correct result cannot be represented. Properties of floating points: * Cardinality: finite, countable Counting function: countFP(a*2^b) = countInt(a)*2^e+countInt(b) where e is the exponent size * Distribution: dense, uniform relative distance * Their operations are not compatible with the reals, because rounding errors may occur. * They can be represented on a computer by their binary representation by a fixed number of bits * Operations on them can be done on a computer in time O(1). Multiprecision Floating Point: A floating point number with an arbitrary mantissa size. Properties of multiprecision floating points: * Cardinality: infinite, countable // The script says they have a finite cardinality (?) Counting function: countMPFP(a*2^b) = countInt(a)*2^e+countInt(b) where e is the exponent size * Distribution: dense, uniform relative distance * Their operations are not compatible with the reals, because rounding errors may occur. * They can be represented on a computer by their binary representation // The script says "by a fixed number of bits" (?) * Operations on them can be done on a computer in time O(1) for small mantissa sizes. NaN, NotANumber: The result of a real operation applied to arguments for which it is non defined. Example: 0/0=NaN IEEE 754 single: The set of floating points with mantissa size 24 and exponent size 8, stored as a sequence of bits in the following manner: sign exponent mantissa 1 bit 8 bits 23 bits Where: * the sign bit is 0 for positive numbers and 1 for negative ones * the exponent ranges from -127 to 128. One adds an offset of 127 to represent it conventiently in 8 bits 0..255 * the mantissa implicitly has a leading '1' This representation has the advantage that the decimal values corresponding to the bit sequences (i.e. decimal()) have the same order as the values represented by the bit sequences (i.e. mantissa*2^exponent). The mantissa is also called significant. The combinations of mantissa and exponent represent the following values: sign exponent mantissa value ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All components to zero: zero 0 -127 +127 <0,0,........> 0 There is also a "negative" zero 1 -127 +127 <0,0,........> -0 Maximal positive exponent and mantissa zero: INF 0 128 +127 <0,0,.......0> INF 1 128 +127 <0,0,.......0> -INF Maximal positive exponent and mantissa:NaN 0 128 +127 NaN Maximal negative exponent drops the leading one in the mantissa ("denormalized representation") 0 -127 +127 decimal()* 2^(-23-126) 1 -127 +127 -decimal()* 2^(-23-126) Other cases: A one is added in front of the mantissa 0 e +127 decimal(<1,b1,...b23>)* 2^(e-23) 1 e +127 -decimal(<1,b1,...b23>)* 2^(e-23) Examples: The number 0.1d: 0.1d = 0.00011001100110011...b ~= 1.10011001100110011001100b * 2d^-4d mantissa: 10011001100110011001100b sign: 0b exponent: -4d + 127d = 123d = 1111011b The smallest number: mantissa: 1111111111111111111111b sign: 1 exponent: 127d + 127d = 254d = 11111110b in decimal: -(2-2^-23)*2^127 = -340282346638528859811704183484516925440 // Calculated with vrcalc (c) 1997 F.M.Suchanek :-) The largest number: mantissa: 1111111111111111111111b sign: 0 exponent: 127d + 127d = 254d = 11111110b in decimal: (2-2^-23)*2^127 = +340282346638528859811704183484516925440 The smallest positive number mantissa: 0000000000000000000001b sign: 0 exponent: -127d + 127d = 0d = 00000000b in decimal: 2^-23*2^-126 = 1,4012984643248170709237295832899e-45 The largest negative number: mantissa: 0000000000000000000001b sign: 1 exponent: -127d + 127d = 0d = 00000000b in decimal: -2^-23*2^-126 = -1,4012984643248170709237295832899e-45 IEEE 754 double: The set of floating points with mantissa size 52 and exponent size 11, stored as a sequence of bits in the following manner: sign exponent mantissa 1 bit 11 bits 52 bits Where: * the sign bit is 0 for positive numbers and 1 for negative ones * the exponent ranges from -1023 to 1024. One adds an offset of 1023 to represent it conventiently in 11 bits 0..2047 * the mantissa implicitly has a leading '1' The mantissa is also called significant. The combinations of mantissa and exponent represent the following values: sign exponent mantissa value ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 0 -1023 +1023 <0,0,........> 0 1 -1023 +1023 <0,0,........> -0 0 1024 +1023 <0,0,.......0> INF 1 1024 +1023 <0,0,.......0> -INF 0 1024 +1023 NaN 0 -1023 +1023 decimal()* 2^(-52-1022) 1 -1023 +1023 -decimal()* 2^(-52-1022) 0 e +1023 decimal(<1,b1,...b52>)* 2^(e-52) 1 e +1023 -decimal(<1,b1,...b52>)* 2^(e-52) Rounding mode: The method used for rounding floating point numbers. The most common ones are * Round towards 0 * Round towards INF * Round towards -INF * Round towards the nearest floating point Table maker's dilemma: The fact that, in order to round a decimal number to the n-th digit, one may have to calculate much more than n digits. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Intervals ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // See also "Intelligence artificielle" (ia.txt, in French) IR: The set of all intervals. Notation: IR:={[a,b] | a, b in Real \/ {-INF,INF}} Center, midpoint of an interval [a,b]: The value (a+b)/2. Notation: mid([a,b]) := (a+b)/2 Width of an interval [a,b]: The value (b-a). Notation: w([a,b]) := b-a Singular interval: An interval with a width of 0. Notation: Instead of [x,x], one also writes x. Real is a subset of IR. Supremum of a real set: The smallest real number, which is greaterequal all elements of the set. Notation: sup(S) Infimum of a real set: The greatest real number, which is lessequal all elements of the set. Notation: inf(S) Interval hull of a real set: The most narrow interval, which is still a (not necessarily strict) super-set of the set. Notation: [A] := [inf(A),sup(A)] Inclusion function of a function f:X->Y: A function [f]:P(X) -> P(Y), such that [f](X) >= f(X). That is: The function receives a set of elements of X instead of a single element of X and it returns a set of elements of Y, which is a super-set of the results of the original function for all x in X. Inclusion function of a function f:Real^n -> Real: An inclusion function for f. Mostly, the term refers to a function [f]:IR^n -> IR such that [f](I1, I2, ...In) >= {f(x1,x2,...xn) | x1 in I1, ... xn in In and f is defined on x1,x2,...xn } That is: [f] receives n argument intervals. It returns an interval, which includes all results that f has for any arguments in the corresponding argument intervals. Thin inclusion function: An inclusion function, which returns a singular interval if it is given singular intervals. [f] thin := [f]([a,a],[b,b],...) = [z,z] Convergent inclusion function: An inclusion function [f]:IR^n->IR with the following property: Take an infinite sequence of n-tuples of intervals, such that the maximum width within the tuples decreases towards 0. Now calculate [f] for each of the tuples. Then the width of its result also decreases towards 0. [f] convergent := ( lim k->INF max({w(I[k][i])|i=1..n}) = 0 => lim k->INF w([f](I[k])) = 0 ) Inclusion monotonic inclusion function: An inclusion function [f]:IR^n->IR, which returns a more narrow interval for more narrow input intervals. [f] inclusion monotonic := ( I[1] [f](I[1],I[2],...I[n]) =< [f](J[1],,...J[n])) Optimal inclusion function: An inclusion function [f]:IR^n->IR, which returns the smallest possible interval, which still includes the function values of f. [f] optimal := [f](I1,I2,...In) = [{f(x1,...xn) | x1 in I1,...xn in In}] If an inclusion function is optimal, it is thin, convergent and inclusion monotonic. Optimal inclusion function for addition: The function [a,b] [+] [c,d] := [a+b,c+d] Its holds: * 0 [+] [a,b] = [a,b] Proof of optimality: We show I1 [+] I2 = [ {x1+x2 | x1 in I1, x2 in I2} ] <=> (I1 [+] I2).ubound = [ {x1+x2 | x1 in I1, x2 in I2} ].ubound && (I1 [+] I2).lbound = [ {x1+x2 | x1 in I1, x2 in I2} ].lbound <=> (I1 [+] I2).ubound = sup({x1+x2 | x1 in I1, x2 in I2}) && (I1 [+] I2).lbound = inf({x1+x2 | x1 in I1, x2 in I2}) <=> (I1 [+] I2).ubound = sup({x1 | x1 in I1})+sup({x2 | x2 in I2}) && (I1 [+] I2).lbound = inf({x1 | x1 in I1})+inf({x2 | x2 in I2}) <=> (I1 [+] I2).ubound = I1.ubound + I2.ubound && (I1 [+] I2).lbound = I1.lbound + I2.lbound <=> true Optimal inclusion function for multiplication: The function [a,b] [*] [c,d] := [min({a*c, a*d, b*c, b*d}), max({a*c, a*d, b*c, b*d})]. It holds: * r [*] [a,b] = [r*a,r*b] for r > 0 * r [*] [a,b] = [r*b,r*a] for r < 0 * 2 [*] [a,b] = [a,b] + [a,b] * 1 [*] [a,b] = [a,b] * A [*] (B + C) != A*B + A*C Optimal inclusion function for division: The function [a,b] [/] [c,d] := [a,b] [*] [1/c,1/d] This result is only defined, if 0 is not in [c,d]. Inclusion connective of a logical operator o:Boolean^2->Boolean: An inclusion function for o. That is: A function [o]:P(Boolean) x P(Boolean) -> P(Boolean), such that A [o] B >= { a o b | a in A, b in B} Inclusion-connective for conjunction: The operator A [&&] B := (A=={false} || B=={false}) ? {false} : A \/ B Inclusion-connective for disjunction: The operator A [||] B := (A=={true} || B=={true}) ? {true} : A \/ B Optimal Inclusion function for sin: The function [sin]([a;b]) := (|a-b|>=2*pi) ? [-1;1] : (cos(a)>=0 && cos(b)>0) ? [sin(a);sin(b)] : (cos(a)>=0 && cos(b)=<0) ? [min(sin(a),sin(b));1] : (cos(a)<0 && cos(b)=<0) ? [sin(b);sin(a)] : [-1;max(sin(a),sin(b))] This function splits the possible positions of the sine curve into 4 different cases: a and b are on the ascending side, a and b are on the descending side or a and b are to both sides of a maximum. Optimal Inclusion function for id:Real->Real: The function [id](x) := [x;x] Floating point interval: An interval, the bounds of which are stored as floating point numbers. When such values have to be rounded, the upper bound is rounded towards INF and the lower bound is rounded towards -INF. As a result, floating point inclusion functions are not thin. Arithmetic operator: One of the real functions +,-,*,/. Arithmetic function: A function f:Real^n->Real, which is the composition of arithmetic operators and identity. Fundamental theorem of interval arithmetic: The fact that any arithmetic function has an inclusion function, which is the composition of the corresponding optimal inclusion functions. Example: f(a,b,c) =a+b*c with a,b,c in Real [f](A,B,C)=A[+]B[*]C with A,B,C in IR Proof by structural induction over the function term t involving intervals: For t being an interval, we use the identity function and clearly get an inclusion function. Now suppose the terms t1 and t2 have an inclusion function. Be their results enclosed by the intervals [t1] and [t2], respectively. Now take any arithmetic operator o. The results of o applied to all possible value pairs in [t1] x [t2] lie in [t1] [o] [t2]. Hence, "t1 o t2" has the inclusion function [t1] [o] [t2]. Since by this construction, we can create any composition of arithmetic operators, all arithmetic functions have corresponding inclusion functions. Optimality theorem of interval arithmetic: The fact that the inclusion function of arithmetic function f:R^n->Real is optimal, if the function term of f involves each of its argument variables not more than once. Example: f(a,b) = a+b -> optimal [f] f(a,b) = a+a*b -> non-optimal [f] Dependency problem: The fact that the inclusion function of a function f:Real^n->Real is not optimal if the function term of f involves one of its argument variables more than once. Example: f(a,b) = a-a [f]([-1,1],[-100,100]) = [-1,1]-[-1,1] = [-2,2] > [0,0] // Overview: Inclusion functions can do the following: // Given a function f:A x B x C ... ->Z // Given subsets A', B', C' ... // Calculate a subset Z' including {f(a,b,...}|a in A, b in B,...} // // An inclusion function returns a set constaining at least all // f(a,b,...), but maybe more (superfluous) elements of Z. The // inclusion function overestimates. // // We focused on real functions. Real inclusion functions can do this: // Given a function f:Real^n->Y // Given n intervals I1,...In // Calculate an interval including {f(x1,...xn}|x1 in I1,...xn in In} // // An optimal inclusion function contains the smallest interval, which // still contains all f(x1,...xn). ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Branch-and-Bound-Solving ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Desired value of a function f:X->Y: A certain y in Y. For functions f:Real^n->Real^m, the desired value is mostly (0,0....). For functions f:Real^n->Boolean, the desired value is mostly 'true'. Zero-point of a function f:Real^n->Real^m: A vector x, such that f(x)=(0,0,...0). True-point of a function f:Real^n->Boolean: A vector x, such that f(x)=true. Solution of a function f:X->Y: An x in X, such that f(x) is the desired value. In the case of a function f:Real^n->Real^m, this is a zero-point. In the case of a function f:Real^n->Boolean, this is a true-point. Pseudo-solution-Set of a function f:X->Y: A subset X' of X, such that [f](X') contains the desired value of f. As inclusion functions may overestimate, this does not mean that X' really contains a solution. One says: X' contains a pseudo-solution. // This notion is critical, since a "pseudo-solution" is not a // value contained in X'. The word "pseudo-solution" is meaningless. Equation: A term of the form A=B. Equation system: A pair of tuple of n functions f[i]: Real^n->Real and a tuple of n real values v[i]. An equation system is mostly written by the function terms in the form: f[1](x[1],x[2],...x[n]) = v[1] f[2](x[1],x[2],...x[n]) = v[2] .... f[n](x[1],x[2],...x[n]) = v[n] System function of an equation system (f,v): The function F:Real^n->Real^n with F(x[1],x[2],...x[n]) := ( f[1](x[1],x[2],...x[n]) - v[1], f[2](x[1],x[2],...x[n]) - v[2], .... f[n](x[1],x[2],...x[n]) - v[n] ) The system function comprises all equation system functions into one big function. The desired value of a system function is (0,...0). Solution of an equation system (f,v): A zero-point of its system function. That is: A vector x, such that f[i](x) = v[i] for all i. Projection of a tuple to it i-th component: Its i-th component. Notation: proj[i](x[1],...x[n]) := x[i] n-dimensional box: An element of IR^n. That is: It is the cartesian product of n intervals. For n=1 ,we have an interval. For n=2, we have something like a rectangle For n=3, we have something like a cube Center of an an-dimensional box: The vector of the centers of its intervals. mid(B) := (mid(B[1]),mid(B[2]),...mid(B[n])) Width of an n-dimensional box: The maximum of the widths of its intervals. w((I[1],I[2],...I[n])) := max {w(I[i]) | i=1..n } Size of an n-dimensional box: The sum of the widths of its intervals. Interval hull of a set S < R^n: The smallest n-dimensional box containing S. Notation: [S] Inclusion function of a function f:Real^n->Real^n: A function [f]:IR^n -> IR^n, such that for each i=1..n [f] o proj[i] is an inclusion function for f o proj[i] That is: Given an interval for each argument, [f] computes a box, in which the results of f will lie. System of equations problem: The problem of, given a system of equations F:Real^n->Real^n, finding all zero-points of F. Formally: Be F:Real^n->Real^n the system function Be [F]:IR^n->IR^n the inclusion function of F Be eps a positive floating point number Be B an n-dimensional box Compute n-dimensional boxes B[1],...B[k], such that * The union of all B[i] contains all vectors x in B, for which F(x)=(0,0,...0) * all boxes have a width smaller than eps w(B[i])=Real^n, B:Real^n, eps:Real) : Set of IR^n IF NOT (0,0,...0) in [F](B) THEN RETURN {} IF w(B) < eps THEN RETURN {B} Partition B into B1 and B2 RETURN solve(F,B1,eps) \/ solve(F,B2,eps) This function always splits the overall box B into two parts and analyzes each of them recursively. As soon as (0,0,...0) is not in the inclusion function result for a sub-box, there is no use in exploiting this box further. Proof of correctness: We assume any finite number representation (e.g. floating point numbers instead of reals). Then the algorithm terminates: In each recursive call, B = B1 \/ B2. Since B1 < B and B2 < B, the problem size decreases. And since we have a finite number of numbers, the problem size cannot converge asymptotically against eps, but has to step over eps after a finite number of steps. As soon as this happens, the recursion terminates in any case. We still have to prove partial correctness, i.e. the fact that if there is a solution x in the initial interval B, then x will be contained in the result of . We prove that x is either in the B of some recursive call or in the result: Induction start: x is in B in the beginning by pre-condition. Induction step: Assume that x is in B. By the definition of the inclusion function [F], [F](B) has to contain (0,0,...0) then. Thus, the first if-condition fails. If w(B) < eps, then B is joined to the solution (q.e.d.). Otherwise, B is split into two parts: Since B=B1\/B2 and x in B, it follows that x in B1 or x in B2. Since B1 and B2 are handed to a new instance of , x will in any case be in the B of a new recursive call (q.e.d). Correctness follows from termination and partial correctness. // Overview: We used Inclusion-functions for finding the solutions // (i.e. zero-points) of real functions: // Take an initial box // Check by the inclusion function, whether the box can contain // a solution // If so, partition the box into two smaller boxes // Check recursively for each of the boxes. // // Thus, binary search together with inclusion functions can find // solutions for functions. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Constraints ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Constraint relation: A relation on Real^n. An element of the relation is said to "fulfill" the constraint. Constraint of a constraint relation R: Its boolean function. That is: A function c:Real^n->Boolean, which returns 'true' for vectors fulfilling the constraint and 'false' else. Constraints are mostly given by their function term, i.e. a term containing * variables * real constants * functions * boolean functions for the comparisons * logical functions The relation between R and c is the following: c(x1,...xn)=true <=> (x1,...xn) in R R = { (x1,...xn) | c(x1,...xn)=true } The desired value of a constraint is 'true'. Solution of a constraint c:Real^n->Boolean: A true-point of c. That is: A vector (x1,...xn) fulfilling the constraint, i.e. an interpretation such that the constraint term evaluates to 'true'. Inclusion test of a constraint c:Real^n->Boolean: An inclusion function for c. That is: A function [c]:IR^n -> P(Boolean) such that [c](A[1],...A[n]) >= { c(a[1],...a[n]) | a[1] in A[1], ... a[n] in A[n] } That is: Given n intervals, the inclusion test returns the set of all those values, which c might return for arguments inside the intervals. That is: If [c] returns {true}, then all values in the intervals fulfill the constraint. If [c] returns {false}, then none of the values in the intervals fulfills the constraint. If [c] returns {true,false}, then some values fulfill it, whereas others do not. But it might also be the case that all fulfill it or none fulfills it. Inclusion-test for '=': The operator defined by [a,b] [=] [c,d] := (b < c || d < a) ? {false} : (a==b && b==c && c=d) ? {true} : {true, false} Inclusion-test for '=<': The operator defined by [a,b] [=<] [c,d] := (d < a) ? {false} : (b =< c) ? {true} : {true, false} Umbrella constraint of a set of constraints {c[1],...c[n]}: The conjunction of these constraints: c(x) := c[1](x) && c[2](x) && ... c[n](x) A vector fulfills this constraint only if it fulfills all constraints c[i]. Thus, multiple constraints can always be interpreted as a single constraint. To do so, one interprets each variable involved in the constraints as a dimension. Thereby, variables inteded to be identical take the same dimension. If there are n dimensions, all constraints are expanded to Real^n (they just ignore the dimensions, which they do not have a variable for). Then the constraints may be assembled to the umbrella constraint. Restricted branch-and-bound-algorithm: A branch-and-bound algorithm, which returns only a finite number of solution boxes with a width smaller than eps. The other boxes may be larger. The union of all boxes still covers all solutions. Restricted branch-and-bound- algorithms are useful if an equation system has infinitely many solutions. Instead of computing all of them, the solver describes some of them precisely and covers the others by large boxes. The number of precise boxes can be fixed in advance. Constraint Branch-and-Bound-Algorithm: A branch-and-bound-algorithm, which works on an equation system, but picks only those solutions, which fulfill an additional given constraint. Restricted BFS Constraint Branch-and-Bound-algorithm: The following algorithm: FUNCTION solve(c:Real^n->{true,false} , B:Real^n , eps:Real , N : Natural) : Set of IR^n // Queue boxes still to be examined // Invariant: TRUE in [c](b) for all b in q VAR q : Queue of R^n := // Collect boxes with result // Invariant: TRUE in [c](r) and w(r)Boolean an umbrella constraint Be [S]:IR^n->P(Boolean) the inclusion test of S Be eps a positive floating point number Be B an n-dimensional box Compute n-dimensional boxes B[1],...B[k], such that * The union of all B[i] contains all solutions of S in B * all boxes have a width smaller than ep w(B[i])= = , ... = ; Variables in [,], ... in [,]; Constraints , ... ; where is one of =,>=,=<. // It takes incredibly long to extract this simple information out of // the manual... As output, it gives contracted boxes for all variables. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fighting the dependency problem ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Metric: A function m:S x S -> Real+ for any set S, such that * m(a,b)=0 => a=b * m(a,b) = m(b,a) * m(a,c) < m(a,b) + m(b,c) This is an axiomatization of functions which measure distances between objects. Interval Distance: The function q:IR x IR -> Real with q([a;b],[c;d]) := max(|a-c|,|b-d|) q is a metric: * q([a;b],[c;d])=0 <=> max(|a-c|,|b-d|)=0 <=> |a-c|=0 && |b-d|=0 <=> a=c && b=d <=> [a;b]=[c;d] * q([a;b],[c;d]) = max(|a-c|,|b-d|) = max(|b-d|,|a-c|) = q([c;d],[a;b]) * q([a1;a2],[c1;c2]) = max(|a1-c1|,|a2-c2|) =< max(|a1-b1|,|a2-b2|) + max(|b1-c1|,|b2-c2|) = q([a1;a2],[b1;b2]) + q([b1;b2],[c1;c2]) Convergence rate of an inclusion function [f] in a box B: The real value alpha, such that there is a real value K such that for all boxes B'= 1, then the inclusion function is very tight for small boxes, but not tight at all for large boxes. Often, one combines inclusion functions with different convergence rates: Low convergence rates for big boxes and high conversion rates for small boxes. Cluster-effect: The fact that an inclusion function with a convergence rate < 1 is imprecise for small boxes. That is: If the boundary box is very small, then the inclusion function often says that a solution might be in there, although this is not the case. Hence, the branch-and-bound-algorithm will further analyze this box (uselessly). It will split the box, which leads to an even smaller box with a greater chance of mis-estimation. This results in clusters of useless small boxes. Tightness optimization: Restructuring a function so that its inclusion function becomes tighter. The more duplicate occurences of argument variables in the function term can be eliminated, the smaller is the dependency problem and the tighter is the resulting inclusion function. Sub-distributivity of interval arithmetic: The fact that A[*](B [+] C) =< A[*]B [+] A[*]C Sub-distributivity results from the dependency problem: The information that A=A is lost on the right-hand-side. This is why it is a tightness optimization to have a function compute a*(b+c) rather than a*b+a*c. Polynomial: A real term of the form SUM {x^i*a[i] | i=0...n} where a[i] are real numbers and x is a real variable. Horner-form of a polynomial SUM {x^i*a[i] |i=0..n}: The equivalent real term a[0] + x*(a[1] + x*(a[2] + x*(a[3] + ... x*a[n])...) The Horner form eliminates duplicate occurences of argument variables and thus optimizes the tightness of the inclusion function. Multi-variate polynomial: A real term of the form: SUM {x[j]^i*a[j][i] | i=0...n, j=1..k} where a[j][i] are real numbers and the x[j] are real variables. Bivariate polynomial: A multi-variate polynomial with 2 variables. Outer term of a term f(g(a,b),c): The term c. Horner-form of a multi-variate polynomial: An equivalent real term, which has a maximum number of variable occurences in outer terms. Example: 3*x^2*y^2 + 4*x*y^2 + 5*y^2+6*x^2*y+7*x*y + 8*y + 9*x^2 + 10*x + 11 = x*(3*x*y^2 + 4*y^2 + 6*x*y + 7*y + 9*x + 10) + y*(5*y+8) + 11 = x*(x*(3*y^2+6*y+9) + y*(4*y + 7) + 10) + y*(5*y+8) + 11 = x*(x*(y*(3*y+6) + 9) + y*(4*y+7) + 10) + y*(5*y+8) + 11 Derivative of a function f:Real->Real: The function f'(x)= (lim h->0 (f(x+h)-f(x))/h) That is: The function tells for each x the slope of the graph of f. Differentiable function: A function, for which the derivative can be calculated. In the following, it is assumed that all functions, the derivative of which we use are differentiable. Continuous function: A function f:Real^n->Real^n, such that for each vector x, lim x'->x f(x')=f(x). That is: There are no jumps in the function graph. Mean value theorem: The fact that for a continuous function f:Real->Real and for two real values x0 < x1, there is always a x in {x0;x1], such that f(x1) = f(x0) + f'(x)*(x1-x0) Mean value form of a function f:Real->Real: The inclusion function [f](I) := f(mid(I)) [+] [f'](I) [*] (I [-] mid(I)) This inclusion function bases on the fact that each function value in I can be predicted by the slope of the function in the interval. Therefore, the inclusion function of the derivative needs to be known. The mean value form has a convergence rate of 2. Thus, it can be used to tighten the boxes, if the input boxes have already been tightened by inclusion functions with smaller convergence rate. Derivative of a continuous function f:Real^n->Real for a component i: The function d[i]f(x) := (lim h->0 (f(x[0],...x[i]+h,x[i+1],...x[n])-f(x))/h) That is: The function tells for each x the slope of the graph of f in the direction of x[i]. Notation for the derivative of f:real->Real: f' Common derivatives: * cos'(x) = -sin(x) * sin'(x) = cos(x) * abs'(x) = (x>0)?1:(x<0)?-1:NaN * (a*f + b*g)'(x) = a*f'(x)+b*g'(x) * (f*g)'(x) = f'(x)*g(x) + g'(x)*f(x) * (f/g)'(x) = (f'(x)*g(x) - f(x)*g'(x)) / g(x)^2 * (x^z)' = z*x^(z-1) for all z in Real\{0} * tan'(x) = 1+tan^2 * tan'(x) = 1/cos(x)^2 * (f^-1)'(x) = 1/f'(f^-1(x)) with f^-1(f(x))=x * log(a*x)' = 1/x * f(g(x))' = g'(x)*f'(g(x)) Multivariate Mean value theorem: The fact that for a continuous function f:Real^n->Real, an n-dimensional box B and any vector x in B, there are vectors y[1],...y[n] in B, such that f(x) = f(mid( B )) + SUM { d[i]f(y[i])*(x[i]-mid( B[i] )) | i=1...n} Types: IR^n R^n R IR --R^n-- ---R------- R ---R------- -----R----- ---R------- -------R-------- -----R----- --------------R-------------- -----------------R------------------------------- Multivariate mean value form of a function f:Real^n->Real: The inclusion function [f](B)= f(mid(B)) [+] SUM { [d[i]f]( B ) [*] (B[i] [-] mid(B[i])) } IR^n IR^n IR IR --R^n- ---IR------ IR ----R---- ----R--- ---IR------ IR ----IR--- ----IR-- ---IR------ -------IR-------- ----IR-- ----------------IR------------------- -----------------------------IR------------------------- This version can be optimized as follows: [f](B) = f(mid(B)) [+] SUM { [d[i]f](B[1],...B[i],mid(B[i+1]),...mid(B[n])) IR IR IR IR IR IR ----R----- -----R----- IR IR ----IR---- -----IR---- ------------IR^n-------------------- ------------------IR------------------------ [*] (B[i] [-] mid(B[i])) | i=1...n} // Overview: We got to know // * constraints: Functions c:Real^n->Boolean // * their inclusion functions: // Inclusion-tests [c]:IR^n->P(Boolean) // * connectives: Functions k:Boolean^2->Boolean // * their inclusion functions: // Inclusion-connectives [c]:P(Boolean)^2->P(Boolean) // // The desired value of these functions is 'true'. Hence, a solution // of a constraint or a connective is a true-point. We can use // the branch-and-bound-algorithm for finding the solutions of // constraints and connectives as we did it for real numbers. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Contractors ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Sub-box of a box B: A box B' of the same dimension with B'=Boolean: A function C:IR^n->IR^n, which, given a box B, returns a sub-box, which still contains all those vectors whithin B, which fulfill c. In terms of the constraint relation R, this means {x | x in B && c(x)=true} = B /\ R =< C(B) =< B Optimal contractor for a constraint c:Real^n->Boolean: A contractor C:IR^n->IR^n, such that C(B) = [B /\ R] = [ {x | x in B && c(x)=true} ] where R is the constraint relation of c. That is: Take all vectors within B. Kick out those which do not fulfill the constraint (this is B /\ R). Draw the smallest possible box around these (this is [B /\ R]). This should also be the result of C(B). Projection of a relation to the i-th component: The set of the projections of all elements of R to the i-th component. proj[i](R) = { proj[i](x) | x in R } // This does not have to be a definition, as proj[i] is a // function, which, when applied to a set, returns the set of its // results anyway. Projection lemma: The fact that for a relation R < Real^n, [R] = [proj[1](R)] x [proj[2](R)] x ... [proj[n](R)] That is: The smallest box comprising a set of vectors R can be built by multiplying the smallest intervals for each component. Addition contractor: The contractor of the constraint corresponding to addition: CADD(I1,I2,I3) = [ sum /\ (I1 x I2 x I3) ] = [proj[1](sum /\ I1 x I2 x I3)] x [proj[2](sum /\ I1 x I2 x I3)] x [proj[3](sum /\ I1 x I2 x I3)] = (I1 /\ (I3 [-] I2)) x (I2 /\ (I3 [-] I1)) x (I3 /\ (I1 [+] I2)) where 'sum' is the relation corresponding to addition. Proof for proj[y](sum /\ Ix x Iy x Iz) = Iy /\ (Iz - Ix): proj[y](sum /\ Ix x Iy x Iz) = Iy /\ (Iz - Ix) <=> proj[y](sum /\ Ix x Iy x Iz) = Iy /\ (Iz - Ix) <=> proj[y](sum /\ Ix x Iy x Iz) =< Iy /\ (Iz - Ix) && proj[y](sum /\ Ix x Iy x Iz) >= Iy /\ (Iz - Ix) <=> proj[y](sum /\ Ix x Iy x Iz) =< Iy (1) && proj[y](sum /\ Ix x Iy x Iz) =< Iz - Ix (2) && proj[y](sum /\ Ix x Iy x Iz) >= Iy /\ (Iz - Ix) (3) (1) Is obvious. (2) Consider any y in proj[y](sum /\ Ix x Iy x Iz). According to the definition of 'sum', there will be a x and a z such that x+y=z. Now assume that x is not in Ix. Then (x,y,z) is not in Ix x Iy x Iz. Then (x,y,z) is not in (sum /\ Ix x Iy x Iz). Then y cannot be in proj[y](sum /\ Ix x Iy x Iz), contradiction. Hence x in Ix, and, analogously, z in Iz. We have y=z-x. The interval difference Iz-Ix contains *any* element, that might result from a subtraction of a value of Iz and a value from Ix. In particular, Iz-Ix will also contain y. (3) Consider any y in Iy /\ (Iz-Ix). The y is in Iy and y is in (Iz-Ix). Since the interval subtraction '-' is an optimal inclusion function, there will be a z in Iz and a x in Ix, such that y=z-x. This means x+y=z, hence (x,y,z) is in 'sum' and since x in Ix, y in Iy and z in Iz, it follows that (x,y,z) in sum /\ Ix x Iy x Iz. Hence y in proj[y](sum /\ Ix x Iy x Iz). Multiplication contractor: The contractor of the constraint corresponding to multiplication: CPROD(I1,I2,I3) = [ prod /\ I1 x I2 x I3 ] = [proj[1](prod /\ I1 x I2 x I3)] x [proj[2](prod /\ I1 x I2 x I3)] x [proj[3](prod /\ I1 x I2 x I3)] = (I1 /\ (I3 [/] I2)) x (I2 /\ (I3 [/] I1)) x (I3 /\ (I1 [*] I2)) where 'prod' is the relation corresponding to multiplication. Subset-contractor: The contractor of the subset constraint: CSUB(I1,I2) := I1 /\ I2 Fixpoint of a function f:S->S: An element s of S, such that f(s)=s. Fixpoint contractor: The function FUNCTION contract(S : Set of constraints c:Real^n->Boolean, B : IR^n) : IR^n WHILE Ex c in S: C(B)!=B DO B := C(B) RETURN B It implements a contractor for the umbrella constraint of S. It reduces the initial box B by the contractor of any constraint. Then it contracts this result again by another contractor and so on, until no contractor can shrink the box any more. This function can be used to solve a general constraint problem. It will always have the same result, independently of the choice of the contractors in the WHILE statement. Nevertheless, the order of the application of the contractors may have an influence on the speed of the algorithm. The algorithm is not optimal: For the constraints x - y = 0 x + y = 0 it will not be able to reduce the initial box, although the only solution is x=0, y=0. Furthermore, it suffers seriously from the dependency problem: For x+x=0 and an initial interval of [-1;1] for x, it will not contract anything! Constraint propagation: Using the contractor for a set of constraints to solve a general constraint problem. It is crucial for efficiency to hand the new box to a constraint which shrinks it the most. Handing the box from constraints to other constraints is called "propagation". Conjunctive contractor: The function C&:IR x IR -> IR with C&(I1, I2) = [I1 /\ I2] (?). The conjunctive contractor is not optimal. Simple constraint: A constraint corresponding to an arithmetic operator. Example: x+y=z Decomposition of a complex constraint K: A set of simple constraints c[i], such that the solution set of K is the same as the solution set of the conjunction of the constraints c[i]. That is: One introduces additional variables and models the complex constraint by a conjunction of simple constraints. Example: x + 3 = y + 2 ~~~~> x + 3 = t /\ y + 2 = t If in the complex constraint, every variable just occurs once, then the contractor on the decomposed constraint is optimal. In any case, the decomposed contractor will always return the empty box, if interval evaluation shows that there is no solution to the complex constraint. Redundant constraint in a constraint set S: A constraint, the solution set of which is a (not necessarily strict) super-set of the intersection of the solution sets of the other constraints in S. That is: The constraint does not add necessary information. Nevertheless, redundant constraints are often added in order to speed up the search for a solution. Mean value contractor of a constraint for f(x)=0: The redundant constraint 0 = f(a)+f'(x')*(x-a) for any initial box I for x real value a in I variable x' in I f must be a function from the real numbers to the real numbers. This constraint has been derived from the mean-value-theorem. Derivative of a continuous function f:Real^n->Real^n for a component i: The function d[i]f(x) := (lim h->0 (f(x[0],...x[i]+h,x[i+1],...x[n])-f(x))/h) The result of this derivation is a vector. Multivariate Mean value contractor of a constraint for f(x)=(0,...0): The additional constraints x' = a + t*(x-a) (0,...,0) = f(a) + (d[1]f(x'), d[2]f(x'), ...) * (x-a) -R^n- ------------R^n*R^n-------- -R^n- -R^n- ----------------------R^n---------- ------------------R^n--------------------- for any box B in IR^n vector a in B with the initial boxes B for x [0;1] for t f must be a function f:Real^n->Real^n. These constraints have been derived from the mean-value-theorem. They are redundant. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Improvements ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Shaving: Analyzing the outer part of a box separatly and dropping it if it does not contain a solution to a constraint. Linear function. A function f:Real^n->Real, which returns the dot-product of its input vector and a vector of fixed constants. Linear equation system: An equation system, the functions of which are all linear. This equation system can be written as A*x = b where * A is the matrix with the vectors of th linear functions * x is a vector variable * b is a vector Gaussian elimination: The following algorithm for the solution of a linear equation system: Write all equations one below the other, sorted by their variables x[1]*a[1][1] + x[2]*a[1][2] + ... x[n]*a[1][n] = v[1] .... x[1]*a[n][1] + x[2]*a[n][2] + ... x[n]*a[n][n] = v[n] FOR i:=1 TO n DO FOR j:=i+1 TO n DO d := a[j][i] / a[i][i] line j := (line j) - (line i)*d // Now a[j][i] is 0 FOR i:=n TO 1 DO x[i] := (v[i] - SUM { x[j]*a[i][j] | j=i+1..n }) / a[i][i] Point-matrix: A (normal) matrix (of real numbers). n*m Interval matrix: A tuple of n tuples of m intervals. The operations defined on normal matrices can also be applied to interval matrices with the appropriate inclusion functions. Interval-vector: A tuple of intervals. Linear interval equation system: A linear equation system, where * the matrix is an n*m interval matrix * the vector is an interval vector * one has interval bounds for all variables. Interval Gauss contraction: The following algorithm for the contraction of a linear interval equation system: Consider all terms, which are not variables to the power of 1 as constants. Thus, any system becomes linear. Apply Gaussian elminination to the equation system Using the initial boxes for the variables, apply contractors to all lines, starting from the last one. This algorithm results in smaller boxes than a simple contraction of each equation. Mid-point-matrix of an interval matrix A: The (normal) matrix mid(A) with mid(A)[i][j]=mid(A[i][j]). Mid-point-inverse of an interval matrix: The inverse matrix of its mid- point matrix. Preconditioning: Multiplying the matrix A of a linear interval equation system by its mid-point inverse A' and also multiplying the interval vector by A'. A solution of this preconditioned system is also a solution of the original system. By preconditioning, the system has large intervals on the diagal, but small intervals elsewhere. Thus, contraction can be applied very efficiently: Consider a line a*x + eps*y = z with a small eps. Then contracting for x involves x = (z-eps*y)/a which yields a small intervall for x for large a. Gauss-Seidel-Iteration: The following algorithm for solving a linear interval equation system: Take the first line, replace all variables by their bounds exept for the first and contract the first. Proceed analogously with the other lines, using the new intervals for the variables. If all lines have been processed, restart from the beginning. Gauss-Seidel-Iteration is very efficient, if preconditioning is used and the i-th line is contracted for the i-th variable. Buchenberg: (?) Groebner-Basis: (?) Brouwer's fixpoint theorem: The fact that a function f:Real^n->Real^n has a fixpoint in a given box, if f(B)=Real^n in a given box. This problem can be solved by introducing G:Real^n->Real^n G(x) := x - f(x) If G has a fixpoint in B, then f has a zero-point in B. By Brouwer's fixpoint theorem, this is the case if G(B)=Real subject to constraint c: A vector x of R^n with c(f(x))=true /\ ~ Ex x': (c(f(x'))=true /\ f(x')Real and a constraint c:Real^n->Boolean, findig the global minimum of f subject to c. Formally: Be f:Real^n->Real the function to be optimized Be [f]:IR^n->IR an inclusion function Be c:Real^n->Boolean an umbrella constraint Be [c]:IR^n->P(Boolean) an inclusion test for c Be eps a positive floating point number Be B an n-dimensional box Compute n-dimensional boxes B[1],...B[k], such that * The union of all B[i] contains all vectors x in B, for which c(x)=true and x is an optimum among the results of f for the solution set of c. * all boxes have a width smaller than eps w(B[i])= [f](B[j]).ubound This problem is a special form of the general constraint problem. It can be solved like a general constraint problem with the additional constraint f(x) < u, where u is the minimum among the upper bounds of [f](B[i]). If this upper bound is uknown (like in Realpaver), use any other value for u, which is known to be an upper bound of the minimum. Monotonocity test: The following redundant constraint, which must be fulfilled by a global optimum x of a function f:Real^n->Real subject to a constraint c: d[i]f(x)=0 for i=1..n That is: x must hav slope 0, because left of x, the slope is negative and right of x, the slope is positive. Convexity test: The following redundant constraint, which must be fulfilled by a global optimum x of a function f:Real^n->Real subject to a constraint c: d[i](d[i]f(x)) >= 0 for i=1..n That is: The slope of the slope in x must be positive, because the slope increases from left to right around x. Optimality conditions: The union of the monotonocity test and the convexity test. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Quantified constraints ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Quantified constraint: A constraint expressed by a term containing * variables * real constants * quantifiers * functions * boolean functions for the comparisons * logical functions With these constraints, one can find values, which fulfill a constraint under all circumstances ("All x") and for which there exist specific other values ("Ex x"): Free variable: A variable which does not occur as an argument to a quantifier. Sentence, Closed constraint: A quantified constraint without free variables. Solution of a closed constraint: The evaluation of the constraint term, i.e. a boolean value. Quantifier elimination: The reformulation of a quantified constraint by an equivalent constraint function term without quantifiers. Although there is some theoretically correct method by Tarski for simple constraints, there is no efficient way of eliminating quantifiers. You mostly calculate the solution of the constraint and invent a quantifier-free equivalent one. Example: All y: x^2+y^2>=1 <=> x=<-1 || x>=1 Tarski's Quantifier theorem: The fact that quantifier elimination is possible for all constraints involving only comparisons, the real functions '+', '*' and rational constants. But quantifier elimination with Tarski's method is slow and complicated. Difference of two real values x and y: The value |x-y|. Constants distance of two constraints: The maximum of the differences of the corresponding constants in the two constraint terms. The constants distance is only defined if the two constraint terms have the same structure, i.e. they differ only in their constants. Example: Distance of 3+x=7 and 4+x=10 is max(|4-3|,|10-7|)=3 Stable constraint: A closed constraint, such that there is a real value d>0, such that all constraints with a constants distance smaller than d are equivalent. Examples: * 0=0 is not stable, because for every real delta, there are eps1 and eps2 with |0-eps1|= 0 is not stable, because for every real delta, there is an eps with |0-eps|= eps is not true, although the original constraint is true. * Ex x: (x^2+y^2=<1 && (x-2)^2+(y-2)^2=<1) is sbale, because one may vary the constants within a range of e.g. 0.1 without the constraint becoming true. Bounded quantifier: A quantifier over a real variable, the value of which is bounded by an interval. Notation: All x in S: term := All x: (x in S => term) Ex x in S: term := Ex x: (x in S && term) If a bounded quantifier appears in a constraint problem, then the interval can be taken as an initial box for the variable and does not need to appear in the term anymore. Approximating equality: Replacing a constraint of the form "x = 0" by the constraint "x==eps" for eps being a small real value. Quantified constraint problem: A constraint problem involving quantified constraints. As solving constraints of this kind is complicated, one often restricts oneself to bounded quantifiers, closed constraints and approximated equalities. Interval interpretation: A function mapping variables to boxes. Interval evaluation: A function mapping a term and an interval interpretation to a set. This is done by replacing all variables by their intervals, replacing all functions by their inclusion functions and applying the functions to the values as indicated by the new term. The result is a super-set of the set of standard evaluation results with standard interpretations ranging in the given intervals. It may be a strict super-set and thus interval evaluation serves as a quick-and-dirty check for possible variable values. Interval evaluation of "All/Ex x in S: term": The interval evaluation of the "term", where the interpretation is extended by the mapping of x to S. Branch-and-check-algorithm: An algorithm for solving a quantified constraint problem. All quantifiers must be bounded. Equalities have to be approximated. Only closed constraints are allowed. The algorithm works as follows: FUNCTION branchAndCheck(C : Constraint) : Boolean // Can we get a definite result by interval evaluation? IF evaluate(C)=={true} THEN RETURN true IF evaluate(C)=={false} THEN RETURN false // If not, examine the term IF C=="Ex x in S: term" THEN Partition S into S1 and S2 RETURN branchAndCheck(Ex x in S1: term) || branchAndCheck(Ex x in S2: term) IF C=="All x in S: term" THEN Partition S into S1 and S2 RETURN branchAndCheck(All x in S1: term) && branchAndCheck(All x in S2: term) IF C=="term1 logoperator term2" THEN RETURN evaluate(term1) logoperator branchAndCheck(term2) This algorithm first does a quick-and-dirty check of the constraint: If the interval evaluation can already tell the definite result of the constraint (by returning a singleton set), we're done. Else, we split up existential constraints by halving the bounding interval and postulating existence in only one of them. Similarly, we halve the bounding interval of universal quantifiers and postulate its validity in both halves. If the constraint term starts with an ordinary sub-term, we evaluate it and combine the result with the result of a recursive call for the remaining sub-term. If intervals are partitioned in subsets of equal size, then the algorithm terminates for stable inputs. The algorithm may be extended so that it also works with constraints with free variables. This is done by partitioning intervals also for free variables. This algorithm proves the fact that approximating and restricted algorithms may be more useful than theoretically correct but complex algorithms (like Tarski's). ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Differential Equations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Time function: A function f:Real+->Real^n. State of a time function f at time t: f(t). Initial state of a time function f: f(0). Functional variable: A variable, which is mapped to a function by the interpretation. Consequently, it may appear in a term in places where function symbols appear. Differential equation: An equation containing a function variable for a function f and a function variable for its derivative f'. Many processes in nature are described by differential equations. Solving a differential equation: Finding the function which the function variable has to be mapped to in order to have the equation to evaluate to 'true'. IVP, intermediate value problem: The problem of finding y such that All t in [a;b]: diffEquation && u(x0)=z && u(x1)=y Where * is a differential equation on the function variable usually termed with the argument * x0, z and x1 are real constants * y is real variable That is: Given an interval and a relation between u(t) and u'(t) and an initial value u(t0), determine u(x1). By this, we can mimick derivation, i.e. we can calculate the result of the derivative of a function f:Real->Real for a constant argument t0: All t in [a;b]: u(t)=f(t) && u'(t0)=x And we can mimick integration, i.e. we can calculate the result of a function f:Real->Real for a constant argument t0 when we know f': All t in [a;b]: u'(t)=f'(t) && u(t0)=x i-th derivative of a function f:Real^n->Real^n: If i=1, then its derivative, else the derivative of its (i-1)-th derivative. Notation: Here: f[i] Taylor expansion of a function f:Real->Real^n: The term u(x0) + SUM { (x-x0)^i/i!*u[i](x0) | i=1...k-1 } + (x-x0)^k/k!*u[k](s) where * x0 is a real constant * x is a real variable * u[i] denotes the i-th derivative of u * s is a real variable in the range [x;x0] * k is a natural constant There is a value for s, such that the taylor expansion is equivalent to f(x). For k=1, this equivalence yields the mean value theorem. In general, the taylor expansion allows calculating f(x), if f(x0) and the first k drivatives of f are known. Reduction of problem A to problem B: An algorithm, which translates an A-problem to a B-problem and the solution of a B-problem to a solution of an A-problem. Thus, an A-problem can be solved if we can solve a B-problem. Solving a IVP: Reducing it to a general constraint problem. This can be done as follows: Be the IVP All t in [a;b]: diffEquation && u(x0)=z && u(x1)=y where * x0, z and x1 are real constants * y is a real variable. Create the taylor expansion of u (with x0=x0) Add the taylor expansion as a constraint * once for u(x1)=taylorExpansion with intermediate variable s * once for u(t)=taylorExpansion with intermediate variable c Take the differential equation and calculate the first k derivatives it Add all of these derivative constraints and the diffEquation itself * once for t=x0 * once for t=s * once for t=c * once for t=t Possibly add new Taylor constraints Add the constraint u(x0)=z Replace all u[k](x) by a new variable ukx Add boxes for all variables (s.b. for method) Solve the resulting general constraint problem. The principle of this algorithm is * relaxing the new problem to the general constraint problem * adding redundant constraints to make the relaxation tight Additional constraints could be added: The set of possible values for u(x0) and u(s) is a subset of the set of possible values for u(t), t in [a;b]. But the subset-contractor is not implemented in Realpaver. Nevertheless, one can mimick this process by hand: When first running Realpaver, the differential equation for s and the taylor expansion for u(t) restrict u(t). Restrciting u(t) also restricts u'(t). Since u(s) and u(c) vary within the possible values of u(t) and since u'(c) and u'(s) vary within the posible values of u'(t), this gives us new bounds for all variables. If Realpaver is re-run, then all bounds get tighter. Picard-theorem: The following fact: Be u:Real^n->Real an (unknown) function Be t0, t1 real numbers, such that t0Real Be [f]:IR^(n+1)->IR an inclusion function of f Be phi(B) := D0 + [0;t1-t0]*[f]([t0;t1],B) Find an n-dimensionl box B, such that phi(B) =< B and B >= D0 Then u(t) is in phi(B) and u'(t) is in F([t0;t1],B) for t in [t0;t1]. This theorem can be used to determine initial boxes B for an IVP. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Hybrid systems ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Function class: A class of all functions from a set A to a set B, denoted A->B. (S,n)-Hybrid system: A set of functions h:Real+->(S x Real^n), where S is any finite set and n is a natural number. Discrete state of a (S,n)-Hybrid system: An element of S. Continuous state of a (S,n)-Hybrid system: An element of Real^n. State of a (S,n)-hybrid system: An element of S x Real^n. Run of a hybrid system H: An element of H. Each run models a possible curriculum vitae of a physical system: Certain parameters may take certain values (e.g. speed, heat etc.). These parameters are modeled by the continuous states of a run: For every time t, f(t) gives the values of the system parameters. Then, the physical system may be in different discrete states (e.g. different states for different gears). These states are modeled by the part s:Real+->S of a run: For each time t, s(t) is the discrete state which the system is in. Depending on the state, the real function of a run may behave differently. The real function may for instance be given by one function term (or differential equation) for each state. Reach-set of a hybrid system H: The set of all states, which are attained by any run of H at any time t. Notation: Reach(H) := {h(t) | h in H, t in Real+} State-set for a hybrid system H: A set of states of H. That is: A set of states, which may be reachable H or may not be reachable in H. Reachable state-set for a hybrid system H: A state-set R, which has some elements in common with the reach-set of H, i.e. R /\ Reach(H) !={}. Stable Hybrid system wrt a state-set R: A hybrid system H, such that for every run h in H, there is a real number t1, such that for all t>t1, h(t) is in R: H stable wrt R :<=> All h in H: Ex t1: All t>t1: h(t) in R That is: If you wait long enough, any run of H will end up in R. Asymptotically stable hybrid system wrt a state r: A hybrid system H, such that for every run h in H, lim t->INF h(t)=r That is: If you wait long enough, any run of H will end up in state r. Asymptotic stability implies stability. See also: Linear system stability theorem Hybrid system parametrization: A function, which assigns to a vector a hybrid system. Differential hybrid system: A hybrid system, the runs of which are given by differential equations of the form dx[i]/dt = f[i](x[1],...x[n]). Init-set of a hybrid system H: The set of all start states of H, i.e. Init(H) := {h(0) | h in H} It holds that init(H) =< reach(H) because for very x in init(H), there is a h in H, such that h(0)=x. Event-number of a run h in an interval [t;t']: The number of events between t and t', that prevent further inspection in the future of h. Such events can for instance be discrete state changes. Notation: event(h,[t;t'])>=0 // Script says ">0" (?) event(h,[t;t]) = 0 One assumes that only finitely many events happen in finite time. Post-set of a hybrid system H for a set of states Q: The set of those states, which are reached by any run in H when starting from any state in Q. Thereby, only one event may be bypassed. Post(Q) := { x | Ex h in H: Ex t in Real+: Ex t' in Real+: t'>=t, h(t) in Q, h(t')=x, event(h,[t;t'])=<1 } This set looks at all runs h which attain a state h(t), which is already in Q. From hereon, it looks into the future by t'>t as far as it can. All states attained thereby are added to Q. It holds that R1 =< R2 => post(R1)=t, such that h(t) in R1 and h(t')=x. Since h(t) in R1, h(t) is also in R2. This means that there is a h, there is a t and there is a t', such that h(t) in R2 and h(t')=x. Hence, x is in post(R2). Furthermore, post(Q) =< reach(H) Proof: A state x is only in the post-set, if there exists a run h and there exists a time t', such that h(t')=x. This implies that x is reachable. // Drawing the runs helps! Fixpoint-Reach-algorithm: The following algorithm for computing the reach-set of a hybrid system: FUNCTION reach(H: (S,n)-HybridSystem) : Set of (S,n)-state Q:Set of (S,n)-state := Init(H) WHILE Post(Q)>Q DO Q := Q \/ Post(Q) RETURN Q This algorithm computes the reach set by adding little chunks of the runs of H. It starts off by the init-set of H. Then it looks into the future by computing the post-set. From this new set of states, it computes again the post-set and so on, until no more states are added. If the algorithm ever terminates, then post(Q)=Q Proof: We know that the post-set is monotonic, hence post(Q)>=Q. If the loop ever terminates, then post(Q)== reach(H) Proof by decente infinie: Suppose there is a reachable state x, which is not in Q. Then there is a run h and a time t, such that h(t)=x. If t=0, then h(t)=h(0) is in init(H). init(H) is in Q and hence h(t) is in Q, contradiction. Hence t!=0. Now find t'=reach(H). Furthermore, it holds at any time that Q =< reach(H) Proof by induction: We know that init(H) =< reach(H). Hence Q==0. If it can be shown that the ranking function decreases in each interation of a loop, then this entails that the loop must terminate, because the ranking function never returns the value 0. Discrete-Fixpoint-Reach-Algorithm: The Fixpoint-Reach-Algorithm, restricted to (S,0)-hybrid systems. This means that all runs map their argument t to an element of S. In this case, the fixpoint-Reachset-Algorithm boils down to FUNCTION Post(Q : Subset of S) : Subset of S RETURN { x | Ex h in H: Ex t in Real+: Ex t' in Real+: t'>=t, h(t) in Q, h(t')=x, event(h,[t;t'])=<1 } FUNCTION reach(H: (S,0)-HybridSystem) : Subset of S Q : Subset of S := Init(H) WHILE Post(Q)>Q DO Q := Q \/ Post(Q) RETURN Q It can be shown that the algorithm always terminates: We assume that S is finite and that only finitely many events happen in a finite time interval. As a ranking function, we choose r(Q) := |Reach(H)|-|Q|. Clearly, r(Q) is an integer number. We know that Q is a subset of Reach(H). Thus, r(Q) is always >=0. Now look at the algorithm start: Init(H) is a subset of Reach(H). This implies |Init(H)|=<|Reach(H)| and hence |Q|=<|Reach(H)| and r(Q)>=0 at the start Now assume r(Q)=0. Then this means Q=Reach(H). Since Post(Q) is a non-strict subset of Reach(H) and thus of Q in this case, Post(Q)>Q is false. The algorithm terminates. r(Q)=0 implies termination If Q is a true subset of Reach(H), then there exists an element of Reach(H), which is not in Q and which is reachable from an element of Q by any run by passing less than two events (else, Reach(H) would be equal to Q). Thus, Post(Q) will contain that element. As Q does not contain this element, it is added to Q by the line Q:=Q\/Post(Q). Thus, |Q| increases with every step. Thus, r(Q) decreases with every step. Thus, r(Q) reaches 0 (it cannot bypass it as said above). Thus, the algorithm terminates. Equivalence relation on a set S: A relation R < S x S, which is transitive, reflexive and symetric. Equivalence class by an equivalence relation R: A set S, such that there is an s, such that S = { x | R(s,x) } An equivalence class is a maximal set of elemnt, which are all equivalent among each other. Abstract system of a (S,n)-hybrid system: A (S#,0)-hybrid system, where S# = S x #. # is a finite set of equivalence classes, such that any x in Real^n is in an equivalence class. Abstract state: A state of an abstract system. Abstract run: A run in an abstract system. State-Abstraction: Mapping a state from a hybrid system to the corresponding state in its abstract system. To do so, one defines the function alpha: S x Real^n -> S# alpha(s,x) := (s,x#) such that x in x# Proof of existence of alpha(s,x): alpha(s,x) exists, if an equivalence class x# exists, which contains x. Be R the equivalence relation. Then we may choose x#={s | R(s,x)}. Then x# is an equivalence class and x in x#. Proof of uniqueness of alpha(s,x): alpha(s,x) is unique if there is exactly one equivalence class x# with x in x#. Suppose there are two distinct equivalence classes x1# and x2# containing x. Then R(x,x1) for all x1 in x1# R(x,x2) for all x2 in x2# Hence, by transitivity R(x1,x2) for all x1 in x1#, x2 in x2# As a result, x1#=x2#. Run-abstraction: Mapping a run from a hybrid system to the corresponding run in its abstract system. One defines the function alpha: (Real+->S x Real^n)->(Real+->S#) alpha(h) := h#, such that h#(t)=alpha(h(t)) for all t in Real+. State-Concretisation: Mapping a state from an abstract system to the set of corresponding states in its underlying (S,n)-hybrid system. One defines the function gamma: S# -> S x Real^n gamma(s,r#) := {(s,r) | r in r#} Run-Concretisation: Mapping a run from an abstract system to the set of corresponding runs in its underlying (S,n)-hybrid system. One defines the function gamma: (Real+->S#)->(Real+->S x Real^n) gamma(h#) := { h | h in H, All t in Real+: h(t) in gamma(h#(t))} Spurious counter example: An abstract run with no corresponding run in its underlying hybrid system. That is: A run h# in an abstract system, such that gamma(h#)= {}. Correct abstract system of a (S,n)-hybrid system H: An abstract system H#, such that each run in H has a corresponding run in H#. That is: For every run h in H, there is a run h# in H#, such that h in gamma(h#). We have the notion "abstract system" refer only to correct abstract systems. Transition in an abstract system: A link from one abstract state x to another one y, which reflects the fact that there is a run in the underlying hybrid system, which goes from a state in x to a state in y. The idea is as follows: The abstract system gathers the states of the underlying hybrid system in abstract states. These abstract states can be seen as the nodes of a graph. The transitions can be seen as the edges of this graph. The goal is to find all those links, which are necessary and to avoid those links, which have no pendant in the underlying system. The runs in the abstract system will all follow the transitions. An abstract system is correct, if for every run in the underlying hybrid system, there exist the corresponding transitions in the abstract system. By these transitions, it is possible to say which states are not reachable or which states cannot be left. Refinement of an abstract system: The splitting of its equivalence classes into smaller equivalence classes. If it is not possible to generalize a statement for all concrete states of an abstract state, then the abstract state can be refined. Eventually, the statement then becomes true for all members of the refined abstract state. Abstract reachability theorem: The fact that, if an abstract state alpha(r) is not reachable in an abstract system H#, then r is not reachable in the underlying system H of H#. Proof: Take a state r, which is reachable in H. Then there is a run h in H and a time t, such that h(t)=r. Since H# is correct, there is a run h# in H#, such that h(t) in h#(t). It is h#(t)=alpha(r) and so alpha(r) is reachable in H#. If this is the case, one can * prove that r is reachable in H by finding such a run. * prove that r is not reachable in H by refining the equivalence classes in H#, such that alpha(r) is no longer reachable in H#. Predicate abstraction: The definition of the equivalence classes of an abstract system by a set P of predicates. There are 2^|P| equivalence classes, each of which is defined by "switching on" a number of predicates: { x | ~p[1](x) && ~p[2](x) && ~p[3](x) && ... } { x | p[1](x) && ~p[2](x) && ~p[3](x) && ... } { x | ~p[1](x) && p[2](x) && ~p[3](x) && ... } { x | p[1](x) && p[2](x) && ~p[3](x) && ... } ... There may be empty equivalence classes, if the predicates which are switched on contradict themselves. Init-set of an abstract hybrid system H#: The set of all start states of H#, i.e. Init(H#) := {h#(0) | h# in H#} = {alpha(h)(0) | h in H} = {alpha(h(0)) | h in H} if H is the underlying hybrid system. Post-set of an abstract hybrid system for a set of abstract states: The set of those abstract states, which are reached by any abstract run when starting from any of the given abstract states. Thereby, only one event may be bypassed. Let H denote a hybrid system, let H# be its abstract system. Post#(Q#) := { x# | Ex h# in H#: Ex t in Real+: Ex t' in Real+:t'>=t, h#(t) in Q#, h#(t')=x#, event#(h#,[t;t'])=<1 } // Take an abstract run h#, take a time t, such that h#(t) in Q#, // take another time point t'>t, such that event#(h#,[t;t'])=<1. // Then the state h#(t') is in the post-set. = { x# | Ex h in H: Ex t in Real+: Ex t' in Real+:t'>=t, alpha(h(t)) in Q#, alpha(h(t'))=x#, event#(h#,[t;t'])=<1} = { x# | Ex q# in Q#, Ex h in H: Ex t in Real+: Ex t' in Real+:t'>=t, h(t) in q#, h(t') in x#, event#(h#,[t;t'])=<1 } // Take an abstract state q# of Q#, take a run h in H, take a time // t, such that h(t) in q#, take another time point t'>t, such that // event#(h#,[t;t'])=<1. Then the abstract state of h(t') is in the // post-set. // To calculate post#(Q#), run through all abstract stats x#. // See if you can find a h with h(t) in q and q in Q#, such that // h(t') in x#. If you find uch a h, the add x# to Post#(Q#). // Drawing the runs helps! Sign abstraction: The definition of the equivalence classes of an abstract system by the sign of the differential equation functions. That is: We have a (S,n)-hybrid system , the reachable continuous states of which are given by differential equations of the form dx[i]/dt = f[i](x[1],...x[n]) i=1..n We divide the continuous state space into the following equivalence classes: {x | f[1](x) = 0 && f[2](x) = 0 && f[3](x) = 0 && ... } {x | f[1](x) > 0 && f[2](x) = 0 && f[3](x) = 0 && ... } {x | f[1](x) < 0 && f[2](x) = 0 && f[3](x) = 0 && ... } {x | f[1](x) = 0 && f[2](x) > 0 && f[3](x) = 0 && ... } ... This gives us 3^n equivalence classes. It is already clear that there may not be a direct transition between a state with f[i](x)<0 to a state with f[i](x)>0 without passing by a state with f[i](x)=0. Further refinements can be done: The differential equations can be derived again, resulting in functions f[i]'. Their signs can be used to split up the equivalence classes again. This kicks out further transitions: If f[i](x)=0 and f[i]'(x)<0, then a follow-up state with f[i](x)>0 is impossible. Sign abstraction can also be done by using arbitrary f[i] for the equivalence classes. Rectangular grid: An abstract system, the equivalence classes of which are boxes of equal size. If the underlying hybrid system is given by differential equations of the form dx[i]/dt = f[1](x[1],...x[n]) i=1..n then the borders of the boxes can be analyzed. If it can be found out by interval arithmetic, that for all x at the border dx[i]/dt = f[i](x[1],...x[n]) > 0 then this means that there is a transition from thi box to the corresponding neighboring box, but no transition back. By refining the boxes appropriately, it can be possible to establish such statements for many of them. i-th Leading principle minor of a n*n matrix M: The value det(M'), where M' is the upper left part of M of size i*i. Hurwitz-Matrix of a polyomial: The matrix a[k-1] a[k-2] ... 0 a[k] a[k-1] ... 0 0 a[k] ... 0 0 0 ... 0 . . . . . . . . a[0] if the polynomial is a[k]*x^k+a[k-1]*x^(k-1)+ ... a[0]*x^0. Routh-Hurwitz-Test: The following algorithm for determining whether all zero-points of a polyomial are negative: Create the Hurwitz-Matrix of the polynomial. Iff all leading principle minors of this matrix are positive, then all zero-points of the polynomial are negative. Linear system: A hybrid system, the runs of which are given by dx/dt = A*x, where x is the vector variable for the continuous state and A is a matrix. The function term of the runs of this system have the form SUM { c[i]*v[i]*e^(lambda[i]*t) | i=1..n } where * n is the number of eigenvalues of A * c[i] are real constants * lambda[i] are the eigenvalues of A * v[i] are the corresponding eigenvectors of A Linear system stability theorem: The fact that a linear system dx/dt = A*x is asymptotically stable iff all eigenvalues of A are negative. // Strictly speaking: If they have a negative real part. But since // this summary does not treat complex numbers, this is ignored here. // See "Analyse de donnees" (maths.txt, in French) for a detailed // treatment of this. This condition can be checked by finding all eigenvalues of A and seeing whether they are negative. It can also be checked by creating the polynomial of det(lambda*In-A) and applying the Routh-Hurwitz-Test. See also: Asymptotically stable hybrid system