- if P and Q are formulas, then P is a formula, P Q is a formula, P Q is a formula, P Q is a formula, and P <=> Q is a formula.
- if P is a formula and x a variable, then x: P and x: P are formulas.

**Theorem:** It is undecidable whether an arithmetical formula is true.

**Proof:** Suppose there is an algorithm A that, given an
arithmetical formula, decides whether that formula is true.
I will use that to give a decision algorithm for the language
AP_{TM} = {(M,w) | M is the description of a Turing machine
that accepts the string w}. As the latter problem is undecidable this
will show that A cannot exists.

Given a Turing machine M, a configuration of M is given by

- a state of M,
- the string on M's tape to the left of the position of the head of M (call it the left-string),
- and the string on M's tape including the square where the head is, and extending to the right of it (the right-string).

- The states of M will be numbered by an initial segment of the natural numbers. The initial state will get number 0 and the accepting state number 1.
- Suppose the tape alphabet of a given Turing machine M has size n. The left-string on M's tape will be interpreted as the n-adic representation of a natural number. The blank symbol will have value n and the other symbols values 1 to n-1. The n-adic representation yields a perfect 1-to-1 correspondence between strings and numbers.
- The right-string on M's tape ends in an infinite sequence of blanks that somehow "don't count". Such a string can handily represented as the n-ary representation of a natural number in reverse. This time the blank symbol has the value 0 and the other symbols (again) values 1 to n-1. In the n-ary representation of natural numbers leading 0's don't count either, so again the correspondence is perfect.

An arbitrary configuration can be represented by a triple (Q,X,Y) of variables, whose values represent the state, left-string and right-string of that configuration, respectively. Now the formula

Q = q R = r U = nX+b Y = nV+a

expresses the property that a configuration given by (Q,X,Y) evolves into one given by (R,U,V) by taking the transition from q to r labelled with ab,R. Thanks to the use of n-ary representation for the right-string, there is no need to treat the case that Y represents a string of mere blanks separately (as I did with first order logic).In case of a left-moving transition ab,L from q to r, the left-string looses a digit, say c, that is appended to the right-string after the last digit of the right-string is changed from a to b. Unless the left-string doesn't have any digits, i.e. is empty; in that case the last digit of y merely changes from a to b. This gives rise to the formula

Q = q R = r ( c: (0 < c < n X = nU+c Z: (Y = nZ+a V = n(nZ+b)+c)) ( X = nU+n Z: (Y = nZ+a V = n(nZ+b))) ( X = U = 0 Z: (Y = nZ+a V = nZ+b)) ).

Note that the case where c is the blank symbol is treated separately (through the middle disjunction). This is because in the left-string n-adic representation is used, and the value of c is n, whereas in the right-string n-ary representation is used, and the value of c is 0. The right-most disjunction deals with the case that the head of M was already in the left-most square, so that it couldn't move further to the left, and thus stays where it is. Arithmetic doesn't have the symbol <. However, c < n can be rewritten as k: c+1+k = n.For every transition in M there is a formula of one the the two forms above telling how a configuration (Q,X,Y) is related to a configuration (R,U,V), reached from (Q,X,Y) by taking that transition. Let T(Q,X,Y,R,U,V) be the disjunction of all those formulas. As M has finitely many transitions, this disjunction is finite as well, and thus a formula of arithmetic. T(Q,X,Y,R,U,V) has free variables Q,X,Y and R,X,Y. It says, in the language of arithmetic, that there is a transition transforming (Q,X,Y) into (R,U,V). That is, the formula T(Q,X,Y,R,U,V) is true exactly when there is such a transition.

Using the formula T(Q,X,Y,R,U,V) it is possible to build another formula T*(Q,X,Y,R,U,V) in the language of arithmetic, with free variables Q,X,Y and R,U,V, that says that it is possible to proceed from configuration (Q,X,Y) to configuration (R,U,V) by following zero or more transitions. T*(Q,X,Y,R,U,V) can be regarded as the reflexive and transitive closure of T(Q,X,Y,R,U,V). Building such a transitive closure within the language of arithmetic is a bit tricky and therefore skipped in class. I will provide the details upon request.

Now the formula

U V: T*(0,0,w,1,U,V)

says that it is possible that to proceed from the initial configuration with the word w on the tape and the head of M in its left-most position, to a configuration involving the accept state. This formula is true exactly when the Turing machine M accepts the word w.Thus, in order to decide whether or not M accepts w, it suffices to check whether or not the formula above is true in arithmetic. This constitutes a reduction of the acceptance problem for Turing machines to the problem of determining truth in arithmetic. As the former problem is undecidable, so must be the latter.

**Theorem:** The language of true arithmetical formulas is not even
recognizable.

**Proof:** Suppose B would be a Turing machine recognizing true
arithmetical formulas. For any formula P in arithmetic, either P
itself or its negation P is true. Thus the
truth of P can be decided by running B on P and B on
P in parallel. Within a finite amount of
time either P or P will be accepted, which
settles the question of whether P is true or not. Thus truth in
arithmetic would be decidable, contradicting the previous theorem.
Hence B does not exists and arithmetical truth is not recognizable.

**Theorem:** For every reasonable method of provability, the
language of provable arithmetical formulas is enumerable (and thus
recognizable).

**Proof:** The first requirement of a reasonable method of
provability is that it should be possible to determine whether a given
piece of text is a proof or not. Hence it is possible to enumerate all
proofs, namely by enumerating all finite pieces of text, and deleting
those that aren't proofs. The second requirement of a reasonable
method of provability is that it should be possible to determine,
given a proof, what formula it is that it proves. This enables the
enumeration of all proofs to be converted into an enumeration of all
provable formulas

**Goedel's incompleteness theorem:**
If a proof system for arithmetic is sound (meaning that only true
formulas are provable) then there must be a true formula that is not
provable.

**Proof:** The set of provable formulas is enumerable, and the set
of true formulas isn't. Therefore there must be a difference. QED

**Remark:** The proof of Goedel's incompleteness theorem given
here rests heavily on Church's thesis, which is not a mathematical
theorem. Goedel's own proof bypasses Church's thesis (in fact it
predates it by several years) and therefore is much more complicated.
The undecidability proof of truth goes through also in the absence of
Church's thesis: truth is then not recursive. However, showing that
provability is recursive enumerable is a lot of work, and requires
slightly stronger assumptions regarding the notion of a reasonable
method of provability. It is possible to bypass the use of
decidability and recursive enumerability by showing that provability
is arithmetical (see below), whereas truth is not. Alternatively it is
possible to construct an actual formula that is true but not provable;
this is what Goedel did.

A language is called arithmetical if it consists of the set of strings that are the n-adic representations of an arithmetical set of natural numbers. (In fact this definition is invariant under a change of the algorithm coupling strings and numbers.)

**Theorem:** Any recursive enumerable language is arithmetical.

**Proof:** Suppose L is the set of strings w accepted by a Turing
machine M. Then L = {w | U
V:
T*(0,0,w,1,U,V)}, where T* is the formula presented earlier.
Thus P(W) is the formula U
V: T*(0,0,W,1,U,V).

The class of arithmetical formulas is much larger than the class of recursive enumerable ones. In fact, all languages that have been shown to be undecidable or unrecognizable in the book, turn out to be arithmetical. The only example of a nonarithmetical language encountered so far is the set of true formulas in arithmetic.

Rob van Glabbeek | rvg@cs.stanford.edu |