For the readers of Hopcroft & Ullman (HU): We have covered Chapters 1 and 2 now, while skipping 2.6 - 2.8. (The applications in Section 2.8 may be good to read though.) Wednesday Chapter 3.

For the readers of Hopcroft, Motwani & Ullman (HMU): We have covered Chapters 1, 2 and 3, while skipping Sections 2.4, 3.3 and 3.4. (The applications in Sections 2.4 and 3.3 may be good to read though.) Wednesday Chapter 4.

**Covered material on regular languages**: Sipser Chapter 1 (all of
it) and minimization of DFAs
(not in Sipser). Also the easy algorithms below.

This corresponds to HMU Chapter 2 (except 2.4), 3 (except 3.3-4), and
4 (except 4.2.3-4),

and to HU Chapter 2 (except 2.7-8) and 3 (except Theorems 3.9-10).

Important topics:

- DFAs and NFAs recognize, and regular expressions generate, the same
class of languages, called the
*regular languages*. - These languages are closed under intersection, union, concatenation, star, complement and reversal.
- The pumping lemma and the closure properties are the main tools for showing languages to be non-regular.
- There are easy algorithms for

I also showed how any CFG can be converted into Chomsky normal form.
This yields an algorithm to decide whether a given string is generated
by a given context-free grammar. And I introduced the concept of an
*inherently ambiguous context-free language*.

Aaron produced a handout that shows a conversion from a PDA to a CFG, available as dvi, ps and pdf.

**Covered material on context-free languages**: Sipser Chapter 2 (all of
it) and the concept of useless variables
(not in Sipser). Also the easy algorithms below.
Furthermore *deterministic* pushdown automata, and the notion of
acceptance by empty stack.

This corresponds to HMU Chapters 5, 6 and 7, except Sections 5.3, 6.4.4,
7.3.1, 7.3.2, 7.3.5 and 7.4 (but the very beginning of 7.4.4 and
7.4.5. is treated),

and to HU Chapters 4, 5 and 6, except Sections 4.6-7,
Theorems 6.2-3 and the end of Section 6.3.

Important topics:

- Context-free grammars generate, and pushdown automata recognize, the
same class of languages, called the
*context-free languages*. - These languages are closed under union, concatenation, star and reversal, as well as intersection with a regular language. They are not closed under complementation and intersection.
- The pumping lemma and the closure properties are the main tools for showing languages to be non-regular.
- Parsing, ambiguity, inherently ambiguous languages.
- There are easy algorithms for
- eliminating epsilon-productions and unit rules, and converting a CFG into Chomsky normal form,
- eliminating useless variables from a CFG,
- checking whether a pushdown automaton or CFG accepts a given string,
- and checking whether the language accepted by a given pushdown automaton or CFG is empty.

- checking whether two CFGs generate the same language,
- checking whether a given CFG is ambiguous,
- checking whether the intersection of two context-free languages (given by CFGs or PDAs) is empty,
- and checking whether the language accepted by a given pushdown automaton or CFG consists of all strings over the given alphabet (i.e. whether its complement is empty).

The midterm exam covers all the material on regular and context-free languages mentioned above.

I showed that one-tape Turing machines are equivalent to multitape Turing machines, and mentioned that there are many variations of Turing machines of this nature.

I also introduced Grammars, which
are generalizations of context-free grammars, not covered in Sipser.
A language is generated by a grammar if and only if it is recursive
enumerable.

**Chomsky hierarchy**

Regular languages | Context-free languages | Context sensitive languages | Recursive enumerable languages |

Regular expressions | Context-free grammars | Context sensitive grammars | unrestricted grammars |

NFA | PDA | LBA | nondeterministic Turing machines |

DFA | (DPDA are weaker) | (DLBA are weaker) | Turing machines |

I presented the concept of an enumerator and showed that a language is enumerable if and only if it is semi-decidable (see last week's handout).

I defined the notion of *cardinality* and showed that there
are equally many rational numbers (fractions) as natural numbers
(non-negative integers): the rational numbers are *countable*.
However, there are more real numbers than natural numbers: the reals
are uncountable. This I showed with the method of *diagonalization*.
The same method also yields that there are countably many algorithms,
but uncountably many languages. Thus there must be languages whose
membership cannot be semi-decided by means of any algorithm.
In fact one such language was constructed last week already, in the
handout on decidability.

I touched upon the notion of reducing one problem to another, and used it to prove that the acceptance problem is undecidable. In terms of general algorithms this is treated in my decidability handout; in terms of Turing machines in the textbooks. Chapter 4 of Sipser is now considered completely covered, as well as the very beginning of Chapter 5.

AP_{TM} = { (<M>,*w*) |
Turning machine M accepts input string *w*}

SELF-REJECT_{TM} = { <M> |
Turning machine M does not accept input string <M>}

SELF-ACCEPT_{TM} = { <M> |
Turning machine M accepts input string <M>}

It is important to apply reductions in the right direction (see above)! I showed that it is undecidable whether a given Turing machine accepts the empty language, by reducing the acceptance problem (which is known to be undecidable) to it. This is a special case of Rice's theorem, which says that any nontrivial property about the language recognized by a Turing machine is undecidable. I proves Rice's theorem, which is not covered in Sipser, using this handout.

Or smaller versions if they don't fit your screen: page 1, page 2, page 3, page 4, page 5 & page 6.

Subsequently I'll treated the undecidability of the empty intersection problem for context-free grammars (not covered in Sipser). Consult this handout for the undecidability of the ambiguity problem for CFGs.

During this course I plan to cover also a few more advanced topics in computability (or decidability) theory; however, these will be postponed until after the treatment of complexity theory. This because it is very important that you get the homeworks on complexity theory back well before the final exam.

** Covered material on computability theory**:
Sipser Chapters 3, 4, 5 (except for the proof of theorem 5.10 (but the
statement is covered)), and 6.3, as well as the handouts linked below.

This corresponds to HMU Chapters 8 and 9, except Sections 8.5.3, 8.5.4
and 9.5.3, plus the handouts.

Important topics:

- Decidable, recognizable and enumerable languages; recursive, Turing-recognizable and recursive enumerable languages; the relationships between these classes; Church's thesis.
- The decidable languages are closed under union, intersection and complementation; r.e. languages under union and intersection, but not complementation.
- The existence of unrecognizable languages, and of recognizable but undecidable languages.
- Diagonalization. Countability.
- Turing machines; configurations; multitape TMs; the equivalence of deterministic and nondeterministic TMs.
- Unrestricted and Context-sensitive grammars. Linear Bounded Automata .
- Computable (or recursive) and partial recursive functions.
- The method of reduction. Mapping reducibility and Turing reducibility.
- Undecidability of the acceptance problem for TMs, the halting problem, whether an algorithm halts on all inputs, the emptiness problem for TMs, the emptiness problem for LBAs, Post Correspondence Problem, ambiguity of CFGs, and empty intersection of CFGs.
- Rice's theorem.

- n-adic and n-ary representation of natural numbers.
- The undecidability of provability in first order logic and truth in arithmetic.
- Goedel's incompleteness theorem and the class of arithmetical languages.

I explained that the class NP can be seen as the class of problems
with polynomial time verifiers, and showed that the problem
*CLIQUE* is in NP. Other examples of problems in NP can (and
should) be found in the book.
It is a famous open problem whether P = NP. Most computer scientists
think this is not the case.

I introduced the concept of a polynomial time reduction.
Then I explained the notion of NP-completeness and mentioned that
*SAT* and even *3SAT* is an NP-complete problem (the first
one ever found, by Cook). Using this, I showed that also *CLIQUE*
is NP-complete, by providing a polynomial time reduction from
*3SAT* to *CLIQUE*.
In the book there are many more NP-complete problems, which are
assigned reading (although you do not have to know hard proofs like the
NP-completeness of *HAMILTONIAN PATH*). Showing NP-completeness
is a skill that is a vital part of this course, and you should
practice on your own to get the necessary experience.

Then I proved Savitch's theorem, saying that
if f(n) log(n) then NSPACE(f(n))
SPACE(f(n)^{2}).
I also recalled that if a language is in NTIME(f(n)), then there is a
constant d such that it is in TIME(d^{f(n)}).
Thus, whereas converting nondeterministic machines into deterministic
ones costs us an exponential in time, it costs only a square in space.
It follows that the classes PSPACE and NPSPACE coincide.

I discussed the hierarchy of complexity classes

L NL P NP PSPACE = NPSPACE EXPTIME EXPSPACE.

The concept of a polynomial-time reduction can also be used to define PSPACE-completeness, EXPSPACE-completeness, etc., and likewise NP-hardness, PSPACE-hardness, etc. (A problem is X-complete iff it is X-hard and in class X.)
For any class X there is a class co-X of complements of problems in X.
When X is defined in terms of deterministic Turing machines, trivially
co-X = X. It is an open problem whether NP = co-NP.
Surprisingly, co-NL = NL. This follows because *PATH* is in NL
(the proof of which has been skipped).

Subsequently I presented, without proof, the time and space hierarchy theorems that say that for time constructible functions such that one is larger than the other by more than a logarithm, the corresponding time complexity classes are different. Likewise, for space constructible functions such that one is larger than the other by more than a constant factor, the corresponding space complexity classes are different.

Finally, I argued that through a suitable encoding of natural numbers as strings, the concepts of decidability and enumerability apply not only to languages, but also to sets of natural numbers. It doesn't matter which encoding is chosen, because there must be an algorithm to convert one encoding into an other. When the encodings are reasonable, such as the standard n-ary or n-adic representation of numbers as strings, one encoding can be retrieved from another in polynomial time (or better), so even most complexity classes are well-defined for sets of natural numbers.

I explained that the n-adic representations yields a bijective correspondence between strings over an n-letter alphabet and natural numbers, and pointed out the difference between n-ary and n-adic representations.

**Covered material on complexity theory**: Sipser Chapters 7
(except for Theorem 7.13 and the proof of Theorem 7.35), 8 (except for
the proofs of Theorem 8.8 and 8.22 (so the statements of these
theorems are covered)) and 9.1 (except for the proofs of Theorems 9.3,
9.10, and 9.15), as well as the handout
on complexity classes.

This corresponds to HMU Chapters 10 and 11.1-3, plus the handout.
The material on the classes L and NL is not in HMU; students who are
not in the possession of Sipser should collect a copy of those parts.

Important topics:

- The complexity classes TIME(f(n)), NTIME(f(n)), SPACE(f(n)) and NSPACE(f(n)), and their relationships.
- The classes L, NL, coNL, P, NP, coNP, PSPACE, NPSPACE, EXPTIME and EXPSPACE and their relationships.
- The relationship between the classes of regular, context-free, and context-sensitive languages and the classes above.
- Polynomial time reductions and the notions of NP-completeness, PSPACE-completeness and EXPSPACE-completeness.
- Log-space reductions and the notion of NL-completeness.
- The problems SAT, 3SAT, Clique, directed and undirected Hamiltonian path, subset-sum, and node-cover (or vertex-cover) are NP-complete; PATH is NL-complete and in P. TQBF (even in prenex conjunctive normal form) and the geography game are PSPACE-complete.
- The time and space hierarchy theorems.

Sadly, the above plan is bogus, because
the concept of definability is ill-defined. The best one can do is
to define particular tools for defining languages, and consider classes
of languages that are definable with those tools. An extremely fruitful
tool is first-order arithmetic. One defines the *arithmetical
languages* as those that can be defined by arithmetical means.
This turns out to be a very large class that surely contains every
language that has been contemplated in this course so far.

I proved that the class of arithmetical languages at least contains all recursive enumerable languages, by showing that the behavior of any Turing machine can be encoded in terms of first-order arithmetic. This also yielded the undecidability of arithmetic, and Goedel's incompleteness theorem.

Rob van Glabbeek | rvg@cs.stanford.edu |