To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Entropy measures the complexity of mappings. For shifts, it also measures their “information capacity,” or ability to transmit messages. The entropy of a shift is an important number, for it is invariant under conjugacy, can be computed for a wide class of shifts, and behaves well under standard operations like factor codes and products. In this chapter we first introduce entropy and develop its basic properties. In order to compute entropy for irreducible shifts of finite type and sofic shifts in §4.3, we describe the Perron–Frobenius theory of nonnegative matrices in §4.2. In §4.4 we show how general shifts of finite type can be decomposed into irreducible pieces and compute entropy for general shifts of finite type and sofic shifts. In §4.5 we describe the structure of the irreducible pieces in terms of cyclically moving states.
Definition and Basic Properties
Before we get under way, we review some terminology and notation from linear algebra.
Recall that the characteristic polynomial of a matrix A is defined to be χA(t) = det(tId – A), where Id is the identity matrix. The eigenvalues of A are the roots of χA(t). An eigenvector of A corresponding to eigenvalue λ is a vector v, not identically 0, such that Av = λv.
We say that a (possibly rectangular) matrix A is (strictly) positive if each of its entries is positive. In this case we write A > 0.
In this chapter we shall discuss the complexity of Frege systems without any restrictions on the depth. There is some nontrivial information, in particular nontrivial upper bounds, but no nontrivial lower bounds are known at present (only bounds from Lemma 4.4.12).
Counting in Frege systems
Theorems 9.1.5 and 9.1.6 are useful sufficient conditions guaranteeing the existence of the polynomial size EF-proofs and of quasipolynomial size F-proofs, respectively. For example, U11 proves the pigeonhole principle PHP(R) and hence there are quasipolynomial size F-proofs of PHPn. A subtheory of corresponding to the polynomial size F-proofs, based on a version of inductive definitions, was considered by Arai (1991); see Section 9.6. Its axiomatization however, stresses a logical construction, whereas we would like a theory based on a more combinatorial principle.
The most important property of a Frege system relevant for the upper bounds is that it can count. We shall make this precise by showing that F simulates an extension of I△0(α) by counting functions, and that F p-simulates a propositional proof system cutting planes.
Definition 13.1.1.
(a) Let L0 be the language of the second order bounded arithmetic but without the symbol #.
Although we have seen many aspects of symbolic dynamics, there are still many more that we have not mentioned. This final chapter serves as a guide to the reader for some of the more advanced topics. Our treatment of each topic only sketches some of its most important features, and we have not included some important topics. For each topic we have tried to give sufficient references to research papers so that the reader may learn more. In many places we refer to papers for precise proofs and sometimes even for precise definitions. The survey paper of Boyle [Boy5] contains descriptions of some additional topics.
More on Shifts of Finite Type and Sofic Shifts
THE CORE MATRIX
Any shift of finite type X can be recoded to an edge shift XG, and we can associate the matrix AG to X. This matrix is not unique, but any two such matrices are shift equivalent, and in particular they must have the same Jordan form away from zero. This gives us a way of associating to X a particular Jordan form, or, equivalently, a particular similarity class of matrices. By Theorem 7.4.6, this similarity class is an invariant of conjugacy, and, by Proposition 12.2.3, it gives a constraint on finite-to-one factors between irreducible shifts of finite type.
Ten years ago I had the wonderful opportunity to attend a series of lectures given by Jeff Paris in Prague on his and Alec Wilkie's work on bounded arithmetic and its relations to complexity theory. Their work produced fundamental information about the strength and properties of these weak systems, and they developed a variety of basic methods and extracted inspiring problems.
At that time Pavel Pudlak studied sequential theories and proved interesting results about the finitistic consistency statements and interpretability (Pudlak 1985, 1986, 1987). A couple of years later Sam Buss's Ph.D. thesis (Buss 1986) came out with an elegant proof-theoretic characterization of the polynomial time computations. Then I learned about Cook (1975), predating the above developments and containing fundamental ideas about the relation of weak systems of arithmetic, propositional logic, and feasible computations. These ideas were developed already in the late 70s by some of his students but unfortunately remained, to a large extent, unavailable to a general audience. New connections and opportunities opened up with Miki Ajtai's entrance with powerful combinatorics applied earlier in Boolean complexity (Ajtai 1988).
The work of these people attracted other researchers and allowed, quite recently, further fundamental results.
It appears to me that with a growing interest in the field a text surveying some basic knowledge could be helpful. The following is an outline of the book.
Recall from Chapter 6 that we regard two dynamical systems as being “the same” if they are conjugate and otherwise “different.” In Chapter 7 we concentrated on conjugacy for edge shifts, and found that two edge shifts are conjugate if and only if the adjacency matrices of their defining graphs are strong shift equivalent.
We also saw that it can be extremely difficult to decide whether two given matrices are strong shift equivalent. Thus it makes sense to ask if there are ways in which two shifts of finite type can be considered “nearly the same,” and which can be more easily decided. This chapter investigates one way called finite equivalence, for which entropy is a complete invariant. Another, stronger way, called almost conjugacy, is treated in the next chapter, where we show that entropy together with period form a complete set of invariants.
In §8.1 we introduce finite-to-one codes, which are codes used to describe “nearly the same.” Right-resolving codes are basic examples of finite-to-one codes, and in §8.2 we describe a matrix formulation for a 1-block code from one edge shift to another to be finite-to-one. In §8.3 we introduce the notion of finite equivalence between sofic shifts, and prove that entropy is a complete invariant. A stronger version of finite equivalence is discussed and characterized in §8.4.
Finite-to-One Codes
We begin this chapter by introducing finite-to-one sliding block codes.
Bounded arithmetic was proposed in Parikh (1971), in connection with length-ofproofs questions. He called his system PB, presumably as the alphabetical successor to PA, but we shall stay with the established name I Δ0 (for “induction for Δ0 formulas”). This theory and its extensions by axioms saying that some particular recursive function is total were studied and developed in the fundamental work of J. Paris and A. Wilkie, and their students C. Dimitracopoulos, R. Kaye, and A. Woods.
They studied this theory both from the logical point of view, in connections with models of arithmetic, and in connection with computational complexity theory, mostly reflected by the definability of various complexity classes by subclasses of bounded formulas. They also investigated the relevance of Gödel's theorem to these weak subtheories of PA and closely related interpretability questions.
Further impetus to the development of bounded arithmetic came with Buss (1986), who formulated a bounded arithmetic system S2, a conservative extension of the system I Δ0 + Ω1 investigated earlier by J. Paris and A. Wilkie, and its various subsystems and second order extensions. The particular choice of the language and the definition of suitable subtheories of S2 allowed him to formulate a very precise relation between the quantifier complexity of a bounded formula and the complexity of the relation it defines, measured in terms of the levels of the polynomial time hierachy PH.
This chapter considers various witnessing theorems, which are theorems characterizing functions definable in various systems of arithmetic in terms of their computational complexity. A prototype of such a theorem (and its proof) is the characterization of primitive recursive functions as provably total recursive functions in fragment of PA (cf. Parsons 1970, Takeuti 1975, and Mints 1976).
There are other approaches to proving witnessing theorems, for example, skolemizing the given theory by Skolem functions from a particular class and then applying Herbrand's theorem. Or there are intrigued model-theoretic constructions. I shall mention these methods too, but my opinion is that one really has to know in advance which class of functions one targets before formulating an argument while the methods based on cut-elimination (Section 7.1) and generalizing Theorem 7.2.3 help to discover the right class. This certainly was the case for all witnessing theorems discussed in this chapter.
Cut-elimination for bounded arithmetic
We first extend the sequent predicate calculus by rules allowing the introduction of bounded quantifiers and by the induction rules and then we prove the cutelimination for such a system.
The predicate calculus LK extends the propositional LK from Section 4.3 by four rules for introducing quantifiers to a sequent as in Definition 4.6.2:
From Section 10.4 we know that all theories (R) and (R) are distinct. In this chapter we examine specific, more direct independence proofs for theories (R), (R), and(R), and we strengthen Corollary 10.4.3.
Herbrandization of induction axioms
In this section we shall examine the following idea for independence proofs: Take an induction axiom for a (α)-formula. It has the complexity (α). Introduce a new function symbol to obtain a Herbrand form of the axiom, as at the beginning of Section 7.3. But this time we reduce the axiom to an existential formula. This allows us to use a simpler witnessing theorem (Theorem 7.2.3) than the original form of the axiom would require.
Consider first the simplest case (which will turn out to be the only one for which the idea works). Let α(x, y) be a binary predicate. Then the herbrandization of the induction axiom for the formula A(a) ≔ ∃u ≥ a, α(u, a)
is the formula
Denote this formula JNDH(A(a)).
Theorem 11.1.1. The formula INDH(A(a)) is provable in (α, f) but not in (α, f). Hence (α, f) is not (α, f)-conservative over (α, f).
Symbolic dynamics is part of a larger theory of dynamical systems. This chapter gives a brief introduction to this theory and shows where symbolic dynamics fits in. Certain basic dynamical notions, such as continuity, compactness, and topological conjugacy, have already appeared in a more concrete form. So our development thus far serves to motivate more abstract dynamical ideas. On the other hand, applying basic concepts such as continuity to the symbolic context shows why the objects we have been studying, such as sliding block codes, are natural and inevitable.
We begin by introducing in §6.1 a class of spaces for which there is a way to measure the distance between any pair of points. These are called metric spaces. Many spaces that you are familiar with, such as subsets of 3-dimensional space, are metric spaces, using the usual Euclidean formula to measure distance. But our main focus will be on shift spaces as metric spaces. We then discuss concepts from analysis such as convergence, continuity, and compactness in the setting of metric spaces. In §6.2 we define a dynamical system to be a compact metric space together with a continuous map from the space to itself. The shift map on a shift space is the main example for us. When confronted with two dynamical systems, it is natural to wonder whether they are “the same,” i.e., different views of the same underlying process.