Language and information

Lecture 1. A formal theory of syntax

1.7. Properties of the base

[Audio recording—Opens a separate, blank window. Return to this window to read the transcript.]

Now, we have seen four constraints on word combination: the partial order of word dependence, which creates sentence structure; the likelihood inequalities, which fit word meanings; the reduction of high-likelihood word occurrences; and the linearizations. Each acts on the resultant of its predecessor.

The four constraints partition the set of sentences into two major sets. Without the third—without the reductions—the other three create a base set from which all other sentences are derived. What is important here is that neither the base set nor the other set—the derived set of the partition—is merely a residue of the other. It isn't that the structure of the base set is just a description of all those sentences which could not be derived from something, nor are the derivations just any change needed to obtain the remaining sentences from the base set. Rather, the base set and the derivations each have simple and understandable structures on their own terms, and it is a non-trivial result that the whole set of sentences is characterized by just these two structures. 

Now, a few properties of the structures as they result from the constraints, first of the base, and then of the set of reduced sentences.

In the base, virtually all sentences, words are simple, not composite, since affixes are generally reductions of next words. This of course makes the formulation of the base incomparably easier. Few words have more than one defining dependence. If you have a word like expect, with two different types of objects, I expect John to come or to go away, and I expect John—one is a noun, the other is a sentence object—you wish you could find a way of deriving one from the other. The dependence defines just a few large word classes. The ones requiring null, the ones requiring one word requiring null, requiring two words requiring null ... you could say that put requires three words requiring null—I put a book on the table, you don't just say I put a book—and then there's said requiring [a word requiring] null and ... a word requiring [a word requiring] something, non-null, and so forth. These are the only classes that we find in the base.

The composition, in terms of partial ordering, of these sentences is transparent in the linear [flow], because the operator is always between its two arguments, and so it's built up through the whole thing. As will be seen, furthermore, in a moment, the base suffices for all the information carried by language. So, it's certainly of considerable importance for us.