Language and information

Lecture 4: The Nature of Language

4.1. The structural properties

[Audio recording—Opens a separate, blank window. Return to this window to read the transcript.]

Today, I want to sum up what we have seen of the structure of language, very briefly, and just take a very brief [look at] the type of mathematical properties that it has, and what is universal about language. And from that, I want to continue, to see how the structure shows something about developmental stages in language, possibly even current stages. I then want to discuss how the stages of development can be understood as being institutionalizations of the use of language, of the use of word combinations—and there's a reason why this seems to be the way to look at it—and in all, to see that all of this gives a picture of language gives a picture of language as an evolving and even a self-organizing system.

Now, first, a very short reminder of what we have seen to be sufficient to describe the structure of language. We deal first of all with simple, non-composite words, whose phonemes are arbitrary with respect to the letters, are arbitrary with respect to the grammar, and whose meaning is not sufficient to explain all of their behavior in the grammar. So we have the words, but it is a far cry from being sufficient for the language. We then saw that there was a dependence of the occurrence of words in sentences on certain other occurrences of words in sentences, and this dependence was a dependence on the dependence properties of the other words, not on a list of words. This dependence—this dependence on dependence—with certain adjustments partitions the set of sentences in a language, and it partitions the words into what could be called operators and arguments, in a sense that is reminiscent of, though not fully identical with, the use of functors in categorial grammar. We also saw that there were differences in likelihood of operators in respect to their arguments, and vice versa, and that the low likelihood and low-information word occurrences, or rather fuzzy domains of these, were reducible in the language, were reducible in their phonemes, even to zero, without losing the words involved, or the role that the meanings of the words play in the sentence. It was also noted—though of course none of this could really be demonstrated—that all events, the entry of words and the imposition of grammatical structurings, meaning these orderings on the words, and their likelihoods, all of these are ordered in the making of a sentence, so that the result is that constructions are contiguous, and that the parts of a construction are contiguous to each other, and that constructions as a whole are contiguous to other constructions, either as being next to them, or as being nested within them. So that very much is known about the structure of sentences out of this material. It was also noted that these constraints suffice to analyze the sentences of the language to any detail desired. And that each step in the making of a sentence, each contribution to one of these constraints has a fixed contribution, substantive or not substantive, to the information of the sentence finally.

Now, this structure has certain mathematical properties. It must first be understood that there are two kinds of applied mathematics. There's the usual kind, which is calculational, like in linear transformations, or the special functions of physics. There is also he finding of mathematical objects in real-world situations. Such a finding is of interest if it is not only a naming of things, but if there is some utility and some coverage to what the mathematical object does. In the case of language, there is very little use of calculation. What there is, is the finding of a mathematical object, the chief one, the basic one, being one that was mentioned I think in the first lecture, that the occurrences of words, or word-sequences if you wish, forms a set of arbitrary objects—because it doesn't matter if they are words or not, in a way that I think I said something about—it forms a set of arbitrary objects which is closed under the dependence-on-dependence relation, with each satisfaction of this relation being a sentence of the language.

Starting with this, we can define very many other sets; in fact, virtually all of the sets which are of importance for grammar: the elementary sentences, the unary non-elementary, the binary non-elementary sentences, the base sentences, the reduced sentences—all of these are defined by various mappings, starting from the set which I first mentioned. Also such things as all component sentences of a sentence, or all unambiguous readings of an ambiguous sentence, or, for example, a grammar common to any arbitrary two languages, and things of this sort.

Now, I want to say something briefly about what is universal and what is not, because all of this will slightly feed what we want to use at the end. The universal things in the structure of languages is the having of phonemic distinctions, not the actual ones, the having of them; the having of words, as phoneme sequences; the having of the linear order of them; and then the partial ordering as a dependence-on-dependence relation; also, the having of likelihood differences among words, the fact that not all words have either random or identical, for instance, or fluctuating likelihoods with respect to the operators on them. And, also, the having of reductions, the fact that reductions exist. These are universal, but all of them are only propensities. They are things that language can do, and these are universal things. There things in which languages are similar to each other to a very large extent. The phonetic types, the types of sounds, for obvious reasons, perhaps, are pretty similar. There are very unusual sounds in certain languages, but by and large there are certain things that are very common. The approximate number of phonemes is not very different as among languages. The rough meanings of some words are very widely common. The main classes of words in respect to the dependence-on-dependence are pretty much the same among languages. And very many reduction types are very similar, actual forms of reductions.

The things that are different in languages are the amount of complexity, the amount of grammatical complexity that is fashioned out of these constraints, which means how the constraints impinge on each other; and where the complexities lie primarily, this also differs heavily. And above all, of course, what phoneme sequences are used for what words, meaning for what meanings. This is completely ... extremely different. Also such things as the availability of morphology—some languages have it, some languages don't. And, also, just what reductions, and what the amount of reducibility is in a language, and what kinds of categories are created in the grammar, including even meaningful categories like tense, and so forth. These are heavily different.