ZELLIG S. HARRIS

(23 October 1909-22 May 1992)

[Proceedings of the American Philosophical Society, Vol. 138, No. 4, 1994]

Zellig Sabbetai Harris, who died in his sleep on 22 May 1992, after a day of work as usual, was an outstanding linguist of our century. He was born in 1909 in Balta, Ukraine, about 180 miles northwest of Odessa, on the Kodyma River. He was brought to Philadelphia when he was four. He studied and taught at the University of Pennsylvania until his retirement in 1979. After that, he was associated with Columbia University. He was a member of the kibbutz Mishmar Haemek and for years commuted between Israel and Philadelphia. His political opinions were close to socialism and anarchism. They are stated in a book he wrote at the end of his life, which is as yet unpublished. [Harris 1997]

Harris started out as a Semitist. Partly in association with his teacher, James A. Montgomery, but soon very much on his own, he became a prominent contributor to the decipherment and reading of the Ugaritic texts. His 1934 book on Phoenician grammar made a great impression and, incidentally, attracted the methodological (and philological) attention of Edward Sapir. In 1939, the systematic and original study on the development of Canaanite dialects followed. Already manifest in all these works is the peculiarly precise, yet flexible, scholarly style that was to remain his for the rest of his life.

Harris was influenced by Sapir and by Leonard Bloomfield, another great American linguist, although he did not formally study with either of them. Both strove to understand the phenomenon of meaning in language, just as Harris always did. However, they knew that at present there are no scientific ways to examine meaning in all of its social manifestations. But, as Harris repeated after Bloomfield, "it frequently, happens that when we do not rest with the explanation that something is due to meaning, we discover that it has a formal regularity or 'explanation.' It may still be 'due to meaning' in one sense, but it accords with a distributional regularity." For Bloomfield, Sapir, and Harris, the primary datum for the scientific study of language is the relative position of the segments of speech utterances.

In 1951, Harris's Methods in Structural Linguistics appeared, a book that made him famous, often for the wrong reasons. It was considered a unified and systematic presentation of the achievements of the Bloomfieldian school to which Harris himself had contributed several papers. The book establishes both a new standard of rigorousness for structural linguistics and a very broad scope for linguistic study. The linguist's job is to study the entire structure of a language as a system. But the results of Harris's book are contrary to some tenets of many followers of Bloomfield. Formal procedures to establish phonemes or morphemes do not lead to a unique result. There may be several ways to assign phonemes to a language, depending for instance on whether we want to minimize their number or minimize their distributional anomalies. The twentieth-century physicists introduced operational definitions that led to the splitting of traditional concepts according to the method of measurement. Furthermore, Harris argued that the rigidity of linguistic levels, phonology first, then morphology, then syntax, is not defensible; instead of composing morphemes from phonemes one can begin with morphology and then decompose morphemes into sequences of phonemes. A novelty of consequence is long components. In Latin, in the environment fili — bon — , the two positions are both filled by a or both by us. In English we have either She is a stewardess or He is a steward. In the Latin example, a and us are long components, in the English example she - ess and he - Ø, or in both languages the gender suffixes are long components. Generally, long components are the expression of a distribution in which the number of choices is less than the number of overt morphemes. Long components help the descriptions of goverment, agreement, and so on.

Reviewers of the book differed on Harris's treatment of meaning. Some criticized him for his de facto use of meaning, others for not using meaning at all. In reality, Harris correlated his formal results with independently known meanings all the time. His theory rigorously limits semantics to a single, controllable set of data that do not say what the meaning of an utterance is, but only whether one utterance is semantically the same as another. He accepts the judgment of native informants as to whether two short utterances are repetitions of one another. Repetition, unlike imitation, allows for the latitude of physical differences that the language ignores. This echoes Sapir's writings about the psychological reality of phonemes. That life contrasts with rife can be learned only from an English-speaking person. Harris was careful in the art of eliciting in linguistics and wrote about it.

Harris wanted to extend distributional methods to the analysis of sentences, sequences of sentences, and kinds of discourse. But there were serious obstacles. The variety of morpheme sequences in a sentence is considerably greater, or less restricted, than that of successions of phonemes in a morpheme. An essential step toward the analysis of sentences was to analyze jointly two sentences with similar but not identical words. If they differ only by morphemes and words that occur all over the language, in every kind of discourse, like suffixes, auxiliary verbs, and articles, and by the order of the remaining words, then one can describe regularities. The man boarded the train is related to The man did not board the train and to Will the man board the train in such a way that when you replace man by electricity, number, speed etc., board by offend, run, divide etc., and train by page, engine, student, then practically any time one of the three utterances is a sentence, so are the remaining two. Words come together unpredictably, but once they do come together they stay so through many changes of the structure of the sentence. Changes that preserve the co-occurrence of words are transformations. To this various conditions are occasionally added.

Transformations, under different names, or without a name, had been used in typical Latin grammars at least from the Renaissance on, e.g., as exercises in changing indicative sentences into interrogative or imperative ones. In Harris's transformational grammar it is the process of deriving a sentence from what he calls the "kernel" (or, in another of his theories, from the "base") that characterizes the sentence. The idea of a transformational grammar appears for the first time near the end of the 1951 book. A presentation of the language structure can use "moving-parts models such as machines or historical sciences. In using such models, the linguistic presentation would speak of base forms, of derived forms, or processes that yield one form out of another. The relation of an element to sequences that contain it becomes the history of the element as it is subjected to various processes and extensions. A relation between two elements is essentially the difference between two historical or otherwise derivational paths." This "presentation" of language structure was to come to the fore in subsequent decades and not only in America.

Harris was always ready to use any explicit technique. Thus, he devised a method to segment an utterance into words (or perhaps into morphemes only) using the sequences of phonemes exclusively. The main idea is simple, based on the ups and downs of the freedom of choice in their sequence. The first phoneme of a word can be nearly any phoneme. The possibility of the second phoneme is limited by the first, e.g., in English, no word starts by repeating the same phoneme, nor does any word start [with] rl. The possibility of the third is further limited by the first two. Where we come to the maximal freedom there is the beginning of a new word. With a few refinements of this procedure, we can obtain the concept of a word without recourse to morphology. When transformations are applied most of the composite words are broken up into simple words, affixes are expanded to free word combinations, suffixes and prepositions are changed and therefore are shown to be the instruments of the transformations. The co-occurrence of words is mostly independent of their morphology. In Harris's 1981 book morphemes are hardly mentioned.

Concrete work with grammatical transformations, and the term itself, were introduced by Harris in 1952, satisfying the needs of discourse analysis. When we analyze a text, we consider two elements equivalent, if they occur in the same environment within their respective sentences. Using only environments in which elements appear, we miss a lot of information about the text that would be obtained if we changed the form of the sentence and used what can be learned about the word from other fragments of the discourse. For instance, the information in the sentence Peter is a self-taught violinist can be conveyed by simpler, standard sentences. In musical texts often violinist and plays violin occur in the same contexts, just like cellist and plays cello. Generally, if Ni and N2 are nouns and V is an appropriate verb (in our case play), then N, + ist = V + Ni. Now, N2 + A + N, + ist = N2 + V + Ni; N2 is A in V + Ni. Put A = self-taught; this adjective is in turn derived from taught himself. Therefore you obtain Peter plays violin; he taught himself to play violin, which is a form easier to compare with other standard sentences.

Harris applied discourse analysis to the languages of science. For many years he and his coworkers studied the sublanguage of immunology. It turned out that a change had occurred in the structure of the language of immunological publications within a few years and this change reflected the advancement of knowledge gained in this period. The results appeared in the 590-page book published in 1989.

In 1955 Harris presented a paper at the meeting of the Linguistic Society of America in Chicago. An enlarged version of it was published in Language in 1957. Here transformations are treated as relations that organize the entire language. Transformations are now understood as holding between constructions. A construction is a sequence of classes of morphemes that can itself serve as a member of a major grammatical class. In a proper context AN (adjective, noun) is an N. Grammar often treats a young man as a noun. As another example note that in We expect that Emily will arrive at noon the segment Emily will arrive at noon is S (sentence) and in We expect Emily's arrival at noon the segment Emily's arrival at noon is N. The transformation between these two sentences can be stated in all generality for two constructions. Emily will arrive in a book and Emily's arrival in a book are parts of the same constructions and subject to the same transformation, even if they are bizarre, with fantastic co-occurrences. For it is the relation between sequences of classes of words, between constructions, that is the subject of the debate now. If a sentence has a strange co-occurrence, its transforms will inherit the strangeness. Both Proportionality dissolves liquids and Liquids are dissolved by proportionality are nonsensical. The judgment by native speakers about the normality or strangeness of sentences replaces the attempts to elicit the utterance. This is methodologically more satisfying: first, because eliciting opinion about normality or strangeness is much easier than eliciting the use of a strange sentence, and second, the new criterion broadens the study of our linguistic sensitivity.

There are cases of two constructions where nearly any sentence of the first construction can be transformed into a sentence of the second, but many sentences of the second do not have their corresponding sentences in the first. For instance, consider The glass was broken by carelessness (Carelessness broke the glass is much less usual), or the short passive The surface of the Earth is curved. To deal with such cases as well, Harris accepts non-symmetric transformations. Sometimes there are negative sentences but no affirmative: There was not a stitch on her.

There are sentences that are not a result of applying transformations to any sentence except the identity transformation. They form the kernel of the language. (The conceptual apparatus and terminology is borrowed from abstract algebra.) All other sentences are generated by successive applications of transformations to the sentences of the kernel. The kernel is finite and the number of transformations is finite. The transformations are defined as applying to the constructions of the kernel. Because the results of transformations are constructions that may be sequences of constructions and each construction belongs to a grammatical class, we can iterate transformations. Because transformations do not substantially change the co-occurrence of words, for any sentence the co-occurrence of words is decided by the finite kernel. Only the unlimited applicability of transformations is the source of the infinity of the set of sentences. Note also that in this theory the judgment of sentencehood is replaced by the judgment of relative acceptability. And it may be supposed that there are rather few judgments of comparative acceptability of sentences: better, much better, slightly better, about the same, equally good, and perhaps a few more. The only evidence from informants that Harris was then using were contrast, as in the 1951 book, and relative acceptability.

This transformational theory was developed in more detail in the book Mathematical Structures of Language, 1968. There, Harris presents an important alternative to the judgment of comparative acceptability. If we feel uncomfortable with a sentence, if we think that it is strange, but still English, we can usually find a context that is normal, in which this sentence occurs naturally. If the sentence Emily arrived in a book is preceded by A rowing club introduced experimentally three new types of boats. They named the types 'a library,' 'a shelf', and 'a book,' reflecting the sizes of the boats, then it is quite normal. If it were objected that this is an ad hoc enlargement of our language, it can be said that such enlargements are a daily practice. We can also devise another frame: For a long time the public did not know much about Emily. To the public attention Emily arrived in a book of 1901 by Tom Thomas. Therefore, instead of the grading of acceptability, we may say that a given sentence is "out of context" and the informant can provide a suitable context for it. There are borderline cases where it is not clear whether a proper context can be found; the set of sentences is fuzzy. Although meaning does not play a role in establishing a transformation, the effect is that many transforms are informationally paraphrases of the original sentences, or else, the given transformation changes the information in a manner typical of it, like an assertion and a question. In this theory a transformation works on a construction (or on two constructions in cases of conjunction).

In 1969 Harris changed his course; he added some techniques, thereby deepening the analysis of language. It must be understood that he was always prepared to use new techniques for the same purpose: to find formal properties of language, in particular properties that correspond to meaning or information. The new theory does not invalidate the previous ones; it only presents another aspect of the problem. The study of the co-occurrence of words in a sentence is reasonable only if some structure has already been assigned to the sentence. Several structures have been proposed for this purpose: constituent structure, phrase structure, string structure. All of these may seem arbitrary. In the new framework Harris uses the operator-argument relation which is much less so. Some words require other words within the same sentence. The word arrive requires, first, something like Peter, crowd, or storm and, second, something like here, station, conclusion. The class of ordered pairs of words that are required by arrive is large but specific to the operator arrive. We may have The book (Peter, crowd, court) arrived at the opinion (conclusion, verdict), but not The storm arrived at the verdict nor The conclusion arrived at the storm. The members of the pairs are arguments of arrive and arrive is the operator on them. Harris's earlier theory divides the set of all sentences into the kernel, and the rest. His new theory divides a language differently, into reports that constitute the base of the language and the sentences that are reductions of combinations of sentences of the base. All that we say can be said by sentences of the base and unambiguously, though perhaps not naturally. Harris had always been preoccupied with the informational content of speech. Discourse analysis was one way of dealing with it. Now, he localized the information of any sentence in its projection into the base. The projection may contain one or more sentences, usually connected by some references between them. For instance The room has indirect western light is derivable from the base sentences The room has something; said something is light; said light is western; said western light is indirect. In the last step of this derivation we used western light, which is a transformational reduction of the preceding light is western. The word said indicates that the light spoken about now is the same as in the previous sentence, that the two occurrences of the word light refer to the same light. We speak here about sentences and occurrences of words. Logicians, like Tarski, insisted that about segments of a language one must speak in a metalanguage. Both Tarski and Harris knew that a natural language contains its own metalanguage; about English we can speak in English. Tarski thought that this leads to an antinomy of the liar type. Harris found the source of such antinomies in an application of a referential to a segment of which it is a part, the result of which is simply not English. Both Tarski and Harris knew that the concept of a sentence in a natural language is not sharp. But Harris pointed out that most scientific concepts are unsharp.

Any sentence of the base is a report. It is a statement in the indicative mood. Harris took the word report from Bloomfield, who used it as a translation of the German Protokollsatz; a term used by philosophers of the Vienna Circle, mainly Camap and Neurath, for descriptions of supposed sensory perceptions by the person who has experienced them. It is not observed and not testable by others. Neurath argued that it is difficult to decide which statements are phenomenally primitive, and he questioned the supposed non-testability of some sentences. Bloomfield changed it to a sociological statement. Everybody who understands the report statement will react to it in approximately the same way. If people react in two different ways, the statement is ambiguous. The report is about the receivers, not about the speaker of the report, in harmony with the study of the listener as the main preoccupation of structural linguists. In Harris, however, only the form remains. A report is a statement of the form I say that x where x is the result of successive application of operators to their arguments without any reductions. The limitation on the co-occurrence of words holds only within the operator-argument relation in a report. What matters is only what arguments an operator takes and what operates on it in turn. To show the operators and their respective arguments it is necessary to use wording like that in the above example with western light and, often, more complicated cross references. To use I say at the beginning helps to interpret the tenses and persons in the rest of the statement. After I say there will appear a number of sentences, linked by anaphoras referring to specific occurrences of words. I say is a performative. In place of it some other performative may occur: I deny, I command, I wonder whether— or not giving rise, after proper reduction, to negation, imperative, and interrogative. Harris makes an effort to expose all reductions that are habitual and pass unnoticed and to return the content of whatever we are saying to the pure report form. Of course, stylistic color is lost and the statement becomes unwieldly. The informational content remains, however. Transformations are reductions from the report statements to more natural speech. They are mainly eliminations of repetitive material, often by pronouns (i.e., pronouning), omissions of words that are easy to guess, changes of word order, replacement of affixes by corresponding free forms. The rules of applying transformations are restricted. In this theory they are all, so to speak, one-way paraphrases. They say what the report said, but they often add ambiguity of the type Felice met Ann; the last mentioned is nice —> Felice met Ann, She is nice, where the sharpness of the anaphora is lost. Thus, the role of transformations is different in this theory from what it is in Harris's previous theory. Reductions do not create new information. But they introduce and thereby explain the ambiguities present in language.

One of the pronounings is particularly important and leads to the formation of relative clauses. Any sentence can be interrupted by another sentence, by an exclamation, etc. Emily—I have to catch a train in 20 minutes; my story will be short—went to Paris. Here the interruption does not lead to any specific reduction. The interruption intonation, which is hinted at by the hyphens, is not zeroed in English except when the first word of the intruder sentence is the same as the last word before the interruption in the host sentence. In that case the first word of the intruder is changed to who, which, when, where etc., whichever is appropriate, and the interruption intonation is dropped. Emily—Emily you met yesterday—went to Paris —> Emily whom you met yesterday went to Paris.

Harris's grammar based on the operator-argument relation combines the distributional trend with the tradition of theories of semantic categories started by Lezniewski, a contemporary of Sapir, and continued today by logicians and linguists working in categorial grammars. Harris was conservative in accepting the main grammatical categories like sentence, verb, noun, adjective, adverb. He considered them well tested for Indo-European and Semitic languages by the history of grammatical research of many centuries. But his subdivisions of the categories are new and in harmony with his theory. Only some nouns may appear in the base as arguments, namely those that do not themselves call for arguments. Some may occur in the base without an article, like proper names, something, somebody, silk, fish, here. Some others must take the article a (a boy). Still others can appear with a or without it. (Beside Mary visited a college is My son goes to college.) The forms with the article the do not appear in the base; they are all derived by transformations. In the base the requirement of any noun for other words is null. Operators are predicates in a broad sense and therefore are sentence-forming. Some are of one nominal argument, On (e.g., sleeps). Some are of one operator argument, 0. (yesterday—an operator that occurs as an argument is taken together with all its own arguments). Others are of more arguments which in turn are 0 or N. Transitive verbs are Onn. Teach is Onn., where the last 0, in our previous example, is same plays violin. By transformations, same plays violin must be changed to a form of an argument; the subject and -s must be changed to zeros and to must be added as an indicator showing the shortened sentence (play violin) to be an argument. Not more than four arguments occur under English operators. This parsimony in kinds of operators obtains for the base, i.e., for the report sublanguage. In the rest of the language there are other methods of imposing and changing the grammatical structure. For example, Emily arrived can be nominalized into Emily's arrival. This non-contiguous -'s..-al is an argument indicator. It changes a segment of one category into a segment of another category. Person and tense suffixes (which are, like -'s..-al, long components from Harris's 1951 book) constitute an operator indicator and it also changes the category of segments. The majority of suffixations have similar effects. This split of semantic categories into operators and arguments on the one hand and their indicators on the other shows the generality and abstractness of the base. By exhibiting the categories of operators and the categories of their arguments in a canonical order we give enough hints for a reconstruction of the informational content of the message. Some languages will be more alike to each other in the base than on the transformed level. When reductive transformations break the rigid structure of a report, the indicators help us to see the derived grammatical relations. Indicators do not have informational content. They are organizational only.

Harris published his capolavoro, A Grammar of English on Mathematical Principles, in 1982. This comprehensive and subtle study of the syntax and weak semantics of English uses the report-paraphrase and operator-argument approach. At the same time it is the most detailed presentation of this theory. As always, Harris tries to write a grammar of the entire language, not of an aspect or fragment of it. Does he succeed? Of course, not quite. He was aware that the book fails to take into account that the indicators form what are called "paradigms"; also, that it does not sufficiently deal with intonation.

In his last years, Harris published two books, one small (1988), one extensive (1991), of reflections on the variety of approaches he had been taking toward language. These books are written in a clear, simple manner, almost without formulas. They must be read with the previous writings in hand, and our understanding of the previous writings should be enriched by these two books. Here the problem of meaning looms large. On the one hand Bloonifieldian social meaning is intuitively known to everybody. On the other hand, the meaning of a word is defined by Harris as its redundancy on redundancy in the information theoretical sense. By this Harris means the limitations of the class of admissible sequences of the word's arguments and further the limitations of the operators that take this word as their argument in turn. These two books also contain Harris's views on the development of language. Some are speculations, while others are sharp observations like "discreteness and repeatability of the elements (which reduces error compounding in transmission), and the lack of grammatical devices for direct expression of feeling, suggests that language developed primarily in the transmission of information within a public, rather than for personal or interpersonal use."

Harris never entered the raging debates about the goals of linguistic inquiry or about the nature of language capabilities. Not that he did not have opinions about those matters ("None of the mental capacities are solely linguistic"), but he favored plurality of possible approaches. He once wrote, "The pitting of one linguistic tool against another has in it something of the absolutist postwar temper of social institutions, but it is not required by the character and range of these tools of analysis." Once, it was said in his presence that a linguist had written something that he would not like. He cut it short: "Let us do our work."

A comprehensive bibliography of Harris's writings was compiled and annotated by E. F. Konrad Koerner; Historiographia Linguistica, 20, 1993. In writing this text I received substantial help from Henry Hoenigswald for which I am deeply grateful.

ELECTED 1962

HENRY Hiz Professor of Linguistics and Philosophy Emeritus University of Pennsylvania