Language Evolution: Word of the Month: Proto-Indo-European ‘Four’

17 September 2014

Word of the Month: Proto-Indo-European ‘Four’

As promised in a comment to my previous blog post, I’m going to discuss an etymological question: the origin and structure of the numeral ‘4’ in the Indo-European languages.

The Proto-Indo-European numeral ‘four’ had several intriguing properties. It was the largest non-complex cardinal number that agreed grammatically with a noun it modified. Consequently, it was inflected for gender and case, like any ordinary adjective. It shared that property with the words for ‘one’, ‘two’ and ‘three’. For obvious semantic reasons, their declension was defective: ‘one’ was normally singular, ‘two’ was declined only in the dual number, and ‘three’ and ‘four’ only in the plural.

The fourth is for luck.

The basic forms of the numeral ‘4’ (as reconstructed in handbooks) were the animate “count plural” *kʷetwores and the inanimate (neuter) “collective plural” *kʷetwōr (from earlier *kʷetwor-h₂). There is some uncertainty about the accentuation of these forms: some reconstruct them with PIE stress on the first syllable, others on the second (the comparative evidence is not unambiguous).

Proto-Indo-European probably had no feminine gender as a formal category, but it had ways to express femininity in derivatives. Curiously, the numerals ‘three’ and ‘four’ seem to have had feminine forms, preserved only in Celtic and Indo-Iranian. They are reconstructed as *tisres ‘3’ and *kʷetesres ‘4’. The final *-es is the familiar nom.pl. ending of animate stems ending in a consonant, but the rest looks baffling. The suffix *-sr-, known also from the Anatolian languages, where it forms nouns denoting human females, probably reflects an archaic, almost completely abandoned word for ‘woman’ (*ser-), although the zero grade (absence of a vowel) in the nom.pl. is aberrant; the initial part (*ti-, *kʷete-) looks in either case like the badly mangled residue of an actual numeral stem. Given the normal rules of IE word-formation, we would expect something like *trí-sor-es and *kʷétwr̥-sor-es. The characteristic “defects” of the attested forms are nevertheless shared between Celtic and Indo-Iranian; they must therefore go back at least to their most recent common ancestor. Such distortions are not quite unexpected in compound words, which commonly lose their transparency through irregular simplification.

Let’s ask a stupid question: what is *kʷetwores/*kʷetwōr the plural of? I mean, if it’s really an adjective, perhaps it had an older “etymological” meaning before it became part of the numeral system? If we strip off the inflections, what remains is the stem *kʷetwor-/*kʷetwr- (the second vowel is lost in so-called “weak” case-forms like loc.pl. *kʷetwr̥sú). This “bare” stem also occurs as a compositional variant of ‘four’, sometimes with the final segments reversed (*kʷetwr̥- ~ *kʷetru-).

An Indo-European stem with four consonants and two vowel slots must have been morphologically complex at some point. The most likely division into morphemes would be *kʷet-w(o)r-. The *-w(o)r- part looks familiar. A suffix of this form is found in a number of Indo-European nouns, typically inanimate abstracts derived from verb roots. We also find it e.g. in the PIE word for ‘fire’, *páh₂wr̥, which is not obviously deverbal (though a connection with *pah₂- ‘guard’ is thinkable). We also have at least one evidently archaic example od an adjective built in the same manner. Beside the inanimate noun *p(e)iH-wr̥ ‘fat’ (Greek pĩar) we find an adjective meaning ‘fat, fertile’ whose masculine form was *p(e)iH-won-; its neuter must have been originally identical with the noun, and a suffixed feminine *piH-wer-ih₂ was added to the paradigm as the IE gender system developed a three-way contrast (I use the cover symbol *H here for a laryngeal whose “index” is hard to determine). Note the consonant alternation in the suffix: it’s characteristic of an entire class of neuters, so-called r/n-stems. They show *-r in the nom./acc. singular and collective (e.g. *páh₂-wōr, the collective of the ‘fire’ word), but *-n- in the remaining cases (like the gen.sg. *ph₂-wén-s). The variant *-n- is also expected in related animate forms, with the strange exception of *-r- occurring before the femininising suffix *-ih₂, as illustrated by the preserved forms of the adjective ‘fat’. The striking agreement between Greek píōn (m.), píeira (f.) and Vedic pī́van- (m.), pī́varī (f.) shows that this unusual alternation is inherited.

To continue our Gedankenexperiment: so far we haven’t identified the underlying root *kʷet-. Still, if we tentatively assume that it was indeed a verb root, some predictions can be made: beside the hypothetical abstract noun *kʷét-wr̥, possible derivatives include an adjective of exactly the same form in the inanimate gender. Its expected animate form would be *kʷét-won- (nom.sg. *kʷétwō, nom.pl. *kʷétwones). The neuter noun/adjective would form the collective plural *kʷétwōr. Of these forms, two can be regarded as attested: *kʷétwōr is a possible reconstruction of the neuter numeral, and *kʷétwr̥ is its uninflected compositional variant. Conspicuous by their absence are any forms with *n instead of *r. Why, for example, is the animate (masculine) plural *kʷetwores rather than *kʷetwones? The most natural explanation is that this particular plural isn’t old enough to participate in the *-n/r- alternation.

Let’s imagine that *kʷétwr̥ was originally a neuter noun (without an accompanying adjective). Whatever its etymological meaning (let’s symbolise it ‘X’), the collective plural *kʷétwōr (meaning ‘a set of instances of X’) came to be employed as a cardinal number, at first uninflected (like ‘five’, ‘six’, etc.), but eventually attracted into the adjective system, presumably on the analogy of the already adjectival numerals ‘two’ and ‘three’. In the early history of Indo-European the accent was often shifted to the second syllable in such collectives; hence the by-form *kʷ(e)twṓr, in which the first vowel could be phonetically reduced (*kʷətwṓr) or lost altogether. Non-initial stress is reflected in Germanic (cf. Gothic fidwor, displaying the voicing effect of Verner’s Law), and vowel reduction accounts for Latin quattuor (with Lat. /a/ from *ə).

When *kʷétwōr ~ *kʷ(e)twṓr came to be interpreted (and declined) as a neuter plural adjective, an animate counterpart was analogically supplied by adding appropriate inflectional endings to the stem *kʷétwor- or *kʷ(e)twór-. Since its origin as an n/r-noun had been forgotten by that time, PIE-speakers had no reason to make their life more difficult by reviving an ancient alternation. The only case-forms requiring distinctly animate inflections (different from neuter ones) were the nom.pl. (*-es) and acc.pl. (*-n̥s from earlier *-m̥-s). The unsettled stress pattern (*kʷétwores ~ *kʷ(e)twóres) may well be an old feature of the numeral ‘four’.

Some details require more attention, but first I would like to address the question left unanswered above: what exactly was *kʷet-, the root supposedly underlying the derivation of the numeral ‘four’? I will try to suggest an answer in the next post (later this week, I hope), so please stay tuned.

[back to the table of contents]

44 comments:

David Marjanović17 September 2014 at 03:22
(Greek pĩar, Vedic

I think you made a typo in an <i> tag...

and vowel reduction accounts for Latin quattuor (with Lat. /a/ from *ə).

Intriguing; Latin has a lot of /a/ that seems to come out of nowhere, to the point that it's been called "unreliable" for drawing conclusions about PIE vowels. Can the vowel reduction, which presumably came with some shortening, account for the otherwise unexpected /tː/ by compensatory lengthening?
ReplyDelete
Replies
Unknown19 September 2014 at 06:24
Even if reduced grade were acceptable, there would be no basis to expect it in Latin 'four'. Oscan _petora_ 'four' (Fest.) and _petiru/o-pert_ 'four times' (Tab. Bant. 14/15), with Umbrian _peturpursus_ 'for quadrupeds' (Tab. Iguv. 6B:11), show that P-Italic inherited /e/-grade in the first syllable, and it makes sense that Q-Italic did also.

The Latin /a/-vocalism must thus be due to contamination. The combining form _quadri/u-_ 'four-' also has -dr- which cannot be derived by verifiable soundlaws from any reasonable IE form of 'four-', so this cluster must also come from the contaminant. The adjective _quadrus_ 'square' could have been extracted from an obsolete noun *quadrum 'whetstone' which was interpreted as a substantivized adjective 'square (stone)', neuter after _saxum_, and later replaced by _co:s_.

An IE root *k^weh1d- 'to wear down, abrade, sharpen by abrasion' accounts for this in addition to a group of Germanic words. Normal grade appears in ON _hváta_ 'to break through', /o/-grade in Go. _hwo:ta_ 'threat', and zero grade in OE _hwæt_ 'swift, brave', _hwæss_ 'sharp', _hwettan_ 'to sharpen, whet, incite', as well as Lat. *quadrum < *k^wh1d-róm. (Pokorny, IEW 636, lists the Gmc. words but wrongly includes Lat. _triquetrus_ instead of _quadrus_.)

The inroad for contamination was probably 'forty'. The inherited tongue-twister *quetvora:ginta: was one syllable longer than _quadra:ginta:_, this originally a colloquial substitute 'square decades', i.e. 'decades on the points of a square'. Once this was established as 'four decades', a new combining form _quadri/u-_ was able to oust inherited *quetur- (= Umb. _petur-_, Skt. _catur-_), whose form was peculiar. New compounds along with _quadrus_ and _quadra:ginta:_ acquired enough collective strength to impose _qua_-anlaut on *queturs 'four times' (hence *quatur(s) > _quater_), *quetvo(:)r 'four', and even *quo:rtus 'fourth' (cf. Praenestine QVORTA, woman's name), turning it into _qua:rtus_.

Sihler's explanation of -tt- in _quattuor_ (NCG §185.4) is laced with misinformation. It obviously has nothing to do with the gemination in _acqua_, condemned by Probus, which did not spread beyond northern Italy. Spanish _agua_ has no underlying geminate, but _cuatro_ does, since Sp. _piedra_ continues Lat. _petra_. And Lat. _mortuus_ (along with _perspicuus_ and the like) shows generalization of the post-heavy Sievers variant *-uwo- for simple *-wo-; it continues OL *mortuvos, not *mortvos (cf. Venetic _murtuvoi_ 'to the dead (man)'). (Likewise disyllabic Lat. _-ius_ continues post-heavy *-ijo- for simple *-jo-.) Lat. _bat(t)vere_ (incorrectly written _bat(t)uere_) is not comparable to _quattuor_, for Friulian has _bataye_ 'battle' from _battva:lia_, but _kutuardis_ 'fourteen' from _quattuordecim_. That is, pretonic -ttv- and -ttu- were kept distinct.

Although Romance requires -tt-, QVATVOR does occur in inscriptions and manuscripts, as well as Medieval Latin (e.g. _quatuor socij_ in the Cuckoo Song instructions). This suggests an external source for the -tt-, such as crossing with Oscan _pettiur_, whose meaning has been disputed. Oscan -iu- for earlier post-dental *-u- is found in _tiurrí_, _eítiuvam_, etc., apparently the rising diphthong [ju] (Buck, OUG §56). And gemination occurred before [j], as in _Dekkieis_, gen. of _Dekis_ 'Decius' (ib. §162). Thus Osc. _pettiur_ could regularly continue earlier *petur, hypostatized from the combining form (Umb. _petur-_, Skt. _catur-_). Native speakers of Oscan (or closely related Sabine) who learned Latin might have had a hard time losing the geminate in 'four', thus making _quattuor_ out of _quatuor_. This is no more outlandish than the attested replacement of _poplicus_ by _publicus_, the latter with Sabine (or Oscan) phonetics, which left _populus_ unaffected.
ReplyDelete
Replies
Piotr Gąsiorowski19 September 2014 at 08:23
Even if reduced grade were acceptable, there would be no basis to expect it in Latin 'four'. Oscan _petora_ 'four' (Fest.) and _petiru/o-pert_ 'four times' (Tab. Bant. 14/15), with Umbrian _peturpursus_ 'for quadrupeds' (Tab. Iguv. 6B:11), show that P-Italic inherited /e/-grade in the first syllable, and it makes sense that Q-Italic did also.

We do not know which forms survived into Proto-Italic, but there's no reason why they should have had the same grade in the first syllable. If the numeral was still declinable, it may have had reflexes of such forms as the "mainstream" *kʷétwores with *e beside a restressed collective *kʷtwṓr; and either of them could have had weak cases of the amphikinetic type like *kʷtwr̥-bʰís, ending up with Italic *a as a remodelled zero-grade. Any levelling in favour of either vowel would have taken place independently in Latin and Sabellic.

Sihler's explanation of -tt- in _quattuor_ (NCG §185.4) is laced with misinformation. It obviously has nothing to do with the gemination in _acqua_, condemned by Probus, which did not spread beyond northern Italy. Spanish _agua_ has no underlying geminate, but _cuatro_ does

Sihler nowhere claims that they represent the same change (or that acqua underlies the modern Romance forms, or that the gemination in that word was regular or widespread). He merely adduces inscriptional examples of a tendency to lengthen obstruents in a similar context (which operated also independently in West Germanic and Sanskrit, among others).

Although Romance requires -tt-, QVATVOR does occur in inscriptions and manuscripts, as well as Medieval Latin (e.g. _quatuor socij_ in the Cuckoo Song instructions). This suggests an external source for the -tt-...

Whether we are dealing with an imperfect sound change leaving behind synchronic variation in Latin, or with external influence, the /tt/ is not particularly mysterious or isolated, which is the whole point here.
ReplyDelete
Replies
David Marjanović19 September 2014 at 13:53
Very interesting! :-) So there's a chance that modern Italian (pubblico, acqua...) continues an Oscan sound change?
ReplyDelete
Replies
Patrick4 February 2015 at 17:36
"The most natural explanation is that this particular plural isn’t old enough to participate in the *-n/r- alternation."

Could this be evidence that the lemma replaced an older *h3ekt- "four", now only preserved in the dual *h3ekt-eh3/1 "eight" and Avestan asti "length of four fingers" (< PIE *h3ekt- + Indo-Iranian length stem *-ti)?
ReplyDelete
Replies
David Marjanović3 April 2015 at 17:34
Oh, sorry, not "five", but "fist"! And only (North)East Caucasian, it hasn't been found in the West. Here's the Starling entry. "Notes: Reconstructed for the PEC level. Correspondences are regular (one of the roots with the relatively rare phoneme *f)."

Judging from a paper where I found it, it's entry number 428 in Starostin & Nikolayev's (1994) North Caucasian Etymological Dictionary, which I don't have (except for the preface which doesn't mention it). Starostin (1988) cited it on p. 119 as Proto-East Caucasian *X̄wink'wV (for which he listed plausible reflexes in 6 languages "and others"), where "X" probably means [χ].

North Caucasian languages are some of the least likely languages to loan material into PIE

Why do you think so? There are plenty of similarities that have been considered evidence for contact (or even, by a few people, the idea that IE and NWC are sister-groups). Geographically it makes sense; that there are similarities which require some explanation other than chance is not controversial as far as I know.
ReplyDelete
Replies
OsoDanes2 September 2019 at 13:43
The Semitic numerals are unlikely to have transferred directly from Proto-Semitic into Proto-Indo-European per se. I tentatively posit an intermediary language (or more!) on the Balkans, recognizing that much of early PIE contact with the Neolithic occured north of the Black Sea. Alternative roads to the Semitic Middle East are the linguistically packed Caucasus region (but the numerals are similar in Kartvelian!) and the linguistic void that we ultimate know as BMAC east of the Caspian Sea. But considering the importance of agriculture in the Balkan region and the sustained border between PIE and the Cucuteni-Tripolye culture in modern Ukraine, my bets are on the western route. I'm still at a loss as to whether the words reached PIE in a linguistic vehicle related to Semitic, or through an unrelated intermediary.
ReplyDelete
Replies
Z4chst3r16 January 2020 at 14:36
One needs to look further back than PIE. There is an affinity between PIE *kʷetwor- and Afro-Asiatic for example Shawiya: d'rbu and even with Pre-Dravidian languages e.g. Meyu that has: yerrabula going back to *kutyarra-pula. In the latter the *kutyarra actually means two and the -pula is a dual ending 4 is 2 doubled/folded. Its possible that PIE and Afro-Asiatic words for 4 go back to something similar - a possible survival of a word like -pula meaning folding or double is seen in the -bu part of d'rbu and say -ba in Hebrew: arba. If 4 was originally something like *kʷetworepola the *kʷetwore- would similarly have meant 2 originally, say derived from a conjunction *kʷe ("and") + comparative *twor- ("more") and the word for 4 would have meant 2 doubled. But then the word for 2 was shortened to just *twor- which allowed the word for 4 to lose the ---*pola ending with the result that just *kʷetwore- came to mean 4.
ReplyDelete
Replies
Legal Translation Company in Dubai28 March 2022 at 14:03
United Arab Emirates, Federal Decree No. (128) of 2021
United Arab Emirates, Federal Decree No. 127 of 2021 No. (127) of 2021
United Arab Emirates, Federal Decree No. (125) of 2021
ReplyDelete
Replies

Add comment

Language Evolution

17 September 2014

Word of the Month: Proto-Indo-European ‘Four’

44 comments:

About me

Some really great blogs

Blog Archive

Popular Posts

Total Pageviews