In this post, I shall lay out my thinking about the origins of the
Indo-European family.
Most Indo-Europeanists adhere to some form of the steppe hypothesis
about the origin of IE; so do I. About a century ago, the Dutch linguist
C. C. Uhlenbeck conjectured that Proto-Indo-European (PIE) was a
language related to Uralic influenced by a substratum related to
Caucasian languages. Recent genetic studies have come up with data that
agree well with that. Apparently, the Yamnaya people who probably spoke
PIE emerged from a mixture of two populations, the "Eastern European
Hunter-Gatherers" (EHG) and the "Caucasus Hunter-Gatherers" (CHG)
originating from south of the Caucasus, about 5000 BC (though at that
time, the CHG probably already were farmers). The Proto-Uralic speakers
probably also were EHG, while the Maikop culture in the northern
foothills of the Caucasus, the southern neighbours of the Yamnaya,
apparently were unmixed CHG. While some CHG moved north across (or
around) the Caucasus, other CHG moved west into Anatolia.
However, some geneticists (not Indo-Europeanists!) have interpreted
these results such that "Proto-Indo-Anatolian" was spoken by the CHG
before these migrations, and Anatolian was carried into Anatolia by the
westward-moving CHG, and PIE proper emerged on the steppe from the
language of the CHG moving north. I think, though, that this does not
comply well with the linguistic facts. There are undeniable
morphological matches between IE and Uralic, which would be hard to
explain if the two language families originated in such different
regions, and these morphemes are also found in Anatolian, of course.
Also, while Anatolian appears to have branched off early (or rather,
emerged from an archaic, probably geographically marginal, dialect of
PIE), it seems unlikely that it branched off 2,000 or more years before
the dissolution of "PIE proper", as there do not seem to have been many
phonological changes between pre-Anatolian and post-Anatolian PIE.
So I would say that the CHG brought in a language related to Northwest
Caucasian (NWC); the Maikop culture may have spoken Proto-NWC. Pre-PIE
was thus a Para-Uralic language on a Para-NWC substratum, which accounts
for the typological differences between PIE and Proto-Uralic; in many of
these points, PIE is quite like NWC. The Russian linguist Viacheslav
Chirikba reconstructs Proto-NWC with 100 consonant phonemes, which are
quite a lot, but most of them are just palatalized or labialized (and
also palatalized AND labialized) variants of other phonemes. These
secondary articulations cannot be separated from the impoverished vowel
inventory, which consists of only two vowel phonemes - a close one and
an open one. What apparently happened was that vowel features such as
[+front] and [+round] were transferred from the vowels to the
consonants. If one removes these secondary articulations from the
inventory, one gets a still large but not huge inventory of 39
consonant phonemes:
Code: Select all
*p *t *ts *tɬ *tʃ *k *q
*b *d *dz *dɮ *dʒ *g *ɢ
*p' *t' *ts' *tɬ' *tʃ' *k' *q'
*f *s *ɬ *ʃ *x *χ *ħ
*z *ʒ *ɣ *ʁ *ʕ
*m *n
*w *r *l *j
his may have been the consonant inventory of the Pre-Proto-NWC language
spoken by the CHG prior to their northward migration. Similar
inventories are found in the (unrelated) Kartvelian languages, as well
as in the NEC languages which are considered related to NWC by some, but
that is very uncertain. The language carried west by CHG into Anatolian
would have been an ancestor of Hattic, a poorly attested non-IE language
of Anatolia, which some linguists assume to be related to NWC. The
Hattic phonology is not fully understood, as the cuneiform spelling does
not indicate distinction alien to Hittite, the language of the scribes.
However, there are vacillations that point at phonemes unknown to
Hittite, such as frequent vaciallations between <p> and <w> which point
at a phoneme */f/, and less commonly between <t> and <l> (as in the
royal title _Tabarna/Labarna_) which can be interpreted as a phoneme
*/tɬ/. If you remove the major distinctions that are foreign to Hittite,
such as the distinction between voiceless, voiced and ejective
consonants, between alveolar and postalveolar sibilants, or between
velars and uvulars, the Pre-Proto-NWC inventory given above shrinks to:
Code: Select all
*p *t *ts *tɬ *k
*f *s *ɬ *x
*m *n
*w *r *l *j
which is more or less what the spellings of Hattic words suggest.
Now back to PIE! What happened on the way from Proto-Indo-Uralic (PIU)
to PIE? It seems likely that PIU was typologically much closer to
Proto-Uralic than to PIE, as Proto-Uralic appears to be a much more
typically "Mitian" language than PIE - to such an extent that
19th-century linguists classified Uralic together with Turkic, Mongolic
and Tungusic as "Ural-Altaic"; also, Uralic is quite similar to
Eskimo-Aleut despite the great geographical and probably also
historio-linguistic distance between the two - the resemblance is
certainly due to conservatism.
So PIU would have had a singe set of stops, unmarked for voicing,
aspiration or glottalization, accompanied by sets of voiced spirants and
nasals at the same places of articulation. The labial spirant *β merged
with *w at some point. Now, these voiced spirants were alien to the
Para-NWC language. Thus, the PIU stops were rendered as voiceless stops
which were phonetically aspirated, and the voiced spirants as voiced
stops. The Para-NWC ejective stops were not used to render anything in
Pre-PIE, so it did not have such sounds (pace the proponents of the
glottalic theory). So we'd get, using the dentals as example, *t > *tʰ
and *ð > *d. In the next step, the aspirated stops were voiced in some
morphemes (probably by some kind of prosodic feature), but as they were
aspirated, these did not merge with the old voiced stops but formed a
third type: *tʰ > *dʰ. From there, it is only one step - loss of
aspiration in *tʰ - that leads to the "Classic PIE" system. The gap at
*b is explained by the old *β > *w merger, which also explains the
somewhat stop-like behaviour (as in the initial clusters *wl- and *wr-)
of *w. The exclusion of roots with two voiced unaspirated stops in one
morpheme would have been due to some kind of dissimilation rule that was
already in operation before the voiced spirants hardened to stops; this
may have turned one of two spirants in one morpheme into an approximant
or whatever.
The velar stop series split into three at the time when the PIU vowel
system was reduced as vowel features became secondary articulations of
the velars in the same way as in NWC. The palatalized velars may also
have absorbed an Indo-Uralic palatal stop series, contributing to their
high frequency. A similar effect affected the uvular fricative *χ, but
here no palatalized variant occurred because this sound had caused
backing of adjacent front vowels (try to say [χe] without either
fronting the [χ] or backing the [e], and you get the picture) - *h1 was
no palatalized laryngeal, but simply */h/. It is misleading to think of
this laryngeal as "e-colouring"; rather, it is *non*-colouring, like all
the non-laryngeal consonants.
So, if the CHG did not speak Proto-Indo-Anatolian, and the CHG moving
into Anatolia spoke Hattic rather than Anatolian, where did Anatolian
come from? From the steppe, like all IE languages. But which way around
the Black Sea? I think the western route is more likely. Apparently, the
Hittites had a tradition of once living in a region where the Sun rises
from the sea, i.e. on the western shore of a sea; IMHO this is more
likely to have been the Black Sea than the Caspian Sea. Also, the most
divergent Anatolian language appears to have been Lydian, which was the
northwesternmost Anatolian language. I would say that Anatolian
originated in the southwestern outlier of the Yamnaya culture on the
Lower Danube, which may have spoken an archaic dialect not yet affected
by the morphosyntactic innovations (feminine gender, tripartite verb
aspect system) of the PIE heartland. A similar language may have been
spoken by the Bell Beaker people, who appear to have spread from there
into Western Europe. The Italo-Celtic languages do not descend from
this, but spread across Western Europe only later. The diversity within
both Italic and Celtic is too small for these branches to have spread
that early.
About a year ago, I suggested that the "Caucasian" substratum may
actually have been a language related to Semitic, which would explain
the Semitic-like words in IE. I no longer think so; NWC is a better
candidate. The Semitic-like words may have been contributed by the
Para-NWC language which may in turn have borrowed them from Semitic as
Neolithic Wanderwörter.
Finally, let me say a word about haplogroups. I think these are
overrated, as the percentages of these lineages can be altered
substantially by genetic drift, founder effects and similar things. This
may explain why both Western Europe (where a subclade of Y-DNA
haplogroup R1b dominates) and Eastern Europe (where a subclade of R1a is
most common) show different profiles than the Yamnaya people themselves
(where a subclade of R1b *different* from the Western European one
seemed to dominate). Also, the sample sizes of the archaeogenetic
studies are so small that minor haplogroups may be missed entirely. At
any rate, it is not advisable to connect Y-DNA haplogroups to language
families (as I have myself used to do for some time before I realized
that this was fallacious).
OK, this has become quite long, and I'll shut up now. It is now open to
discussion.