Overly thorough minimalism
As you may have divined from the phonology of Pchekeho, I like small phoneme inventories; the smaller the better. This got me wondering, what's the smallest I could go? In natlangs we seem to have a lower bound of ten phonemes, or just possibly nine. But is that a hard limit? It's very difficult to say. In this post I'm doing a fairly thorough analysis of phonological universals to hopefully produce the true holy grail of minimalism. Since I'm not bound by worries about correct citation or academic integrity, I can throw out previously-proposed universals much more easily than a real linguist, and I shall do exactly that (similarly unprofessionally, on the offchance that it irritates some passing academics, I'll be referring to all my sources on a first-name basis). Once I've produced for myself a set of absolute phonological universals, I will try to use them to produce a working, naturalistic language. In the spirit of this thing (whatever it is), I'll also give it the least amount of morphology possible, and probably the fewest grammatical categories possible, if that's sensible. In other words I'm making an anti-kitchen sink language. A toilet cistern language, if you will.
It's notoriously difficult to propose phonological universals, and it's only getting more so as new Papuan and South American (and occasionally African) languages. For this thought experiment I want to try and determine what the smallest possible phonemic inventory is that doesn't violate any phonological universal of natural languages. I'll be using mostly descriptive universals rather than "analytic" ones – if I say for instance, "all languages contrast consonants for place of articulation", I'm merely describing a feature of all known languages, not claiming that it's impossible for such a language to ever arise. However, to qualify they have to have absolute universals; it's no good saying that "all languages have coronal phonemes" since Northwest Mekeo doesn't have any, even though it's true for 99.99% of language. I'll also make the caveat that I can't refer to any numbers other than "multiple" – it's no good saying "all languages have at least 9 phonemes" because that would defeat the whole point of this, but it
is acceptable to say that "all languages have multiple consonant phonemes". (NB I end up ignoring this rule.)
For my starting place I'll be using Larry Hyman's 2008 article
Universals in Phonology which is still pretty definitive, even though languages like the Mekeos, Ontena Gadsup and Biritai have disproven some of its universals since. To supplement this, I'll be taking all uncontested phonemic inventories I know of in good faith – for instance I'll accept Jones' /p k g m ŋ w j β/ inventory for Northwest Mekeo even though it hasn't been independently verified. On the other hand, I won't accept Kabardian's zero-vowel analysis since there are newer analyses which are preferable. It's also worth noting that I'm only talking about natural spoken languages, even though most conlangs conform to these universals so they wouldn't have a great deal of effect anyway. Sign languages presumably have their own set of phonological universals too, but I don't know anywhere near enough about it (I don't know if anyone does to be honest).
It's obviously impossible to prove a universal, either descriptive or analytic; the former because we don't know all the languages which have ever existed, and the latter because it's impossible to prove anything in science. For a "universal" to qualify I will arbitrarily say that it must have no counter-examples in any known languages or dialects. I'll also give myself artistic license in vetoing universals which I think are stupid, with or without explanation.
1. General remarks
Firstly I'll discuss some very general, non-specific universals which are trivial but worth pointing out.
1a. All spoken languages have phonemes
This is worth mentioning at least. All languages draw from groups of sounds (allophones) that can be defined as finite, mostly discrete sets (phonemes). I say "mostly discrete" here because there can be accoustic overlap especially in the vowel space. Also note that I mention only
spoken languages, even though I believe sign languages still have something parallel to phonemes. It's hard to imagine what a language which
didn't have phonemes would be like. Presumably there would be no limitations on word form or even manner of production whatsoever. I'm sure it would be possible to construct a language without any phonemes that uses only suprasegmentals to express lexical and grammatical ideas, but it certainly isn't reasonable for human languages.
An interesting one which I don't think has been discussed before is:
1b. There are never fewer phones than phonemes in a phonological system
I'll define a phone as a segment which occurs frequently (i.e. in a conditioned environment or in non-nonce free variation) which is marked by a phonological difference that
could form the basis of a phonemic distinction in another language. This means that English /t/ → [t tʰ] counts as two phones, because /t tʰ/ are different enough to be separate phonemes in e.g. Hindi, but /i/ → [i ɪ̝̟] still counts as one phone. It isn't impossible to produce a phonological system where the underlying form has an additional contrast, for instance:
/p t ʈ k/
/m n/
/s/
/i u a/
/ʈ/ → [p] / _u
/ʈ/ → [t] / _[a,#]
/ʈ/ → [k] /_i
However I'm certain that this never occurs in a language to the exclusion of any other allophony. I could go on and define probably dozens more inane universals like this but they won't provide very much interest so I won't.
2. Consonants
First off:
2a. All spoken languages have multiple consonant phonemes
While some languages have been analysed without vowels, no language has ever been analysed without consonants. Most languages have between eight and a hundred consonants; it's obviously not going to be worthwhile looking at languages with enormous consonant inventories since they will inevitably have more complexity than small consonant inventories and won't provide possible counter-universal examples, so I'll start with the smallest ones. A reasonable number of languages have seven consonants; some of these include Pirahã (under Everett's slightly dubious analysis), Buin, Puinave, Sikaritai and West Mekeo. Six consonants is also attested, famously with Rotokas, but also for numerous Lakes Plain languages such as Iau, Kirikiri and Obokuitai, and even more obscurely for North Mekeo (and possibly also for East Mekeo which is in the process of losing /ʔ/). Five consonants has been proposed for proto-Lakes Plain which most likely had */p t k b d/; but this is a reconstructed protolanguage so it must be taken with a grain of salt. There are basically two questions as to the validity of this inventory; firstly whether */ɾ/ was a phoneme, and secondly whether */s/ was a phoneme; in both cases it seems likely that they were allophones (of */d/ and */t/ respectively), but it's really impossible to say without a whole lot more research. Finally there's the most obscure of all, which is the 5-consonant system proposed for Biritai in
a talk by Mark Donohue. This would be a normal Lakes Plain inventory save for the lack of */k/, and complete unconditional loss of /k/ is attested, so I'll accept this inventory as true even though there is no thorough analysis of it anywhere. There's also a chance that some other Lakes Plain language lacks, for instance, a phonemic sibilant, and has an inventory of /b t d k ɸ/, so I don't think excluding five-consonant inventories is wise.
It would technically be possible for me to propose that ?"all languages have at least five consonant phonemes", but to begin with, that defeats the whole point of this exercise, and furthermore I'm not entirely sure that there hasn't
ever been a language with four consonant phonemes. Consonant inventory size seems to be pretty normally distributed, and there must have been at least 100,000 languages spoken in human history, so the chances that one of them once ended up having four consonants may not be infinitessimal.
Code: Select all
Pirahã Buin Puinave Sikaritai West Mekeo
Isolate, SAm S.Boug., PNG Isolate, SAm L.Plain, PNG Austronesian, PNG
p t ʔ p t k p t k t k p k
b d g b d b g
s h s h ɸ s
m n m n m ŋ
r w l
Rotokas Kirikiri North Mekeo (East Mekeo)
N.Boug., PNG L.Plain, PNG Austron., PNG Austronesian, PNG
p t k t k k p k (ʔ)
b d g b d b
ɸ s β f
m ŋ m ŋ
ɫ l
*Proto-Lakes Plain ?Biritai
Lakes Plain, PNG Lakes Plain, PNG
p t k t
b d b d
ɸ s
Larry Hyman came up with four universals surrounding consonant systems, three of which have been disproved in the past 16 years:
Larry wrote:Consonant Universal #1: Every phonological system has oral stops.
Consonant Universal #2: Every phonological system contrasts phonemes that are [–cont] (= stops) with phonemes that are specified with a different feature.
Consonant Universal #3: Every phonological system contrasts phonemes for place of articulation.
Consonant Universal #4: Every phonological system has coronal phonemes
#1 is disproven by Ontena Gadsup, which has the following absolutely nuts inventory:
Code: Select all
Ontena Gadsup: Akuna Gadsup:
ʔ p t k ʔ
ɾ d
m n m n
ɸ s x
β j β j
The phonemes /ɸ s ɾ x/ surface as [p t d k] word-internally following /ʔ/, and /ns/ surfaces as [nt]. This can be contrasted with Akuna Gadsup (also shown above) in which lenition is incomplete and thus stops can be treated as underlying. No other series gets as close to stops as being universal; sonorants are absent from numerous languages and nasals from a fair handful, while fricatives are only really an areal feature of Afro-Eurasia and the Americas.
#2 is automatically countered as well, although it can be reformatted. Subjectively speaking, all consonant inventories have at least two lines:
Code: Select all
Rotokas: Maxakalí:
p t k p t tɕ k ʔ
b d g b d dʑ g
h
Note that for Rotokas I'm taking /b d g/ as archi-phonemes; traditionally they're given as /β ɾ g/ but for our purposes it really doesn't matter. And Maxakalí has one element of one line offset slightly, so you could say that it needs three lines.
What this actually
means is that all languages have some kind of "MOA" (manner/mode of articulation) contrast. This can
almost be described as "all languages have an MOA contrast at multiple POAs", save for Obokuitai:
Here MOA is only contrasted at the coronal POA (unless you get into "central" vs. "peripheral" which IMO is too abstract an approach for this purpose). This means that systems like ?/p t k ʔ s n l/ or even ?/p t k s/ can't be ruled out at this stage, although I don't know of any languages other than Obokuitai (and probably other East Tariku languages) which have this restriction. As such we arrive at our first proper universal:
2b. All consonant inventories have multiple degrees of sonority including multiple obstruents
This is very similar to Larry's #2, but it allows for Ontena Gadsup – for instance /ɾ/ is more sonorous than /s/. To my knowledge no language has only a single obstruent. Note that I'm working on a standard sonority hierarchy of "voiceless plosive < voiced plosive < voiceless fricative < voiced fricative < nasal < lateral < flap < approximant < high/close vowel < mid vowel < low/open vowel"; it's important to distinguish voicing in plosives since some languages limit their sonority differentiation to this. In fact, we can produce another universal stemming from #2b, this time with a bit more specification:
2c. There are always multiple consonant phonemes which are more sonorous than the least sonorous series of phonemes
This covers all languages with sonority contrasts at multiple POAs, but also covers languages like Obokuitai, since all of /b d s h/ are more sonorous than /t k/. The same goes for Ontena Gadsup; /β j m n ɾ/ are all more sonorous than /ɸ s x/ (I'm not sure where /ʔ/ falls on the sonority hierarchy but it doesn't make a difference wherever it goes). #2b is very much a descriptive universal rather than a bottom-up one, but I feel like it will inevitably be true – I'm fairly sure we can rule out inventories like /p t ʈ c k kʷ q qʷ ʔ s/.
Larry's universal #3 still holds up pretty well, although I'll word it slightly differently:
2d. There are always multiple contrastive places of articulation in a consonant inventory (?)
This means that we can rule out an inventory like /k b ɸ m l j/, but we can't automatically rule out /p k s m ʁ/. It might be possible to say that there are always multiple contrastive places of articulation
at multiple levels of sonority, since for instance Obokuitai has /t : k/, /b : d/, /s : h/. However, a traditional analysis of Rotokas does stump this, as does North Mekeo described previously:
Code: Select all
Rotokas (traditional analysis):
p t k
g
β
ɾ
Regardless of whether this is the best analysis, I can't really reject it for any reason other than "it's not very neat", so I will leave #2d as it is. There's also a fair few near-misses like Karajá:
Code: Select all
Karajá (1) Karajá (2)
k k
b d b d
ɗ ɗ
θ h θ
h
l l
ɾ w ɾ
w
/h/ is not technically a fricative, so it could be separated from /θ/'s line, and /w/ is a semivowel rather than a tap, so you could tabulate it as (2), in which case it would only contrast POA at the sonority level of "voiced stop". Note how Karajá shows that POA does not always contrast at the least sonorous MOA (although it does tend to have the most degrees of contrast there) – we can't exclude ?/k m n ŋ/ on this basis. Take also North Mekeo or Ontena Gadsup (if /ʔ/ is taken to be least sonorous, which it probably is) above; in the former POA only contrasts in nasals.
I'm not entirely certain it's impossible for a consonant inventory to be entirely vertical. Say Karajá had /b t kʰ/ for its stops; you'd end up with just /kʰ t b ɗ θ h l ɾ w/ under a very strict analysis, which would violate this universal without looking overly bizarre. Likewise, a Mekeo dialect with /k b β m l w/ wouldn't seem wholly out of place. In this case the best you could do is say that "there will be multiple places of articulation in a consonant inventory which contrast either in the same sonority rank, or one differing only by phonation", which is far less pithy but possibly more absolute. In the end I don't think this has a great deal of influence on the form of an inventory.
Finally, Larry's #4 is blocked by Northwest Mekeo.
Interestingly, it's blocked
only by NW Mekeo; to my knowledge there isn't a single other coronal-free consonant inventory. If it wasn't for this one Austronesian dialect, coronal phonemes could indeed be considered universal. In fact the reason they're missing from this language might be because of the basically arbitrary stigma against coronals as being infantile; Mekeo people replacing /k ŋ/ with [t n] will be told to stop acting like children (except for /ŋ/ → [n] / i_, i_).
Labials are fairly frequently missing, especially from North American languages, while a lack of velars is not unheard of. However, there aren't any languages which lack any two of these series, so we can modify this universal to state that:
2e. Consonant phonemes will always occur at at least two out of labial, coronal and velar POAs.
Note that as per the discussion about Karajá and POAs above, I don't mention a
contrast here. I don't think it would be impossible for a language to have an inventory like /b k ʔ n ɸ h l w j/, but there are
always either {labial and coronal} or {labial and velar} or {coronal and velar} phonemes, so an inventory like /t tɕ ʔ s ɕ h n ɲ l j/
can be excluded. I don't think that this is just a random conflation of probabilities either, but rather a combination of a) a natural tendency for phonemes to spread out to fill the phonemic space and b) the lack of common sound changes which can remove these POAs without moving them into another POA. I guess you could have something like */p t k/ → */k t tɕ/ → /ʔ t tɕ/ but it seems vanishingly unlikely, more so than just say the 1% chance of not having bilabials combined with the 0.1% chance of not having velars. Even then we would only expect one language in human history to have lacked both series, so I am confident following 2e.
(Cue me being proven wrong by South-central Klastafak from the Mamberamo river basin in a few years' time)
I do not believe anything more can be said universally about consonants, in part because of languages like Rotokas and Biritai which have so few phonemes. Any more universals would end up being rephrasings of the previous, or just uselessly conditional. You could technically say "all languages have either oral stops or two labial fricatives that contrast in voicing alone" because of Ontena Gadsup, but that's just petty and wouldn't add anything to our understanding of phonetics.
3. Vowels, or perhaps Nuclei
Vowels are at the same time a very fruitful field for phonological universals, yet also extremely frustrating. I'll be limiting myself to universals surrounding the minimal end of vowel inventories to save time and money (well, mostly just time). Vowel systems are fraught with competing analyses, with some languages being analysed as having one vowel by one linguist, and seven by another. Vowel systems also interplay a lot with consonant systems; it's naturalistic to have a vowel system of /ɨ a/, but not when combined with a consonant inventory like /p t k b d g/.
I'll start off by making a not wholly uncontroversial claim:
3a. All languages have at least one vowel phoneme.
Several languages have been analysed to have zero vowels. Most famous of these is Kabardian, which Aert Kuipers claimed had no vowels whatsoever. Traditionally Kabardian can be thought of as having three vowels /ɨ ə a/ like some other Northwest Caucasian languages. However, the vowel [a] has some unusual distribution and stress properties, such that it can be non-trivially analysed as /hə/ (and [aː] as /əh/). Following this, /ɨ/ can be analysed as an epenthetic vowel, since (surface) CV syllables predominate. At this point the analysis is not untenable. However, it falters at the next point of Aert's analysis. For no particular reason, he then ascribes the height distinction between [ɨ] and [ə] onto consonants. This means that Kabardian would have to be analysed as having 97 consonants, all of which fell into [±high] pairs (other than /h/ I believe). For this reason I won't accept a zero vowel analysis, although I might accept one vowel (I do for other languages anyway). Zero vowels has also been claimed for a few other languages like pre-proto-Indo-European or even Mandarin. However, in both these cases there are syllabic resonants "/j̩ w̩/" which I do not believe are sensibly different from /i u/. I could rephrase the claim as "all languages have at least one vowel or syllabic resonant phoneme", but functionally speaking they are roughly the same.
Following on from this, Larry lists six vowel system universals:
Larry wrote:Vocalic Universal #1: Every phonological system contrasts at least two degrees of aperture.
Vocalic Universal #2: Every phonological system has at least one front vowel or the palatal glide /y/. [i.e. IPA /j/]
Vocalic Universal #3: Every phonological system has at least one unrounded vowel
Vocalic Universal #4: Every phonological system has at least one back vowel.
Vocalic Universal #5: A vowel system may be contrastive only for aperture only if its vowels acquire vowel color from neighboring consonants.
Vocalic Universal #6: A vowel system can be contrastive for nasality only if there are output nasal consonants.
I will have to reject his first universal on the basis of languages such as Kabardian in which a one-vowel analysis is acceptable. Another language which has been reasonably analysed as having only one vowel is the Chadic language Moloko. The only phonemic vowel is /a/; /ə/ is produced from epenthesis, while there is word-level prosody (notated as ᵒ and ᵉ) which determines the roundedness of all vowels in a word; hence
/mnzar/ → [mənzar] "see! (sg.)"
/mnzar-amᵒ/ → [mʊnzɔrɔm] "see! (pl.)"
/mnzar-aᵉ/ → [mɪmɪnʒɛrɛ] "seeing"
The problem with this "universal" is that occasionally vowel features are transferred onto consonants or entire words (the reverse can happen too, but not to this extent). It is possible to produce a fairly irritating but nevertheless valid universal:
3b. All languages have multiple vowel phonemes unless consonants or words have markedness for F2.
Larry's second universal on the other hand does to my knowledge hold true, although I will be a bit more conservative in producing my 3c:
3c. All languages have at least one [+front] vowel, or a [+front] consonant such as /j/.
His third universal is somewhat problematic to me. I don't exactly believe that a vertical vowel system of /ɨ ə a/ can be considered markedly [–round] (or [+unrounded]); they're simply unspecified for roundedness. His fourth universal falls down for the same reason; backness is equally unspecified. The fifth universal seems more promising, and is roughly equivalent to my 3b, but there's a problem when you run into Eastern Arrernte, which has /ə a/ where /ə/ is [ɪ~ə~ʊ] in pretty much free variation. Consonants have no [+front] feature, and while there is [+round] marking on numerous consonants, it doesn't actually affect the realisation of /ə/ except in very limited environments (specifically #_Cʷ). While this does technically conform to his universal, I'll stick with my wording of 3b above (which doesn't affect Eastern Arrernte anyway). I'll ignore his #6 because, as Larry himself notes, not all languages have nasality as a feature at all.
Worthy of mention is a fourth universal:
3d. No language distinguishes frontness (F2) without also distinguishing height (F1).
This is needed to prevent systems like /i ɨ/ which would otherwise be permitted. Like Larry, I'm not treating /i u a/ systems as horizontal because I think that's stupid. Only one horizontal vowel system has been seriously proposed, which is proto-Indo-European's */e o/. First off, this is a protolanguage so immediately a poor counter-example. Additionally, there are several instances of */a/ which aren't satisfactorily explained through */h₂/. And finally, I believe, as do many other Indoeuropeanists that, */o/ was lower than */e/ and possibly longer also. In the end I'm conducting this train and I choose to ignore PIE, so there.
There's one other universal I can consider including, which would be something along the lines of "for a language to have fewer than three vowels it must have a large number of consonants". On the one hand, I believe this is true, and would produce a more naturalistic inventory, but on the other hand, it isn't worded very absolutely and uses numbers which I said I wouldn't. However, in the interests of producing something naturalistic, I will begrudgingly permit this one to remain:
3e. No language has fewer than three vowel phonemes which does not also have more than ten consonant phonemes.
Ten is probably a very low limit here; twenty would probably work just as well, but I'm covering my bases here. It's the universal I'm least happy with but I can't say it's not true.
4. Anything else
Larry discusses a few more universal proposals. Some of them are too theoretical to be of much use - "No vowel can be [+high, +low]" is true, but it tells us more about the features [high] and [low] than it does about phonemes. He also acknowledges that there is little use or validity to the claim that "all languages have accent".
The idea of the syllable is of interest. There are three syllable universals that Larry discusses, as well as one he mentions in another paper, which I paraphrase below:
Paraphrasing Larry, I wrote:#1: All languages have syllables
2#: All languages have CV syllables
3#: Consonants and vowels always belong to a syllable
4#: Syllabification is always predictable
Conveniently, all of these rest on the validity of #1. If a language can be found with no syllables, then it obviously won't have CV syllables, and none of its consonants or vowels will belong to a syllable. On the other hand, a language without syllables would have predictable syllabification in that all of its phonemes would be not syllabified, but that universal is fairly tangential to my aim anyway.
To determine if a language has syllables, we first have to work out what a syllable actually is and what a language without them wouldn't have. Larry actually comes back with another paper to discuss this very thing;
Does Gokana really have no syllables? Or: What's so great about being universal?. He posits four features which indicate the existence of syllables:
- Phonological rules conditioned by syllable structure
- Morphological rules or allomorphy conditioned by syllable structure
- Prosodies or word-stress targeting the syllable as a feature-bearing unit
- Prosodic grouping of syllables into higher order constituents, e.g. feet
These don't all need to co-occur; some languages use moras rather than syllables for prosody determination, and feet don't exist in monosyllabic languages. In fact the first two don't
have to exist in a language even if it did have syllables. An isolating obligatory-onset CV language wouldn't follow the first two, and wouldn't necessarily have to follow the third or the fourth (in fact in such a case I'd have to wonder if there was any reason to posit sub-syllabic phonemes other than just convention).
The case of Gokana suggests does go against all of these tendencies to the point that syllables are fairly useless. The fundamental problem with this universal is that syllables don't really *exist* in the way phonemes exist. You can hear phonemes in a language, but you can't hear "." (hence why you can't distinguish [ˈsɪt.ɪŋ] from [ˈsɪ.tɪŋ] without introducing some acoustic feature like aspiration, glottalisation, vowel length, pause, etc.). Syllables are an abstract concept that basically just exist as a tool to explain the four things Larry describes above (and probably a few more things too). The universal "consonants and vowels always belong to a syllable" is really entirely nonsensical, since you can divide
anything up into syllables.
Returning to the point at hand, the best we can say is that syllables aren't
helpful in a given language. This is the case in Gokana. There's no point in trying to syllabify the sequence /kɛ̃̄ɛ̃̀ɛ̃̀ɛ̃̀ɛ̃̄ɛ̃́/ (wake-
CAUS-
LOG-
3SG-
FOC). You can do it (/kɛ̃̄ɛ̃̀ɛ̃̀ɛ̃̀ɛ̃̄ɛ̃́/ or /kɛ̃̄ɛ̃̀ɛ̃̀.ɛ̃̀ɛ̃̄ɛ̃́/ or /kɛ̃̄ɛ̃̀.ɛ̃̀ɛ̃̀.ɛ̃̄ɛ̃́/ or /kɛ̃̄.ɛ̃̀.ɛ̃̀.ɛ̃̀.ɛ̃̄.ɛ̃́/) but it doesn't add anything to the understanding of Gokana.
Gokana does however have
moras and systematic root shapes. This suggests that you could say "all languages form phonemes into strings on the basis of sonority", but then you encounter difficulties with languages like Nuxálk and various Berber languages where obstruent sequences can occur practically
ad infinitum.
Ultimately I don't think there's much more that can usefully be added. In the next post I'll provide the maximally minimal inventory and start working on grammar. Before I go, here's a list of the 12 universals I've come up with, with the actually useful ones in bold:
1a. All spoken languages have phonemes
1b. There are never fewer phones than phonemes in a phonological system
2a. All spoken languages have multiple consonant phonemes
2b. All consonant inventories have multiple degrees of sonority including multiple obstruents
2c. There are always multiple consonant phonemes which are more sonorous than the least sonorous series of phonemes
2d. There are always multiple contrastive places of articulation in a consonant inventory
2e. Consonant phonemes will always occur at at least two out of labial, coronal and velar POAs.
3a. All languages have at least one vowel phoneme.
3b. All languages have multiple vowel phonemes unless consonants or words have markedness for F2.
3c. All languages have at least one [+front] vowel, or a [+front] consonant such as /j/.
3d. No language distinguishes frontness (F2) without also distinguishing height (F1).
3e. No language has fewer than three vowel phonemes which does not also have more than ten consonant phonemes.