Zompist Bboard Again

Posted: **Fri Jun 05, 2020 6:07 am**

Moose-tache wrote: ↑Fri Jun 05, 2020 4:33 am
Definition. The complexity of a language on a given sentence (let’s denote it as χ) is defined by finding each proposition which could be associated with that sentence, getting the quantum number for each such proposition, and then taking the difference between the highest and lowest calculated quantum numbers.
I think this is reasonable, but we can use cross-linguistic analysis to help us define our "minimal" number of quanta, as in my examples below. This whole discussion started as a question of whether it's actually more complex to learn Cherokee than Spanish, or if it merely feels that way. So if we compare Spanish, Cherokee, and many other languages, in large passages, we should be able to quantify how many units of information must be included in Spanish or Cherokee, above the number of pieces of information that are common to all or nearly all control languages. These quanta of information would include mandatory morphological marking, marked syntactic structures, suppletion, slang, and anything else that would require its own discrete lexical construction in the mind of the speaker in order to be used correctly. …

That’s pretty much what my definition of ‘complexity’ was trying to capture. The ‘lowest calculated quantum number’ in the definition would be the bare minimal number of quanta the sentence could be construed to have; similarly for the ‘highest calculated quantum number’. The difference between them should count the number of things which have mandatory marking. On the other hand, this definition isn’t too good at accounting for stuff like irregularity and slang, which is why I mentioned that it could be better.

Let's take three examples:
Korean: gwisin i mwusewe [ghost subj frightening] (This is the most natural translation of "I am afraid of ghosts," and does not necessarily mean "Ghosts are scary in general.")
Choctaw: shilop i~ mahlatalih [ghost 3rd-afraid-1st]
Mandarin: wo3 pa4 gui3 [1st afraid ghost]

Both Korean and Choctaw have obligatory tense, seen here more or less as null suffixes, but this is absent in Mandarin (though of course all languages have the option to elaborate on time). Mandarin and Choctaw require the first person be overtly marked, but Korean lets it be implied (the assumption that emotion verbs with no overt subject marking refer to the speaker is common cross linguistically, and even shows up in English). Korean requires that “ghost” take argument marking, while in Choctaw it only needs a third person agreement prefix on the verb, and in Mandarin it shows up with no overt marking beyond its syntactic location to the right of the verb. In all three of these languages the only commonality is that “ghost” and “scared” appear, along with whatever syntactic alignment is the default. (Once again I am treating word-choice as one quantum. In a language with fifty words for fear, each with its own default syntactic alignment, we may need to unpack the semantic aspect of complexity further. But for now I'm focusing on morpho-syntax).

So does “scared of ghosts” inherently require person marking for the subject? Does it inherently require overt tense? Does it inherently require an adposition to specify the syntactic role of “ghost?” I would suggest that, at a fundamental level, it does not. The upshot of this is that person marking, argument marking, and tense all get thrown into one of those two buckets marked “redundancy” and “irregularity.” Also important: note that the three examples each have relatively similar amounts of redundancy/irregularity, despite having radically different amounts of “morphology” in most typological analyses.

Alright, let’s test my definition above on these:

	χ_min	χ_max	χ
Korean	2	3	2
Choctaw	3	4	1
Korean	3	4	1

So, indeed, these all have similar χ-values, coinciding with your intuition that each example has ‘relatively similar amounts of redundancy/irregularity, despite having radically different amounts of “morphology” in most typological analyses’.

zompist wrote: ↑Fri Jun 05, 2020 4:58 am In your sentence, it seems odd to me that you don't include "I" (in some abstracted form) as necessary. "I'm afraid of ghosts" and "you're afraid of ghosts" seem to me to differ in as important a way as "I'm afraid of ghosts" vs "I'm afraid of vampires".

That seems pretty odd to me as well.

Maybe you're thinking that the Korean sentence doesn't require it. But that just makes me think about null marking. Take the Mandarin phrase, for instance: it has no overt aspect marker, but it contrasts with sentences that do have it (corresponding to "I used to be afraid of ghosts", "I'm afraid of ghosts so...", "I have been afraid of ghosts", etc). I'm comfortable with saying that the Mandarin sentence doesn't mark tense, but I think it does signal aspect.

Indeed, that worries me as well. If you use my definitions, the ‘quantum number’ of the proposition SCARES(ghosts, me) is ҁ=3 — but the ‘minimal complexity’ of the Korean sentence is χ=2. It seems like a reasonable constraint to require χ≥ҁ, but that assumption seems to be breaking here. So either Moose-tache’s Korean sentence isn’t actually a translation, or something is very wrong here.

… You could even say that the "fundamental level" is something like SCARES(ghosts, me). Such approaches can run into problems because the proposed primitives often aren't. But the ones in your example seem fine to me.

I was thinking along the same lines (note my use of ‘proposition’ earlier). Possibly another way to define the quantum number ҁ of a proposition is to simply count the number of identifiers involved in a propositional-calculus representation of that proposition — in which case the ‘quanta’ in question simply refer to the identifiers used in the expression.

Posted: **Fri Jun 05, 2020 7:01 am**

Raphael wrote: ↑Fri Jun 05, 2020 6:06 am
zompist wrote: ↑Fri Jun 05, 2020 4:58 am To put it another way, are ghosts like dogs (pretty universal) or like gods (tied to a shitload of cultural baggage)?
Is there any concept, including "dog", that's not tied to a shitload of cultural baggage?

Sure. Dog is a more of a natural class than gods.

As the article says, the concept is full of disagreements, like everything in philosophy. But as it also says, though you can take the position that there's no such thing, that appears to be a minority view and a bit of a bizarre one.

Posted: **Fri Jun 05, 2020 9:40 am**

zompist wrote: ↑Thu Jun 04, 2020 11:50 pm All of the above is why I prefer Dixon's way of putting it: that it seems pretty evident that languages don't vary by an order of magnitude in complexity. We don't have a world where Primitivese is mastered by toddlers at 4, while Complexish takes people till 40.

On the other hand, I remember claims that George W. Bush never mastered English!

Posted: **Fri Jun 05, 2020 9:48 am**

Moose-tache wrote: ↑Fri Jun 05, 2020 3:09 am Then there's irregularity. If a speaker has to keep track of multiple forms of "frightened" depending on technical details like whether it's used with an object or not, etc., then that's more information the speaker must account for. Unlike redundancy, this doesn't really add anything to the sentence, other than the opportunity to disambiguate similar words, but it's necessary to be properly understood. In English this means remembering both the past tense marker -ed, and the combining rule that it reduces to -t or -0 after verbs ending in a coronal plosive.

And it's interested to know why it matters. It seems to be that regularisation confounds the listener's predictions, and degrades his comprehension thus.

Moose-tache wrote: ↑Fri Jun 05, 2020 3:09 am ...much as English native speakers know that the past tense of "sping" is "spang" but the past tense of "spuck" is "spucked."

Are you suggesting that actually it's "spung" as with "fling"?

Posted: **Fri Jun 05, 2020 10:15 am**

Near the beginning of the thread we talked about how fusional seems to apply mostly to IE languages, but that there are other ones out there that just arent as famous because they arent widely spoken.

My favorite example might be the mysterious Yurok language, which i keep confusing with the unrelated Karok .... at https://en.wikipedia.org/wiki/Yurok_lan ... ifications you can see a wonderful paradigm showing the different words for "big" depending on what noun class the referent is in. Sometimes it's peloy, sometimes plohkeloy, sometimes plep .... then there's pleyteloy, plrʔry, and ploks. Im sure the native speakers see a pattern there, but it's opaque to me, and seeing a second paradigm for "red" right beside it with an altogether different pattern leads me to believe that it is quite complicated indeed.

Another example is the little-known Angoram of Papua, which I suspect is just one of many languages of a similar type. I know it only from a brief mention in an Encyclopedia Britannica article, and Wikipedia's article on it is not very detailed. (Maybe I should fill it in with the Britannica info someday.) I posted about this long ago.

Besides having surprisingly long words such as /ˈnankʰəřařenantʰikʰa/ for "kill" (maybe this is a society of pacifists?), Angoram has the peculiarity that when a noun of a particular noun class is introduced into a clause, "the entire sentence, rather than a single word, is changed". (credit to Stephen Wurm for the wording.) So for example, in a sentence "i killed the frog" the verb for kill would take an inflection to show that the object is in the "frog" noun class (or maybe just "animal" ... the article was light on the details). So would any adjectives that go with it, including numbers.

The Asturian Wikipedia has much more detailed information, including three sample sentences showing just how radically unlike European languages it is. The first two words seem to be the subject and object, but I cant tell which of the remaining words is the verb:

ame akwum kuvambakwum sumupar amenakwum salikəmba "I saw my two big women"

ame pwanggli kəpanggli klupar amenakanggli salikənggliya "I saw my two big arrows"

ame konggəmbər kəvambər pələpar amenkəmbər salikəmbəra "I saw my two big gardens"

Yeah the sentences are a bit clumsy, but i guess Wurm wanted to show that by changing just one word, nearly everything else in the sentence also changes.

What gets me about Angoram is the implication ... although it never explicitly says so .... that these fusional modifications of the words in the sentence are not dependent on noun class, but on the exact form of the word being used as the object, which means that *every single word* in the language would have a unique pattern of modifications to all of those other words, making it far more complicated than even Yurok.

Posted: **Fri Jun 05, 2020 12:57 pm**

bradrn wrote: ↑Fri Jun 05, 2020 3:32 am
aporaporimos wrote: ↑Fri Jun 05, 2020 3:26 am FWIW, my intuitive impression of Classical Greek: despite the morphological complexity, the syntax isn't exactly simple. It's just that instead of syntax rules about word order, you have syntax rules about when and how to use all those cases and tenses and moods.
Minor nitpick: wouldn’t this count as morphosyntax, rather than as either of morphology or syntax?

Oh, maybe; I thought that morphosyntax was a subset of syntax (or maybe a partly overlapping set). My terminology may be coming from traditional grammar. Anyways, my point is that, in English you have a rule "the subject precedes the verb and the object follows it," and in Greek you have a rule "the subject is in the nominative and the object is in the accusative," and these rules seem equivalently complex. (This is what I meant by syntax.) But in Greek, in order to apply the rule, you need to know how the accusative case is formed (add /-n/ to the stem, basically, but the exact form varies by paradigm and between singular and plural as a result of sound change, and neuter nouns work differently). This what I meant by morphology. You can see this limits the extent to which morphological complexity takes the place of syntactic complexity in Greek: the rules aren't more complicated in the abstract, but there's more you need to know to apply them.

Posted: **Fri Jun 05, 2020 2:55 pm**

Nortaneous wrote: ↑Fri Jun 05, 2020 12:19 amI've definitely heard claims that constructs in [difficult language] aren't learned until [slightly later than expected age], but I'm not sure if this has been studied.

zompist wrote: ↑Fri Jun 05, 2020 2:52 amYes, that's it. I like his comment because it avoids at least some of the straw men. If we somehow had an absolute metric and found out that Latin is 1.25 times as hard as English-- or the reverse-- it would not invalidate the Ling 101 prof's statement. If it was 6.4 times, that'd be surprising.

I would say that using an (imaginary, magical) absolute metric and finding some language is 1.25 times as hard as English would definitely say something interesting about complexity. I mean, while I do lean towards the idea that some languages are more complex than others, I also think that the differences aren't that great anyway, and 1.25 is more or less the difference of interest I'd expect to see, maybe as much as 1.5. I'm not looking for an order of magnitude here where a 4 year old achieves a stage in one language that only a 40 year old achieves in another.

This and the comments about LING101 apologia are also tied to my complaint about linguists devaluing written language, even though I do understand it's a reaction against the general public overvaluing it. A lot of the so-called artificial things about loftier written styles in literate languages have, presumably, comparable parallels in oral languages spoken by illiterate speakers. Sanskrit and Homeric Greek reflect very conservative oral traditions that retained obsolete morphology and syntax after all. It's not all that different from a Spanish speaker learning about the literary anterior preterite in that language (hube cantado), or French speakers learning the imperfect subjunctive (j'amasse), both of which are absent in the ordinary spoken language a 10 year old would've acquired. I imagine that a lot of the complexity of illiterate languages is actually of this sort, namely dialectal/uncommon/obscure/old morphology, syntactic constructions or vocabulary, which speakers use something else for until they happen to learn about these other ways of saying something too.

That said, I also know this can be used the other way, to argue that languages are equally complex. Again, my opinion is just my impression, and ultimately the lack of any good metric or good non-impressionistic argumentation means I can't put much confidence in it anyway.

One thing that doesn't come up much in these discussions is, I think it's not so unreasonable to assume that adults in all languages, if they actually use them, constantly learn new things about them. Especially vocabulary, but also styles that involve a certain morphology or syntax or even pronunciation. If this is the case, we could also say that languages are so expansive depending on how people live, how separately they live, how politically or religiously they're divided, how many domains it is used in, etc., that basically a lot of human languages that are in use are too large for most single humans to ever learn exhaustively. Maybe it is possible in a tiny tribal language, but then if you're in a tiny tribe you also find yourself learning the (very divergent) dialects and languages of a lot of your neighbours anyway. This would mean that it's not so much that languages are equally "finitely complex" so that (e.g.) a "10 (or 20) year old" has already acquired "all the basics" in any given language, but that languages, oral-only or written, are just hugely complex to even ask the question...

I mean, my parents are not all that amazingly well-read in Spanish, but whenever I talk to them for an extended period of time I end up learning new words, because there is just that much spoken Salvadoran vocabulary I just don't know. Just this morning I learned /koˈhojo/ 'bud', which, when I looked it up in the DRAE, turned out to be spelled cohollo (with /h/, cf. albahaca which is /albaˈhaka/ in El Salvador), for which the RAE prefers a form with /g/ (cogollo). Sometimes I learn new meanings of words I know, or morphology, or syntax, or even about sociolectal accents...

As an aside, Frislander's mention of complexity in the form of wild lexical homonymy is particularly interesting. It's not something I had ever considered...

There is also probably something to the idea that syntax, inflectional morphology and the lexicon (derivational morphology, homonymy) balance each other. For many polysynthetic languages with complex polypersonal agreement, it's pointless to talk about S, O and V order, not only because they tend to have pragmatic word order, because they very rarely have S and O in the same sentence at all. So Ojibwe can be described as being SV or VS, and OV or VO. This may arguably reflect a simplicity in syntax that is not usual in e.g. IE, Turkic or Sinitic languages. I notice that papers on Iroquoian/Algonquian """syntax""" tend to actually be about morphology(!), that is, morphosyntax about the interpretation/selection/scope/orders of morphemes within words (and then at large within sentences). You get weird things (from a European view) in languages with lots of morphemes per word.

bradrn wrote: ↑Wed Jun 03, 2020 7:47 pmYes, I agree that many languages have at least a couple of fusional morphemes. But usually the term is used few languages are as thoroughly fusional throughout as IE (and Semitic) are. To my understanding, the vast majority of Inuktitut inflectional morphemes have only one meaning and are unfused (polypersonal agreement being a prominent exception in many languages), whereas the vast majority of IE inflectional morphemes are highly fused. I don’t think it really makes sense to describe a language such as Inuktitut as ‘fusional’ on the basis of a couple of morphemes alone.

Those "couple of morphemes alone" were just illustrative, to show you that Inuktitut has fusion in its endings of polypersonal agreement.

It's a bummer you discount verbal personal agreement, since that's a natural target of morphological fusion, besides plural marking. Maybe a better example would have been Nortaneous' all-time favourite language, Iau from the Lakes Plain family in Papua, namely its beautiful tonal inflections.

aporaporimos wrote: ↑Fri Jun 05, 2020 12:57 pm
bradrn wrote: ↑Fri Jun 05, 2020 3:32 amMinor nitpick: wouldn’t this count as morphosyntax, rather than as either of morphology or syntax?
Oh, maybe; I thought that morphosyntax was a subset of syntax (or maybe a partly overlapping set). My terminology may be coming from traditional grammar. Anyways, my point is that, in English you have a rule "the subject precedes the verb and the object follows it," and in Greek you have a rule "the subject is in the nominative and the object is in the accusative," and these rules seem equivalently complex. (This is what I meant by syntax.)

Chomsky talks at length about the syntax involved in case selection while calling it "syntax", so I'd say "syntax" does cover anything in morphosyntax, but I wonder about the opposite. My understanding of that term is that it is a union of anything in the morphology and syntax sets (morphology ∪ syntax, or a "full outer join" in SQL terms), not an intersection of the parts where both are involved (morphology ∩ syntax, or an "inner join" in SQL terms), so I'd say your (aporaporimos) use was correct. But maybe some or many linguists insist in the narrower definition (with an intersection).

Posted: **Fri Jun 05, 2020 4:35 pm**

zompist wrote: ↑Fri Jun 05, 2020 4:58 am In your sentence, it seems odd to me that you don't include "I" (in some abstracted form) as necessary. "I'm afraid of ghosts" and "you're afraid of ghosts" seem to me to differ in as important a way as "I'm afraid of ghosts" vs "I'm afraid of vampires".

Thank you for pointing this out. But I'm not suggesting that "I" has no meaning in the sentence. I am simply looking at the minimal translations of the concept. If a language allows us to imply a first person subject without mentioning it, then it does not count as part of that language's morpho-syntactic complexity (although any rules relating to pronominal implicature and omission might be, maybe). Obviously you could say "You are afraid of ghosts," and these two sentences would contrast in any language. But the fact that two sentences contrast does not mean that the parts that contrast are essential to the individual sentences.

Here's another example.
"I have an appointment."
"I have an appointment on Tuesday."
Clearly these sentences contrast. In fact, we can add any number of additional dimensions, like "unfortunately" or "before lunch." But none of this implies that "I have an appointment" means that the appointment is on Monday, is fortunate, or takes place after lunch. This also gets to your point about Mandarin aspect. I hinted at this above, but the ability to add aspectual information is not the same as saying that the default sentence is secretly marked for every thing that it isn't. This is the sort of madness that leads students to draw sentence trees by starting with a half dozen empty layers before reaching any information actually present in the sentence or the mind of the speaker. Similarly, if Korean allows us to treat first person subjects as a given, and then allows us to specify other subjects by adding information, that does not mean that the first person subject was overtly there the whole time, as a "null quantum" of information, any more than "not before lunch" is hiding in the sentence "I have an appointment."

I have a reason for all this. I am expressly using the grammatical rules of these languages because that's what we're comparing: morpho-syntactic complexity. When I talk about the minimal units of information required by the concept, I do not mean to suggest that I've discovered some Platonic ideal of thought. I simply mean the concept that is actually conveyed by the subject language. And in the subject language, i.e. Korean, "I am afraid of ghosts" does not need "I." What fourth dimensional mathematics is going on in a Korean speaker's mind, and whether they begin their thought process with X-bar or something, is none of my concern. If a Korean is scared of ghosts and needs to employ their language's tools for expressing that idea, this is what we get.

Well, I'm not sure about the ghosts. Informally, I expect it's a pretty common concept. Are the Choctaw and Korean concepts the same concept though? To put it another way, are ghosts like dogs (pretty universal) or like gods (tied to a shitload of cultural baggage)?

Choctaw ghosts are decidedly different from Korean ghosts. As I've said before, I am being deliberately sneaky when it comes to semantics in all of my examples, but you could probably improve them by using less culturally-dependent terms.

Another caveat, just to get it out in the open before someone inevitably mentions it: I am also treating each example as if it is in a vacuum. Pragmatics will complicate things by allowing different amounts of omission in different languages. If someone asks you "You're trembling! What's the matter with you?" It is acceptable in English to say "Afraid of ghosts," even though that kind of pro-drop would make a sentence nonsensical in isolation. My hope is that either these things will cancel out across large samples, or they can be addressed by an additional layer of analysis of quantifying units of information.

Posted: **Fri Jun 05, 2020 6:15 pm**

Moose-tache wrote: ↑Fri Jun 05, 2020 4:35 pm If a language allows us to imply a first person subject without mentioning it, then it does not count as part of that language's morpho-syntactic complexity (although any rules relating to pronominal implicature and omission might be, maybe).

Fair enough. (Except for the final "maybe". Pragmatics is part of language too.)

Here's another example.
"I have an appointment."
"I have an appointment on Tuesday."
Clearly these sentences contrast. In fact, we can add any number of additional dimensions, like "unfortunately" or "before lunch." But none of this implies that "I have an appointment" means that the appointment is on Monday, is fortunate, or takes place after lunch. This also gets to your point about Mandarin aspect. I hinted at this above, but the ability to add aspectual information is not the same as saying that the default sentence is secretly marked for every thing that it isn't.

This isn't a very good analogy. Sure, you can always add extra information. But that doesn't mean that all information is an oozy homogenous mass of equal importance. The whole thing about morphosyntax is that the language insists that certain things are important, and certain things are extraneous. So e.g. we say that evidence is essential in a Quechua sentence but not an English one. We can add information in either case: either speaker can explicitly say "I know this because Ramón told me." But only Quechua makes you explicitly mark evidentiality in almost every sentence.

This is the sort of madness that leads students to draw sentence trees by starting with a half dozen empty layers before reaching any information actually present in the sentence or the mind of the speaker.

No it isn't. I'm happy to argue both for and against Chomskyan syntax in another thread. I'll just point out though that you might like Jackendoff & Culicover's Simpler Syntax. They have syntax trees which closely match surface structure, no empty nodes allowed, and a separate semantic layer that's not far from what you're talking about.

You may not like null markings, but a) they're very useful, and b) it's not always neatly obvious where to divide morphemes anyway.

E.g., I assume you'd allow that Russian morphology requires marking case, gender, and number? Except when it doesn't: Akademiya Nauk "Academy of Sciences" has a bare root nauk which cannot appear (IIRC) in any other context. So by your standards it's "madness" to say that it's a genitive plural? Then what's (say) nauka— do you gloss it "science.gen.pl-nom.s"? Or have two different roots "nauk", one meaning "science" and one meaning "science.gen.pl"? You do you, but it's simple and natural to just say that the genitive plural has a null marking.

Examples of the latter problem: is the root for French finir fin- or fini-? Is Biblical Hebrew hammayim 'the waters' ha-mmayim or ham-mayim or what? (If it's not clear: the article is ha- and causes gemination of a following non-guttural consonant.)

A little more difficult: is "je parle" marked for mood? Morphologically, it happens to merge indicative and subjunctive. But plenty of other verbs do make the distinction.

When a language doesn't require a distinction at all, I'm happy with your no-null-nodes thing. That's why I said that Mandarin doesn't indicate tense. But when a distinction is made regularly and obligatorily, it's part of the language's morphosyntax even if in some sample sentences it happens to be null.

To put it another way, you can forget the word for "unfortunately" and still be said to speak Mandarin pretty well. If you don't know aspect, however, you can't.

Posted: **Fri Jun 05, 2020 6:18 pm**

Posted: **Fri Jun 05, 2020 11:48 pm**

Pabappa wrote: ↑Fri Jun 05, 2020 10:15 am Besides having surprisingly long words such as /ˈnankʰəřařenantʰikʰa/ for "kill" (maybe this is a society of pacifists?), Angoram has the peculiarity that when a noun of a particular noun class is introduced into a clause, "the entire sentence, rather than a single word, is changed". (credit to Stephen Wurm for the wording.) So for example, in a sentence "i killed the frog" the verb for kill would take an inflection to show that the object is in the "frog" noun class (or maybe just "animal" ... the article was light on the details). So would any adjectives that go with it, including numbers.

The Asturian Wikipedia has much more detailed information, including three sample sentences showing just how radically unlike European languages it is. The first two words seem to be the subject and object, but I cant tell which of the remaining words is the verb:

ame akwum kuvambakwum sumupar amenakwum salikəmba "I saw my two big women"

ame pwanggli kəpanggli klupar amenakanggli salikənggliya "I saw my two big arrows"

ame konggəmbər kəvambər pələpar amenkəmbər salikəmbəra "I saw my two big gardens"

Yeah the sentences are a bit clumsy, but i guess Wurm wanted to show that by changing just one word, nearly everything else in the sentence also changes.

What gets me about Angoram is the implication ... although it never explicitly says so .... that these fusional modifications of the words in the sentence are not dependent on noun class, but on the exact form of the word being used as the object, which means that *every single word* in the language would have a unique pattern of modifications to all of those other words, making it far more complicated than even Yurok.

From The Languages of the Sepik-Ramu Basin and Environs, some Angoram glosses:

imbarŋgar ta ami-na-klea sum-erəm kup-le
pig.I.PL these.I.PL 1SG-POSS-I.PL I.PL-three big-I.PL
these three big pigs of mine

paruŋgli kle ami-na-ŋglea kl-erəm kup-aŋglea
betelnut.III.PL these.III.PL 1SG-POSS-III.PL III.PL-three big-III.PL
these three big betelnuts of mine

səmur wura ami-na-kura wa-rəm kup-ura
cane.VIII.PL these.VIII.PL 1SG-POSS-VIII.PL VIII.PL-three big-VIII.PL

And Foley writes:

As in Bantu languages all modifiers of nouns, like possessors, deictics, numerals, adjective and relative clauses must agree in class and number with their head noun. The basis of noun classification in Lower Sepik languages is a mix of semantic and phonological criteria. There is always an animate class (split in Yimas into three, for male humans, for female humans and for higher animals, essentially mammals and the cassowary), one for useful plants, and a number of classes defined by the final phoneme of the root, e. g. –ŋg, -mb, -i, -aw, etc.

So it's not the case that every single word has its own noun class. One difficult thing, however, is that unlike Bantu (and other Papuan languages like Tehit*), there are many different concord forms for the same class.

* Tehit has concord for person (and gender in the third-person singular) on just about anything that isn't an alienable noun (Hesse: "a wide variety of word classes, including verbs, adjectives, possessives, quantifiers, inalienable nouns (partitives and kin terms), prepositions, relativizers, and even conjunctions"), primarily in the form of uniconsonantal prefixes:

Wa-wet o-u w-lok oli m-aka w-hitung oli-m.
3SM-child DET-3SM 3SM-pick_up again 3SF-come 3SM-count again-3SF
His child again picked up another (grub) and again distributed (it).

Tet t-to t-adien Srer la mam m-kain se oko ha.
I 1SG-say 1SG-with Srer DU 1EXC 1EXC-own river DEM but
I say that I (my clan) along with the Srer (clan), we both own this river, however.

W-thok w-ak aidi m-fot wa-wet o-u w-dik.
3SM-split_out 3SM-at down_there 3SF-finish 3SM-child DET-3SM 3SM-put
When he finished chopping out a grub, his child would set it down.

W-lok oli m-an o m-ahin bet-alit m-aka fo w-to m-an ko m-an t-ate-u.
3SM-pick_up again 3SF-REL DEM 3SF-from mud-wallow 3SF-come then 3SM-say 3SF-REL DEM 3SF-for 1SG-grandparent-3SM
He picked another one up from the boar wallow and said: This one is for grandpa.

Posted: **Sat Jun 06, 2020 6:51 am**

I’ve been thinking about this a bit, and one conclusion I’ve come to is that we may be conflating different types of complexity. Let me try to do my best to disentangle them.

Firstly, there’s complexity in terms of how many elements are obligatory in a sentence. Mandarin has obligatory aspect; English has obligatory tense and aspect; Quechua has obligatory evidentiality (plus I’m sure a few more things). I think my χ measurements earlier capture this fairly well, but it should be pretty obvious anyway without a formal definition.

Next, there’s complexity in terms of number of morphological paradigms and irregularity (I’m merging these two because they seem related to me). Turkish has practically no irregularity or complex paradigms; English has a little bit of irregularity; Latin and Hebrew have lots of morphological paradigms, and as far as I’m aware (which is not far) they also have quite a bit of irregularity. This measure seems roughly correlated with how much grammatical stuff you have to rote-learn when learning a new language.

Finally, there’s complexity in terms of some nebulous whole-language property, which is distributed between syntax, morphology, the lexicon, semantics etc. This seems to be very hard to define, which is a pity, as this is the sort of complexity we’re talking about when we say stuff like ‘all languages are roughly equally complex’. (This can’t be referring to either of the other two measurements, as those obviously differ between languages.) I think that finding a good definition for this (as well as disambiguating a bit more between these three measurements) would solve a lot of the confusion we’ve been having.

Posted: **Sat Jun 06, 2020 9:30 am**

Am I missing something? Because, coming to this late, it occurs to me that all languages have the same problem which is how to express the complexity of the thoughts that we as humans have. Since the complexity of the thoughts that need putting into speech are not really simpler if you're Russian, Brazilian, American, Hawaiian, Egyptian of whatever, why would we expect Russian, Portuguese, English, Hawaiian or Arabic to be significantly different in complexity. We might not have a definite measurement of that (making talk of perfect counter-balancing rather silly), but I'd expect them in different ways to be of equivalent complexity.

Which also means this trope (which I get from my dad a lot) about languages getting less complex is wrong. Languages only get less complex if the ideas we want to express get less complex. The only language I can think of in that category is Orwell's Newspeak, specifically designed by the Party to make expressing non-approved ideas impossible and hence make the ideas impossible.

Posted: **Sat Jun 06, 2020 9:47 am**

evmdbm wrote: ↑Sat Jun 06, 2020 9:30 am Am I missing something? Because, coming to this late, it occurs to me that all languages have the same problem which is how to express the complexity of the thoughts that we as humans have. Since the complexity of the thoughts that need putting into speech are not really simpler if you're Russian, Brazilian, American, Hawaiian, Egyptian of whatever, why would we expect Russian, Portuguese, English, Hawaiian or Arabic to be significantly different in complexity. We might not have a definite measurement of that (making talk of perfect counter-balancing rather silly), but I'd expect them in different ways to be of equivalent complexity.

This is (I think) an important point, and I did mention it in passing earlier:

bradrn wrote: ↑Fri Jun 05, 2020 12:14 am … Languages can all represent roughly the same things …

But you said it much better than I did.

Which also means this trope (which I get from my dad a lot) about languages getting less complex is wrong. Languages only get less complex if the ideas we want to express get less complex. The only language I can think of in that category is Orwell's Newspeak, specifically designed by the Party to make expressing non-approved ideas impossible and hence make the ideas impossible.

This is a common trope amongst non-linguists. Everyone going back to the Romans has complained about the younger generation being lazy in their pronunciation of the beautiful language of their ancestors, and if this goes on much longer we’ll soon be speaking in monosyllabic grunts. And everyone going back to the Romans was (and is) dead wrong about this.

(I really mean it about the Romans, by the way. A quote by Cicero, from Deutscher’s The Unfolding of Language: ‘practically everyone … in those days [last century] spoke correctly. But the lapse of time has certainly had a deteriorating effect in this respect’. And while I’m at it, a quote in 1974 by Hans Weigel, showing impressive chutzpah: ‘every age claims that its language is more endangered and threatened by decay than ever before. In our time, however, language really is endangered and threatened by decay as never before …’)

Posted: **Sat Jun 06, 2020 2:16 pm**

bradrn wrote: ↑Sat Jun 06, 2020 9:47 am Everyone going back to the Romans has complained about the younger generation being lazy in their pronunciation of the beautiful language of their ancestors, and if this goes on much longer we’ll soon be speaking in monosyllabic grunts. And everyone going back to the Romans was (and is) dead wrong about this.

the Romans were clearly right, at least about the French

Posted: **Sat Jun 06, 2020 2:18 pm**

bradrn wrote: ↑Sat Jun 06, 2020 6:51 am English has a little bit of irregularity

Curiously, I've seen a claim that it has, compared to other languages, a very high number of irregular verbs.

Posted: **Sat Jun 06, 2020 2:24 pm**

evmdbm wrote: ↑Sat Jun 06, 2020 9:30 am Am I missing something? Because, coming to this late, it occurs to me that all languages have the same problem which is how to express the complexity of the thoughts that we as humans have. Since the complexity of the thoughts that need putting into speech are not really simpler if you're Russian, Brazilian, American, Hawaiian, Egyptian of whatever, why would we expect Russian, Portuguese, English, Hawaiian or Arabic to be significantly different in complexity. We might not have a definite measurement of that (making talk of perfect counter-balancing rather silly), but I'd expect them in different ways to be of equivalent complexity.

Agree that the complexity of human thought puts an important lower bound on how complex language can be. I don't think it imposes an upper bound, though.

As others have mentioned above, writing systems vary greatly in complexity, even though they are all means of expressing spoken language in graphical form. (Generally different spoken languages, sure, but you can also look at alternative scripts for a single language, such as Pinyin and Hanzi for Mandarin.) So it's not true as a general principle that the complexity of a means of expression is determined by the complexity of the underlying content being expressed.

Another example: I would argue that the Mandarin system of numerals is simpler than that of English. Mandarin has unique terms for numerals 1 - 10; then 11 - 19 are formed by concatenation "ten one," "ten two," etc; 20 - 29 are "two ten," "two ten one," and so on all the way up to 100, which has another unique term. English, on the other hand, has irregular terms "eleven" and "twelve," then 13 - 19 are formed somewhat irregularly with a suffix, and there's another suffix for multiples of ten 20 - 90, which sounds confusingly similar to the other suffix. French, I'm told, is worse. But these numeral systems all solve the same problem, namely expressing cardinal numbers. There's a difference in complexity of expression that doesn't correspond to a difference in complexity of what's being expressed.

Posted: **Sat Jun 06, 2020 2:39 pm**

aporaporimos wrote: ↑Sat Jun 06, 2020 2:24 pm
evmdbm wrote: ↑Sat Jun 06, 2020 9:30 amAm I missing something? Because, coming to this late, it occurs to me that all languages have the same problem which is how to express the complexity of the thoughts that we as humans have. Since the complexity of the thoughts that need putting into speech are not really simpler if you're Russian, Brazilian, American, Hawaiian, Egyptian of whatever, why would we expect Russian, Portuguese, English, Hawaiian or Arabic to be significantly different in complexity. We might not have a definite measurement of that (making talk of perfect counter-balancing rather silly), but I'd expect them in different ways to be of equivalent complexity.
Agree that the complexity of human thought puts an important lower bound on how complex language can be. I don't think it imposes an upper bound, though.

I think that was zompist's point about languages being good enough for their speakers all around the world, but not as clearly stated. Regardless of how more complex or less complex a language is than others, there's still a whole reality to deal with. The moment you start dealing with plants, collecting fruit and seeds and roots, grounding them for their chemical properties, growing them, using them for decoration, buying and selling them, etc., and talking about these things to yourself and other people, you need a whole bunch of vocabulary and expressions, and the lexical knowledge of the grammatical constructions they're used with.

I mean, I am constantly making up new vocabulary for everything I do, some of which there's standard vocabulary for, which I just don't know. I remember I used to say "relativization on [a noun phrase]" (e.g. relativization on subjects), which I had made up for want of a better term, until Salmoneus corrected me telling me of the standard "to be relativized (said of a noun phrase)", which is therefore transformed to "relativization of [a noun phrase]".

Posted: **Sat Jun 06, 2020 4:24 pm**

bradrn wrote: ↑Sat Jun 06, 2020 6:51 am Finally, there’s complexity in terms of some nebulous whole-language property, which is distributed between syntax, morphology, the lexicon, semantics etc. This seems to be very hard to define, which is a pity, as this is the sort of complexity we’re talking about when we say stuff like ‘all languages are roughly equally complex’. (This can’t be referring to either of the other two measurements, as those obviously differ between languages.) I think that finding a good definition for this (as well as disambiguating a bit more between these three measurements) would solve a lot of the confusion we’ve been having.

Most of this is nebulous because it's less studied, or incompletely studied. You mentioned Construction Grammar earlier, so we can put it this way: how many constructions are there in English? I don't think anyone knows. For comparison, John Ross has a list of 200 transformations in English, with no claim that we've found them all. And constructions are a broader category.

But this is why I talked earlier about how long it takes children to learn the language. That amasses everything about the language. Plus, it does put an upper limit on complexity. You can't really have a language that takes longer to learn than half the average lifespan.

(With the same caveats as above: not talking about writing systems, separate literary languages, etc. Societies can definitely keep adding to education requirements, as those of you in grad school know.)

Posted: **Sat Jun 06, 2020 4:50 pm**

aporaporimos wrote: ↑Sat Jun 06, 2020 2:24 pm Agree that the complexity of human thought puts an important lower bound on how complex language can be. I don't think it imposes an upper bound, though.

But why be loads more complicated than necessary? I mean there's room for manoeuvre so expecting perfect counterbalancing is silly, but at some point it's why bother territory....

Zompist Bboard Again

Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity