Morphological complexity

bradrn · Post by **bradrn** » Fri Jun 05, 2020 12:14 am

zompist wrote: ↑Thu Jun 04, 2020 11:50 pm
aporaporimos wrote: ↑Thu Jun 04, 2020 2:05 pm The claim that all languages are equally complex has always seemed implausible to me on its face. If it were actually true that 1) complexity of languages can be measured and quantified
This seems to me to fundamentally misunderstand why Linguistics 101 books say things like this. It is not because there is some sort of claim that there is a complexity metric. It's because non-linguists are obsessed with which languages are better than others, and complexity is part of that. …

But that’s not what we’ve been discussing. As you note, we’re all past the Ling 101 level; hopefully no-one here would seriously believe that one language is ‘better’ than another in this way, and no-one has mentioned that at all. What we’ve been discussing is whether it makes sense at any higher level to define the ‘complexity’ of a language, and if so whether all languages will end up roughly as complex as each other or not.

the much stronger claim that complexity in one part of a language will always be perfectly counterweighed by simplicity in another part.
There is no such claim.

I dunno, I actually think there could be reason to make such a claim, at least in a weaker form. Languages can all represent roughly the same things, but they all do it using different combinations of morphology, syntax and lexical choice (and probably a few more I’ve missed). So, for instance, Abkhaz and Tiwi have hugely elaborate morphologies, but not much syntax to speak of — you can barely define what a constituent is. On the other hand, English, Standard Chinese and Yoruba don’t have much morphology, but have hugely elaborate syntax. Another example: languages can represent causation using either morphology, syntax or lexical choice, and most languages only have one or two of these mechanisms dominant. Any lack of one causative construction is compensated by the presence of another.

All of the above is why I prefer Dixon's way of putting it: that it seems pretty evident that languages don't vary by an order of magnitude in complexity. We don't have a world where Primitivese is mastered by toddlers at 4, while Complexish takes people till 40.

I think this is the single strongest argument for ‘all languages are equally complex’ that I’ve seen in this thread so far.

Nortaneous · Post by **Nortaneous** » Fri Jun 05, 2020 12:19 am

zompist wrote: ↑Thu Jun 04, 2020 11:50 pm All of the above is why I prefer Dixon's way of putting it: that it seems pretty evident that languages don't vary by an order of magnitude in complexity. We don't have a world where Primitivese is mastered by toddlers at 4, while Complexish takes people till 40.

Would it need to be an order of magnitude? I've definitely heard claims that constructs in [difficult language] aren't learned until [slightly later than expected age], but I'm not sure if this has been studied. I'd be surprised if L1 acquisition takes the exact same amount of time in Tok Pisin and Inuktitut.

missals · Post by **missals** » Fri Jun 05, 2020 12:55 am

zompist wrote: ↑Thu Jun 04, 2020 11:50 pm
the much stronger claim that complexity in one part of a language will always be perfectly counterweighed by simplicity in another part.
There is no such claim.

Yes, there absolutely is, and it surprises me that you would claim there isn't. It forms a regular part of the conventional spiel given to non-linguists about why no language is better or more complex than any other - two separate claims that are conflated by non-linguists, a notion that is unfortunately and falsely accepted by the typical linguist response. We all know how it goes:

"Well, some languages, like Mohawk, have very complex morphology, meaning they use lots of complicated devices to build words with very complex structures. Many of these languages have rather free word order. But what about English? Our morphology is much less complex, isn't it? Well, English has much stricter word order than Mohawk, meaning it's more syntactically complex. So we can't really say one language is more complex than another."

Or, in response to "Why do languages get less complex over time?"

Well, first off, that's just a trend within Indo-European. But if we look at the changes, we can see that the languages aren't really getting less complex over time - Old English had more nominal cases than Modern English, but it had much freer word order. Nowadays, Modern English has almost entirely lost nominal case, but we have much stricter word order - so as one area of the grammar simplified, another got more complicated.

These stories form a regular part of linguist "apologia" to non-linguists and language enthusiasts, and I would be surprised if any long-time participant in linguistic discussion boards had never seen it, let alone engaged in it themselves.

And yes, it features in the scholarly literature too, both implicitly and explicitly.

From Trudgill's Sociolinguistic Typology:

Of course, the variable complexity claim would not be true if simplification in one part of a language were automatically compensated for by complexification elsewhere, as seems to be suggested by the Hockett quotation above. I do not consider syntax at all in this book8, but
Dahl (2009) demonstrates convincingly that Hockett’s argument about the relationship between morphological and syntactic complexity does not hold. He is supported in this argument by Sampson (2009), and in particular by the work of Shosted (2006), who does not succeed in demonstrating the validity of this negative correlation hypothesis, as he calls it: it is not the case that if one component of language, e.g. morphology, is simplified, then another, e.g. syntax, must necessarily be elaborated to compensate in terms of the overall level of complexity.

The Hockett quote in question:

the total grammatical complexity of any language, counting both morphology and syntax, is about the same as any other.

So yes, scholars have both made this claim before, and it has been argued against by the likes of Trudgill, Dahl, McWhorter, etc, who propose that some languages actually are more complex than others, and that complexity is inversely correlated with certain social phenomena - namely mass adult acquisition - and positively correlated with small, tight-knit social networks.

Yalensky · Post by **Yalensky** » Fri Jun 05, 2020 1:04 am

Nortaneous wrote: ↑Fri Jun 05, 2020 12:19 am
zompist wrote: ↑Thu Jun 04, 2020 11:50 pm All of the above is why I prefer Dixon's way of putting it: that it seems pretty evident that languages don't vary by an order of magnitude in complexity. We don't have a world where Primitivese is mastered by toddlers at 4, while Complexish takes people till 40.
Would it need to be an order of magnitude? I've definitely heard claims that constructs in [difficult language] aren't learned until [slightly later than expected age], but I'm not sure if this has been studied. I'd be surprised if L1 acquisition takes the exact same amount of time in Tok Pisin and Inuktitut.

I think Zompist might be referring to Dixon's slim and polemical The Rise and Fall of Languages, so here's the footnote on p.75 that might shed light on what you raise:

R.M.W. Dixon wrote: There is a mistaken belief among some linguists that 'all languages are equal'. While it is the case that all languages are roughly equal (that is, no language is six times as complex as any other, and there are no primitive languages), it is by no means the case that they are exactly equal. (Slobin, 1982, shows how speakers of different languages master comparable parts of their grammars at a quite different rate.) I have done field work on languages in Australia, Oceania and Amazonia and they were certainly not all equally difficult to describe. There is no doubt that one language may have greater overall grammatical complexity and/or a communicative advantage in a certain sphere, over another. But this is a topic for a separate book.

Not being well read in Dixon I don't know if this separate book ever emerged (perhaps Zompist is citing that one instead?). So part of Dixon's judgment is indeed formed by different acquisitions of comparable constructs. The study Dixon cites is Slobin, Dan I. 1982. 'Universal and particular in the acquisition of language', pp. 128-70 of Language acquisition: The state of the art, edited by Eric Wanner and Lila R. Gleitman. Cambridge: Cambridge UP. (here, though some pages might not be visible)

FWIW it appears that Dixon agrees with what missals says that some linguists have indeed made the "all langs are equally complex" argument. Dixon doesn't name names, though.

mae · Post by **mae** » Fri Jun 05, 2020 1:25 am

bradrn · Post by **bradrn** » Fri Jun 05, 2020 1:36 am

mae wrote: ↑Fri Jun 05, 2020 1:25 am
Nortaneous wrote: ↑Fri Jun 05, 2020 12:19 amI'd be surprised if L1 acquisition takes the exact same amount of time in Tok Pisin and Inuktitut.
"Tok Pisin" as a language actually spoken and learned by people in PNG is drastically different from the 'literary' or 'standard' register that is characterized by an alleged degree of morphosyntactic simplicity (whatever that means); even in the case of the standard register the claims made online about TP that are supposed to be evidence of simplicity are often simply false. The natively learned/spoken varieties are different from the traditional standard to such a degree that I've seen reports that they're actually often not mutually intelligible. Since standard TP and native TP are so radically different I don't think it's plausible that we can make off-the-cuff judgments like this.

I’m surprised to hear this. Could you give an example?

Post by **zompist** » Fri Jun 05, 2020 2:30 am

bradrn wrote: ↑Fri Jun 05, 2020 12:14 am
the much stronger claim that complexity in one part of a language will always be perfectly counterweighed by simplicity in another part.
I dunno, I actually think there could be reason to make such a claim, at least in a weaker form. Languages can all represent roughly the same things, but they all do it using different combinations of morphology, syntax and lexical choice (and probably a few more I’ve missed). So, for instance, Abkhaz and Tiwi have hugely elaborate morphologies, but not much syntax to speak of — you can barely define what a constituent is. On the other hand, English, Standard Chinese and Yoruba don’t have much morphology, but have hugely elaborate syntax. Another example: languages can represent causation using either morphology, syntax or lexical choice, and most languages only have one or two of these mechanisms dominant. Any lack of one causative construction is compensated by the presence of another.

All this is reasonable, but it isn't what's stated above-- that complexity will "always be perfectly counterweighted".

As I said earlier, it seems reasonable that languages should balance out: as distinctions are lost in one area, speakers find a way to make them (i.e. complicate things) in another.

But you know, "it seems reasonable" isn't a "strong claim".

Post by **zompist** » Fri Jun 05, 2020 2:52 am

Yalensky wrote: ↑Fri Jun 05, 2020 1:04 am I think Zompist might be referring to Dixon's slim and polemical The Rise and Fall of Languages, so here's the footnote on p.75 that might shed light on what you raise:

Yes, that's it. I like his comment because it avoids at least some of the straw men. If we somehow had an absolute metric and found out that Latin is 1.25 times as hard as English-- or the reverse-- it would not invalidate the Ling 101 prof's statement. If it was 6.4 times, that'd be surprising.

Here's how David Crystal puts it; note the very careful wording that avoids the straw man that we have that absolute metric: "All languages meet the social and psychological needs of their speakers, are equally deserving of scientific study, and can provide us with valuable information about human nature and society."

Nortaneous wrote: I've definitely heard claims that constructs in [difficult language] aren't learned until [slightly later than expected age], but I'm not sure if this has been studied.

Yes, this sort of thing has been studied. So, e.g., agglutinative Turkish cases are learned earlier than the far less regular Serbo-Croatian. That shouldn't surprise anyone (except Chomsky, for other reasons).

bradrn · Post by **bradrn** » Fri Jun 05, 2020 3:01 am

zompist wrote: ↑Fri Jun 05, 2020 2:30 am
bradrn wrote: ↑Fri Jun 05, 2020 12:14 am
the much stronger claim that complexity in one part of a language will always be perfectly counterweighed by simplicity in another part.
I dunno, I actually think there could be reason to make such a claim, at least in a weaker form. Languages can all represent roughly the same things, but they all do it using different combinations of morphology, syntax and lexical choice (and probably a few more I’ve missed). So, for instance, Abkhaz and Tiwi have hugely elaborate morphologies, but not much syntax to speak of — you can barely define what a constituent is. On the other hand, English, Standard Chinese and Yoruba don’t have much morphology, but have hugely elaborate syntax. Another example: languages can represent causation using either morphology, syntax or lexical choice, and most languages only have one or two of these mechanisms dominant. Any lack of one causative construction is compensated by the presence of another.
All this is reasonable, but it isn't what's stated above-- that complexity will "always be perfectly counterweighted".

True, which is why I was careful to say that ‘there could be reason to make such a claim, at least in a weaker form’.

zompist wrote: ↑Fri Jun 05, 2020 2:52 am Here's how David Crystal puts it; note the very careful wording that avoids the straw man that we have that absolute metric:

Crystal wrote:All languages meet the social and psychological needs of their speakers, are equally deserving of scientific study, and can provide us with valuable information about human nature and society.

This statement seems to me to be so obviously true as to be irrelevant. Of course all languages ‘are equally deserving of scientific study’ — if we didn’t believe that, we wouldn’t be doing linguistics! And not only that, this quote doesn’t seem to be all that relevant for our discussion here: it’s a long jump to go from ‘All languages meet the social and psychological needs of their speakers’ to ‘All languages are about as equally complex as each other’. I mean, I agree with you that all languages are about as equally complex, but this quote feels too close for comfort to a motte-and-bailey tactic.

Moose-tache · Post by **Moose-tache** » Fri Jun 05, 2020 3:09 am

If we start from a perspective of what ideas a person can convey with a certain language, then we should expect relatively constant levels of complexity. You need to indicate that you are frightened; that's one quantum of information. Then you need to specify that you're frightened because of ghosts, that's another. Two quanta. Every language has to be able to deliver that, and no language can discover a third essential quantum unless the speaker actually meant to convey one. If we dispense with the notion of a universal syntax, we might also need a third quanta: a syntactic rule to explain the role of word 1 with word 2 in the absence of overt explanation.

But then there's redundancy. If every sentence requires an aspect marker, then you have to add that you're scared of ghosts continuously. If you require evidentiality or subject pronouns or whatever else then you must account for that. Native speakers will undoubtedly count these among the essential pieces of information, and they may do useful work of disambiguation (for example, if "ghost" sounds like "turtle," then a human agreement marker will help demonstrate that you're afraid of ghosts and not turtles), but we can argue over whether or not they're really necessary to the expression "scared of ghosts." In Choctaw this means making sure that every main verb has a tense. In English it means... well, we do that in English too.

Then there's irregularity. If a speaker has to keep track of multiple forms of "frightened" depending on technical details like whether it's used with an object or not, etc., then that's more information the speaker must account for. Unlike redundancy, this doesn't really add anything to the sentence, other than the opportunity to disambiguate similar words, but it's necessary to be properly understood. In English this means remembering both the past tense marker -ed, and the combining rule that it reduces to -t or -0 after verbs ending in a coronal plosive. In Cherokee it means knowing what stem ending each verb gets before adding tense suffixes. In our ghost example, we might decide that each predicate belongs to one of a number of alignment types, so sensory verbs take experiencers while action verbs take agents, or emotional states describe the speaker unless otherwise specified, and this must be memorized.

So everything beyond those two-three bare minimum quanta of information ("scare" + "ghost" + "default alignment") is the result of either redundancy or irregularity, and places some cognitive burden on the speaker and the listener. But crucially, neither of these things is tied to morphology.

More to the point, how would we know if it was? It's unusual for the complexity of a verb or noun paradigm to come solely in the form of charts of suffixes. Cherokee, for example, has complex stem changes based on what tense suffix follows, but these changes are not random (because of course they are the product of ancient regular sound changes), so native speakers memorize them as analogous sets, much as English native speakers know that the past tense of "sping" is "spang" but the past tense of "spuck" is "spucked." Are Cherokee speakers performing a feat of morphology, or lexicon? Similarly, any description of consonant mutation in Irish, by page count, is dominated by lists of exceptions. Clearly the majority of mental work Irish speakers must do when employing consonant mutation is related to syntax and semantics, but it's also a morphological change.

This is what I like about Construction Grammar. It decouples linguistic analysis from traditional attempts to use words or affixes as clumsy synonyms for the quanta I mentioned above. "Kick the bucket" is clearly one unit of information, and so is "-ed." As hard as it is to quantify the complexity of morphology, it may be a fool's errand to begin with.

That said, are there languages that simply have more or less redundancy and irregularity than others? Leaving aside creoles, do languages differ in how many additional quanta they require over the course of a conversation or a passage? I certainly have not encountered any natural language that gives any impression of doing so at first glance. Muskoki verbs require pronominal prefixes and tense suffixes, but a sentence in Creek doesn't have any more pronominal or tense information than a sentence in English. When I learn a new language, the number of times I think "But why do I even need to include an aspect marker here?" doesn't seem to vary significantly. This is hardly scientific, but all of us probably have developed this sort of intuition through our love of learning languages. So when people say "some languages are more complex," my first thought is always "what is this exotic language you've found that has radically more or fewer quanta of redundant or irregular information? Because I've never discovered it."

aporaporimos · Post by **aporaporimos** » Fri Jun 05, 2020 3:26 am

zompist wrote: ↑Thu Jun 04, 2020 11:50 pm
aporaporimos wrote: ↑Thu Jun 04, 2020 2:05 pm The claim that all languages are equally complex has always seemed implausible to me on its face. If it were actually true that 1) complexity of languages can be measured and quantified
This seems to me to fundamentally misunderstand why Linguistics 101 books say things like this. It is not because there is some sort of claim that there is a complexity metric.

It's because non-linguists are obsessed with which languages are better than others, and complexity is part of that. They want to hear that French is more logical, Italian is more beautiful, Arabic is God's language, Phrygian is the first language, etc, etc. They want to hear that the standard languages are better than dialects. They want to hear that primitive cultures speak primitive languages. (Think of the native characters in, oh, any old movie set in the West or in Asia.)

And it's not just Americans... when I was in Iquitos, my wife's uncle really wanted to hear that Spanish was the hardest language in the world. (I forget if I was nice enough to not tell him that US high school students think it's the easiest.) If I recall correctly, there's a mini-literature in Japan about how Japan is different (and of course better than) other countries, including in its language.

Everyone here is past the Ling 101 stuff, but linguistics professors all run into it and get tired of it and throw in some stuff to combat the myths.

the much stronger claim that complexity in one part of a language will always be perfectly counterweighed by simplicity in another part.
There is no such claim.

The annoying bit is that the people who criticize the Ling 101 tidbit, as I'm afraid you're doing, are the ones who seem to think there is a complexity metric.

What is it? If it's so important, what exactly is your measurement by which you know that "all languages are equally complex" is wrong?

All of the above is why I prefer Dixon's way of putting it: that it seems pretty evident that languages don't vary by an order of magnitude in complexity. We don't have a world where Primitivese is mastered by toddlers at 4, while Complexish takes people till 40.

As missals pointed out, people certainly do believe that all languages are equally complex in a strong sense (perhaps because they read it in a Ling 101 textbook). At any rate, if we here are all past the Ling 101 stuff, then we should be able to discuss the (extremely interesting!) question of whether and how languages vary in complexity without worrying that we're accidentally indulging Phrygian supremacism, or something.

FWIW, my intuitive impression of Classical Greek: despite the morphological complexity, the syntax isn't exactly simple. It's just that instead of syntax rules about word order, you have syntax rules about when and how to use all those cases and tenses and moods. Plenty of material for a couple hundred pages of a reference grammar. Where Greek strikes me as especially simple is that there are very few idiomatic phrases, and much of the derivational morphology, including the ubiquitous preposition + verb compounds, is semantically transparent or nearly so (as compared to, say, English phrasal verbs, which are analogous to Greek prep+verb compounds but often have completely opaque idiomatic meanings). So if were to apply the balancing-out analysis to Greek, I would say that morphological complexity is balanced out by simplicity of semantics of phrases and derived words--which is something you might not pick up from a reference grammar.

bradrn · Post by **bradrn** » Fri Jun 05, 2020 3:28 am

Moose-tache wrote: ↑Fri Jun 05, 2020 3:09 am If we start from a perspective of what ideas a person can convey with a certain language, then we should expect relatively constant levels of complexity. You need to indicate that you are frightened; that's one quantum of information. Then you need to specify that you're frightened because of ghosts, that's another. Two quanta. Every language has to be able to deliver that, and no language can discover a third essential quantum unless the speaker actually meant to convey one. If we dispense with the notion of a universal syntax, we might also need a third quanta: a syntactic rule to explain the role of word 1 with word 2 in the absence of overt explanation.

But then there's redundancy. If every sentence requires an aspect marker, then you have to add that you're scared of ghosts continuously. If you require evidentiality or subject pronouns or whatever else then you must account for that. Native speakers will undoubtedly count these among the essential pieces of information, and they may do useful work of disambiguation (for example, if "ghost" sounds like "turtle," then a human agreement marker will help demonstrate that you're afraid of ghosts and not turtles), but we can argue over whether or not they're really necessary to the expression "scared of ghosts." In Choctaw this means making sure that every main verb has a tense. In English it means... well, we do that in English too.

I think this comes the closest to a definition of ‘linguistic complexity’ I’ve seen so far. But there’s one point that’s got me a little bit puzzled: what exactly counts as a quantum? For instance, is a quantum a ‘unit of information’? But you count ‘because of ghosts’ as +1ҁ; why not +2ҁ, one quantum for ‘because’ and +1ҁ for ‘ghosts’? Or +3ҁ, for each of ‘because’, ‘ghosts’, ‘∅ indefinite article’? Or +4ҁ, for ‘because’, ‘ghosts’, ‘∅ indefinite article’, and the syntactic rule involved here? My point is that we’ll need a proper definition of what exactly constitutes a ‘quantum’ before this definition becomes useful.

bradrn · Post by **bradrn** » Fri Jun 05, 2020 3:32 am

aporaporimos wrote: ↑Fri Jun 05, 2020 3:26 am … At any rate, if we here are all past the Ling 101 stuff, then we should be able to discuss the (extremely interesting!) question of whether and how languages vary in complexity without worrying that we're accidentally indulging Phrygian supremacism, or something.

I tried to say this earlier, but you put it much better than I did.

FWIW, my intuitive impression of Classical Greek: despite the morphological complexity, the syntax isn't exactly simple. It's just that instead of syntax rules about word order, you have syntax rules about when and how to use all those cases and tenses and moods.

Minor nitpick: wouldn’t this count as morphosyntax, rather than as either of morphology or syntax?

Plenty of material for a couple hundred pages of a reference grammar. Where Greek strikes me as especially simple is that there are very few idiomatic phrases, and much of the derivational morphology, including the ubiquitous preposition + verb compounds, is semantically transparent or nearly so (as compared to, say, English phrasal verbs, which are analogous to Greek prep+verb compounds but often have completely opaque idiomatic meanings). So if were to apply the balancing-out analysis to Greek, I would say that morphological complexity is balanced out by simplicity of semantics of phrases and derived words--which is something you might not pick up from a reference grammar.

That’s pretty interesting. It seems to corroborate what I said earlier:

bradrn wrote: ↑Fri Jun 05, 2020 12:14 am … morphology, syntax and lexical choice (and probably a few more I’ve missed) …

So: English has lots of syntax, plenty of funny lexical irregularities and choices, but not a lot of morphology. Tiwi has practically no syntax, a hugely complex morphology, and I have no idea about its lexicon. And Classical Greek has lots of morphology, lots of syntax, but practically no complications in its lexicon.

Moose-tache · Post by **Moose-tache** » Fri Jun 05, 2020 3:38 am

bradrn wrote: ↑Fri Jun 05, 2020 3:28 amBut you count ‘because of ghosts’ as +1ҁ; why not +2ҁ, one quantum for ‘because’ and +1ҁ for ‘ghosts’?

Because I am speaking English to you, and English requires a preposition to indicate the target of "scared." It is an example of redundancy, not an inherently important part of the sentence. The fact that you parsed it out on the same level of importance as "scared" and "ghost" illustrates the other point I made, that native speakers will not distinguish important from unimportant information. But when we are comparing "I am afraid of ghosts" between languages, we must be able to tease out the parts that are essential to the idea being conveyed, and which ones are simply required by the grammar of the language.

Syntacticians learned long ago that there is functionally no limit to how many pieces of information you can posit for a given sentence. Chompskyists would say that the sentence "I am afraid of ghosts" contains nine thousand discrete quanta: "not past progressive," "not performative," "not featuring an old, old wooden ship," etc. This is probably because each time you add a layer you can then claim to have "discovered" something about language processing and get tenure. I respectfully submit that this is nonsense, since there is no reason to assume that these additional pieces of information are common to all translations of "I am afraid of ghosts." The only units that we can say with confidence are present in the mind of every speaker are "scared" and "ghost," and possibly a default alignment rule for sensory predicates.

bradrn · Post by **bradrn** » Fri Jun 05, 2020 3:44 am

Moose-tache wrote: ↑Fri Jun 05, 2020 3:38 am
bradrn wrote: ↑Fri Jun 05, 2020 3:28 amBut you count ‘because of ghosts’ as +1ҁ; why not +2ҁ, one quantum for ‘because’ and +1ҁ for ‘ghosts’?
Because I am speaking English to you, and English requires a preposition to indicate the target of "scared." It is an example of redundancy, not an inherently important part of the sentence. The fact that you parsed it out on the same level of importance as "scared" and "ghost" illustrates the other point I made, that native speakers will not distinguish important from unimportant information. But when we are comparing "I am afraid of ghosts" between languages, we must be able to tease out the parts that are essential to the idea being conveyed, and which ones are simply required by the grammar of the language.

It doesn’t seem terribly redundant to me: ‘I am scared because of ghosts’ has a different meaning to ‘I am scared for ghosts’, which in turn is different to ‘I am scared after ghosts’. So I’d say that ‘because of’ is ‘essential to the idea being conveyed’.

Moose-tache · Post by **Moose-tache** » Fri Jun 05, 2020 3:48 am

bradrn wrote: ↑Fri Jun 05, 2020 3:44 am It doesn’t seem terribly redundant to me: ‘I am scared because of ghosts’ has a different meaning to ‘I am scared for ghosts’, which in turn is different to ‘I am scared after ghosts’. So I’d say that ‘because of’ is ‘essential to the idea being conveyed’.

This is also something I covered in my original post. Redundancy can provide information, and this information can be useful, but that doesn't mean it is a fundamental part of every phrase in which it appears.

For example, obligatory aspect markers give you the opportunity to specify whether you are afraid of ghosts, in general, or afraid of ghosts, completely, etc. Speakers of a language that requires aspect markers will undoubtedly believe these details to be essential parts of every human thought. If we were speaking Yuchi to each other, we would undoubtedly be asking "but how can you say you're afraid of something without an evidentiality marker?" I am talking specifically about the basic concept of "scared of ghosts," not the substantiations of that concept that are permissible in English grammar, all of which require additional quanta to be grammatical. The test of this is the fact that numerous languages will translate "I am afraid of ghosts" with no need for a preposition, or any other overt indicator that "ghost" is the target of "scared."

(note: I am cheating slightly by assuming that speakers understand how "scared" functions in terms of syntactic alignment. If you have a language where every speaker walks into every conversation with no idea how predicates and arguments relate to one another, then yes, you would need an additional piece of information to explain that "ghost" is the target of "scared," and not, say, a location in which you are scared or something silly like that. But I think it's fair to say that it's implied by the semantics of "scared" that you can be scared of something, and that this is the default target of the predicate. Individual languages may complicate this by refusing to allow predicates to have any sort of semantically determined default alignment, or requiring suppletive roots for being scared for and being scared of, but I would count this as an example of redundancy or irregularity, respectively, not an inherent part of a concept. The construction "scared (of something)" in all the examples above is one quantum, even though English or Korean or Cherokee may require multiple quanta to use it in a sentence.)

bradrn · Post by **bradrn** » Fri Jun 05, 2020 4:10 am

Moose-tache wrote: ↑Fri Jun 05, 2020 3:48 am
bradrn wrote: ↑Fri Jun 05, 2020 3:44 am It doesn’t seem terribly redundant to me: ‘I am scared because of ghosts’ has a different meaning to ‘I am scared for ghosts’, which in turn is different to ‘I am scared after ghosts’. So I’d say that ‘because of’ is ‘essential to the idea being conveyed’.
This is also something I covered in my original post. Redundancy can provide information, and this information can be useful, but that doesn't mean it is a fundamental part of every phrase in which it appears.

For example, obligatory aspect markers give you the opportunity to specify whether you are afraid of ghosts, in general, or afraid of ghosts, completely, etc. Speakers of a language that requires aspect markers will undoubtedly believe these details to be essential parts of every human thought. If we were speaking Yuchi to each other, we would undoubtedly be asking "but how can you say you're afraid of something without an evidentiality marker?" I am talking specifically about the basic concept of "scared of ghosts," not the substantiations of that concept that are permissible in English grammar. The test of this is the fact that numerous languages will translate "I am afraid of ghosts" with no need for a preposition, or any other overt indicator that "ghost" is the target of "scared."

Alright, that makes sense. Let me phrase it as a formal definition, to check my understanding:

Definition. The quantum number of a proposition (let’s denote it as ҁ) is defined by finding, for each Earthly spoken language, the number of separate pieces of information required to denote that proposition, and then taking the minimum over all such values.

Is this a correct interpretation of what you’re saying? If so, then I think I see a way to define the complexity of a language, at least for any given sentence:

Definition. The complexity of a language on a given sentence (let’s denote it as χ) is defined by finding each proposition which could be associated with that sentence, getting the quantum number for each such proposition, and then taking the difference between the highest and lowest calculated quantum numbers.

For instance: in English, I am afraid because of ghosts could correspond (minimally) to the proposition with the three pieces of information ‘I’, ‘afraid’, ‘of ghosts’, or (maximally) to the proposition with the six pieces of information ‘I’, ‘afraid’, ‘of ghosts’, ‘present tense’, ‘perfect aspect’, ‘because of’. So the complexity of English on this sentence is χ=6-3=3.

So, does this seem like a reasonable definition? (Probably not, and I’m sure other people can come up with a much better one, but I can’t think of any myself.)

Moose-tache · Post by **Moose-tache** » Fri Jun 05, 2020 4:33 am

Definition. The complexity of a language on a given sentence (let’s denote it as χ) is defined by finding each proposition which could be associated with that sentence, getting the quantum number for each such proposition, and then taking the difference between the highest and lowest calculated quantum numbers.

I think this is reasonable, but we can use cross-linguistic analysis to help us define our "minimal" number of quanta, as in my examples below. This whole discussion started as a question of whether it's actually more complex to learn Cherokee than Spanish, or if it merely feels that way. So if we compare Spanish, Cherokee, and many other languages, in large passages, we should be able to quantify how many units of information must be included in Spanish or Cherokee, above the number of pieces of information that are common to all or nearly all control languages. These quanta of information would include mandatory morphological marking, marked syntactic structures, suppletion, slang, and anything else that would require its own discrete lexical construction in the mind of the speaker in order to be used correctly.

Let's take three examples:
Korean: gwisin i mwusewe [ghost subj frightening] (This is the most natural translation of "I am afraid of ghosts," and does not necessarily mean "Ghosts are scary in general.")
Choctaw: shilop i~ mahlatalih [ghost 3rd-afraid-1st]
Mandarin: wo3 pa4 gui3 [1st afraid ghost]

Both Korean and Choctaw have obligatory tense, seen here more or less as null suffixes, but this is absent in Mandarin (though of course all languages have the option to elaborate on time). Mandarin and Choctaw require the first person be overtly marked, but Korean lets it be implied (the assumption that emotion verbs with no overt subject marking refer to the speaker is common cross linguistically, and even shows up in English). Korean requires that “ghost” take argument marking, while in Choctaw it only needs a third person agreement prefix on the verb, and in Mandarin it shows up with no overt marking beyond its syntactic location to the right of the verb. In all three of these languages the only commonality is that “ghost” and “scared” appear, along with whatever syntactic alignment is the default. (Once again I am treating word-choice as one quantum. In a language with fifty words for fear, each with its own default syntactic alignment, we may need to unpack the semantic aspect of complexity further. But for now I'm focusing on morpho-syntax).

So does “scared of ghosts” inherently require person marking for the subject? Does it inherently require overt tense? Does it inherently require an adposition to specify the syntactic role of “ghost?” I would suggest that, at a fundamental level, it does not. The upshot of this is that person marking, argument marking, and tense all get thrown into one of those two buckets marked “redundancy” and “irregularity.” Also important: note that the three examples each have relatively similar amounts of redundancy/irregularity, despite having radically different amounts of “morphology” in most typological analyses.

Post by **zompist** » Fri Jun 05, 2020 4:58 am

Interesting stuff.

In your sentence, it seems odd to me that you don't include "I" (in some abstracted form) as necessary. "I'm afraid of ghosts" and "you're afraid of ghosts" seem to me to differ in as important a way as "I'm afraid of ghosts" vs "I'm afraid of vampires".

Maybe you're thinking that the Korean sentence doesn't require it. But that just makes me think about null marking. Take the Mandarin phrase, for instance: it has no overt aspect marker, but it contrasts with sentences that do have it (corresponding to "I used to be afraid of ghosts", "I'm afraid of ghosts so...", "I have been afraid of ghosts", etc). I'm comfortable with saying that the Mandarin sentence doesn't mark tense, but I think it does signal aspect.

Your ideas remind me of cognitive grammar. You could even say that the "fundamental level" is something like SCARES(ghosts, me). Such approaches can run into problems because the proposed primitives often aren't. But the ones in your example seem fine to me.*

* Well, I'm not sure about the ghosts. Informally, I expect it's a pretty common concept. Are the Choctaw and Korean concepts the same concept though? To put it another way, are ghosts like dogs (pretty universal) or like gods (tied to a shitload of cultural baggage)?

Raphael · Post by **Raphael** » Fri Jun 05, 2020 6:06 am

zompist wrote: ↑Fri Jun 05, 2020 4:58 am To put it another way, are ghosts like dogs (pretty universal) or like gods (tied to a shitload of cultural baggage)?

Is there any concept, including "dog", that's not tied to a shitload of cultural baggage?

Zompist Bboard Again

Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity

Re: Morphological complexity