Zompist Bboard Again

Posted: **Mon Jun 29, 2020 8:34 pm**

Ser wrote: ↑Mon Jun 29, 2020 8:02 pm By the way, there is a long tradition here... dating to 15 years ago or so... of welcoming people with images of some pickles and tea... I guess that tradition is getting well-worn and tired now, seeing that no one has yet done it. (!)

I had to wait a good few years to get mine! And even then I had to ask for them.

Posted: **Mon Jun 29, 2020 9:21 pm**

It appears to me that most arguments in theoretical syntax have the form "sentence X is bad/ungrammatical, therefore we shall explain this happening because…".

Where does this come from? This seems to me to be unlike anything else in linguistics. Nobody thinks it needs a particular explanation if any given language such as English lacks any perhaps possible feature like a word fiffen, a locative case, a series of ejectives, a single lexeme expressing 'sea turtle', or a distinct grocery store clerk sociolect. Instead what we seek to explain / model are the features that do exist.

I generally do not see much reason to think that language operates on an everything-is-possible-except-when-forbidden basis, but rather as only-a-finite-number-of-things-are-possible-in-the-first-place.

(Plus, yes, I insist that languages are necessarily finite entities. They have a finite number of speakers who utter (or think, or parse) a finite number of words-per-minute with finite hearing and articulation fidelity over finite lifespans. Any theory that generates infinitely many "possible sentences" is infinitely wrong.)

The "standard mythology" on this I've seen goes that a theory of languages with infinite generative capacity was developed as a response to the problems in a behaviorist position that language only consist of mechanical repetition of a finite number of pre-learned utterances. Which is indeed obviously nonsense. But the weakest stronger theory that allows for linguistic creativity will not be any kind of generative syntax that produce literally infinitely many utterances. It would be, I think, a kind of "madlibs construction grammar" where people learn, minimally, a finite number of lexemes + a finite number of constructions and can then fit the former into the latter in various, possibly novel ways — but still only in finitely many ones.

Posted: **Mon Jun 29, 2020 9:29 pm**

Ser wrote: ↑Mon Jun 29, 2020 8:02 pm
priscianic wrote: ↑Mon Jun 29, 2020 12:14 amThere actually is somewhat of a literature within a broadly Chomskyan framework on this kind of phenomenon (you might see it called "matrix que", since it's a que appearing in a matrix/main clause); for instance, Etxepare (2007), Demonte and Fernández Soriano (2014), and Corr (2016), to name just a few. I suspect the Corr might be especially interesting to conlangers, since it looks at these kinds of "matrix complementizers" in a (micro)comparative perspective, looking at variation in different varieties of Ibero-Romance.
Thank you for that!

By the way, there is a long tradition here... dating to 15 years ago or so... of welcoming people with images of some pickles and tea... I guess that tradition is getting well-worn and tired now, seeing that no one has yet done it. (!)

Another by the way, priscianic: have you read the final two books Priscian's grammar of Latin discussing syntax?

Haha, thanks for the traditional (and I guess archaic by now?) welcome!

And no, I haven't actually read the Institutiones Grammaticae, contrary to what my username might suggest...it's from the term "Priscianic formation" (which I assume goes back to the way Priscian thought about things), which refers to when a stem from one part of an inflectional paradigm gets used for another, unrelated part, like the perfect passive participle stem in Latin (e.g. laudat- ‘praised’) being used for the future active participle (e.g. laudatur- ‘about to praise’), or the first person singular present indicative stem in Spanish (e.g. pong- ‘put’) being used for the present subjunctive (e.g. ponga, pongas, ponga, … ‘put.SBJV’).

Posted: **Mon Jun 29, 2020 9:32 pm**

Tropylium wrote: ↑Mon Jun 29, 2020 9:21 pm (Plus, yes, I insist that languages are necessarily finite entities. They have a finite number of speakers who utter (or think, or parse) a finite number of words-per-minute with finite hearing and articulation fidelity over finite lifespans. Any theory that generates infinitely many "possible sentences" is infinitely wrong.)

Really? Consider the following sequence of sentence:

I know that.
I know that you know that.
I know that you know that I know that.
I know that you know that I know that you know that.
I know that you know that I know that you know that I know that.
… ad infinitum

All these sentences are acceptable and grammatical, and there are an infinite number of them, so a theory to explain them must be able to generate infinitely many possible sentences.

Posted: **Mon Jun 29, 2020 10:00 pm**

Tropylium wrote: ↑Mon Jun 29, 2020 9:21 pm It appears to me that most arguments in theoretical syntax have the form "sentence X is bad/ungrammatical, therefore we shall explain this happening because…".

Where does this come from? This seems to me to be unlike anything else in linguistics. Nobody thinks it needs a particular explanation if any given language such as English lacks any perhaps possible feature like a word fiffen, a locative case, a series of ejectives, a single lexeme expressing 'sea turtle', or a distinct grocery store clerk sociolect. Instead what we seek to explain / model are the features that do exist.

I'm not sure I understand this argument. The whole idea about ungrammaticality judgments is to find the limits on what is possible. If you exhaustively know what's possible, then you also exhaustively know what's impossible, and likewise if you exhaustively know what's impossible, then you also exhaustively know what's possible (assuming that you can do some sort of excluded middle inference for possibility, such that there are no in-between states that are neither possible nor impossible).

This issue comes up in trying to study dead languages. You might look at the attested corpora, and come up with a particular explanation for the data you see. Then you think about what kinds of predictions that theory makes, and you note that it predicts that a certain class of complicated sentences should be grammatical. However, you don't find any such sentences in the corpus, given their complexity, and thus you're at an impasse: you don't know for sure whether those sentences are grammatical or not, and thus whether your theory makes the right predictions or not. If you were working with speakers of a living language, then in this situation you would construct some sentences and contexts and elicit acceptability judgments.

And of course, the most interesting cases are those where one theory predicts that a certain set of sentences should be grammatical, but in actuality they aren't. And so you point out those cases, say "the correct theory needs to account for these judgments, and the old theory does not predict them", and then either revise the old theory or come up with a new one.

So I'm not sure I understand the problem with using negative data (i.e. judgments of unacceptability).

Posted: **Mon Jun 29, 2020 10:31 pm**

priscianic wrote: ↑Mon Jun 29, 2020 10:00 pm
Tropylium wrote: ↑Mon Jun 29, 2020 9:21 pm It appears to me that most arguments in theoretical syntax have the form "sentence X is bad/ungrammatical, therefore we shall explain this happening because…".

Where does this come from? This seems to me to be unlike anything else in linguistics. Nobody thinks it needs a particular explanation if any given language such as English lacks any perhaps possible feature like a word fiffen, a locative case, a series of ejectives, a single lexeme expressing 'sea turtle', or a distinct grocery store clerk sociolect. Instead what we seek to explain / model are the features that do exist.
I'm not sure I understand this argument. The whole idea about ungrammaticality judgments is to find the limits on what is possible. If you exhaustively know what's possible, then you also exhaustively know what's impossible, and likewise if you exhaustively know what's impossible, then you also exhaustively know what's possible (assuming that you can do some sort of excluded middle inference for possibility, such that there are no in-between states that are neither possible nor impossible).

This issue comes up in trying to study dead languages. You might look at the attested corpora, and come up with a particular explanation for the data you see. Then you think about what kinds of predictions that theory makes, and you note that it predicts that a certain class of complicated sentences should be grammatical. However, you don't find any such sentences in the corpus, given their complexity, and thus you're at an impasse: you don't know for sure whether those sentences are grammatical or not, and thus whether your theory makes the right predictions or not. If you were working with speakers of a living language, then in this situation you would construct some sentences and contexts and elicit acceptability judgments.

And of course, the most interesting cases are those where one theory predicts that a certain set of sentences should be grammatical, but in actuality they aren't. And so you point out those cases, say "the correct theory needs to account for these judgments, and the old theory does not predict them", and then either revise the old theory or come up with a new one.

So I'm not sure I understand the problem with using negative data (i.e. judgments of unacceptability).

I think Tropylium’s issue is that we don’t seem to do this with any other part of language. If we see a corpus of a dead language, and the corpus lacks, say, the consonant /w/, we don’t feel any burning need to justify this. If a language seems to have no perfect aspect, then we don’t feel the need to exhaustively test every combination of words until we find it (or not). So why is it any different with syntax?

(My gut feeling is that there’s something wrong with this argument, but I’m not entirely sure what it is…)

Posted: **Mon Jun 29, 2020 11:29 pm**

bradrn wrote: ↑Mon Jun 29, 2020 10:31 pm
priscianic wrote: ↑Mon Jun 29, 2020 10:00 pm
Tropylium wrote: ↑Mon Jun 29, 2020 9:21 pm It appears to me that most arguments in theoretical syntax have the form "sentence X is bad/ungrammatical, therefore we shall explain this happening because…".

Where does this come from? This seems to me to be unlike anything else in linguistics. Nobody thinks it needs a particular explanation if any given language such as English lacks any perhaps possible feature like a word fiffen, a locative case, a series of ejectives, a single lexeme expressing 'sea turtle', or a distinct grocery store clerk sociolect. Instead what we seek to explain / model are the features that do exist.
I'm not sure I understand this argument. The whole idea about ungrammaticality judgments is to find the limits on what is possible. If you exhaustively know what's possible, then you also exhaustively know what's impossible, and likewise if you exhaustively know what's impossible, then you also exhaustively know what's possible (assuming that you can do some sort of excluded middle inference for possibility, such that there are no in-between states that are neither possible nor impossible).

This issue comes up in trying to study dead languages. You might look at the attested corpora, and come up with a particular explanation for the data you see. Then you think about what kinds of predictions that theory makes, and you note that it predicts that a certain class of complicated sentences should be grammatical. However, you don't find any such sentences in the corpus, given their complexity, and thus you're at an impasse: you don't know for sure whether those sentences are grammatical or not, and thus whether your theory makes the right predictions or not. If you were working with speakers of a living language, then in this situation you would construct some sentences and contexts and elicit acceptability judgments.

And of course, the most interesting cases are those where one theory predicts that a certain set of sentences should be grammatical, but in actuality they aren't. And so you point out those cases, say "the correct theory needs to account for these judgments, and the old theory does not predict them", and then either revise the old theory or come up with a new one.

So I'm not sure I understand the problem with using negative data (i.e. judgments of unacceptability).
I think Tropylium’s issue is that we don’t seem to do this with any other part of language. If we see a corpus of a dead language, and the corpus lacks, say, the consonant /w/, we don’t feel any burning need to justify this. If a language seems to have no perfect aspect, then we don’t feel the need to exhaustively test every combination of words until we find it (or not). So why is it any different with syntax?

(My gut feeling is that there’s something wrong with this argument, but I’m not entirely sure what it is…)

You won't ask questions like that until you have a theory that makes predictions, and you want to test those predictions. But once you have a theory that makes predictions, you might start asking seemingly mundane and uninteresting questions like that.

For the case of phonology, here's a hypothetical case where you might imagine asking "does this language have [ŋ]?": imagine that you're studying a language, and see a suffix -na. When suffixed to a V-final stem, it surfaces as -na: sufi -na = sufina. But when you suffix it to a C-final stem, you notice that it undergoes place assimilation:

wap + na = wapma
abuʈ + na = abuʈɳa
koc + na = kocɲa

So you have a hypothesis: -na undergoes place assimilation to the place of the preceding consonant. At the same time, you note that you haven't come across [ŋ] in any of your work so far. So you ask the question: is there [ŋ] in the language, and in particular does it appear when you suffix -na onto a velar-final stem? If yes, then your hypothesis about -na is correct. If no, then you have to revise your hypothesis to block the unattested process -na > ŋa. But my point is that then this become a reasonable and interesting question to ask, given the theoretical position you find yourself in.

For the case of semantics, here's a (real) case where you might ask a slightly more complicated question: can you have a perfect of an imperfective/progressive? In English, for example, you certainly can (I have been eating this meal since 3pm), and there are other languages that allow this, like Bulgarian (and there are some that don't, like Greek). And say you're studying Azerbaijani, which has an imperfective marker -Ir, as well as a perfect marker -mIş which, in classic Turkic fashion, also can get indirect evidential readings. You're doing work on the language, and notice people producing verbs that end in -Ir-mIş ‘IPFV-PERF’. Given that, in plenty of other languages, you can have perfects of imperfectives, it seems reasonable to assume that this is also a perfect of an imperfective. But, just to do due diligence, you ask yourself: does Azerbaijani allow perfects of imperfectives? So you go to figure out what these -Ir-mIş verbs mean, trying them out in different contexts that are designed to target "perfect of imperfective" readings. But to your surprise, your consultants reject -Ir-mIş in all these contexts! Digging at it a bit further, you discover that the sequence -Ir-mIş actually only allows indirect evidential + imperfective readings! If you hadn't thought to ask the basic question, you might not have noticed this striking absence. And then, naturally curious, you might wonder why some languages (like English and Bulgarian) allow perfects of imperfectives, some don't (like Greek), and some don't even when they can morphologically compose the imperfective and perfect markers (like Azerbaijani). At least to me, that seems like it begs for an explanation.

I think the broader point here is that of course no one asks questions like "does this language have [ŋ]" or "does this language allow perfects of imperfectives", except when they have a reason to expect a particular answer to that question. In the phonology case, the expectation was that the language had [ŋ] in cases of nasal place assimilation (given your hypothesis about -na), and in the semantics case, the expectation was that the language does allow perfects of imperfectives, given that you've heard those two morphemes cooccur (and that other languages allow this as well).

And in syntax, no one is presenting randomly-chosen ungrammatical sentences and going "we must account for this!". Rather, they're finding sentences that might be surprisingly ungrammatical given some other facts we know, or some other expectations we have. People aren't asking why you can't put all prepositions at the beginning of a sentence in English, for instance, because no one has a reason to believe that that's something that English could possibly do. But people are asking why you get island effects with movement/displacement, because, given that in general you can have unbounded long-distance (i.e. cross-clausal) movement, you might expect this to be true in all cases. But it isn't!

Posted: **Mon Jun 29, 2020 11:32 pm**

Tropylium wrote: ↑Mon Jun 29, 2020 9:21 pm It appears to me that most arguments in theoretical syntax have the form "sentence X is bad/ungrammatical, therefore we shall explain this happening because…".

Where does this come from? This seems to me to be unlike anything else in linguistics. Nobody thinks it needs a particular explanation if any given language such as English lacks any perhaps possible feature like a word fiffen, a locative case, a series of ejectives, a single lexeme expressing 'sea turtle', or a distinct grocery store clerk sociolect. Instead what we seek to explain / model are the features that do exist.

First, we do make negative arguments in other parts of linguistics. We have the entire idea of phonotactics, which is precisely defining what phonemic sequences are and are not possible. We define things like deponent verbs, which have parts of their paradigm "missing." Historical linguistics absolutely talks about losing features; if a proto-language has ejectives and a daughter does not, we ask what happened.

Second... I understand that it may be weird to concentrate on ungrammaticality. I don't think it means as much as Chomsky thinks it does. In many a syntax class, discussion gets hung up on idiolectal differences, and in some cases, I do think linguists are exploring how people deal with an unusual situation rather than really discussing a syntactic fact.

But it makes sense if you think of syntax as a machine for generating a language. This might be a set of rules, or a piece of code, or a human brain.

Take the sample grammars here. They're literally rules and an engine that works with them. If my program generates bad sentences, then my rules are wrong. It's easy, after all, to write a program that produces valid sentences, and also invalid sentences: you combine words at random. A machine that produces valid sentences and not invalid sentences is much harder, also more interesting. And when you are writing such rules, generating bad sentences is much more diagnostic of a problem to fix than generating any number of correct ones.

(Plus, yes, I insist that languages are necessarily finite entities. They have a finite number of speakers who utter (or think, or parse) a finite number of words-per-minute with finite hearing and articulation fidelity over finite lifespans. Any theory that generates infinitely many "possible sentences" is infinitely wrong.)

I agree with you that infinite capacity is not actually theoretically required. But your statement that it's actually wrong is just as dogmatic and unsupported as Chomsky's.

Posted: **Tue Jun 30, 2020 11:15 am**

Tropylium wrote: ↑Mon Jun 29, 2020 9:21 pm Plus, yes, I insist that languages are necessarily finite entities. They have a finite number of speakers who utter (or think, or parse) a finite number of words-per-minute with finite hearing and articulation fidelity over finite lifespans. Any theory that generates infinitely many "possible sentences" is infinitely wrong.

How would you go about generating a grammar of a programming language, e.g. C?

Of course, you do hit the very real issue with programming languages that what compilers will accept often does not correspond to the language definition, and different compilers accept different sets of programs, for various meanings of 'accept'. And that of course is similar to the concept of grammaticality in natural language - and human language processors (wetware as well as software) generally need to interpret what is manifestly ungrammatical. For starters, there's random corruption of the utterances.

Posted: **Wed Jul 01, 2020 3:37 pm**

bradrn wrote: ↑Mon Jun 29, 2020 9:32 pmConsider the following sequence of sentence:

I know that.
I know that you know that.
I know that you know that I know that.
I know that you know that I know that you know that.
I know that you know that I know that you know that I know that.
… ad infinitum

All these sentences are acceptable and grammatical, and there are an infinite number of them, so a theory to explain them must be able to generate infinitely many possible sentences.

No, not all of these sentences are grammatical. I take "grammaticality" to mean that a real, living human will actually come to understand a sentence. Yet at minimum, if we repeat a sentence such as this for one thousand years — then no human cannot in principle even take in the entire sentence, much less process it and understand it to hold a particular meaning.

Of course we can imagine an "extrapolated human" who parses any sentence like this just fine, but this seems to not have much to do with the study of languages or humans as they actually exist. For one, such an "extrapolated human" necessarily needs to have unbounded memory, giving major implications for cognitive capabilities.

All of us can also identify the mental algorithm for potentially parsing any such finite sentence, but this is distinct from actually parsing the sentence, much in the same way how parsing "18168" or "26^1117" as being a well-defined natural number is distinct for parsing "5" as standing for this amount: · · · · ·.

Richard W wrote: ↑Tue Jun 30, 2020 11:15 amHow would you go about generating a grammar of a programming language, e.g. C?

This is a non-problem: I wouldn't. Programming languages are not natural languages and they do not have any mental grammars. No human natively writes C.

priscianic wrote: ↑Mon Jun 29, 2020 10:00 pmIf you exhaustively know what's possible, then you also exhaustively know what's impossible, and likewise if you exhaustively know what's impossible, then you also exhaustively know what's possible (assuming that …)

Granted. The difference I'm pointing at is that syntax starts with enormously permissive theories and seeks to falsify parts of them. Lexicology, phonology, semantics etc. generally don't: they start out by assuming nothing unwarranted and only add items to the model when they find evidence that they exist. The latter seems to me like the proper amount of epistemic caution.

priscianic wrote:This issue comes up in trying to study dead languages (…) you don't know for sure whether those sentences are grammatical or not, and thus whether your theory makes the right predictions or not.

Why should we have any synchronic theories particular individual languages at all? Why is not not good enough to document what actually exists?

No phonological inventory makes any predictions, and no dictionary does either. We still have theories of phonology or semantics, but they do not make any a priori predictions about what features any one particular language should have — only about what feature some languages could have.

When you ask questions like "is there [kŋ]" or "is there a perfect of an imperfective", this is essentially a guideline to documenting the language more fully. And if we do find that e.g. k-final stems do not exist at all (therefore could not even in principle produce assimilation to [kŋ]) we don't try to artificially create them to see what would happen — at least, we do not claim doing so to advance the morpho/phonological description of the language as it currently exists.

priscianic wrote:But people are asking why you get island effects with movement/displacement, because, given that in general you can have unbounded long-distance (i.e. cross-clausal) movement, you might expect this to be true in all cases. But it isn't!

I do not grant the background assumptions here: I have never seen any reason to assume a phenomenon of "movement" to exist at all.

In terms of purely ground-level observable facts, what is being claimed in cases like these seems to be that e.g. sentences A B C, A B D and E C A B exist, but E D A B does not. Such a fact seems to be as much of a non-problem to me as the fact that the English lexicon has the words cat, bat and copycat but not ˣcopybat. Compounds are/have been created independently one at a time, only when needed. At no point in history has there occurred a mass Cartesian product of some sets of nouns bringing n·m compounds into some language "immediately".

Adopting the same approach in syntax would immediately remove all risk of ever overgenerating anything.

Posted: **Wed Jul 01, 2020 6:56 pm**

Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm
Richard W wrote: ↑Tue Jun 30, 2020 11:15 amHow would you go about generating a grammar of a programming language, e.g. C?
This is a non-problem: I wouldn't. Programming languages are not natural languages and they do not have any mental grammars. No human natively writes C.

I was offering you a simpler task.

Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm No phonological inventory makes any predictions, and no dictionary does either. We still have theories of phonology or semantics, but they do not make any a priori predictions about what features any one particular language should have — only about what feature some languages could have.

They quickly enter this territory when it comes to phonetic syllable or word structure. I can remember coming upon a phonological description of Thai syllables which explained that why several words I knew to exist did not exist.

Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm At no point in history has there occurred a mass Cartesian product of some sets of nouns bringing n·m compounds into some language "immediately".

It's interesting to learn that German, Swedish and Sanskrit don't exist. Mind you, there are some old claims around that Sanskrit is a 'priestly fraud'.

Posted: **Wed Jul 01, 2020 7:50 pm**

Richard W wrote: ↑Wed Jul 01, 2020 6:56 pm
Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm At no point in history has there occurred a mass Cartesian product of some sets of nouns bringing n·m compounds into some language "immediately".
It's interesting to learn that German, Swedish and Sanskrit don't exist. Mind you, there are some old claims around that Sanskrit is a 'priestly fraud'.

You're being deeply uncharitable here. This is not what Tropylium is saying. He's saying that a set of compounds do not all suddenly spring into being without being uttered; the first one is coined, and then another based on that model, and then another based on some subset of those two, and so on, potentially by different speakers each time. Words do not exist independently of people who use them.

Posted: **Wed Jul 01, 2020 8:09 pm**

Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pmNo, not all of these sentences are grammatical. I take "grammaticality" to mean that a real, living human will actually come to understand a sentence.

Again, this is just as dogmatic a claim as Chomsky's. Also, just as unnecessary.

Yet at minimum, if we repeat a sentence such as this for one thousand years — then no human cannot in principle even take in the entire sentence, much less process it and understand it to hold a particular meaning.

And so what? It's quite possible and unremarkable for humans to create an artifact, or a text, that no one human can understand fully. It's barely possible to read the entire Talmud and its commentaries, but it takes half a lifetime. The Tibetan Buddhist canon is far longer in number of volumes at least (I'm not sure how long each is). No one could read everything in Wikipedia. Outside of texts, no one person could (say) master the entire corpus of law in a single state, or the entire multi-million-line code base for a very large scale project, or everything that's part of the standard model in physics. (Feynman did his damndest, but that was 40 years ago.)

Or to take another approach: we could start a story, where each contributor contributes the next line, and do that for as long as the human race continues. There is some point where the story becomes too much to memorize, but your claim is that it becomes ungrammatical, which makes no sense. Every part of it is grammatical, and merely adding to a story does not introduce ungrammaticality.

Now, I did say "texts", but I hope you won't insist that a text is not a "sentence". You can't save one dogmatic assertion by piling on another. Texts are something that language allows, and which can be studied by linguistics. (Probably not as much as they should be. But there's plenty of pragmatics that deals with multi-sentence texts.)

Posted: **Wed Jul 01, 2020 8:20 pm**

Also, just to be clear, the "language is infinite" bit is only important for very early stages in GG, though it seems to have hit a nerve and is often repeated.

It's really only important if you want to turn some of Chomsky's 1950s arguments into proofs-- statements of the form "a grammar of such-and-such complexity is required to produce these sentences." That isn't true if languages are finite, for the simple reason that a grammar could be written simply by listing all valid sentences. If languages are infinite, this method can't be used.

But a) Minimalism doesn't even generate sentences the way those 1950s grammars did, so it's entirely moot; and b) rather than stipulate that language is infinite, Chomsky could have simply said that methods like listing all valid sentences were obviously impossible for a human brain. But he probably couldn't get that past his own considerable math envy.

Posted: **Thu Jul 02, 2020 1:01 am**

Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm
priscianic wrote: ↑Mon Jun 29, 2020 10:00 pmIf you exhaustively know what's possible, then you also exhaustively know what's impossible, and likewise if you exhaustively know what's impossible, then you also exhaustively know what's possible (assuming that …)
Granted. The difference I'm pointing at is that syntax starts with enormously permissive theories and seeks to falsify parts of them. Lexicology, phonology, semantics etc. generally don't: they start out by assuming nothing unwarranted and only add items to the model when they find evidence that they exist. The latter seems to me like the proper amount of epistemic caution.

I agree. This is a big complaint of mine about syntactic theories — they tend to start with an assumption and then seek to explain all language in terms of that assumption, when it really should be the other way around.

priscianic wrote:But people are asking why you get island effects with movement/displacement, because, given that in general you can have unbounded long-distance (i.e. cross-clausal) movement, you might expect this to be true in all cases. But it isn't!
I do not grant the background assumptions here: I have never seen any reason to assume a phenomenon of "movement" to exist at all.

In terms of purely ground-level observable facts, what is being claimed in cases like these seems to be that e.g. sentences A B C, A B D and E C A B exist, but E D A B does not. Such a fact seems to be as much of a non-problem to me as the fact that the English lexicon has the words cat, bat and copycat but not ˣcopybat. Compounds are/have been created independently one at a time, only when needed. At no point in history has there occurred a mass Cartesian product of some sets of nouns bringing n·m compounds into some language "immediately".

Adopting the same approach in syntax would immediately remove all risk of ever overgenerating anything.

Yes, but it also removes all risk of explaining anything. There is an essential difference between word-formation and syntax: speakers of natural languages generally do not create new words as they talk (onomatopoeia and productive compounding excepted), but do continually produce novel sentences which have never been heard before. And listeners can clearly understand these novel sentences, so there must be some rule somewhere specifying how these sentences get interpreted. At the same time, some sentences — e.g. ‘on cat the sitting see mat I the’ — are clearly unintelligible, and get rejected as nonsense, so clearly there must be some rule which they are violating. The question syntax asks is: what, exactly, are these rules? And how precise can we make them? By contrast, if I am reading your post correctly, your approach seems to say ‘some sentences get produced, others don’t, no need to complicate the situation any further’ — an approach which is correct in its own way, but fails to have any predictive power.

Posted: **Thu Jul 02, 2020 3:10 am**

Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm
priscianic wrote: ↑Mon Jun 29, 2020 10:00 pmIf you exhaustively know what's possible, then you also exhaustively know what's impossible, and likewise if you exhaustively know what's impossible, then you also exhaustively know what's possible (assuming that …)
Granted. The difference I'm pointing at is that syntax starts with enormously permissive theories and seeks to falsify parts of them. Lexicology, phonology, semantics etc. generally don't: they start out by assuming nothing unwarranted and only add items to the model when they find evidence that they exist. The latter seems to me like the proper amount of epistemic caution.

To be honest, I'm not quite seeing what you're seeing here. The kinds of complaints I usually see about formal/theoretical/generative syntax are things (strawmen) like: it analyzes everything "as if it's English", and that it can't account for the diversity of human language—in other words, that it's too restrictive, not that it's too permissive. Even bradrn, in their reply to this point, says "they tend to start with an assumption and then seek to explain all language in terms of that assumption, when it really should be the other way around" (which doesn't seem to be actually agreeing with your point here, unless I'm misreading things? It seems like bradrn is complaining about syntactic theories being too restrictive, rather than too permissive? Maybe I'm misunderstanding something).

And the "epistemic caution" that you advise, "only add items to the model when [you] find evidence that they exist", seems (to me, as a syntactician) to be exactly how we do our work. For instance, in the particular framework I work in, assuming that trees are universally binary-branching only, since we haven't found a pattern yet that convincingly requires a ternary (or >2-nary) branching structure, or assuming verbs universally form a constituent with the patient to the exclusion of the agent, that things like pronominal/variable binding are primarily sensitive to hierarchical structure and not just linear order, etc.—the examples abound. (If you don't like the particular framework or formalism I prefer, you can replace it with whatever you like—I'm making a general point about syntactic theorizing. If anything, we try to make overly restrictive theories, or stronger, more "universal" theories—in other words, theories that can easily be falsified, because falsifying by providing one counterexample is easier than proving some universal principle true.)

Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm
priscianic wrote:This issue comes up in trying to study dead languages (…) you don't know for sure whether those sentences are grammatical or not, and thus whether your theory makes the right predictions or not.
Why should we have any synchronic theories particular individual languages at all? Why is not not good enough to document what actually exists?

Am I understanding you right here? Are you trying to make the normative claim that we shouldn't study dead languages with the tools of synchronic linguistics? One, you're gonna need to do a lot more convincing to get me to accept that, and two, I don't see how it's relevant to my point about how negative evidence is useful for theory-building.

Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm No phonological inventory makes any predictions, and no dictionary does either. We still have theories of phonology or semantics, but they do not make any a priori predictions about what features any one particular language should have — only about what feature some languages could have.

I don't know what kinds of phonology or semantics you're familiar with, but it's certainly not the kind of phonology or semantics I'm familiar with. Phonology is not just phoneme inventories (some phonologists don't even believe in existence of phonemes, much less in the existence of phoneme inventories), and semantics is not just about dictionaries and lexical semantics (plenty of semanticists basically just gloss over lexical semantics entirely).

Moreover, theories of phonology and semantics do make predictions about what features languages can have. In the case of phonology, for instance, there are theories that end up with the result that if you allow CVC syllables, then you also must allow CV syllables, and that makes a strong prediction about what features a given language should have: i.e. if it has CVC syllables, then it should have CV syllables. In the case of semantics, for instance, there are theories of indexical shift (allowing "shifted" readings of indexicals in indirect speech, e.g. first-person pronouns referring to the reported speaker, rather than the speaker of the utterance) that predict that if a language allows indexical shift of second person pronouns, it must then allow indexical shift of first person pronouns. These are restrictive theories that make falsifiable predictions, and part of the work of coming up with a better theory is trying to falsify some of these predictions in order to figure out where they break.

Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm
priscianic wrote:But people are asking why you get island effects with movement/displacement, because, given that in general you can have unbounded long-distance (i.e. cross-clausal) movement, you might expect this to be true in all cases. But it isn't!
I do not grant the background assumptions here: I have never seen any reason to assume a phenomenon of "movement" to exist at all.

The background assumptions aren't the point here—you can replace it with whatever floats your boat. The point is that sometimes you might expect certain things to be grammatical (for whatever reason), and they aren't—and that naturally leads you to wonder why. If you don't ever experience these kinds of expectations and questions, I'm not sure what I can say. To me, questions like "why is the world one way, and not the other?" are natural and interesting questions that curious people ask, the kinds of questions that drive scientific inquiry. I'm not sure I can personally grok how someone wouldn't have these same kinds of questions about human language, and I think that's a core part of why I'm so confused by your responses here, because they seem to be trying to deny that these kinds of questions are interesting and worth asking.

Here's another example that might be more palatable to you: in many languages, you get reflexive pronouns that can appear in object position to indicate that the subject did something to themselves. At the same time, we know that there are many languages that are "syntactically ergative" in some sense, where the transitive object is somehow "privileged" over the transitive agent (e.g. the transitive object might pass certain "subjecthood diagnostics" that we might expect the transitive agent to pass in a "syntactically accusative" language). We might then ask: is there any language (perhaps a syntactically ergative one) where reflexive pronouns appear in the transitive subject position, to indicate that someone or something did something to themselves? The answer so far seems to be that no such language exists. Why?

(Zompist and bradrn have already addressed the other points in this reply, so I'm not addressing them here.)

Posted: **Thu Jul 02, 2020 3:24 am**

priscianic wrote: ↑Thu Jul 02, 2020 3:10 am
Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm
priscianic wrote: ↑Mon Jun 29, 2020 10:00 pmIf you exhaustively know what's possible, then you also exhaustively know what's impossible, and likewise if you exhaustively know what's impossible, then you also exhaustively know what's possible (assuming that …)
Granted. The difference I'm pointing at is that syntax starts with enormously permissive theories and seeks to falsify parts of them. Lexicology, phonology, semantics etc. generally don't: they start out by assuming nothing unwarranted and only add items to the model when they find evidence that they exist. The latter seems to me like the proper amount of epistemic caution.
To be honest, I'm not quite seeing what you're seeing here. The kinds of complaints I usually see about formal/theoretical/generative syntax are things (strawmen) like: it analyzes everything "as if it's English", and that it can't account for the diversity of human language—in other words, that it's too restrictive, not that it's too permissive. Even bradrn, in their reply to this point, says "they tend to start with an assumption and then seek to explain all language in terms of that assumption, when it really should be the other way around" (which doesn't seem to be actually agreeing with your point here, unless I'm misreading things? It seems like bradrn is complaining about syntactic theories being too restrictive, rather than too permissive? Maybe I'm misunderstanding something).

Good catch in pointing out that my complaint contradicts Tropylium’s complaint; I have no idea how I didn’t notice that when I wrote it.

Posted: **Thu Jul 02, 2020 4:42 am**

KathTheDragon wrote: ↑Wed Jul 01, 2020 7:50 pm
Richard W wrote: ↑Wed Jul 01, 2020 6:56 pm
Tropylium wrote: ↑Wed Jul 01, 2020 3:37 pm At no point in history has there occurred a mass Cartesian product of some sets of nouns bringing n·m compounds into some language "immediately".
It's interesting to learn that German, Swedish and Sanskrit don't exist. Mind you, there are some old claims around that Sanskrit is a 'priestly fraud'.
You're being deeply uncharitable here. This is not what Tropylium is saying. He's saying that a set of compounds do not all suddenly spring into being without being uttered; the first one is coined, and then another based on that model, and then another based on some subset of those two, and so on, potentially by different speakers each time. Words do not exist independently of people who use them.

I don't see how what you are saying about words differs from sentences in these languages, except in degree. Also, where compound verbs are freely formed in highly inflected languages, you automatically get an immense number of derivative words. Such forms generally exist independently of whether someone has ever uttered them. Now, English is not so free in generating compound words from nouns and similar parts of speech as the three languages I mentioned. While 42-legged was English before I independently coined it this morning, it is debatable as to how many words it constitutes. When it comes to the long Sanskrit compounds that serve the same purpose as clauses in other languages, I don't think the term 'coining' is appropriate.

My feeling is that novel sentences are generated on the model of other sentences. Generative grammars are an approximation to this process.

Posted: **Thu Jul 02, 2020 2:26 pm**

bradrn wrote: ↑Thu Jul 02, 2020 1:01 amsome sentences — e.g. ‘on cat the sitting see mat I the’ — are clearly unintelligible, and get rejected as nonsense, so clearly there must be some rule which they are violating.

I do not think this follows. If "copybat" is rejected as not being a word, this doesn't mean that there has to exist a rule that it violates — it merely means that it is not an existing word. It very well could exist, if there was a referent for it.

If then "on cat the sitting see mat I the" is rejected as not being a sentence, this could likewise mean that it just lacks a referent to a particular relation between the words (which I take you'd agree are known English words). Why should it be a priori ruled out even as a possibility that some strings of words just happen to not be meaningful sentences?

Indeed just like we could try to coin the word "copybat" — to propose a referent that it applies to — we could also try to develop a referent for your currently-nonsense sentence:
– "the sitting" — already an NP.
– "see mat" — already a VP.
– "on cat" — compare "on point"; could be an expression meaning 'to be catlike', to be coming soon to internet cat memes near you.
– "I the" — compare "Beatles, The", i.e. parseable as a library catalog rendering of "the I". Which in turn could be a psychological or philosophy-of-the-mind term coined to mean something akin to 'ego'.

At this point the sentence will be readable, and only a bit of punctuation is additionally needed to highlight this:
"On cat: the sitting; see mat, I, the!". ≈ "Is it catlike that sitting is taking place, and I exhort my ego to see a mat."

This could be clearly also further developed e.g by coining a meaning for an NP + imperative + NP construction.

bradrn wrote:By contrast, if I am reading your post correctly, your approach seems to say ‘some sentences get produced, others don’t, no need to complicate the situation any further’ — an approach which is correct in its own way, but fails to have any predictive power.

Well, I do pare this back a little bit. I am aware that many novel sentences can be easily analyzed as having an already known structure that's simply dressed up with different choices of lexemes, and these analyses will predict many other similar tamely novel ("lexically novel") sentences. "Nameless blue nodes sit stringently" and all that pop punk.

It is not the only option though that any novel sentence was already a part of the language in some abstract sense, and that if it has not been predicted, our previous theory of the language's grammar was therefore wrong. At least sometimes novel sentences must be instances of language change, the invention of some new syntactic device(s) that did not previously exist (such as "on cat" just now above). If this is the case, then there is no real problem in the previous synchronic grammar's failure to predict this novel sentence.

I guess this is a simple inverse of the other case I would like to see recognized, that sometimes a sentence could exist but just happens not to do so.

Posted: **Thu Jul 02, 2020 5:12 pm**

priscianic wrote:The point is that sometimes you might expect certain things to be grammatical (for whatever reason), and they aren't—and that naturally leads you to wonder why.

No, I don't mean that there's anything bad with this if you already have ended up with an expectation… The original observation I made is that this seems to be the only common workflow for bringing syntactic theories in better accord with languages as they exist in reality. And I mean not just the question, but also the answer, which invariably is something roughly "here's a constraint that prevents this from being grammatical".

Meanwhile we still do not seek constraints that theoretically exclude e.g. the existence of fiffen or copybat or a zillion other phonologically possible words and are content to think that these are random gaps.

It's also interesting that you tell me movement is only a "framework" and it does not need to float my boat. Does this mean that its existence is not even in principle falsifiable? That it indeed doesn't actually exist in the real, physical world? That, therefore, nothing is actually explained by an analysis that proposes movement, since movement doesn't even exist? That whenever someone answers "because movement", they haven't actually answered the "why", only named the phenomenon? Do phenomena called movement perhaps actually really occur because of wakalixes?

priscianic wrote:To me, questions like "why is the world one way, and not the other?" are natural and interesting questions that curious people ask, the kinds of questions that drive scientific inquiry.

"Why" is always a good question, but it is fundamentally a question about diachrony. "Everything is the way it is because it got that way."

I do not mean to go absolutely anti-Saussure; synchrony is separable from diachrony in principle. But my starting point would be that the synchronic state of a system can only be described, and there cannot exist any predictive theories that are "purely synchronic". Causality does not exist without time.

Causally speaking, if some construction like "reflexive pronoun in a transitive subject position to indicate a reflexive action" is never attested, as a logical necessity either (1) for some reason there are no possible or no sufficiently probable diachronic ways for such a construction to develop; or (2) the same holds for all precedessors that could plausibly give rise to such a construction.

Sometimes (1) is because the target is logically incoherent (it is not possible to construct a house with four walls but five corners even if you try). It seems awfully bold to think that this should be always or by default the case though.

Richard W wrote: ↑Thu Jul 02, 2020 4:42 amforms generally exist independently of whether someone has ever uttered them

Not for any sense of "exist" that is supposed to be on the same level as forms that are uttered.

Richard W wrote:My feeling is that novel sentences are generated on the model of other sentences. Generative grammars are an approximation to this process.

Explaining novel sentences is one of the things generative grammars do. Another thing I see them doing though is trying to reduce sentences as being generated from other sentences or archetypes or operations (rather than simply taken from memory) even when they aren't novel at all.

Zompist Bboard Again

Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random

Re: Syntax random