Graphemic space

Natural languages and linguistics
Post Reply
User avatar
alice
Posts: 913
Joined: Mon Jul 09, 2018 11:15 am
Location: 'twixt Survival and Guilt

Graphemic space

Post by alice »

By analogy with the tendency of vowel qualities to spread out evenly across the possible vocalic space, is there anything comparable for graphemes ("letters of the alphabet")? This might explain why conlangers who try to create conscripts based on cursive handwriting have trouble coming up with graphemes which are satisfactorily different from those of the Roman or Cyrillic alphabets, for example. It does, however, necessitate a taxonomy of graphemes which probably doesn't exist yet; any thoughts?
Self-referential signatures are for people too boring to come up with more interesting alternatives.
bradrn
Posts: 5727
Joined: Fri Oct 19, 2018 1:25 am

Re: Graphemic space

Post by bradrn »

alice wrote: Tue Oct 19, 2021 3:13 am By analogy with the tendency of vowel qualities to spread out evenly across the possible vocalic space, is there anything comparable for graphemes ("letters of the alphabet")? This might explain why conlangers who try to create conscripts based on cursive handwriting have trouble coming up with graphemes which are satisfactorily different from those of the Roman or Cyrillic alphabets, for example. It does, however, necessitate a taxonomy of graphemes which probably doesn't exist yet; any thoughts?
I have wondered about this idea myself. I think zompist has also suggested something similar. But my understanding of real-world writing systems suggests otherwise: writing systems seem to have no problem with graphemes being excessively close to each other. Thai is an especially prominent example here, but writing systems as diverse as Kurrent, Geʼez, Tangut and Cherokee all show the same phenomenon. The most that could be said is that writing systems tend to evolve to make letters more distinguishable. (This article on Thai is especially interesting here.)
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
Creyeditor
Posts: 238
Joined: Wed Jul 08, 2020 9:15 am

Re: Graphemic space

Post by Creyeditor »

I think it's similar for graphemes and vowel phonemes. There is a tendency for them to be as different as possible (by exploiting given parameters) but it's just a tendency. Vowel phoneme clouds overlap (even in languages with small vowel phoneme inventories) and graphemes are somtimes surprisingly similar. Yet, no language only has /a/ vs. /æ/ and no script looks like Tengwar.
bradrn
Posts: 5727
Joined: Fri Oct 19, 2018 1:25 am

Re: Graphemic space

Post by bradrn »

Creyeditor wrote: Tue Oct 19, 2021 4:42 am I think it's similar for graphemes and vowel phonemes. There is a tendency for them to be as different as possible (by exploiting given parameters) but it's just a tendency. Vowel phoneme clouds overlap (even in languages with small vowel phoneme inventories) and graphemes are somtimes surprisingly similar. Yet, no language only has /a/ vs. /æ/ and no script looks like Tengwar.
I don’t think they’re quite that comparable. For one thing, vowels tend to be fairly far apart, even in large systems: I know of no language which genuinely has both [a] and [æ], or any other distinction as small (with the sole exception of Kensiu). But writing systems with distinctions as small as those of Tengwar are not difficult to find, despite the fact that there are far less writing systems than languages: Lampung and Hangeul are probably most comparable to Tengwar, and then there’s Thai, Cherokee etc. as I mentioned above.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
User avatar
Pabappa
Posts: 1359
Joined: Sun Jul 08, 2018 11:36 am
Location: the Impossible Forest
Contact:

Re: Graphemic space

Post by Pabappa »

i started with an alphabet that i acknowledged was Romanesque .... but as I worked with it over the years, I realized there were a lot of shapes that had just never occurred to me. We have only one Q, one G, etc .... but letter shapes with bumps and bolts in other places would still look Romanesque. However, I have no cursive form of the script and don't intend to create one.
Creyeditor
Posts: 238
Joined: Wed Jul 08, 2020 9:15 am

Re: Graphemic space

Post by Creyeditor »

bradrn wrote: Tue Oct 19, 2021 5:15 am
Creyeditor wrote: Tue Oct 19, 2021 4:42 am I think it's similar for graphemes and vowel phonemes. There is a tendency for them to be as different as possible (by exploiting given parameters) but it's just a tendency. Vowel phoneme clouds overlap (even in languages with small vowel phoneme inventories) and graphemes are somtimes surprisingly similar. Yet, no language only has /a/ vs. /æ/ and no script looks like Tengwar.
I don’t think they’re quite that comparable. For one thing, vowels tend to be fairly far apart, even in large systems: I know of no language which genuinely has both [a] and [æ], or any other distinction as small (with the sole exception of Kensiu). But writing systems with distinctions as small as those of Tengwar are not difficult to find, despite the fact that there are far less writing systems than languages: Lampung and Hangeul are probably most comparable to Tengwar, and then there’s Thai, Cherokee etc. as I mentioned above.
I think graphemes and vowel phonemes are similar in that they are not points in space but clouds. Each realization is a bit different. And vowel clouds frequently overlap even in languages with three vowel phonemes, e.g. Kabardian.
And grapheme clouds can overlap, too. In German contemporary handwriting some realizations of <u> look identical to some realizations of <n>.

Also, I was referring to an improbable vowel phoneme inventory only consisting of /a/ and /æ/ without any other vowels. Fine vowel quality distinctions in larger phoneme inventories are common, I think. German lax high fromt vowels and tense mid vowels mostky differ in length, which is arguably neutralized in some contexts. And Moro (Kordofanian) has two schwas.
Moose-tache
Posts: 1746
Joined: Fri Aug 24, 2018 2:12 am

Re: Graphemic space

Post by Moose-tache »

There are real context differences between speaking and reading that may affect this question. For a long time, reading and writing meant entering a very special mode, which for some people made up only a small fraction of their language use and for others was a highly ritualized career tool. Text is sometimes stylized to make letters deliberately more similar, in a way that rarely happens to vowels (you don't see people in legal courts switching to all centralized voiceless vowels, for example). The purpose, ambiguity tolerance, required shared education, and aesthetic considerations for writing are all very different than speaking.
I did it. I made the world's worst book review blog.
zompist
Site Admin
Posts: 2718
Joined: Sun Jul 08, 2018 5:46 am
Location: Right here, probably
Contact:

Re: Graphemic space

Post by zompist »

FWIW, my statement in the LCK was modal-- it was about best practices, not about what natlang writing systems actually do. Actual writing systems can be terrible, with multiple letters merging (as in Arabic) or barely distinguishable (as in traditional Hebrew fonts). Medieval European calligraphy ("Gothic" letters) seem to aim for a forest of undifferentiated stalks and serifs. And if you do have elegant and mnemonic glyphs, users will distort them into unrecognizability in a few centuries.
bradrn
Posts: 5727
Joined: Fri Oct 19, 2018 1:25 am

Re: Graphemic space

Post by bradrn »

Creyeditor wrote: Tue Oct 19, 2021 12:49 pm I think graphemes and vowel phonemes are similar in that they are not points in space but clouds. Each realization is a bit different. And vowel clouds frequently overlap even in languages with three vowel phonemes, e.g. Kabardian.
Of course, but I wasn’t claiming that vowel clouds never overlap. Just that they tend to spread out as far as possible in vowel space.
Also, I was referring to an improbable vowel phoneme inventory only consisting of /a/ and /æ/ without any other vowels. Fine vowel quality distinctions in larger phoneme inventories are common, I think. German lax high fromt vowels and tense mid vowels mostky differ in length, which is arguably neutralized in some contexts. And Moro (Kordofanian) has two schwas.
I don’t think such fine distinctions are all that common when you look at vowel systems phonetically. Vanishingly few languages have two vowels separated by only length, or only one height level. There’s a reason why e.g. English has [iː] and [ɪ], rather than [i] and [ɪ], or [iː] and [i]. My dialect does actually have a genuine example of a length distinction in [e̞] vs [e̞ː] (DRESS vs SQUARE), but even then the latter tends to be slightly diphthongised as something like [e̞ˑə̆], and most other dialects separate them even more.

On the other hand, my point is that writing systems are far less resistant to such clashes. Almost any writing system you might think of will have at least two letters which are difficult to tell apart, and some reach levels of confusion which are unheard-of in vowel systems. (The standout example here is undoubtedly Book Pahlavi, which underwent such extreme mergers that it only had 13 distinguishable graphemes. Though admittedly it fell out of use quite quickly.)
Moose-tache wrote: Tue Oct 19, 2021 7:41 pm There are real context differences between speaking and reading that may affect this question. For a long time, reading and writing meant entering a very special mode, which for some people made up only a small fraction of their language use and for others was a highly ritualized career tool. Text is sometimes stylized to make letters deliberately more similar, in a way that rarely happens to vowels (you don't see people in legal courts switching to all centralized voiceless vowels, for example). The purpose, ambiguity tolerance, required shared education, and aesthetic considerations for writing are all very different than speaking.
I agree that this is a key point. An additional point underlying all of this is that writing is a far more conscious process than speaking: all scripts are, ultimately, conscripts, and people can consciously control their writing far more easily than they can their speaking.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
Nortaneous
Posts: 1534
Joined: Sun Jul 15, 2018 3:29 am

Re: Graphemic space

Post by Nortaneous »

zompist wrote: Tue Oct 19, 2021 9:09 pm FWIW, my statement in the LCK was modal-- it was about best practices, not about what natlang writing systems actually do. Actual writing systems can be terrible, with multiple letters merging (as in Arabic) or barely distinguishable (as in traditional Hebrew fonts).
Tocharian monks decided to adapt a full Brahmic script to a language with one stop series and then merge the letters for /t/ and /n/, to the point where arguments from sound change were needed to revise the reading of the verbal ending -mntär from earlier -mttär (specifically parallelism with -mc- > -mñc-).

Then again I don't think I reliably distinguish <a e o u> in cursive.
Duaj teibohnggoe kyoe' quaqtoeq lucj lhaj k'yoejdej noeyn tucj.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
Creyeditor
Posts: 238
Joined: Wed Jul 08, 2020 9:15 am

Re: Graphemic space

Post by Creyeditor »

bradrn wrote: Tue Oct 19, 2021 9:36 pm
Creyeditor wrote: Tue Oct 19, 2021 12:49 pm I think graphemes and vowel phonemes are similar in that they are not points in space but clouds. Each realization is a bit different. And vowel clouds frequently overlap even in languages with three vowel phonemes, e.g. Kabardian.
Of course, but I wasn’t claiming that vowel clouds never overlap. Just that they tend to spread out as far as possible in vowel space.
Also, I was referring to an improbable vowel phoneme inventory only consisting of /a/ and /æ/ without any other vowels. Fine vowel quality distinctions in larger phoneme inventories are common, I think. German lax high fromt vowels and tense mid vowels mostky differ in length, which is arguably neutralized in some contexts. And Moro (Kordofanian) has two schwas.
I don’t think such fine distinctions are all that common when you look at vowel systems phonetically. Vanishingly few languages have two vowels separated by only length, or only one height level. There’s a reason why e.g. English has [iː] and [ɪ], rather than [i] and [ɪ], or [iː] and [i]. My dialect does actually have a genuine example of a length distinction in [e̞] vs [e̞ː] (DRESS vs SQUARE), but even then the latter tends to be slightly diphthongised as something like [e̞ˑə̆], and most other dialects separate them even more.

On the other hand, my point is that writing systems are far less resistant to such clashes. Almost any writing system you might think of will have at least two letters which are difficult to tell apart, and some reach levels of confusion which are unheard-of in vowel systems. (The standout example here is undoubtedly Book Pahlavi, which underwent such extreme mergers that it only had 13 distinguishable graphemes. Though admittedly it fell out of use quite quickly.)
I agree that graphemes and vowel phonemes are different. There is more pressure for uniformity in writing. But I think we have to agree to disagree on fine phonetic vowel quality distinctions.
Travis B.
Posts: 6304
Joined: Sun Jul 15, 2018 8:52 pm

Re: Graphemic space

Post by Travis B. »

bradrn wrote: Tue Oct 19, 2021 9:36 pm I don’t think such fine distinctions are all that common when you look at vowel systems phonetically. Vanishingly few languages have two vowels separated by only length, or only one height level. There’s a reason why e.g. English has [iː] and [ɪ], rather than [i] and [ɪ], or [iː] and [i]. My dialect does actually have a genuine example of a length distinction in [e̞] vs [e̞ː] (DRESS vs SQUARE), but even then the latter tends to be slightly diphthongised as something like [e̞ˑə̆], and most other dialects separate them even more.
Just using another English example, though, the English here has separate vowel quality and vowel quantity systems, such that ladder and latter, and madder and matter, are distinguished solely by vowel length, and vowel quantity is derived from historical consonant quality and consonant elision (which tends to make vowels longer by merging or lengthening them whenever hiatus is not possible) while vowel quality is derived from historical vowel quality/quantity.
Yaaludinuya siima d'at yiseka ha wohadetafa gaare.
Ennadinut'a gaare d'ate ha eetatadi siiman.
T'awraa t'awraa t'awraa t'awraa t'awraa t'awraa t'awraa.
bradrn
Posts: 5727
Joined: Fri Oct 19, 2018 1:25 am

Re: Graphemic space

Post by bradrn »

Travis B. wrote: Wed Oct 20, 2021 1:37 pm
bradrn wrote: Tue Oct 19, 2021 9:36 pm I don’t think such fine distinctions are all that common when you look at vowel systems phonetically. Vanishingly few languages have two vowels separated by only length, or only one height level. There’s a reason why e.g. English has [iː] and [ɪ], rather than [i] and [ɪ], or [iː] and [i]. My dialect does actually have a genuine example of a length distinction in [e̞] vs [e̞ː] (DRESS vs SQUARE), but even then the latter tends to be slightly diphthongised as something like [e̞ˑə̆], and most other dialects separate them even more.
Just using another English example, though, the English here has separate vowel quality and vowel quantity systems, such that ladder and latter, and madder and matter, are distinguished solely by vowel length, and vowel quantity is derived from historical consonant quality and consonant elision (which tends to make vowels longer by merging or lengthening them whenever hiatus is not possible) while vowel quality is derived from historical vowel quality/quantity.
What strange dialect do you speak? I have the [æː] vs [æ] distinction, and it even seems to be a purely length-based distinction, but the difference between ladder/latter is voicing (and tapping in the former word) rather than any sort of length.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
kodé
Posts: 113
Joined: Sun Sep 09, 2018 3:17 pm

Re: Graphemic space

Post by kodé »

A different difference, as it were, between vowel phonemes and graphemes, is that graphemes can have several different allographs—but often in a different ways than phonemes have allophones. Allophony is contextual based on phonological context, I.e., other phonemes (or phonological structure). Some allography is based on graphic context, such as traditional typesetting of sequences like ‘fi’, or on structure, like with initial vs. medial vs. final vs. isolation forms of many Arabic letters. But other allography is based on non-graphical features, like non-italicized vs. italicized graphs, or lowercase vs. capital. These features can be syntactic (capitalization, sometimes), lexical (capitalization, other times), discourse-sensitive (CRUISE CONTROL FOR COOL), or sociolinguistic. I’m sure you could find allophony based on lexical or discourse factors, but I’m also pretty sure it isn’t systematic in the way allography is. You couuuuuld argue that font variation is similar to dialectal or register variation, but it bears a lot more thinking out.

As far as easily confusable graphemes, I’m a bit surprised that no one’s brought up Armenian (as an Armenian, I’m required to never shut up about Armenian). Even in clear script, Չ Ջ Ձ Զ are hard to distinguish—and have been hard for me for almost three decades. Certain printed fonts are pretty unreadable, and handwriting can be baaad.
Travis B.
Posts: 6304
Joined: Sun Jul 15, 2018 8:52 pm

Re: Graphemic space

Post by Travis B. »

bradrn wrote: Wed Oct 20, 2021 5:55 pm
Travis B. wrote: Wed Oct 20, 2021 1:37 pm
bradrn wrote: Tue Oct 19, 2021 9:36 pm I don’t think such fine distinctions are all that common when you look at vowel systems phonetically. Vanishingly few languages have two vowels separated by only length, or only one height level. There’s a reason why e.g. English has [iː] and [ɪ], rather than [i] and [ɪ], or [iː] and [i]. My dialect does actually have a genuine example of a length distinction in [e̞] vs [e̞ː] (DRESS vs SQUARE), but even then the latter tends to be slightly diphthongised as something like [e̞ˑə̆], and most other dialects separate them even more.
Just using another English example, though, the English here has separate vowel quality and vowel quantity systems, such that ladder and latter, and madder and matter, are distinguished solely by vowel length, and vowel quantity is derived from historical consonant quality and consonant elision (which tends to make vowels longer by merging or lengthening them whenever hiatus is not possible) while vowel quality is derived from historical vowel quality/quantity.
What strange dialect do you speak? I have the [æː] vs [æ] distinction, and it even seems to be a purely length-based distinction, but the difference between ladder/latter is voicing (and tapping in the former word) rather than any sort of length.
The strange dialect I speak is that of Milwaukee, WI. All in all, the diachronics are pretty simple. First, all phonemic vowel length was lost, reducing vowel distinctions to quality alone. Then, all vowels before fortis obstruents (with or without an intervening sonorant) turned short, and all other vowels turned long - simple vowel length allophony at this point. Then some voicing contrasts were lost, such as the distinction between intervocalic /t/ and /d/ and the voicing of /t/ versus /d/ before another plosive (in this case, though, the preglottalization distinction is still preserved). Additionally, many intervocalic consonants and even consonant clusters were lost; where hiatus was not possible, and in some case where it was possible, the preceding and following vowels merged either into diphthongs or lengthened vowels; if the preceding vowel was short, the resulting diphthong or lengthened vowel is long, and if the preceding vowel was long, the resulting diphthong or lengthened vowel is overlong. Note that here vowel nasalization was preserved, as if either original vowel was nasalized, the resulting diphthong or lengthened vowel is also nasalized.
Yaaludinuya siima d'at yiseka ha wohadetafa gaare.
Ennadinut'a gaare d'ate ha eetatadi siiman.
T'awraa t'awraa t'awraa t'awraa t'awraa t'awraa t'awraa.
bradrn
Posts: 5727
Joined: Fri Oct 19, 2018 1:25 am

Re: Graphemic space

Post by bradrn »

kodé wrote: Wed Oct 20, 2021 7:40 pm A different difference, as it were, between vowel phonemes and graphemes, is that graphemes can have several different allographs—but often in a different ways than phonemes have allophones. Allophony is contextual based on phonological context, I.e., other phonemes (or phonological structure). Some allography is based on graphic context, such as traditional typesetting of sequences like ‘fi’, or on structure, like with initial vs. medial vs. final vs. isolation forms of many Arabic letters. But other allography is based on non-graphical features, like non-italicized vs. italicized graphs, or lowercase vs. capital. These features can be syntactic (capitalization, sometimes), lexical (capitalization, other times), discourse-sensitive (CRUISE CONTROL FOR COOL), or sociolinguistic. I’m sure you could find allophony based on lexical or discourse factors, but I’m also pretty sure it isn’t systematic in the way allography is. You couuuuuld argue that font variation is similar to dialectal or register variation, but it bears a lot more thinking out.
I’d argue that there are four main motivations for allography:
  • Free variation: decision between allographs is purely at the whims of the writer: e.g. Latin ⟨a~ɑ⟩, serif vs sans-serif
  • Stylistic: decision between allographs affects only emphasis and tone: e.g. italicisation, full-caps
  • Contextual: decision between allographs depends on context: e.g. initial/final forms in Arabic/Hebrew/Greek, sentence-initial capitalisation, ligatures
  • Semantic: there are minimal pairs between allographs and these have different meanings: e.g. sentence-internal capitalisation
Perhaps more relevantly for this thread, I’d also argue that there’s two entirely different forms of allography. As usual, I prefer to analyse it in terms of prototypes:
  • Polytypicality: a single grapheme with multiple prototypes: e.g. ⟨a⟩ vs ⟨ɑ⟩, or ⟨צ⟩ vs ⟨ץ⟩
  • Variation within the prototype: e.g. ⟨a⟩ vs ⟨a⟩, or ⟨ש⟩ vs ⟨ש
I hypothesise that variation within the prototype is mostly associated with free variation, whereas polytypicality is mostly associated with other motivations for allography.
As far as easily confusable graphemes, I’m a bit surprised that no one’s brought up Armenian (as an Armenian, I’m required to never shut up about Armenian).
Only because I didn’t know about it! I’ll add it to my list of examples, thanks.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
Post Reply