Sego

bradrn · Post by **bradrn** » Sun Nov 12, 2023 6:07 am

Richard W wrote: ↑Sun Nov 12, 2023 5:12 am
bradrn wrote: ↑Sun Nov 12, 2023 4:08 am (Also, may I point out that Emacs is a really bad choice of text editor for such fonts… though then again it now uses HarfBuzz, so I guess that makes it a lot easier. I don’t recall how well it worked, the last time I tried using an OpenType font in Emacs.)

[EDIT: yes, it’s as terrible as I thought it was. Even ligatures require defining them manually and iterating through the text to find them; something more advanced like Noto Nastaliq just completely doesn’t work at all.]
You may have to populate table composition-function-table, which might take a while to be taken notice of. With the Tai Tham script in Emacs 27, my heart fell when I saw <BA, MEDIAL RA, SIGN E> rendered with the SIGN E rendered on the right with a dotted circle before it, but within a few minutes it had been Indicly rearranged to correctly display the three glyphs in reverse order to the three characters. Last time I read the documentation, it seemed to undersell the complex script support.

Seeing that Emacs is my preferred text editor, this is good to know, thanks! Of course, it’s ridiculously baroque compared to how it should be handled, but then again that does describe much of Emacs…

Richard W wrote: ↑Sun Nov 12, 2023 5:47 am You don't need 280 different glyphs to swap 20 heads and 14 bodies. One could most smoothly do that with 14 uninked non-spacing marks, thus:

1) Contextual substitution: Replace head_x body_y by head_x body_y_blanked
2) For each y, a contextual substitution: Replace head_x body_y_blanked by body_y head_x body_y_blanked
The invoked substitution would replace head_x by body_y head_x (280 of these).

That's simple typing.

OK, you’ve lost me here… what’s body_y_blanked supposed to be? And how does it help to flip head_x body_y → body_y head_x (which appears to be the cumulative effect of these rules)?

I do something similar in my Da Lekh font, where I use a pre-defined macro insert_before, so the subtable for the invoked substitution is:
Code: Select all
lookup spawn_ke
    insert_before mai_ke   movepre_letter
    insert_before mai_ke   movepre_inpra 
end lookup
(I do this so I can optionally render a transliteration.) The variables movepre_letter and movepre_inpra represent groups of glyphs.

What software is this code for? I’ve been specifying OpenType properties through FontForge’s dialogs, which is probably a suboptimal way of doing it.

Richard W · Post by **Richard W** » Sun Nov 12, 2023 6:58 am

Richard W wrote: ↑Sun Nov 12, 2023 5:12 am You may have to populate table composition-function-table, which might take a while to be taken notice of. With the Tai Tham script in Emacs 27, my heart fell when I saw <BA, MEDIAL RA, SIGN E> rendered with the SIGN E rendered on the right with a dotted circle before it, but within a few minutes it had been Indicly rearranged to correctly display the three glyphs in reverse order to the three characters. Last time I read the documentation, it seemed to undersell the complex script support.

It looks like a more complicated issue for Nastaliq - https://mail.gnu.org/archive/html/bug-g ... 00908.html.

Richard W · Post by **Richard W** » Sun Nov 12, 2023 7:51 am

bradrn wrote: ↑Sun Nov 12, 2023 6:07 am
Richard W wrote: ↑Sun Nov 12, 2023 5:47 am You don't need 280 different glyphs to swap 20 heads and 14 bodies. One could most smoothly do that with 14 uninked non-spacing marks, thus:

1) Contextual substitution: Replace head_x body_y by head_x body_y_blanked
2) For each y, a contextual substitution: Replace head_x body_y_blanked by body_y head_x body_y_blanked
The invoked substitution would replace head_x by body_y head_x (280 of these).

That's simple typing.
OK, you’ve lost me here… what’s body_y_blanked supposed to be? And how does it help to flip head_x body_y → body_y head_x (which appears to be the cumulative effect of these rules)?

body_y_blanked is one of the 14 uninked non-spacing marks - one for each distinct body - not needed for those with the length mark.

One makes head_x, at least when used as part of a CV combination, into a zero-width non-spacing mark. The horizontal metrics for a CV combination will then be taken from body_y.

bradrn wrote: ↑Sun Nov 12, 2023 6:07 am
Richard W wrote: ↑Sun Nov 12, 2023 5:47 am I do something similar in my Da Lekh font, where I use a pre-defined macro insert_before, so the subtable for the invoked substitution is:
Code: Select all
lookup spawn_ke
    insert_before mai_ke   movepre_letter
    insert_before mai_ke   movepre_inpra 
end lookup
(I do this so I can optionally render a transliteration.) The variables movepre_letter and movepre_inpra represent groups of glyphs.
What software is this code for? I’ve been specifying OpenType properties through FontForge’s dialogs, which is probably a suboptimal way of doing it.

My font compiler, which was compulsorily published at https://wrdingham.co.uk/fonts/oft.html. (The compulsion came from the GNU GPL.) It is perhaps close to the metal, but not as close as TTX. In TTX, which lacks the 'macro', it is, approximately:

Code: Select all

      <Lookup index="58">
        <LookupType value="7"/>
        <LookupFlag value="0"/>
        <!-- SubTableCount=1 -->
        <ExtensionSubst index="0" Format="1">
          <ExtensionLookupType value="2"/>
          <MultipleSubst Format="1">
            <Substitution in="bullet" out="uni1A6E,bullet"/>
            <Substitution in="emdash" out="uni1A6E,emdash"/>
            <Substitution in="endash" out="uni1A6E,endash"/>
            <Substitution in="high_tail" out="uni1A6E,high_tail"/>
            <Substitution in="hyphen" out="uni1A6E,hyphen"/>
            <Substitution in="left_low_tail" out="uni1A6E,left_low_tail"/>
            <Substitution in="multiply" out="uni1A6E,multiply"/>
            <Substitution in="space_nb" out="uni1A6E,space_nb"/>
            <Substitution in="u1A2C_u1A60_u1A2C" out="uni1A6E,u1A2C_u1A60_u1A2C"/>
            <Substitution in="u_body" out="uni1A6E,u_body"/>
            <Substitution in="uni1A20" out="uni1A6E,uni1A20"/>
            <Substitution in="uni1A21" out="uni1A6E,uni1A21"/>
            <Substitution in="uni1A22" out="uni1A6E,uni1A22"/>
            <Substitution in="uni1A23" out="uni1A6E,uni1A23"/>
            <Substitution in="uni1A24" out="uni1A6E,uni1A24"/>
            <Substitution in="uni1A25" out="uni1A6E,uni1A25"/>
            <Substitution in="uni1A26" out="uni1A6E,uni1A26"/>
            <Substitution in="uni1A27" out="uni1A6E,uni1A27"/>
            <Substitution in="uni1A28" out="uni1A6E,uni1A28"/>
...
            <Substitution in="uni25FC" out="uni1A6E,uni25FC"/>
            <Substitution in="uni25FD" out="uni1A6E,uni25FD"/>
            <Substitution in="uni25FE" out="uni1A6E,uni25FE"/>
          </MultipleSubst>
        </ExtensionSubst>
      </Lookup>

My point was partly that, yes, even the 280 lines of text get tedious. The name mai_ke is an alias for the Postscript name uni1A6E. The other point was that this glyph-efficient glyph swapping technique can be found in the source code for Da Lekh.

bradrn · Post by **bradrn** » Sun Nov 12, 2023 8:15 am

Richard W wrote: ↑Sun Nov 12, 2023 7:51 am
bradrn wrote: ↑Sun Nov 12, 2023 6:07 am
Richard W wrote: ↑Sun Nov 12, 2023 5:47 am You don't need 280 different glyphs to swap 20 heads and 14 bodies. One could most smoothly do that with 14 uninked non-spacing marks, thus:

1) Contextual substitution: Replace head_x body_y by head_x body_y_blanked
2) For each y, a contextual substitution: Replace head_x body_y_blanked by body_y head_x body_y_blanked
The invoked substitution would replace head_x by body_y head_x (280 of these).

That's simple typing.
OK, you’ve lost me here… what’s body_y_blanked supposed to be? And how does it help to flip head_x body_y → body_y head_x (which appears to be the cumulative effect of these rules)?
body_y_blanked is one of the 14 uninked non-spacing marks - one for each distinct body - not needed for those with the length mark.

One makes head_x, at least when used as part of a CV combination, into a zero-width non-spacing mark. The horizontal metrics for a CV combination will then be taken from body_y.

Hmm, I think I’m starting to get it… let me see if I’ve got this straight:

⟨head_x⟩ and ⟨body_y⟩ are separate glyphs, all of which are spacing.
Each ⟨body_y⟩ has a corresponding glyph ⟨body_y_blanked⟩, which contains no ink and is not spacing.
A sequence of ⟨head_x body_y_blanked⟩ is rendered as a non-spacing diacritic, placed on top of the previous glyph
The head and body glyphs need to be flipped so that the head can be made a combining character, which modifies the previous character

In which case, there’s still a few details I’m confused about:

If all the ⟨body_y_blanked⟩ glyphs are identical (uninked and non-spacing), why do we need a separate one for each ⟨body_y⟩?
How is ⟨head_x body_y_blanked⟩ rendered as non-spacing? Through another GSUB?
Why not achieve substitutions 1) and 2) in your quoted post via a single substitution, which maps ⟨head_x body_y⟩ immediately to ⟨body_y head_x body_y_blanked⟩?

sasasha · Post by **sasasha** » Sun Nov 12, 2023 1:14 pm

bradrn wrote: ↑Sun Nov 12, 2023 2:04 am …and it’s alive!

This is really cool! Thank you so much for putting the time in to try it out!!

bradrn wrote: ↑Sun Nov 12, 2023 8:15 am
Richard W wrote: ↑Sun Nov 12, 2023 7:51 am
bradrn wrote: ↑Sun Nov 12, 2023 6:07 am OK, you’ve lost me here… what’s body_y_blanked supposed to be? And how does it help to flip head_x body_y → body_y head_x (which appears to be the cumulative effect of these rules)?
body_y_blanked is one of the 14 uninked non-spacing marks - one for each distinct body - not needed for those with the length mark.

One makes head_x, at least when used as part of a CV combination, into a zero-width non-spacing mark. The horizontal metrics for a CV combination will then be taken from body_y.
Hmm, I think I’m starting to get it… let me see if I’ve got this straight:

⟨head_x⟩ and ⟨body_y⟩ are separate glyphs, all of which are spacing.

Each ⟨body_y⟩ has a corresponding glyph ⟨body_y_blanked⟩, which contains no ink and is not spacing.

A sequence of ⟨head_x body_y_blanked⟩ is rendered as a non-spacing diacritic, placed on top of the previous glyph

The head and body glyphs need to be flipped so that the head can be made a combining character, which modifies the previous character

In which case, there’s still a few details I’m confused about:

If all the ⟨body_y_blanked⟩ glyphs are identical (uninked and non-spacing), why do we need a separate one for each ⟨body_y⟩?

How is ⟨head_x body_y_blanked⟩ rendered as non-spacing? Through another GSUB?

Why not achieve substitutions 1) and 2) in your quoted post via a single substitution, which maps ⟨head_x body_y⟩ immediately to ⟨body_y head_x body_y_blanked⟩?

So grateful to both of you for this discussion ‒ I'm infinitely more informed now than I was before!

Richard W · Post by **Richard W** » Sun Nov 12, 2023 5:51 pm

bradrn wrote: ↑Sun Nov 12, 2023 8:15 am
Richard W wrote: ↑Sun Nov 12, 2023 7:51 am
bradrn wrote: ↑Sun Nov 12, 2023 6:07 am

OK, you’ve lost me here… what’s body_y_blanked supposed to be? And how does it help to flip head_x body_y → body_y head_x (which appears to be the cumulative effect of these rules)?
body_y_blanked is one of the 14 uninked non-spacing marks - one for each distinct body - not needed for those with the length mark.

One makes head_x, at least when used as part of a CV combination, into a zero-width non-spacing mark. The horizontal metrics for a CV combination will then be taken from body_y.
Hmm, I think I’m starting to get it… let me see if I’ve got this straight:

⟨head_x⟩ and ⟨body_y⟩ are separate glyphs, all of which are spacing.

Each ⟨body_y⟩ has a corresponding glyph ⟨body_y_blanked⟩, which contains no ink and is not spacing.

A sequence of ⟨head_x body_y_blanked⟩ is rendered as a non-spacing diacritic, placed on top of the previous glyph

The head and body glyphs need to be flipped so that the head can be made a combining character, which modifies the previous character

In which case, there’s still a few details I’m confused about:

If all the ⟨body_y_blanked⟩ glyphs are identical (uninked and non-spacing), why do we need a separate one for each ⟨body_y⟩?

How is ⟨head_x body_y_blanked⟩ rendered as non-spacing? Through another GSUB?

Why not achieve substitutions 1) and 2) in your quoted post via a single substitution, which maps ⟨head_x body_y⟩ immediately to ⟨body_y head_x body_y_blanked⟩?

I didn't quite get it. I think I also need to change head_x to head_dia_x, which means my macro wouldn't work. The glyph head_x is spacing, but head_dia_x is not. Mapping the character represented by 'x' directly to a mark would be unnecessarily moving into unspecified territory. (I think in terms of character encoding and the OpenType specification, rather than being oriented towards many font editors.)

So, for the CV combinations without a length mark, the mapping goes:

1) Backing store has characters <x><y>.
2) The CMAP table converts this to the glyph sequence head_x body_y
3) One contextual lookup converts this to head_x body_y_blanked
4) The 14 contextual lookups then convert this to body_y head_dia_x body_y_blanked. I'm assuming that bodies without a head can also be implemented by glyph body_y.
5) With any luck, the cursive positional lookups will then work as intended.

We need the 14 separate body_y_blanked glyphs to provide the context for the 14 contextual lookups.

head_dia_x and body_y_blanked are made non-spacing by specifying their advance widths as zero. I would also ultimately declare them as marks in the GDEF table - I'm not sure how one does that in the AFDKO notation. I would also put the body_y_blanked in the mark attachment class (OpenType specification terminology) for marks that almost don't interact. In my compiler, those glyph attributes can be assigned by 'category mark invisible', but I haven't worked out how to do it in AFDKO. (As a matter of policy, I assign all glyphs to one of the four categories base, ligature, component and mark, but some font editors will default these categories.) Confusingly, AFDKO uses the term attachment class for the partitioning of the attached marks in the mark2* lookups, and I haven't worked out how to do the partitioning of marks into mark attachment classes using the AFDKO.

bradrn wrote: ↑Sun Nov 12, 2023 8:15 am Why not achieve substitutions 1) and 2) in my quoted post via a single substitution, which maps ⟨head_x body_y⟩ immediately to ⟨body_y head_x body_y_blanked⟩?

Well, according to the OpenType specification, there is a way of doing it. In my compiler's notation:

Code: Select all

GSUB
    lookup bigbang
         type context
        subtable bigbang
    end lookup

    lookup expand_a
        type  multiple
       subtable expand_a
    end lookup
...
    lookup blank
         type single
    end lookup
end GSUB

classification for_bigbang
    empty bb_0 
    class bb_head is coverage head_p to head_r -- If we use an obvious glyph numbering
    class bb_a is body_a
...
    class bb_ua is body_ua
    class bb_i is body_i
    class bb_o i body_o
end classification

lookup bigbang -- This subtable of a context lookup uses a 'classification' to specify the lookup.  (AFDKO chooses mode itself.)
    context bb_head bb_a
    0 expand_a -- Lookup to apply to 1st glyph
    2 blank -- Lookup to apply to 2nd glyph - I had written index '1', but of course it should be '2' because the first glyph
            -- expands to two glyphs.
    ...
    context bb_head bb_ua
    0 expand_ua -- Lookup to apply to 1st glyph
    2 blank -- Lookup to apply to 2nd glyph

    context bb_head bb_i
    0 expand_i -- Lookup to apply to 1st glyph
    2 blank -- Lookup to apply to 2nd glyph
    ...
    context bb_head bb_o
    0 expand_o -- Lookup to apply to 1st glyph
    2 blank -- Lookup to apply to 2nd glyph
end lookup

lookup expand_a
    head_p > body_a head_dia_p
    ...
   head_r > body_a head_dia_r
end lookup

lookup blank
    bb_a > body_a_blanked
...
    bb_ua > bb_ua_blanked
    bb_i  > bb_i_blanked
    bb_o  > bb_o_blanked
end lookup

This works with HarfBuzz.

It fails on Windows using Uniscribe/DirectWrite. (OpenType is delicate. I suspect buffering problems.) And that's the killer for me. If I only wanted my fonts to work on Linux, I'd use (SIL) Graphite, not Opentype, and Windows users would have to use applications that support Graphite fonts, which has meant just LibreOffice and Firefox, and iPhone users could go hang.

As a matter of policy, OpenType does not do glyph swapping, insertion or deletion. These are supposed to be in the domain of script-specific shapers. The jack-booted USE has relaxed that a bit, but you're still handicapped without a standardised Unicode encoding.

PS: One might try substituting the second glyph first, as apparently allowed by the OpenType syntax specification, but some (all?) renderers, e.g. HarfBuzz and M18n, insist that the indexes be in strictly ascending order. AAT's OpenType implementation has difficulty with the concept of glyphs' indexes changing as the subsidiary substitutions progressed, as did HarfBuzz's until I pointed out what the specification said.

bradrn · Post by **bradrn** » Sun Nov 12, 2023 11:38 pm

Richard W wrote: ↑Sun Nov 12, 2023 5:51 pm So, for the CV combinations without a length mark, the mapping goes:

1) Backing store has characters <x><y>.
2) The CMAP table converts this to the glyph sequence head_x body_y
3) One contextual lookup converts this to head_x body_y_blanked
4) The 14 contextual lookups then convert this to body_y head_dia_x body_y_blanked. I'm assuming that bodies without a head can also be implemented by glyph body_y.
5) With any luck, the cursive positional lookups will then work as intended.

We need the 14 separate body_y_blanked glyphs to provide the context for the 14 contextual lookups.

OK, this is starting to make more sense now. I hadn’t realised you were still using cursive positional lookups. Presumably, this method requires the cursive exit anchor to be placed at the top of the body, right?

(That being said, I’m still not entirely sure what body_y_blanked is for — which of the ‘contextual lookups’ are you referring to in that last sentence?)

If I only wanted my fonts to work on Linux, I'd use (SIL) Graphite, not Opentype, and Windows users would have to use applications that support Graphite fonts, which has meant just LibreOffice and Firefox, and iPhone users could go hang.

And don’t forget LuaTeX!

As a matter of fact, this may actually be a plausible option, depending on what sasasha wants to use this font for. I’ve never used Graphite myself, but it sounds much less painful than messing around with OpenType tables.

Richard W · Post by **Richard W** » Mon Nov 13, 2023 3:11 am

bradrn wrote: ↑Sun Nov 12, 2023 11:38 pm
Richard W wrote: ↑Sun Nov 12, 2023 5:51 pm So, for the CV combinations without a length mark, the mapping goes:

1) Backing store has characters <x><y>.
2) The CMAP table converts this to the glyph sequence head_x body_y
3) One contextual lookup converts this to head_x body_y_blanked
4) The 14 contextual lookups then convert this to body_y head_dia_x body_y_blanked. I'm assuming that bodies without a head can also be implemented by glyph body_y.
5) With any luck, the cursive positional lookups will then work as intended.

We need the 14 separate body_y_blanked glyphs to provide the context for the 14 contextual lookups.
OK, this is starting to make more sense now. I hadn’t realised you were still using cursive positional lookups. Presumably, this method requires the cursive exit anchor to be placed at the top of the body, right?

Yes. Incidentally, I think cursive connection requires the TrueType flavour of glyph definitions.

bradrn wrote: ↑Sun Nov 12, 2023 11:38 pm (That being said, I’m still not entirely sure what body_y_blanked is for — which of the ‘contextual lookups’ are you referring to in that last sentence?)

The 14 contextual lookups at Step 4 above. In earlier versions of Da Lekh, I cleaned up the equivalents (ghosts of shifted preposed vowels), but then I found they were pre-adaptions for transliteration and grammatical mark-up (as an aid to spell-checking).

sasasha · Post by **sasasha** » Mon Nov 13, 2023 6:24 am

Phonology

Consonants

Classical Seguwe-akhe has the following consonants:

	Labial	Dental	Alveolar	Post-Alveolar / Palatal	Velar
Unaspirated unvoiced stop	p		t		k
Unaspirated voiced stop	b		d		g
Aspirated stop	ph		th		kh
Nasal	m		n
Lateral affricate			tl
Unvoiced fricative			s	š
Voiced fricative		ð	z
Lateral approximant			l
Trill/tap			r
Approximant	w			y

The alveolar series is generally laminal, except ⟨s z⟩, which are typically more apical. The postalveolar fricative ⟨š⟩ tends towards a laminal pronunciation in lower-class speech, and apical in upper-class speech; though this varies by dialect. All the coronal consonants are generally more laminal and retracted in lower registers, with apical realisations a marker of prestige.

⟨ð⟩ is a relatively marginal phoneme occuring mainly in particles and as a separator at certain morpheme boundaries. However, it is found in several common lexical items. Its surface realisation varies from /ð/ to /ɣ/, or merges with ⟨y⟩ /j/ in some dialects.

Vowels

There are three vowels, with a two-way length distinction, and two diphthongs.

		u ū
e ē
	a ā

ae /æ͜ɪ/
au /ɑ͜ʊ/

The Classical vowel-system distinguishes vowel height and roundedness, but not front/back distinctions. (Modern varieties do not display this trait). ⟨e ē⟩ are mid-high, and commonly fronted, whilst ⟨u ū⟩ are high, rounded, and tend towards backing. Surface realisations of the vowels vary according to sandhi and sociolinguistic factors.

Phonotactics

Syllables in Classical Seguwe-akhe may only take the form (C)V, where V may be any of the eight vowel segments.

Stress

Seguwe-akhe is aggluttinating, with relatively long words common. A weak stress accent on the penult of roots is common, but is affected by a variety of phonological factors. Different compounding, derivational and grammatical morphological processes affect the realisation of the stress accent. In addition, stress patterns vary according to dialect and socio-linguistic factors.

More: show

bradrn · Post by **bradrn** » Mon Nov 13, 2023 8:23 am

Richard W wrote: ↑Mon Nov 13, 2023 3:11 am
bradrn wrote: ↑Sun Nov 12, 2023 11:38 pm (That being said, I’m still not entirely sure what body_y_blanked is for — which of the ‘contextual lookups’ are you referring to in that last sentence?)
The 14 contextual lookups at Step 4 above. In earlier versions of Da Lekh, I cleaned up the equivalents (ghosts of shifted preposed vowels), but then I found they were pre-adaptions for transliteration and grammatical mark-up (as an aid to spell-checking).

I think I’m not making my confusion entirely clear. The two mappings you listed are as follows:

head_x body_y → head_x body_y_blanked (1 rule)
head_x body_y_blanked → body_y head_dia_x body_y_blanked (14 rules)

But as far as I can see, this could just as easily be done with only one mapping, namely:

head_x body_y → body_y head_dia_x (14 rules)

On the basis that if it’s possible to write a rule to insert a body_y given a body_y_blanked, it should be just as possible to insert a body_y given a body_y.

So what am I missing here?

sasasha wrote: ↑Mon Nov 13, 2023 6:24 am (Question: am I overcomplicating this?! I don't think stress is exactly free, but it's quite difficult to find the patterns underlying my instincts.)

Yes, this seems somewhat ridiculously baroque to me. More to the point, it’s a reasonably strong linguistic universal that stress position cannot depend on the value of the syllable onset.

That being said, simple stress-assignment rules can still give surprisingly complex outcomes. I find it helpful to tinker around with rules for left-to-right and right-to-left assignment of iabms and trochees, especially when adding extrasyllabicity constraints into the mix. (Hayes’s Metrical Stress Theory is a nice book on the topic, albeit slightly too definite on some topics.)

sasasha · Post by **sasasha** » Mon Nov 13, 2023 9:09 am

bradrn wrote: ↑Mon Nov 13, 2023 8:23 am Yes, this seems somewhat ridiculously baroque to me. More to the point, it’s a reasonably strong linguistic universal that stress position cannot depend on the value of the syllable onset.

That being said, simple stress-assignment rules can still give surprisingly complex outcomes. I find it helpful to tinker around with rules for left-to-right and right-to-left assignment of iabms and trochees, especially when adding extrasyllabicity constraints into the mix. (Hayes’s Metrical Stress Theory is a nice book on the topic, albeit slightly too definite on some topics.)

Oh, fabulous, thank you! I will adjust another time.

Richard W · Post by **Richard W** » Mon Nov 13, 2023 1:15 pm

bradrn wrote: ↑Mon Nov 13, 2023 8:23 am Yes, this seems somewhat ridiculously baroque to me. More to the point, it’s a reasonably strong linguistic universal that stress position cannot depend on the value of the syllable onset.

Yea! Another oddity for English (Middle Class Southern English?), where the vowel after a cluster -Ch- has to be stressed. (If it weren't, the /h/ wouldn't be sounded.)

Richard W · Post by **Richard W** » Mon Nov 13, 2023 2:05 pm

bradrn wrote: ↑Mon Nov 13, 2023 8:23 am I think I’m not making my confusion entirely clear. The two mappings you listed are as follows:

head_x body_y → head_x body_y_blanked (1 rule)
head_x body_y_blanked → body_y head_dia_x body_y_blanked (14 rules)

But as far as I can see, this could just as easily be done with only one mapping, namely:

head_x body_y → body_y head_dia_x (14 rules)

On the basis that if it’s possible to write a rule to insert a body_y given a body_y_blanked, it should be just as possible to insert a body_y given a body_y.

So what am I missing here?

A command to replace one sequence of glyphs by another. Formally, there are 8 types of substitution command - single, multiple, alternate, ligature, context, chained, extension and reverse. Extension is just a way of having large gaps between parts of the encoding of a substitution, and reverse just allows a dependency to ripple through from the end of a run of glyphs. Alternate is like single, but allows the replacement to be chosen by some external mechanism. Context is a special case of chained. (CSS Level 3 Fonts Module font-feature-settings defines the most intelligible mechanism.) That leaves:

single (replace one glyph by another, depending on what's being replaced),
multiple (replace one glyph by 2 or more depending on what's being replaced, though I've not seen an implementation that doesn't allow one or more)
ligature
chained - match sequences ABC, with B non-null and process B as defined by other lookups, and repeat starting at C, or advance one non-ignored glyph and repeat. A and C may be null, and there may be multiple combinations ABC to process.

Basically, a chained substitution is your only way forward.

bradrn · Post by **bradrn** » Mon Nov 13, 2023 5:10 pm

Richard W wrote: ↑Mon Nov 13, 2023 2:05 pm
bradrn wrote: ↑Mon Nov 13, 2023 8:23 am I think I’m not making my confusion entirely clear. The two mappings you listed are as follows:

head_x body_y → head_x body_y_blanked (1 rule)
head_x body_y_blanked → body_y head_dia_x body_y_blanked (14 rules)

But as far as I can see, this could just as easily be done with only one mapping, namely:

head_x body_y → body_y head_dia_x (14 rules)

On the basis that if it’s possible to write a rule to insert a body_y given a body_y_blanked, it should be just as possible to insert a body_y given a body_y.

So what am I missing here?
A command to replace one sequence of glyphs by another.

Ah! It all makes sense now, thanks! I had thought chained substitutions were more flexible than they actually are.

sasasha · Post by **sasasha** » Sat Nov 18, 2023 5:59 am

Big info dump, to sort out:

Introduction to the Aretian language family
(From an in-world Felonian Era Felonian-centric perspective)

According to Old Urngese legend, the ancient peoples of Aretia who survived the end of the world were gathered by the great Vashari Aum onto the slopes of Mount Kyunaret, and were found to be of three broad tribes: the White (Hujar) tribe, the Green (Kiavik) tribe, and the Red (Jakka) tribe. To escape the desolation of the world they agreed to co-operate in building Jakuk-Niaunun (c.f. Ainyn jøkuć ‘cave-home’ njynan ‘mountain-gate’, njynunei ‘mountain-town’), an underground palace of remarkable beauty within the mountain, which would house the god Kiau (c.f. Kyunaret) and keep the people safe from the ravaging demons (aurunki) outside until the earth could be repopulated. When the danger subsided, each tribe went its separate ways: the Hujar pushed north to hunt in the earth's slow-waxing memory of winter, the Kiavik west to settle the new, virgin forests, and the Jakka south to heal the soil.

White Aretians
The White Aretians / Snow Tribe left first, and travelled north, following the migrating herds and birds, where, with the cooperation of powerful Vashari, they founded Birdland (the legendary White Aretian homeland). Here they encountered the rival Glass Tribe (the ancient Toi·oi), against whom many wars were fought. The two sides eventually came to a confederation and co-founded the Kingdom of Urngas, though later conflict arose again and an army from Birdland led by (anti-)hero Kureselmi conquered Glass Island (Ajinuk), ravaging it in the process and destroying or exiling the Glass Tribe. (Kureselmi famously deceived a Vashari lord into giving him aid to cross the ocean, convincing the Vashari that the Glass Tribe were fashioning weapons designed to take Vashari out of the sky, and that the Vashari must help destroy the tribe’s glass houses to interrupt production of the weapon.)

The principal White Aretian deities were (in Kelsi) Yumta, the goddess of the quick northern summer, in which flowers carpet the earth; her consort Sennifín, the Lord of Mosquitoes; Moi, the summer-dormant elk-god whose giant waking body sheds snow; and his consort Šelm, the goddess of the sky, who lives in the full moon, and whose principal task is guarding the Bridge of Heaven at night. Kiu, the Lord of the Mountain, is said to reside inside Kyunaret (Urngese Kiau).

A White Aretian group migrated more easterly than the others, and established the nation of Iutarnum. Later this area was conquered by a Red Aretian (Nicufenkic) peoples, and became Jodarn.

(At the beginning of the Classical Era, the seafaring Róruans arrived and conquered first Ajinuk and then Urngas (and parts of Baugo), which was seen as the retribution of the Glass Tribe for the actions of Kureselmi; these areas were subject to the Empire of the East for a time, but after several hundred years the Róruan polity disintegrated, leaving relatively little trace; new kings of Urngas and Ajinuk were called from ‘Birdland’ (by then identified with Tumšaǧ, a large Kelsi polity), and both areas retained a distinct cultural identity throughout the Róruan period.)

Green Aretians
The Green Aretians / Leaf Tribe chose to live in the dense upland and mid-high-latitude coniferous- and temperate rain-forests of Aretia, and consequently spread west along the vast band of this terrain in Northern Aretia, hence they acquired the name Arethlanians (lan = west). They were skilled in both hunting and gathering, and garden agriculture. They frequently assimilated isolated tribes whom they encountered (many of whom spoke forms of Old East Orlogian, hence the significant OEO substrate in Arethlanian languages), adopting local agricultural practice and many customs, but usually imposing (with the aid of their relatively sophisticated material culture) their language and overall belief and political systems as they went. Eventually the Arethlanian languages split into many divergent branches, including Central Alpine, Central Silvine, Baugoic (inc. Old Beyvin; Middle Beyvin being formed by heavy influence from Rochûce), Baulanic, Yarrobamic, Felonian, Aeolic and Panaphonic.

Red Aretians
The Red Aretians / Soil Tribe based their migrations on the reclamation of the soils (see the Great Kudzu) and the cultivation of crops on a wide scale; especially wheat in the West (e.g. the Iozhi and Chiacoic areas), and redroot in the East (the Wengal and Nicufenki areas). They also became keen pastoralists (Wengal and Chiacoic - mainly pastoralist; Iozhi and Nicuphenkic - mainly agriculturalists). They encountered fewer survivalist tribes than the Arethlanians, since the lands which they reclaimed had been, in general, abandoned during the Thermal Crisis. Their languages all derived from a common source, Proto-Red-Aretian, which was spoken in the foothills and grasslands south of the Aretian Alps for several centuries before their speakers dispersed, using the Aretian pony as a means of exploration, exchange, colonisation and conquest.

On Dating the Thermal Crisis

Aretian scholarship of the late Felonian era held that the Thermal Crisis occurred roughly around 4200 to 4000 BSE:

Thermal Crisis ~BSE 4200‒4000
VE 5· BSE 4000‒3001
NE 4· BSE 3000‒2001
AE 3· BSE 2000‒1001
RE 2· BSE 1000‒1
SE 1· SE 0‒999
FE (0)· SE 1000‒1999
ME (1)° SE 2000‒

In actuality, carbon dating has suggested that the Thermal Crisis occurred as much as two thousand years earlier, and that the Padin domes collapsed at least a thousand years before the Iozhi chronicles intimate (so we can no longer trust even the earliest Iozhi accounts of the Padin as anything beyond speculative fiction, informed, perhaps, by insightful myth). How to account for the missing millennia is anyone's guess.

Myth-makers may have used older stories of the Crisis to legitimate claims to hegemony; for instance, the Aretian myths regarding refuge in Kyunaret may preserve actual memories of the Crisis, but have recast the characters completely to fit their own world-view. Potentially, the Proto-Aretians were just a successful culture who came to dominate Aretia long after the land had been ‘reclaimed’.

Proto-Aretian

Phonology

Vowels

IPA
i u
æ ɑ

Orthography
i u
a ā

With length distinction

Consonants
(Orthography)

p t k k̆
f s x x̆ (H)
m n ŋ ŋ̆ (N)
l
v r j w

Phonotactics

(N,H)(C)V(V)(C2)(H)

C2 can be -p, -t, -k, -k̆, -m, -n, -ŋ, -ŋ̆, -r

Some cognates

°iæːk- > Hujar °iakka
> OUr. jakka
> Ain. eka
> Kel. yahka
> Jakka °ioki
> Aiv. iyaš
> Ioz. iozhi
> MIo. zhôzhi
> Jod. yogi
> Nik. zhoukʉ
> Kya. kyōhi
> Wen. shök
> Kjavik °eaħ
> Eth. ei
> Bag. eaħ
> Bey ïàf
> Yar. ɛlɣ
> Bli. elagh

°iāk- > Huj. °iāka
> OUr. juka
> Ain. jøć
> Kel. yunka
> Jak. °ioka
> Aiv. iyâk
> Ioz. ioga
> MIo. zhɔ̂va
> Jod. yova
> Nik. zhāfa
> Kya. kyoka
> Wen. shok
> Kjavik °eāħ
> Eth. och
> Bag. eoħ
> Bey œ̈w
> Yar. yewɣ
> Bli. llevŭgh

°āur- > Huj. °aur
> OUr. aur
> Ain. or
> Kel. aor
> Jak. °our
> Ioz. or
> Jod. our
> Wen. ûz
> Kja. °āur
> Eth. ar
> Bag. awr
> Bey árh

Features:

Hujar Kjavik Jakka
Animate/inan ✓ ✓ ✓
Evidentiality ✓
Construct st. ✓
Ergative inan. ✓ ✓
…

Classical languages (from which Proto-Aretian may be reconstructed):

Jakka branch:
Old Aivan (literature)
Old Wengal (scattered inscriptions)
> Wengal
Middle Chiacoic (scattered inscriptions)
> Shawwa
> Kyakoprak
> Yau
Classical Iozhi (literature)
> Iozhi
Classical Kufi (literature)
> Nikufenki
> Jodarn

Kjavik branch:
Late Panaphonic (literature)
> Felonian
Bolanic (literature, much lost)
Old Bliken (scattered inscriptions)
Baugoic (scattered inscriptions)
Old Yarven (small body of literature)
> Yarrobam

An old (outdated) table of languages spoken in Aretia, to be updated: https://docs.google.com/document/d/19l8 ... p=drivesdk

A newer diagram showing the Proto-Aretian languages, to digitise: https://drive.google.com/file/d/1eyHdvh ... p=drivesdk

sasasha · Post by **sasasha** » Mon Dec 04, 2023 4:26 am

Padin and Iozhi society
The Gravedigger and the Midwife of a Burned Continent

[Extract from a late Steel Age scholarly text written in the Ethi language, published and printed by the University of Fondring. NB This text takes a moderate/hybrid position in the early modern Aretian Secularism debate, when archaeological evidence and divers global perspectives have been added to the mix: the ancients are acknowledged, but discussion of rings, of Sentinels, and of the Talent and the divine are all avoided.]

1.

Between the Red Days and the collapse of the Padin domes, large-scale urban civilization in Aretia (and the world) came to a close for nearly a millennium. We know very little of the detail, but conflict over resource scarcity and constant streams of refugees fleeing flooded coastal lands, dessicated southerly climes and random outcroppings of war and pestilence shredded nations and dislodged them from their geographical homes, until nations ceased to exist at all as anything other than distant memories in the legends of a few ragged tribes. Food production and distribution span out of control into chaos and, aided by the agencies of greed and climate catastrophe, left populations with either vast rotting reserves of food or little to eat besides each other. A layer of ash and bone archaeologists know as the Old World Cline testifies to a period of conflagration, starvation, widespread violence, cannibalism and mass extinction. What was left of the Genžië forests burned to the snow-line, which slowly retreated and then disappeared under the victorious sun; the plains burned and cracked, becoming vast graveyards to life of all kinds; and great storms washed the exhausted soils into the ocean. (It is sobering to note that the name of our continent derives from a Proto-Aretian word meaning ‘burned’, *āur).

No doubt small groups of forgotten people survived in this shocking landscape, sheltering in the constructions of the Old Ones. Where built of durable materials, these must have remained on the land in many places as eerie imprints of silenced generations. But, with few exceptions, the archaeologists can say little about them: over the centuries that followed, the cities, roads and vehicles of the Old Ones were dissolved in a reactive solution of coarse time, unbridled nature, recalcitrant flame and desperate human ingenuity. To the archaeologists, durable Old World artefacts “pepper” the soil for two to three hundred years after the Old World Cline, filtered by increasingly sparing availability, until eventually they disappear: they were all either used into nothingness, or lost beneath the loose, ashy sands of the new era.

There are glaring exceptions to this trend. The awe-inspiring remains left in Padidoi, half-buried in the desert that came to gorge upon the once rich land of Nyangmar, stand to show how far the concentrated efforts of humankind can take us when prompted by extreme adversity ‒ and how much our species will fight to maintain our ways of life only to heighten our eventual undoing.

The Padin can be said to have experienced a golden age while the continent's (and practically the entire planet's) other peoples were engulfed in suffering and collapse. Previously a democratic confederation of alluvian city states with a large, industrialised population, advanced agricultural techniques and a scientific outlook, the Padin had been unconsciously preparing their society for the storm since before it was even understood to be imminent. The story of their lonely success and ultimate collapse is chilling to this day: a tale of corporate political radicalism, psychotic xenophobia, stunningly ingenious construction, and deliberate murder on a scale perhaps never to be encountered in the world before or since.

The Padin are the earliest society about whom we have detailed knowledge. However, their own records were kept in a form so miraculous that it either left no trace, or remains invisible to us. Most of what we do know is written in the annals of their neighbours the Iozhi, who inherited some of the broken shards of their society and produced one of the brightest flames ‒ perhaps the very brightest ‒ of the continent's next wave of urban civilisation. (Other fragments come from the tales of the Toi'oi, far to the north, from the myths of the eastern Nikufenki and Syamomein, and from the heavy, ponderous songs of the Vashari.) Lost for centuries into our own era, the rediscovery and decipherment of archaic Iozhi pictographic texts in recent times has yielded the challenging secrets of the Padin to us.

But for now, we will skip over the Padin story, which is the telling of the end of the world that that went before, and turn to the story of the voice by which it was told: that of the Iozhi, the ancient resuscitators of south-west Aretia's breathless soils, the tellers of the start of the world that we call New.

🝮

According to Old Urngese legend, the ancient peoples of Aretia who survived the end of the world were gathered by the great Vashari Aum onto the slopes of Mount Kyunaret, and were found to be of three broad tribes: the White (Hujar) tribe, the Green (Kiavik) tribe, and the Red (Jakka) tribe. To escape the desolation of the world they agreed to co-operate in building Jakuk-Niaunun (c.f. Ainyn jøkuć ‘cave-home’ njynan ‘mountain-gate’, njynunei ‘mountain-town’), an underground palace of remarkable beauty within the mountain, which would house the god Kiau (c.f. Kyunaret) and keep the people safe from the ravaging demons (aurunki) outside until the earth could be repopulated. When the danger subsided, each tribe went its separate ways: the Hujar pushed north to hunt in the earth's slow-waxing memory of winter, the Kiavik west to settle the new, virgin forests, and the Jakka south to heal the soil.

The word Iozhi is in fact cognate with the Old Urngese word jakka meaning ‘red’, and it is the ancient Iozhi themselves who further corroborate the legend by telling us that their ancestors came from the mountains of the north to resettle the empty lands of the south. Xion and ores are the Iozhi words for ‘north’ and ‘mountain’, as well as, in the case of Xion, the name of a principal deity of the Iozhi pantheon, corresponding curiously with Old Urngese Kiau (‘grace’) and auriai (‘burned surface’, c.f. Ainyn or ‘fire’ and ei ‘upper’).

Whilst there may be some truth to the legend, the ancestors of the Iozhi were by no means the only Aretian-speaking peoples who migrated southwards during the Reclamation, and nor should we make the mistake of thinking them somehow the principal among those peoples far back into antiquity. Linguists today do group the Aretian language family into three broad branches (White, Green and Red), but they assign to the Red branch all the languages of Nikufenk, the Wengals and the Kyakoprak alongside the Iozhi. The Reclamation was almost certainly a joint effort of all of these peoples and perhaps others, unremembered, as well as many Vashari; recent evidence points to an early homeland for a Red Aretian culture in the southern foothills of the Aretian Alps, characterised by use of the pony, cultivation of wheat and redroot, and increasingly skilled use of metallurgy.

We should furthermore talk of an Iozhic group of languages rather than projecting the unity of Classical Iozhi into the distant past. Early Iozhi annals speak of as many as fifteen separate kingdoms of the Iozhi, spread across a much wider area than that in which the modern Iozhic language can be found, and make several mentions of interpreters employed on diplomatic missions between them. In the Middle Ages Chiacoic-speakers conquered much of the land that once rang with Iozhic tongues, but obscure forms of the Iozhic language can be found in isolated pockets throughout modern-day Kyakoprak (/Chiaora).

The classical unity of Iozhi is, of course, itself an illusion, proceeding from a period of imperial language policy, standardised literary education and trading hegemony that produced a regularised form of the language which was exported widely as a lingua franca in the Silver Age. Iozhic languages were still spoken in isolated towns and cities across the Middle Sea into the modern era, remnants of colonies of the classical period. Elotöri, Ghlwati or Madrigaric speech is now generally ubiquitous in those places, but there are inscriptions in Iozhic for any traveller to see in practically all of the Middle Sea's ancient settlements. Indeed, throughout much of recorded history, if they had ever dreamed of sailing the ocean or making their fortune, noble or mercantile-minded individuals from the entire Middle Word had first made Iozhic speech (and possibly writing too) their own.

Hence, it is merely the modern perspective, in which any form of Iozhi is perhaps not the first language a young Felonian merchant would be best advised to pick up, which produces the idea of one Iozhi language and one Iozhi people ‒ a fact with which any serious overview of the Iozhi phenomenon must grapple. One may wonder in the face of this mercurial history and the current political instabilities, do the Iozhi have (and have they ever had) an affinity as a group of speakers, and if so, from where does it come?

I will leave this and other such questions aside, and now describe the geographical situation (present and past) of the Iozhi heartlands, before painting an outline of the movements of the peoples whom we know have flowed across it.

🝮

From the southern foothills of the Aretian Alps, in the broad region known as Ululaia, tumble three fresh streams. Born of meltwater in winter, leaping with salmon in the spring, reducing almost to dust in the summer, and rejoicing in the flood-bearing rains of autumn, these three sisters grow closer as they age. Accepting many tributaries, they drain a significant basin of the middle portion of our continent into, first, three wide parallel valleys; then (as K̆umbas meets Olezai) two; then (as Olevas meets Iunka, where the hills are forgotten) one sweeping plain. The lower course, known as Iovaro, once meandered across a great swamp, leaking into the Middle Sea in a vast half-drowned landscape that was known in antiquity as Ioias. The entire region was known to the ancients as Thunilavazi or Pi-Thunilavazi ‒ the Land of the Three Rivers. [c.f. Ainyn tyngla ‘three’, ngwer ‘river’]

The spongy mouth of the Iovaro was known as Ambaz, and on an unlikely rocky island around which it cuts out to sea stands to this day the Acropolis of Li-Amba, perhaps the most famed of all the cities of the Iozhi.

South-east of Ambaz, the western coastline of our continent contains the dry land of Shuz, then hugs the lower reaches of the brief but majestic Elantine Mountains. The Straits of Elant separate this coast and its many isles from the large island of Fiuna.

West and north of Ambaz rise the Tumbune Mountains, the western extensions of the Aretian Alps which form the backbone of the Arms of Elotör. Short rugged valleys that fall into deep rias, sheer cliffs and tendrilous peninsulas characterise this metal-rich land, whose mercurial isle-studded southern coastline has long bristled with ships swapping ores and grains, gems and wines.

All of these regions spoke Iozhic languages in antiquity, and formed part of the Iozhi world. The colonies of the Iozhian seafarers expanded this world further, to locations along the northern shores of Selafika, to Nyangmar and Syamomein, to the eastern shores of Orlok, and even north to our own rain-fed region, the Panaphonic plain.

Zompist Bboard Again

Sego

Re: Sego

Re: Emacs and complex scripts

Re: Sego

Re: Sego

Re: Sego

Re: Sego

Re: Sego

Re: Sego

Seguwe-akhe: Phonology WIP

Re: Sego

Re: Sego

Re: Sego

Re: Sego

Re: Sego

Aretian language family

Padin and Iozhi society

	ma	ke	mu	ra
A	1	1	1	1
b	0	0	0	0
c	1	1	1	0
d	0	1	1	0
e	0	0	0	0
F	0	0	1	0
g	0	0	0	0
h	1	1	1	0
i	0	1	1	0
j	0	0	0	0
K	0	0	1	0
l	0	0	1	0
m	0	0	0	0
N	0	0	0	0
o	0	0	0	0
p	0	0	0	0
q	0	0	0	0
R	0	0	0	0
s	0	0	0	0
t	0	0	0	0
u	0	0	0	0
Total	3	5	8	1

	a	khe
K	1	0
N	0	1
o	0	1
Total	1	2