Page 1 of 1

SCA² question

Posted: Sat May 30, 2020 1:56 am
by Emily
How well does SCA² handle composite glyphs -- that is, characters that (should) appear as one glyph, but are actually made of two or more Unicode points? Like, if I want to put in an ø with an acute accent (ǿ), that takes two Unicode code points: U-00F8 for the ø, and U+0301 for the combining accent. Does SCA² process this as one character or two?

Re: SCA² question

Posted: Sat May 30, 2020 2:23 am
by bradrn
GreenBowtie wrote: Sat May 30, 2020 1:56 am How well does SCA² handle composite glyphs -- that is, characters that (should) appear as one glyph, but are actually made of two or more Unicode points? Like, if I want to put in an ø with an acute accent (ǿ), that takes two Unicode code points: U-00F8 for the ø, and U+0301 for the combining accent. Does SCA² process this as one character or two?
I believe it’s two characters — as far as I’m aware, SCA² works on characters rather than graphemes. Of course, you can use rewrite rules to turn it into one character if you want. (Although in the specific case of ǿ, you can actually represent that as a single code point: ǿ is one character, ø+◌́ is two.)

EDIT: I tested it, and e.g. rules like ́/x/_ convert ǿ to øx. So SCA² definitely does treat the combining acute accent as its own character.

Re: SCA² question

Posted: Sat May 30, 2020 3:45 am
by zompist
Yes, it's quite simpleminded, so use single characters where you can. Rewrite rules are your friend, though!

Re: SCA² question

Posted: Sat May 30, 2020 4:45 am
by Emily
That's about what I figured, thank you!