alice wrote: ↑Mon Jun 17, 2024 2:49 pm
bradrn wrote: ↑Mon Jun 17, 2024 1:20 pmSo, if we take this problem as solved, the remaining problem is how to represent suprasegmentals. I’m starting to think that this problem is bound up with that of representing features more generally, which is an area where Brassica is less complete than I’d like. I’ll have to further ponder whether there’s any way of doing this which doesn’t require endless feature lists for every single phoneme.
This is pretty much where I ended up after much thinking about the problem, too.
Since we’re agreed on this point, I’ve been thinking a bit more about features in SCAs. And I should write down my thoughts…
(Note: this has ended up long and rambly, like so many of my posts. Skip to the end if you want to see my new proposal for Brassica.)
Probably the best place to start is by outlining Brassica’s current approach. It’s a very simple (perhaps simplistic) approach: everything is done with simple ordered lists of graphemes. Ad-hoc categories in sound changes are lists, and named categories are simply exactly the same thing, but given a name so they can be reused. Categories in the target and the replacement are matched up one-to-one, and their elements are matched up one-to-one as well. If the replacement has fewer elements than the target, any missing elements are mapped to � (U+FFFD).
On this foundation, Brassica simulates features by allowing set-theoretic operations on categories: unions, intersections and subtractions. Thus, one can write sound changes such as
[Fric -Vcd] / [Fric +Vcd] / V _ V, which appear featural while not requiring a full feature system.
This approach has two major disadvantages:
- It depends very strongly on the ordering of graphemes within a category. If Fric happens to have its voiceless consonants listed in a different order to the voiced ones, then that sound change will produce unexpected results in non-obvious ways.
- The concept of a feature isn’t actually a part of the SCA. Thus, ‘feature transfer’ changes like assimilation or spreading are annoying to write: instead of saying ‘the next vowel takes the rounding of the previous one’, you must say ‘unrounded → unrounded, rounded → rounded’ as two separate steps.
What are the essential components of a feature system which solves these problems? Taking Lexurgy as a reference point, it lets you define
features (which can be binary or multivalent), then assign each phoneme a set of feature values. Thus each feature value becomes, essentially, an unordered set of phonemes. The key point which makes this work is that
every phoneme is uniquely defined by its feature values: thus, a feature matrix in the target and a feature matrix in the replacement can trivially match up their elements, without requiring the user to define an ordering.
The biggest problem with this sort of feature system is, quite simply, that it’s
really really annoying. Every single phoneme needs to be uniquely defined by its features — so you sometimes need to add useless features simply to keep phonemes apart. Out of necessity, the syntax is also horribly verbose, because every phoneme needs its own list of feature values.
A more subtle problem is that it gives two ways to do the same thing. Conceptually speaking,
[+nasal] should be the same thing as
[m n ŋ], usable in the same ways. But (at least in Lexurgy) the former is a feature matrix while the latter is a list of phonemes. And, to my understanding, the two can’t be mixed: if a rule acts on all stops plus /s/, you can’t write
[+stop s] directly.
So: is there some way of solving the problems (1) and (2)
without requiring a full feature system?
Taking (2) first, there actually is a clever way to work around this in current Brassica. One can write:
[{[e i a ɯ]} {[œ y o u]}] C [{[e œ]} {[i y]} {[a o]} {[ɯ u]}] / [{[e i a ɯ]} {[œ y o u]}] C @1 [{@4 [e i a ɯ]} {@4 [œ y o u]}]
This is, of course, a disgusting unreadable hack. The basic idea is to make
nested categories, such that outer category
@1 matches the roundedness of the first vowel, and outer category
@4 matches position in the vowel space. Then the nested categories in the replacement just recombine those in a different way.
It can be made much more readable by defining named categories:
[{Unrounded} {Rounded}] C [{[e œ]} {[i y]} {[a o]} {[ɯ u]}] / [{Unrounded} {Rounded}] C @1 [{@4 Unrounded} {@4 Rounded}]
This makes it clear that the most ‘hackish’ part is abstracting away the roundedness of the second vowel. You can define categories
Unrounded and
Rounded, but you can’t talk about a single element across both
Unrounded and
Rounded.
The alternative is simply matching
V, then map
all its elements separately to
Unrounded or
Rounded categories:
[{Unrounded} {Rounded}] C [Unrounded Rounded] / [{Unrounded} {Rounded}] C @1 [{@4 Unrounded Unrounded} {@4 Rounded Rounded}]
This is better, but note that I needed to decompose
V into a union of
[Unrounded Rounded] to ensure that all the ordered lists end up with the same order. (Or write
V with an order which just happens to work, but that’s fragile.)
In fact, this reveals that the problem here is actually the same as (1). In both cases, the issue is setting up a correspondence between two category sets. In (1), I need to map
+Vcd to
-Vcd. Here, I need to map
V to
Rounded or
Unrounded.
Perhaps the answer is to get away from trying to express everything in terms of order-preserving set operations. Instead, oppositions could be defined manually:
+Vcd = p t k f s
-Vcd = b d ɡ v z
-Round = e i a ɯ
+Round = œ y o u
Where, as a sanity check, Brassica would ensure that these lists have the same number of elements. Of course, this doesn’t eliminate ordering issues entirely, but it makes it
really clear that the two lists should line up. Then a construction like
[Fric +Vcd] can simply use the ordering of
+Vcd.
The disadvantage is one no longer gets the complement ‘for free’, simply by defining
Vcd. The idea was nice, but clearly it doesn’t work in practice. And after all it’s not a bad idea for
-Vcd to get its own independent existence, rather than being simply a subtraction from something else.
Note the key difference from a true feature system: this doesn’t require every grapheme to have unique feature assignments (indeed the opposite is expected). Additionally, these are all just regular categories — you can write
[+Vcd s] if you want.
But, because Brassica would now know about these binary oppositions between categories, this opens up the opportunity of explicitly matching things by their feature value. This makes the above assimilation rule
much easier to write:
[V $Round] C V / V C [V $Round]
(Where by default the features get matched up one-to-one between target and replacement, like categories.)
Overall… I think I really like this design, actually. I feel that my philosophy with Brassica is to avoid relying on ‘all-encompassing’ concepts which force things to be done in one specific way: like all-encompassing feature matrices, or all-encompassing syllabification rules. Requiring use of ordered set operations is a similar constraint, and I’m not sorry to get rid of it.
(Well, maybe not get rid of it
totally… I may keep it in some capacity for those cases where it’s useful.)
But anyway, all this is just my opinion. What does everyone else here think?