Beyond the SCA: strategies for improving conlanger productivity

Conworlds and conlangs
User avatar
alice
Posts: 1072
Joined: Mon Jul 09, 2018 11:15 am
Location: 'twixt Survival and Guilt

Beyond the SCA: strategies for improving conlanger productivity

Post by alice »

Imagine you have the ideal SCA, which takes as input (1) a set of rules representing sound changes and (2) a word, and outputs the result of processing the word through the rules. Now imagine you have a program which uses this SCA to automate as many of your conlanging tasks as possible. What tasks would these include?

Here are a few to start with; not all use the SCA directly, but they do form conceptual thematic parts of this putative Ideal Conlanger's Assistant.
  • An inflector, which has its own set of rules to model the inflection of a word
  • A reflex validator, which keeps your child languages' lexicons consisent with the sound changes
  • A thematic dictionary generator, which can do snazzy things like generaing Swadesh lists
*I* used to be a front high unrounded vowel. *You* are just an accidental diphthong.
Lērisama
Posts: 258
Joined: Fri Oct 18, 2024 9:51 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by Lērisama »

alice wrote: Tue Jan 21, 2025 2:50 pm
  • A thematic dictionary generator, which can do snazzy things like generaing Swadesh lists
This one seems impossible, unless you have to manually select a theme for each word, which I'd much rather not do (I have a backlog of over 30 words that need adding to my dictionary). It also seems a pain to handle alongside semantic change. I'd love the middle one, and Bassica does the first one already¹, and it's close enough to the mythical ideal SCA for me.

Now imagine you have a program which uses this SCA to automate as many of your conlanging tasks as possible. What tasks would these include?
The production of a document with your sound changes in a more human-readible form. Although I've memorised the LZ ones for all but the most painful words.


¹ With some limitations – stem change and infixes
LZ – Lēri Ziwi
PS – Proto Sāzlakuic (ancestor of LZ)
PRk – Proto Rākēwuic
XI – Xú Iạlan
VN – verbal noun
SUP – supine
DIRECT – verbal directional
My language stuff
Creyeditor
Posts: 314
Joined: Wed Jul 08, 2020 9:15 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by Creyeditor »

AWKWORDOID: I would definitely need something for phonotactics similar to Awkwords. It would be great if the program could suggest licit roots/words so that I can just assign meanings. In a perfect world, this would not just be random words that follow my established phontactics but also follow some sensible and editable probability distribution for segments and different syllable types/root types.
DERIVATIONALIZER: I would love a module that suggests combinations of roots and derivational affixes (or roots and roots in the case of compounds) based on roots and derivational affixes that I enter. It could also suggest polysemy patterns based on public online databases. In conjunction with the AWKWORDOID this could help me build my conlang's vocabulary. Of course, there would need to be a way to store the vocabulary in a dictionary-like format that could be converted into a nice pdf-file.
SYNTACXG: I would definitely need a way to store syntactic constructions and how they can be combined. This would help me to (a) see gaps and come up with new constructions and (b) do translations correctly. This might mean that you need to save snippets of example [sentences/phrases/clauses/...] related to these constructions.

I still can't guarantee that I would use such a tool because I am slow to adapt to new technology in general. Some of these already exist or might exist even if I never had a look at them.
zompist
Site Admin
Posts: 3205
Joined: Sun Jul 08, 2018 5:46 am
Location: Right here, probably
Contact:

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by zompist »

Besides the tools I've made available, I heavily use another one: a Javascript program to convert text to html. This allows me to create html documentation very quickly from my Word documents. Earlier I wrote an RTF-to-html converter, but Word's RTF is pretty horrible and produces a lot of nonsense. Rather than keeping the converter up to date, I wrote the Javascript converter to handle tables, bullet points, and sample sentences.
alice wrote: Tue Jan 21, 2025 2:50 pm
  • An inflector, which has its own set of rules to model the inflection of a word
I wrote one of these for Old Skourene, whose inflection is pretty insane.

I kind of feel that if your language can be easily handled by an inflection program, it's probably not very naturalistic. Natural languages are crammed with exceptions. Plus, I'd worry about nailing down the morphology too early. Our first notions of the axes of inflections may not be the best.
  • A thematic dictionary generator, which can do snazzy things like generaing Swadesh lists
It's hard to picture this not producing much more work, as presumably you have to specify the semantic classification for each word you create. And again, I think the tendency would be to assign the same semantics to all your languages. I'm most interested in classifications that differ from English, and that requires some personal thought and serendipity.
bradrn
Posts: 6711
Joined: Fri Oct 19, 2018 1:25 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by bradrn »

zompist wrote: Tue Jan 21, 2025 4:52 pm Besides the tools I've made available, I heavily use another one: a Javascript program to convert text to html. This allows me to create html documentation very quickly from my Word documents. Earlier I wrote an RTF-to-html converter, but Word's RTF is pretty horrible and produces a lot of nonsense. Rather than keeping the converter up to date, I wrote the Javascript converter to handle tables, bullet points, and sample sentences.
I’m curious to know how this compares to more established tools like Pandoc.
alice wrote: Tue Jan 21, 2025 2:50 pm
  • An inflector, which has its own set of rules to model the inflection of a word
I wrote one of these for Old Skourene, whose inflection is pretty insane.

I kind of feel that if your language can be easily handled by an inflection program, it's probably not very naturalistic. Natural languages are crammed with exceptions. Plus, I'd worry about nailing down the morphology too early. Our first notions of the axes of inflections may not be the best.
Personally I find an inflector most useful in combination with a SCA. If you already have morphology for a protolanguage, generating paradigms for a bunch of words and running it through some sound changes is a great way to work out the paradigm in the descendant. (This is how I worked out the Eŋes lexical aspect system, for instance — one paradigm in the ancestor split into seven in the descendant.)
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
zompist
Site Admin
Posts: 3205
Joined: Sun Jul 08, 2018 5:46 am
Location: Right here, probably
Contact:

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by zompist »

bradrn wrote: Thu Jan 23, 2025 12:26 am
zompist wrote: Tue Jan 21, 2025 4:52 pm Besides the tools I've made available, I heavily use another one: a Javascript program to convert text to html. This allows me to create html documentation very quickly from my Word documents. Earlier I wrote an RTF-to-html converter, but Word's RTF is pretty horrible and produces a lot of nonsense. Rather than keeping the converter up to date, I wrote the Javascript converter to handle tables, bullet points, and sample sentences.
I’m curious to know how this compares to more established tools like Pandoc.
Since you've written your own SCA, I think you understand the power of writing your own tools to do exactly what you want.

It's a matter of deciding what kind of tedium you want to handle. A third-party tool may well reduce the overall tedium, but generate its own problems. And as I noted, Microsoft does some weird stuff in its RTF (or did, it was a long time ago).
bradrn
Posts: 6711
Joined: Fri Oct 19, 2018 1:25 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by bradrn »

zompist wrote: Thu Jan 23, 2025 3:17 pm
bradrn wrote: Thu Jan 23, 2025 12:26 am
zompist wrote: Tue Jan 21, 2025 4:52 pm Besides the tools I've made available, I heavily use another one: a Javascript program to convert text to html. This allows me to create html documentation very quickly from my Word documents. Earlier I wrote an RTF-to-html converter, but Word's RTF is pretty horrible and produces a lot of nonsense. Rather than keeping the converter up to date, I wrote the Javascript converter to handle tables, bullet points, and sample sentences.
I’m curious to know how this compares to more established tools like Pandoc.
Since you've written your own SCA, I think you understand the power of writing your own tools to do exactly what you want.

It's a matter of deciding what kind of tedium you want to handle. A third-party tool may well reduce the overall tedium, but generate its own problems. And as I noted, Microsoft does some weird stuff in its RTF (or did, it was a long time ago).
Normally I’d agree, except that Pandoc (and similar tools) are in my experience extremely flexible. So I’m curious to know what your tool does that makes it work so well for you. (There might be functionality which would be useful for other people too…)
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
zompist
Site Admin
Posts: 3205
Joined: Sun Jul 08, 2018 5:46 am
Location: Right here, probably
Contact:

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by zompist »

bradrn wrote: Thu Jan 23, 2025 7:28 pm Normally I’d agree, except that Pandoc (and similar tools) are in my experience extremely flexible. So I’m curious to know what your tool does that makes it work so well for you. (There might be functionality which would be useful for other people too…)
Probably not... most people don't seem to like '90s web design. Which reminds me, I really need to add margins to some pages, to get them at least to the year 1999.

So, my requirements for an html page are that it's simple, human-readable and human-maintainable, without cruft output by the converter, or left over from Word, that gets in my way later.

I can format tables in several different ways, and output whole sets of sample sentences (with or without a native-script line) with one keystroke. I should really do something about headers and indexes though, those are still tedious.

But as I said, it's about where you want to put the tedium. I have much less patience these days for figuring out third-party products: it's not worth it to spend a week massaging a new tool into shape and then finding out it doesn't quite do what I want anyway. My current workflow is fine.
bradrn
Posts: 6711
Joined: Fri Oct 19, 2018 1:25 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by bradrn »

zompist wrote: Fri Jan 24, 2025 12:49 am I have much less patience these days for figuring out third-party products: it's not worth it to spend a week massaging a new tool into shape and then finding out it doesn't quite do what I want anyway. My current workflow is fine.
Addressing this first: I’m not trying to convince you to change anything! Like everyone else in this thread, my interests are purely selfish — if your tool has any juicy features which could be useful to me, I’d like to know about them…
Probably not... most people don't seem to like '90s web design.
You might be surprised. On Hacker News I regularly see people complaining about how there’s so few 90s-style webpages left. (I should link yours next time…)
Which reminds me, I really need to add margins to some pages, to get them at least to the year 1999. […] I should really do something about headers and indexes though, those are still tedious.
I agree, these two improvements would be nice. But other than those, I quite like the current aesthetic.

(Note: rather than ‘increasing margins’ I’d focus on ‘decreasing line width’, which is the more important measure.)
So, my requirements for an html page are that it's simple, human-readable and human-maintainable, without cruft output by the converter, or left over from Word, that gets in my way later.
In my experience, the modern conversion tools create very clean output. Pandoc in particular is designed for integration into larger webpages (e.g. I used it for my abortive attempt at a blog). Thankfully, we’ve come a long, long way since the horrors of RTF…

Though not long enough, apparently! One of my great annoyances with SIL Toolbox (which I use for dictionaries) and its MDF format is that it produces typeset output as RTF, which is an utter pain if you don’t have access to Microsoft Word. I’ve been thinking of writing a program to convert it to LaTeX instead, given that I reimplemented MDF for Brassica anyway. Or, for that matter, it could plug into Pandoc — it’s all Haskell so should be easy.

(Yes, I know most conlangers get by just fine with a simple spreadsheet. But my perfectionism will not let me accept mere one- or two-word glosses for words, in place of proper definitions, and spreadsheets are annoying for anything much longer than that.)
I can format tables in several different ways, and output whole sets of sample sentences (with or without a native-script line) with one keystroke.
Ooh, that latter one sounds interesting… more details please?
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
User avatar
Vilike
Posts: 165
Joined: Thu Jul 12, 2018 2:10 am
Location: Elsàss
Contact:

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by Vilike »

bradrn wrote: Fri Jan 24, 2025 1:25 am One of my great annoyances with SIL Toolbox (which I use for dictionaries) and its MDF format is that it produces typeset output as RTF, which is an utter pain if you don’t have access to Microsoft Word. I’ve been thinking of writing a program to convert it to LaTeX instead, given that I reimplemented MDF for Brassica anyway. Or, for that matter, it could plug into Pandoc — it’s all Haskell so should be easy.
THAT would improve my productivity. At the moment I rely on a Perl script (by Kilu von Prince) found somewhere on a defunct webpage. I also tried thrice or fource to make use of Benjamin Galliot's Lexika, which would output semantically correct XML if I were successful.

Recently, I've been testing the Typst typesetting engine and hacked a template to extract data directly from MDF. It works, got my page headers and all, but it's a slow under-optimised regex mess; also I'd like to generate a reverse dictionary without having to fire up LexiquePro.
Yaa unák thual na !
bradrn
Posts: 6711
Joined: Fri Oct 19, 2018 1:25 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by bradrn »

Vilike wrote: Fri Jan 24, 2025 3:30 am
bradrn wrote: Fri Jan 24, 2025 1:25 am One of my great annoyances with SIL Toolbox (which I use for dictionaries) and its MDF format is that it produces typeset output as RTF, which is an utter pain if you don’t have access to Microsoft Word. I’ve been thinking of writing a program to convert it to LaTeX instead, given that I reimplemented MDF for Brassica anyway. Or, for that matter, it could plug into Pandoc — it’s all Haskell so should be easy.
THAT would improve my productivity. At the moment I rely on a Perl script (by Kilu von Prince) found somewhere on a defunct webpage. I also tried thrice or fource to make use of Benjamin Galliot's Lexika, which would output semantically correct XML if I were successful.
Well… I guess I‘d better get on to it, then!

(Actually, I also have my own program to convert MDF to XML. Maybe I should publish it. At least if I can get it off my dead laptop…)
Recently, I've been testing the Typst typesetting engine and hacked a template to extract data directly from MDF. It works, got my page headers and all, but it's a slow under-optimised regex mess; also I'd like to generate a reverse dictionary without having to fire up LexiquePro.
I did notice that the dictionary you sent to me (for the conlang relay) was made in Typst. I really liked the typography! It single-handedly made me consider Typst as a serious contender.

Speaking of which: how does Typst compare to LaTeX, in your opinion? I’d consider switching, but I know LaTeX well enough by now that I can do almost anything in it, and that’s not something I’d like to give up.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
User avatar
Vilike
Posts: 165
Joined: Thu Jul 12, 2018 2:10 am
Location: Elsàss
Contact:

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by Vilike »

bradrn wrote: Fri Jan 24, 2025 3:36 am Speaking of which: how does Typst compare to LaTeX, in your opinion? I’d consider switching, but I know LaTeX well enough by now that I can do almost anything in it, and that’s not something I’d like to give up.
I like that there are no intermediary files. Otherwise, to a casual user like me, who forgets half of their LaTeX knowledge between projects, it looks like it's easier to customise on the fly (bar the strange decision to hardcode the styling of definition lists...). Way less boilerplate (but it's still an immature project). Data extraction from markup files looks more straightforward, at least compared to the horror that is ConTeXt. Already a good deal of community-made packages, including one for interlinear glossing.

But I still haven't refactored my resume with it, so I'd say: cautious endorsement.
Yaa unák thual na !
Lērisama
Posts: 258
Joined: Fri Oct 18, 2024 9:51 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by Lērisama »

bradrn wrote: Fri Jan 24, 2025 1:25 am (Yes, I know most conlangers get by just fine with a simple spreadsheet. But my perfectionism will not let me accept mere one- or two-word glosses for words, in place of proper definitions, and spreadsheets are annoying for anything much longer than that.)
I can't handle that either, so I eventually just wrote a database, and a python program to display it in a readible form and make searching easier. It's a pain to input entries (I make a csv file with the new words in and then run a different python program to add it to the database, then correct the inevitable errors), but better than a spreadsheet for long definitions
LZ – Lēri Ziwi
PS – Proto Sāzlakuic (ancestor of LZ)
PRk – Proto Rākēwuic
XI – Xú Iạlan
VN – verbal noun
SUP – supine
DIRECT – verbal directional
My language stuff
zompist
Site Admin
Posts: 3205
Joined: Sun Jul 08, 2018 5:46 am
Location: Right here, probably
Contact:

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by zompist »

bradrn wrote: Fri Jan 24, 2025 1:25 am (Yes, I know most conlangers get by just fine with a simple spreadsheet. But my perfectionism will not let me accept mere one- or two-word glosses for words, in place of proper definitions, and spreadsheets are annoying for anything much longer than that.)
No spreadsheets for me. But I have no nostalgia for my teenage days of conlanging without computers!

I can format tables in several different ways, and output whole sets of sample sentences (with or without a native-script line) with one keystroke.
Ooh, that latter one sounds interesting… more details please?
It just properly formats the straight text— it doesn't write them. :)

This discussion is making me want to upgrade the thing. Maybe next language I do...
Lērisama
Posts: 258
Joined: Fri Oct 18, 2024 9:51 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by Lērisama »

zompist wrote: Fri Jan 24, 2025 2:47 pm
bradrn wrote: Fri Jan 24, 2025 1:25 am (Yes, I know most conlangers get by just fine with a simple spreadsheet. But my perfectionism will not let me accept mere one- or two-word glosses for words, in place of proper definitions, and spreadsheets are annoying for anything much longer than that.)
No spreadsheets for me. But I have no nostalgia for my teenage days of conlanging without computers!
What do you use? I'm not particularly happy with my current dictionary and very open to stealing ideas on that front.
LZ – Lēri Ziwi
PS – Proto Sāzlakuic (ancestor of LZ)
PRk – Proto Rākēwuic
XI – Xú Iạlan
VN – verbal noun
SUP – supine
DIRECT – verbal directional
My language stuff
zompist
Site Admin
Posts: 3205
Joined: Sun Jul 08, 2018 5:46 am
Location: Right here, probably
Contact:

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by zompist »

Lērisama wrote: Fri Jan 24, 2025 3:17 pm
zompist wrote: Fri Jan 24, 2025 2:47 pm
bradrn wrote: Fri Jan 24, 2025 1:25 am (Yes, I know most conlangers get by just fine with a simple spreadsheet. But my perfectionism will not let me accept mere one- or two-word glosses for words, in place of proper definitions, and spreadsheets are annoying for anything much longer than that.)
No spreadsheets for me. But I have no nostalgia for my teenage days of conlanging without computers!
What do you use? I'm not particularly happy with my current dictionary and very open to stealing ideas on that front.
I start everything in Word. If you want to know what the document looks like, it's pretty much exactly like the pages on my site. :)

I keep the lexicon in a table in Word, which looks nice and allows me to sort. The definition area can grow to several lines. As Word has some strange ideas about characters with diacritics, its sorting may require some manual correction.

After the language is html-ized, the html becomes the master, which is why I need it to be human-readable so I can make corrections and add words. If I work on an existing language a lot (e.g. Verdurian, Kebreni) I convert the lexicon to a Javascript program, which gives lots of nice search options, and allows things like redisplaying the headword in the native script.
Lērisama
Posts: 258
Joined: Fri Oct 18, 2024 9:51 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by Lērisama »

zompist wrote: Fri Jan 24, 2025 3:40 pm I start everything in Word. If you want to know what the document looks like, it's pretty much exactly like the pages on my site. :)

I keep the lexicon in a table in Word, which looks nice and allows me to sort. The definition area can grow to several lines. As Word has some strange ideas about characters with diacritics, its sorting may require some manual correction.

After the language is html-ized, the html becomes the master, which is why I need it to be human-readable so I can make corrections and add words. If I work on an existing language a lot (e.g. Verdurian, Kebreni) I convert the lexicon to a Javascript program, which gives lots of nice search options, and allows things like redisplaying the headword in the native script.
Thank you, that's helpful.
LZ – Lēri Ziwi
PS – Proto Sāzlakuic (ancestor of LZ)
PRk – Proto Rākēwuic
XI – Xú Iạlan
VN – verbal noun
SUP – supine
DIRECT – verbal directional
My language stuff
bradrn
Posts: 6711
Joined: Fri Oct 19, 2018 1:25 am

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by bradrn »

zompist wrote: Fri Jan 24, 2025 2:47 pm

I can format tables in several different ways, and output whole sets of sample sentences (with or without a native-script line) with one keystroke.
Ooh, that latter one sounds interesting… more details please?
It just properly formats the straight text— it doesn't write them. :)
Oh… ‘whole sets of sample sentences’ made me wonder if it was more elaborate.

(I remember than once, long ago, I attempted to create a word processor which could integrate with a conlang dictionary and update example sentences in a grammar as the words are altered. It failed: partly because I was a terrible programmer at the time, partly because it was a stupid idea, and partly because the rich text component I was attempting to use was built on RTF.)
Lērisama wrote: Fri Jan 24, 2025 3:17 pm
zompist wrote: Fri Jan 24, 2025 2:47 pm
bradrn wrote: Fri Jan 24, 2025 1:25 am (Yes, I know most conlangers get by just fine with a simple spreadsheet. But my perfectionism will not let me accept mere one- or two-word glosses for words, in place of proper definitions, and spreadsheets are annoying for anything much longer than that.)
No spreadsheets for me. But I have no nostalgia for my teenage days of conlanging without computers!
What do you use? I'm not particularly happy with my current dictionary and very open to stealing ideas on that front.
For my part, as mentioned I use SIL Toolbox. The UI is a bit dated now, but I find it works well for my needs: the hierarchical structure of entries makes it possible to include as much or as little detail as I want.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
TomHChappell
Posts: 145
Joined: Fri Jul 26, 2019 6:40 am
Location: SouthEast Michigan

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by TomHChappell »

If I had the budget to do so, I’d get one example of each type of tool mentioned in this thread!
But, I don’t.
It’s nice to at least read about them!
Last edited by TomHChappell on Sat Jan 25, 2025 11:57 am, edited 1 time in total.
User avatar
alice
Posts: 1072
Joined: Mon Jul 09, 2018 11:15 am
Location: 'twixt Survival and Guilt

Re: Beyond the SCA: strategies for improving conlanger productivity

Post by alice »

To get back on topic: What (hypothetical) tools would you like to have or find useful?
*I* used to be a front high unrounded vowel. *You* are just an accidental diphthong.
Post Reply