Page 2 of 12

Re: The Index Diachronica

Posted: Tue Aug 30, 2022 5:37 am
by bradrn
Ares Land wrote: Tue Aug 30, 2022 5:17 am The current software is IMO pretty serviceable; the lists of sound change themselves have problems, but that's to be expected from the work of amateur enthusiasts.
What I'd be interested in would be a way to edit the sound changes in a reasonably easy fashion. (I think I could improve the lists for a few languages here and there myself.)
Indeed, Index Diachronica’s current interface (assuming that’s what you’re talking about) is pretty nice. The list of sound changes is what really needs improvement: the current set is quite abominably bad, and of questionable reliability even in the best cases. Adding the ability to edit will help somewhat, but given the number of changes needing fixing, I suspect a more organised effort will be necessary.

Re: The Index Diachronica

Posted: Tue Aug 30, 2022 6:42 am
by Kuchigakatai
bradrn wrote: Mon Aug 29, 2022 8:39 pmCitations and details, yes, and perhaps even examples if there are any, but I think argumentation and justification would have to be out of scope for a sound change database. (Anyway, how could you even justify an individual sound change sensibly without including masses of words?)
I was thinking that maybe argumentation could be included for the most interesting (weird) sound changes.

As I said this is just a dream database in my mind anyway.

Re: The Index Diachronica

Posted: Tue Aug 30, 2022 8:31 am
by bradrn
Kuchigakatai wrote: Tue Aug 30, 2022 6:42 am
bradrn wrote: Mon Aug 29, 2022 8:39 pmCitations and details, yes, and perhaps even examples if there are any, but I think argumentation and justification would have to be out of scope for a sound change database. (Anyway, how could you even justify an individual sound change sensibly without including masses of words?)
I was thinking that maybe argumentation could be included for the most interesting (weird) sound changes.
That seems quite sensible to me — desirable, even; singling out the weirdest sound changes in this way would be a good choice.
As I said this is just a dream database in my mind anyway.
Who says that means can’t be real one day?

Re: The Index Diachronica

Posted: Tue Aug 30, 2022 10:17 am
by Man in Space
bradrn wrote: Tue Aug 30, 2022 8:31 am
Kuchigakatai wrote: Tue Aug 30, 2022 6:42 amAs I said this is just a dream database in my mind anyway.
Who says that means can’t be real one day?
I am still interested in helping achieve this—I need to overcome inertia.

Re: The Index Diachronica

Posted: Tue Aug 30, 2022 1:07 pm
by Ares Land
It certainly would be interesting.

I have a caveat in that you'd get a lot of documentation, plenty of quotes and examples for Indo-European languages (not all branches though!) and then things get worse the further you get from Europe. Relatedly, the surprising sound changes tend to be found outside Europe...

(I have a paper somewhere on sound changes in Iroquian; I could add these to ID if they're not already there, but providing examples is going to be tough!)

Re: The Index Diachronica

Posted: Tue Aug 30, 2022 7:29 pm
by Moose-tache
I also have some books that could help contribute to this project. I agree that an IE bias is almost inevitable, but that's not the end of the world. And most of us are reasonably well-informed about at least one non-IE language. For example, I could probably supply sound changes, citations, and examples for most of the Tungusic languages.

Re: The Index Diachronica

Posted: Tue Aug 30, 2022 9:42 pm
by Man in Space
Would it be worth it developing a dedicated web site? Like how the old KneeQuickie pages were, except beefier and cited and all…I realize this kind of defeats the purpose of having the master list as a PDF, but maybe it would be better that way. Maybe a group project webspace at the LCS, if they'd have it?

Re: The Index Diachronica

Posted: Tue Aug 30, 2022 10:02 pm
by bradrn
Moose-tache wrote: Tue Aug 30, 2022 7:29 pm I also have some books that could help contribute to this project. I agree that an IE bias is almost inevitable, but that's not the end of the world. And most of us are reasonably well-informed about at least one non-IE language. For example, I could probably supply sound changes, citations, and examples for most of the Tungusic languages.
I agree with this. I don’t think an IE bias would be a huge problem, any more than it’s been for the rest of historical linguistics — worst case, we’ll just have more information for IE than for non-IE, which is hardly the end of the world. And between us, we do all have a fair amount of linguistic knowledge. (And if I include inactive and former members, it becomes surprisingly comprehensive.)
Man in Space wrote: Tue Aug 30, 2022 9:42 pm Would it be worth it developing a dedicated web site? Like how the old KneeQuickie pages were, except beefier and cited and all…I realize this kind of defeats the purpose of having the master list as a PDF, but maybe it would be better that way. Maybe a group project webspace at the LCS, if they'd have it?
Yes, definitely! I’d even suggest developing a custom site, like what chridd made for the current Index Diachronica — it makes it so much easier to use.



In any case, it seems that we do have a fair number of interested people on the ZBB alone. What next?

As I see it, the creation of an updated Index Diachronica would need to proceed in two parts. Firstly, there is the actual gathering of reliable sound changes. This is somewhat laborious, but not too hard, and doesn’t need to be coordinated particularly tightly — we could start now, if we wanted.

Secondly, as I said above, we’d need to make a new website. The interface for the current version is pretty good, but also lacking in important places: just off the top of my head, it has no reliability information, no way to group sound changes, no way to repeat sound changes across different branches, poor referencing, and—most importantly—no way to edit anything. This all makes sense, given the original ID itself didn’t have any of this information to start with, but if we want to make a more reliable database, this is exactly the sort of information which we need to include.

If we’re serious about doing this, I’d suggest starting off by collecting comprehensive information about sound changes from a handful of small families, to get a better idea of what kind of information we’re dealing with. (I’d suggest starting with Anglic, Polynesian and Tungusic, but there may be better choices.) Then we can use those changes to start building our unified database of sound changes. Does this approach sound sensible to everyone?

(Oh, and by the way, if we really are serious, then we might want to find a better place for discussions on the project… either a new subforum, or some other place entirely.)

Re: The Index Diachronica

Posted: Wed Aug 31, 2022 4:25 am
by dhok
I can try to throw this onto my list of projects, but 'by year's end' may be ambitious.

Re: The Index Diachronica

Posted: Wed Aug 31, 2022 7:06 am
by bradrn
dhok wrote: Wed Aug 31, 2022 4:25 am I can try to throw this onto my list of projects, but 'by year's end' may be ambitious.
Indeed, given that statement was made last year…

(Less flippantly, I think we could reasonably hope to have a small working database by the end of this year; after that it would ‘just’ be a matter of collecting sound changes and possibly extending the software where needed. But it all depends on how interested and committed people are.)

Re: The Index Diachronica

Posted: Wed Aug 31, 2022 9:22 am
by Kuchigakatai
bradrn wrote: Tue Aug 30, 2022 10:02 pmno way to group sound changes, no way to repeat sound changes across different branches, poor referencing,
What did you mean by this?
If we’re serious about doing this, I’d suggest starting off by collecting comprehensive information about sound changes from a handful of small families, to get a better idea of what kind of information we’re dealing with. (I’d suggest starting with Anglic, Polynesian and Tungusic, but there may be better choices.) Then we can use those changes to start building our unified database of sound changes. Does this approach sound sensible to everyone?
Okay. I'd suggest leaving Algic to dhok though. Oh wait- you wrote Anglic not Algic... Anglic, as in English/Scots/Yola?

Re: The Index Diachronica

Posted: Wed Aug 31, 2022 10:32 am
by dhok
I'll take Algic or at least Algonquian.

Also, some measure of allowing for "original research" might not be the worst idea...the problem is that to really know the context for a historical phonology paper you do want to have some measure of familiarity with the language or its family.

Re: The Index Diachronica

Posted: Wed Aug 31, 2022 7:31 pm
by Moose-tache
It might be a good idea to keep track of citations and have multiple contributors where possible. We could even have a tier system, in which languages with poorly documented sound changes get a special flair.

Re: The Index Diachronica

Posted: Wed Aug 31, 2022 7:37 pm
by Man in Space
I’ve contacted zompist about opening up a new subforum; he’s amenable to the idea. I also registered indexdiachronica.com last night, so we could have a domain and web hosting.

Re: The Index Diachronica

Posted: Wed Aug 31, 2022 8:58 pm
by Nortaneous
I'd contribute to this. I think compiling a searchable database from vetted papers (i.e. not Starostin, Ehret, etc. and not outlines with the same six obviously insufficient changes) would be a worthy goal. I had some Middle Welsh changes somewhere that never made it into the searchable version but that still should be good.

(Is "Index Diachronica" valid Latin? Cf. "Index Librorum Prohibitorum", etc. I talked to the Index Phonemica guy a few times and he was pretty unhappy about this.)

(Since this is my first post in almost a year, Pyysalo was underrated as a webdev guy, and even though he was wrong about laryngealism, ABVD and STEDT both suck as apps and I hope dude's making bank at FAANG like Stuart Robinson now. Someone should email about this and see if he can be reasonable enough to contribute. I'm pretty mad about how the best online resource for IE is Wiktionary. Hi.)

Re: The Index Diachronica

Posted: Wed Aug 31, 2022 9:25 pm
by fusijui
Nortaneous just took the pixels out from under my fingers. Of the handful of languages/families whose histories I feel even a little conversant with that appear in the current Index, the material is either extremely sketchy because there really aren't accepted reconstructions out there, extremely sketchy because the resources aren't ones that are available online, or sketchy in the sense of copying... 'low-grade sources', euphemistically.

I understand the urge to put up whatever is available, even if it's so incomplete as to be misleading, in the hope that it attracts something better -- but that's also sort of a cargo-cult approach. I'd really rather see blanks than a 'selection of some sound changes' with no ability to judge how complete or representative it is of the language/family's history.

Re: The Index Diachronica

Posted: Wed Aug 31, 2022 11:48 pm
by bradrn
Kuchigakatai wrote: Wed Aug 31, 2022 9:22 am
bradrn wrote: Tue Aug 30, 2022 10:02 pmno way to group sound changes, no way to repeat sound changes across different branches, poor referencing,
What did you mean by this?
‘no way to group sound changes’ — some sound changes are best expressed with more than one phonological rule (e.g. the Great Vowel Shift), but the current ID has no consistent way to indicate these rules should be considered as a single group.

‘no way to repeat sound changes across different branches’ — we know from the Wave Model that certain sound changes can spread across many different language groups, but the current ID needs to repeat these changes separately for each of these groups. (Although now that I double-check, this appears to be less of a problem than I thought it was.)

‘poor referencing’ — exactly what it sounds like; references for many sound changes are either poor or entirely absent.
If we’re serious about doing this, I’d suggest starting off by collecting comprehensive information about sound changes from a handful of small families, to get a better idea of what kind of information we’re dealing with. (I’d suggest starting with Anglic, Polynesian and Tungusic, but there may be better choices.) Then we can use those changes to start building our unified database of sound changes. Does this approach sound sensible to everyone?
Okay. I'd suggest leaving Algic to dhok though. Oh wait- you wrote Anglic not Algic... Anglic, as in English/Scots/Yola?
Well, more as in Old/Middle/Modern English; we can probably leave Scots out of an initial version. (I’m not sure if there’s enough Yola materials to say much about its phonology with confidence…) My reasons for starting off with this particular grouping are as follows:
  1. Anglic is very well documented and researched, so we have a good chance at actually collecting all of the relevant sound changes with a high degree of confidence
  2. It includes some quite odd phonological rearrangements with difficult-to-describe conditioning factors, including many sporadic changes
  3. There’s probably several people here who are able to help with Anglic, at least to some extent
  4. I’ve already done some of the work, having collated sound changes from Middle to Modern English as an example for my SCA (though note that these give incorrect results on many inputs, so I’m sure I’ve missed some changes)
Though Algic would also be a perfectly fine choice, especially if dhok’s willing to do it. (We might also want to ask Whimemsz as well, if anyone is still in contact with him. At the very least, I know he contributed to the last version of the ID.)
dhok wrote: Wed Aug 31, 2022 10:32 am Also, some measure of allowing for "original research" might not be the worst idea...the problem is that to really know the context for a historical phonology paper you do want to have some measure of familiarity with the language or its family.
I’ve been thinking about this, and haven’t come to any firm conclusions yet… but I’m leaning towards forbidding original research, since that makes verification of reliability difficult. (It doesn’t mean that we can’t do original research; just that it needs to be published somewhere else as well, preferrably with argumentation etc. I believe Wikipedia does this too.)
Moose-tache wrote: Wed Aug 31, 2022 7:31 pm It might be a good idea to keep track of citations and have multiple contributors where possible. We could even have a tier system, in which languages with poorly documented sound changes get a special flair.
I’ve been thinking of tiers too, as it happens. I’m considering the following system, from most reliable to least reliable:
  1. Fully documented sound changes from the ancestor to the descendent, with conditioning, ordering etc. fully worked out and agreed on
  2. Less well documented sound changes, mostly worked out but perhaps with minor inconsistencies or unknown points
  3. Mostly unordered sound correspondences, possibly with some details on conditioning but not much more
  4. Basic phoneme correspondences with no further details
  5. Obviously insufficient changes including only a subset of phoneme correspondences
It may be useful to have a tier system for protolanguages too, from ‘well-reconstructed’ to ‘no agreement that this exists’. But that’s a somewhat separate topic.
Man in Space wrote: Wed Aug 31, 2022 7:37 pm I’ve contacted zompist about opening up a new subforum; he’s amenable to the idea. I also registered indexdiachronica.com last night, so we could have a domain and web hosting.
Thank you!
Nortaneous wrote: Wed Aug 31, 2022 8:58 pm I'd contribute to this. I think compiling a searchable database from vetted papers (i.e. not Starostin, Ehret, etc. and not outlines with the same six obviously insufficient changes) would be a worthy goal.
Welcome back! I am thoroughly of your (and fusijui’s) mind in this regard: a searchable, reliable database of sound changes would be a great boon for conlangers, and quite probably for linguists more generally too.

However, I think we have some room to include less well-vetted sources too. This is why I like the idea of a ‘tiers’ system so much: it lets us include more data, while still maintaining reliability. Then, if the user only wants vetted sources, they can filter for ‘most reliable’ or whatever we end up calling it. As long as they’re not actively wrong, I see no reason to exclude sources just because they don’t comprehensively present a complete history.

(Of course, some things are actively wrong: Sarostin and co. wouldn’t make the cut either way.)

Re: The Index Diachronica

Posted: Thu Sep 01, 2022 2:15 am
by Moose-tache
I’ve been thinking of tiers too, as it happens. I’m considering the following system, from most reliable to least reliable:
1) Fully documented sound changes from the ancestor to the descendent, with conditioning, ordering etc. fully worked out and agreed on
2) Less well documented sound changes, mostly worked out but perhaps with minor inconsistencies or unknown points
3) Mostly unordered sound correspondences, possibly with some details on conditioning but not much more
4) Basic phoneme correspondences with no further details
5) Obviously insufficient changes including only a subset of phoneme correspondences
Also, we need to have collaborative tiers. A hobbyist like me might find a random book on Bantu sound changes, but other people might be more likely to put their faith in it if four or five people have all edited the same section to coincide with their own reading.

I think these needs can be accomplished without a highly elaborate system. It could be as simple as following two rules:
1) no contributions or edits without citations, and
2) each section displays how many users have edited it.

Re: The Index Diachronica

Posted: Thu Sep 01, 2022 4:56 am
by bradrn
Moose-tache wrote: Thu Sep 01, 2022 2:15 am
I’ve been thinking of tiers too, as it happens. I’m considering the following system, from most reliable to least reliable:
1) Fully documented sound changes from the ancestor to the descendent, with conditioning, ordering etc. fully worked out and agreed on
2) Less well documented sound changes, mostly worked out but perhaps with minor inconsistencies or unknown points
3) Mostly unordered sound correspondences, possibly with some details on conditioning but not much more
4) Basic phoneme correspondences with no further details
5) Obviously insufficient changes including only a subset of phoneme correspondences
Also, we need to have collaborative tiers. A hobbyist like me might find a random book on Bantu sound changes, but other people might be more likely to put their faith in it if four or five people have all edited the same section to coincide with their own reading.

I think these needs can be accomplished without a highly elaborate system. It could be as simple as following two rules:
1) no contributions or edits without citations, and
2) each section displays how many users have edited it.
I agree that this is needed, but the two systems seem complementary to me. It’s easy to imagine a situation where lots of people have edited a certain entry, but it still remains fairly superficial and unreliable due to lack of research — or, conversely, one where only a single person has edited an entry, but that entry is highly detailed and easily verifiable to be correct. (I imagine these cases might be exemplified by e.g. Algic and Anglic respectively.)

Re: The Index Diachronica

Posted: Thu Sep 01, 2022 10:09 am
by Moose-tache
What I mean is, even if it's cited, one single ZBBer might just have a hobby horse for, say, glottalic theory or something. I feel much more confident in the result if multiple contributors have looked at it, possibly made tweaks, and then said "Yup, that's good." I mean, if nobody is going to check my work, I could cite Edo Nyland for my Basque-to-Modern-Xhosa sound changes. Ideally we should have more than one person who can provide some review to each entry.