AI in conlanging - present and future
Re: AI in conlanging - present and future
such a quantity of words is irrelevant,
even if generative artificial "intelligences"
try to replace the qualitative by the quantitative,
but all they produce is barroom talk with an erudite dummy...
even if generative artificial "intelligences"
try to replace the qualitative by the quantitative,
but all they produce is barroom talk with an erudite dummy...
Re: AI in conlanging - present and future
given that this is AI, does - for it - count code and-or HTML as words?zompist wrote: ↑Sat May 11, 2024 5:16 amThere are some estimates of the total size of the Internet, but I don't think they're comparable... consider how much of the Internet is pictures, video, HTML codes, or other code.I got ChatGPT to estimate the total number of words it has been exposed to in all of its training. [...]
It’s ultimate estimate, IIRC, was 5 quintillion.
-
- Site Admin
- Posts: 3276
- Joined: Sun Jul 08, 2018 5:46 am
- Location: Right here, probably
- Contact:
Re: AI in conlanging - present and future
Doing some Googling, it appears GPT-4 was trained on 13 trillion tokens. (Tokens are somewhere between words and morphemes.) At that level of analysis, I don't think they're separating out programs from human language.keenir wrote: ↑Sat May 11, 2024 11:55 pmgiven that this is AI, does - for it - count code and-or HTML as words?zompist wrote: ↑Sat May 11, 2024 5:16 amThere are some estimates of the total size of the Internet, but I don't think they're comparable... consider how much of the Internet is pictures, video, HTML codes, or other code.I got ChatGPT to estimate the total number of words it has been exposed to in all of its training. [...]
It’s ultimate estimate, IIRC, was 5 quintillion.
That's nowhere near 5 quintillion, so I think ChatGPT got confused here.

Re: AI in conlanging - present and future
You’re quite right. I assumed the figure came from training corpus plus input from other AIs plus interactions with users, but didn’t interrogate that particularly well.
Here’s ChatGPT’s explanation:
I didn’t save the chat where the very large estimate was given, so I can’t remember how it got there.
Here’s ChatGPT’s explanation:
More: show
I didn’t save the chat where the very large estimate was given, so I can’t remember how it got there.
Re: AI in conlanging - present and future
as my ai-enabled search engine says,
“AI-generated response. Remember to check important information”...
believe me, but not too much...
“AI-generated response. Remember to check important information”...
believe me, but not too much...
-
- Posts: 566
- Joined: Tue Mar 16, 2021 4:09 pm
- Location: Poland
Re: AI in conlanging - present and future
AI models are getting better at conlanging. The following text is a courtesy of Claude 3.5 Sonnet. Which language is this conlang descended from?
On heste tid, sceap tat hæfde næn ull sæh hesta on feld. An drog swær wagn, oter bær micele byrten, and tridde bær mann swiftlice.
Sceap cwæt to hestum: "Min heorte is sær, tonne ic seo mann ridan hesta."
Hestas cwædon: "Hlyste, sceap! Ure heortan sint sære tonne we seot: mann, hlaford, macet ull of sceapum to warm clatum for him sylf. And sceap hæft næn ull."
Ta sceap hærde tis, hit fleah into feld.
IPA transcript:
/ɔn heste tiːd ʃeːap tat hæfde næːn ul sæːh hesta ɔn feld an drɔg swæːr wagn ɔter bæːr mikele byrten and tride bæːr man swiftlike/
/ʃeːap kwæːt tɔ hestum miːn heorte is sæːr tɔne ik seːo man riːdan hesta/
/hestas kwæːdɔn hlyste ʃeːap uːre heortan sint sæːre tɔne weː seːot man hlaːfɔrd maket ul ɔf ʃeːapum tɔ warm klaːtum fɔr him sylf and ʃeːap hæːft næːn ul/
/ta ʃeːap hæːrde tis hit fleːah intɔ feld/
On heste tid, sceap tat hæfde næn ull sæh hesta on feld. An drog swær wagn, oter bær micele byrten, and tridde bær mann swiftlice.
Sceap cwæt to hestum: "Min heorte is sær, tonne ic seo mann ridan hesta."
Hestas cwædon: "Hlyste, sceap! Ure heortan sint sære tonne we seot: mann, hlaford, macet ull of sceapum to warm clatum for him sylf. And sceap hæft næn ull."
Ta sceap hærde tis, hit fleah into feld.
IPA transcript:
/ɔn heste tiːd ʃeːap tat hæfde næːn ul sæːh hesta ɔn feld an drɔg swæːr wagn ɔter bæːr mikele byrten and tride bæːr man swiftlike/
/ʃeːap kwæːt tɔ hestum miːn heorte is sæːr tɔne ik seːo man riːdan hesta/
/hestas kwæːdɔn hlyste ʃeːap uːre heortan sint sæːre tɔne weː seːot man hlaːfɔrd maket ul ɔf ʃeːapum tɔ warm klaːtum fɔr him sylf and ʃeːap hæːft næːn ul/
/ta ʃeːap hæːrde tis hit fleːah intɔ feld/
- WeepingElf
- Posts: 1698
- Joined: Sun Jul 15, 2018 12:39 pm
- Location: Braunschweig, Germany
- Contact:
Re: AI in conlanging - present and future
Old English.Otto Kretschmer wrote: ↑Mon Jun 24, 2024 9:12 am AI models are getting better at conlanging. The following text is a courtesy of Claude 3.5 Sonnet. Which language is this conlang descended from?
On heste tid, sceap tat hæfde næn ull sæh hesta on feld. An drog swær wagn, oter bær micele byrten, and tridde bær mann swiftlice.
Sceap cwæt to hestum: "Min heorte is sær, tonne ic seo mann ridan hesta."
Hestas cwædon: "Hlyste, sceap! Ure heortan sint sære tonne we seot: mann, hlaford, macet ull of sceapum to warm clatum for him sylf. And sceap hæft næn ull."
Ta sceap hærde tis, hit fleah into feld.
IPA transcript:
/ɔn heste tiːd ʃeːap tat hæfde næːn ul sæːh hesta ɔn feld an drɔg swæːr wagn ɔter bæːr mikele byrten and tride bæːr man swiftlike/
/ʃeːap kwæːt tɔ hestum miːn heorte is sæːr tɔne ik seːo man riːdan hesta/
/hestas kwæːdɔn hlyste ʃeːap uːre heortan sint sæːre tɔne weː seːot man hlaːfɔrd maket ul ɔf ʃeːapum tɔ warm klaːtum fɔr him sylf and ʃeːap hæːft næːn ul/
/ta ʃeːap hæːrde tis hit fleːah intɔ feld/
Re: AI in conlanging - present and future
To clarify, is it outright OE text, and not descended from it?
/j/ <j>
Ɂaləɂahina asəkipaɂə ileku omkiroro salka.
Loɂ ɂerleku asəɂulŋusikraɂə seləɂahina əɂətlahɂun əiŋɂiɂŋa.
Hərlaɂ. Hərlaɂ. Hərlaɂ. Hərlaɂ. Hərlaɂ. Hərlaɂ. Hərlaɂ.
Ɂaləɂahina asəkipaɂə ileku omkiroro salka.
Loɂ ɂerleku asəɂulŋusikraɂə seləɂahina əɂətlahɂun əiŋɂiɂŋa.
Hərlaɂ. Hərlaɂ. Hərlaɂ. Hərlaɂ. Hərlaɂ. Hərlaɂ. Hərlaɂ.
- WeepingElf
- Posts: 1698
- Joined: Sun Jul 15, 2018 12:39 pm
- Location: Braunschweig, Germany
- Contact:
Re: AI in conlanging - present and future
I am not an expert on Old English, but to me it looks just like outright Old English, not a conlang descended from it.
Re: AI in conlanging - present and future
I'd agree with this, or perhaps suggest that it looks like a particularly good sample of Markov-generated text from OE inputs.WeepingElf wrote: ↑Mon Jun 24, 2024 1:15 pmI am not an expert on Old English, but to me it looks just like outright Old English, not a conlang descended from it.
*I* used to be a front high unrounded vowel. *You* are just an accidental diphthong.
- WeepingElf
- Posts: 1698
- Joined: Sun Jul 15, 2018 12:39 pm
- Location: Braunschweig, Germany
- Contact:
Re: AI in conlanging - present and future
It is obviously Schleicher's Fable, in a Germanic language that looks like Old English - but it doesn't really seem to be that, rather it seems to show some North Germanic traits, such as hest 'horse' or ull 'wool'. My guess is that it is a bogolang, obtained by running Old Norse through the sound changes from Common (West) Germanic to Old English. This kind of thing doesn't really require AI, just a sound change applieralice wrote: ↑Mon Jun 24, 2024 3:04 pmI'd agree with this, or perhaps suggest that it looks like a particularly good sample of Markov-generated text from OE inputs.WeepingElf wrote: ↑Mon Jun 24, 2024 1:15 pmI am not an expert on Old English, but to me it looks just like outright Old English, not a conlang descended from it.

-
- Posts: 566
- Joined: Tue Mar 16, 2021 4:09 pm
- Location: Poland
Re: AI in conlanging - present and future
100% AI made. Guess the language family:
Spēt fīðr tōkral tō fītsk nī ūmar ār, fōtomp tō tāsk kōt nī.
/speːt fiːðr toːkral toː fiːtsk niː uːmar aːr | foːtomp toː taːsk koːt niː/
(All people free and equal in are-born are, dignity and just-things in.)
Kālr sāðkrār ār kōtwar tō kōkr sēm.
/kaːlr saːðkraːr aːr koːtwar toː koːkr seːm/
(They endowed are reason and heart's-reproach [conscience].)
Tō tāgv nī īpkār kōkr nī kōd svēr.
/toː taːgv niː iːpkaːr koːkr niː koːd sveːr/
(And each-other to sibling-group heart in action do-should.)
Spēt fīðr tōkral tō fītsk nī ūmar ār, fōtomp tō tāsk kōt nī.
/speːt fiːðr toːkral toː fiːtsk niː uːmar aːr | foːtomp toː taːsk koːt niː/
(All people free and equal in are-born are, dignity and just-things in.)
Kālr sāðkrār ār kōtwar tō kōkr sēm.
/kaːlr saːðkraːr aːr koːtwar toː koːkr seːm/
(They endowed are reason and heart's-reproach [conscience].)
Tō tāgv nī īpkār kōkr nī kōd svēr.
/toː taːgv niː iːpkaːr koːkr niː koːd sveːr/
(And each-other to sibling-group heart in action do-should.)
Re: AI in conlanging - present and future
Japonic, if not Future Japanese itself. What was the model, the prompt? Did you ask only for a text, or are there additional explanations? Did you gloss it yourself?
Yaa unák thual na !
-
- Posts: 566
- Joined: Tue Mar 16, 2021 4:09 pm
- Location: Poland
Re: AI in conlanging - present and future
Gemini 2.5 Pro in Google AI Studio.
Prompt: "Create a conlang descended from Proto Japonic with Old Norse phonology and phonotactics but no Old Norse words or morphemes"
At the first try it did throw some elements of Old Norse but after I pinpointed that fact to it, it churned out a fine conlang. Then I asked for some translations
More translated text. The model outputs the roots for every word but I'm pretty certain this forum's software is too clunky for copypasting all of that:
Kīðkāz tō Fī (The North-Wind and The Sun)
Kīðkāz tō Fī sēmk ār, īð mōt tīkr kā.
/ˈkiːðkaːz toː fiː seːmk aːr | iːð moːt tiːkr kaː/
(North-Wind and Sun contended were, which more strong Q.)
Toki, tǣvfīð kītrok, āttakk kōrm nī meikt.
/ˈtoki | ˈtæːvfiːð ˈkiːtrok | ˈaːttakː koːrm niː mɛikt/
(Time[-when], traveler arrived, warm cloak in wrapped.)
Kālr āfk: fǣt onk [tǣvfīða kōrm nȳkmok nī nāsk] mōt tīkr sīlr ār tō, kāt jōr.
/kaːlr aːfk | fæːt oŋk | tæːvˈfiːða koːrm nyːkˈmok niː naːsk | moːt tiːkr siːlr aːr toː | kaːt joːr/
(They agreed: first one [traveler-ACC cloak make-take-off in succeed-PRES] more strong known be will that, side[-other] than.)
Satō Kīðkāz fūɣk, ītak fōð ātfok.
/ˈsatoː ˈkiːðkaːz fuːɣk | ˈiːtak foːð ˈaːtfok/
(Then North-Wind blew, intense extent could[-PAST].)
Sikār Kīðkāz ītak fūɣk fōð, sōn fōð kātak tǣvfīð kōrm makk kār mēgr.
/ˈsikaːr ˈkiːðkaːz ˈiːtak fuːɣk foːð | soːn foːð ˈkaːtak ˈtæːvfiːð koːrm makː kaːr meːgr/
(However North-Wind intense blew extent, that extent tightly traveler cloak wrapped him around.)
Tō, tūθ nī Kīðkāz mākk kōkrm.
/toː | tuːθ niː ˈkiːðkaːz maːkː koːkrm/
(And, final in North-Wind yielded attempt.)
Satō Fī tērk āttakk nī.
/ˈsatoː fiː teːrk ˈaːttakː niː/
(Then Sun shone warm in.)
Tō tāðtin tǣvfīð nūɣk kōrm.
/toː ˈtaːðtin ˈtæːvfiːð nuːɣk koːrm/
(And immediately traveler took-off cloak.)
Karā Kīðkāz sīfrark Fī mōt tīkr ār tō mītman.
/ˈkaraː ˈkiːðkaːz ˈsiːfrark fiː moːt tiːkr aːr toː ˈmiːtman/
(Therefore North-Wind was-obliged Sun more strong be that admit[-INF].)
Re: AI in conlanging - present and future
As far as I can see, this just takes all the fun out of it.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Re: AI in conlanging - present and future
I'm unsatisfied. The brief calls for a language descended from Proto-Japonic, but this is just mangled Japanese. Examples:
- It's like fifty percent long vowels, even in small grammatical words, which aside from being ugly seems a little implausible.
- "North" is found in Old Japanese, but since it's always written 北 we can't reconstruct the phonetics.
- Tǣvfīð reflects **tabihito (compare fī "sun", Japanese hi). The Proto-Japonic *tapiputo (not **tapipito!) gives tabyūdo in modern Japanese, but the word has been repaired by analogy to give tabibito with rendaku.
-
- Posts: 566
- Joined: Tue Mar 16, 2021 4:09 pm
- Location: Poland
Re: AI in conlanging - present and future
It took me like 5 minutes to create this lol. Dunno if any human would do it that fast.
Remember that publically available genrative AI has been around for only 2.5 years, best available models are essentially equivalents of Commodore 64 in the world of personal computers.
Remember that publically available genrative AI has been around for only 2.5 years, best available models are essentially equivalents of Commodore 64 in the world of personal computers.
Re: AI in conlanging - present and future
Anyone with a decent knowledge of Japonic could do it that fast. I don’t know Japonic myself, but I could easily do as good a job of making such a ‘conlang’ from English or French (those being the languages I know best). But I don’t do so, because the result would be boring, incoherent and lazy — as indeed the thing in your post is.Otto Kretschmer wrote: ↑Wed May 14, 2025 2:33 am It took me like 5 minutes to create this lol. Dunno if any human would do it that fast.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
- WeepingElf
- Posts: 1698
- Joined: Sun Jul 15, 2018 12:39 pm
- Location: Braunschweig, Germany
- Contact:
Re: AI in conlanging - present and future
theres an old saying: you want it fast or you want it good?Otto Kretschmer wrote: ↑Wed May 14, 2025 2:33 am It took me like 5 minutes to create this lol. Dunno if any human would do it that fast.