A decryption challenge

Moose-tache · Post by **Moose-tache** » Thu Aug 20, 2020 12:50 am

There are way too many one-character words, and the two-character words are quite numerous and varied compared to English, so there either has to be an abugida-style vowel marking going on or (more likely in my opinion) abbreviation of lots of common words. I'm curious: do you think the mirror images are a grade (akin to descender vs no descender), or unrelated letters?

Risla · Post by **Risla** » Thu Aug 20, 2020 4:06 am

quinterbeck wrote: ↑Wed Aug 19, 2020 6:23 pm ignoring the dot and grave accents, which seem pretty evenly distributed.

Not giving anything away yet since I'm seeing some good insights in this thread, but since this is just a dumb handwriting thing: there's no distinction here (they're all supposed to be dots).

quinterbeck · Post by **quinterbeck** » Thu Aug 20, 2020 9:00 am

Risla wrote: ↑Thu Aug 20, 2020 4:06 am
quinterbeck wrote: ↑Wed Aug 19, 2020 6:23 pm ignoring the dot and grave accents, which seem pretty evenly distributed.
Not giving anything away yet since I'm seeing some good insights in this thread, but since this is just a dumb handwriting thing: there's no distinction here (they're all supposed to be dots).

Aha! That explains why some of them are so ambiguous!

Moose-tache wrote: ↑Thu Aug 20, 2020 12:50 am There are way too many one-character words, and the two-character words are quite numerous and varied compared to English, so there either has to be an abugida-style vowel marking going on or (more likely in my opinion) abbreviation of lots of common words.

I reckon it's a combination of both.

Moose-tache wrote: ↑Thu Aug 20, 2020 12:50 am I'm curious: do you think the mirror images are a grade (akin to descender vs no descender), or unrelated letters?

Not a grade, as there's only two pairs like that. Different series, I'd say, but perhaps related by feature.

Qwynegold · Post by **Qwynegold** » Thu Aug 20, 2020 3:11 pm

quinterbeck wrote: ↑Wed Aug 19, 2020 6:23 pmBased on that, I think it's a sort of abugida with the series indicating a consonant (although it's 5 more than english has)

Do you mean consonant sounds or consonant phonemes?

quinterbeck wrote: ↑Wed Aug 19, 2020 6:23 pmAll my tables and tallies are on paper, but I'm happy to scan some of it and add to the thread if anyone wants

Yes please. I'd like to take a look and see if I can figure anything out, and it would be wasteful for me to tally things if someone's already done that.

quinterbeck · Post by **quinterbeck** » Thu Aug 27, 2020 4:39 pm

I couldn't get my scanner to work, so I'm afraid it's all phone photos!

Here's what I think is the base glyph set, followed by the tallies of glyphs with extensions (ascenders and descenders). The alphanumeric labels are part of encoding I've been using, based on the graphical relationships that I see. There's also a tally of all words, and on a fourth sheet, a word summary of words with a frequency of two or more, plus the six word collocations I could find.

The photos are embedded under more, and here's a link to the imgur album: https://imgur.com/a/jioTF7g

More: show

Glyphs
Graphically, there's two main types: ascending (a) and descending (d) - a series only appears with one type of extension or the other. Then I've grouped them into 5 or 6 numbered groups each based on a common shape, as there are a lot of pairs with the only difference being a horizontal stroke (at x-height on a-types, and baseline-height on d-types). The exceptions being: d1#, grouped by reflection, and d4#, grouped by weaker graphical similarity of base+descender shape. And a1::a1b is possibly not a meaningful grouping. Also, a5c might have been better placed in the #b column... ah well.

I initially treated the dot and line diacritics the same, and tallied up ignoring them, but then noticed that the line diacritics only appear on certain ascending glyphs, which leads me to believe they are separate but related series. The dot diacritics are much more widely distributed. The overline (o) only appears on #b bases, and the underline (u) appears on primary and #b bases (I've encoded b+u as v). In fact, it occurs to me as I type that a5b might actually belong in the #u column... then I could scrap #c... hmm, maybe I will recode that, actually.

So on the series-grade tally sheet, the counts on the left hand side ignore line diacritics, then the revised counts of a-type including lined series are on the right. In the bottom right hand corner you can see my encoding for ascenders (l-t-f) and descenders (q-g-j), matching the table columns. (For example, a1-l is the most common glyph with an ascender.)

Then at the bottom, I've noted some caveats about the descender grades. Although I'm reasonably confident that right-hooked and left-hooked descenders are the correct correspondence, it's possible that it's inwards/outwards, so I've made some notes about which way the groups 'face'. Qwynegold's comment about reflected characters led me to realise I had assumed it, although I think the evidence is in favour of the initial assumption anyway. Especially as group d4# doesn't have a clear facing direction, and apart from d1b, the remaining attested series face 'left'.

For the sake of completion I've been encoding the dot diacritics as i (overdot) and e (underdot), and I also designated all the punctuation marks for the purposes of transcribing the text into excel, which I have completed (with errors, probably).

Words
The wordlist was put together before Risla revealed that dot and dash were not distinct, so there are some words counted separately that shouldn't be - at a glance I see a2-i, a2-e and a2b-i all have two variants listed.

I don't have any experience in corpus analysis, but I'm surprised so many words are unique

(I need to go to bed - I'll come back and add some more thoughts to this tomorrow)

Risla · Post by **Risla** » Thu Aug 27, 2020 9:00 pm

Oh hey, nice work so far. It's cool to see my code written in someone else's handwriting!

I went through the word list looking for mistakes, and found a couple, at least one of which (the one in white, erasing the low bar) is definitely my own and is in the original text (I knew it as soon as I found it, it's one of the most common dumb mistakes I make when I'm writing

).

More: show

I've added a couple dots in pink, circled in green one definite non-word, and made a couple edits in pink based on the fact that the following characters are actually distinct:

I figure this one could be thought of as just a handwriting thing—I could definitely make them more distinct when writing—so I thought I'd give you this one.

quinterbeck · Post by **quinterbeck** » Fri Aug 28, 2020 9:49 am

Thanks for the corrections!

Yes, I did assume that was a handwriting thing, as that character on the left is the only one in the text with a long base. Still, good to know it's distinct.

Your green-circled 'non-word' appears in the compound at the beginning of line 17 in the text. Just checked, and I've copied it correctly (I've assumed that the colon character is some kind of punctuation, that's why I describe it as a compound)

Risla · Post by **Risla** » Fri Aug 28, 2020 3:17 pm

Oh doh, it is a word! I guess I was primed by a different word (at the beginning of line eight) and read one character as another. Sorry for the brain fart!

It's interesting to me how bad I am at actually reading my own code, despite the fact that I can write it fluently. The only texts I have in it are ones written by myself, which means when I'm reading it I tend to be very good at predicting the next word. As soon as I saw it in the text proper I read it with no problem, but it's always harder to read anything out of context.

quinterbeck · Post by **quinterbeck** » Fri Aug 28, 2020 3:49 pm

Risla wrote: ↑Fri Aug 28, 2020 3:17 pm Oh doh, it is a word! I guess I was primed by a different word (at the beginning of line eight) and read one character as another. Sorry for the brain fart!

I actually misread it the first time and thought it was a repetition of line eight word one! That's why it's off to the side

So I don't blame you

quinterbeck · Post by **quinterbeck** » Sun Aug 30, 2020 12:44 pm

My working assumption is that the x-height characters indicate consonants (possibly with inherent vowel) and the combination of modifiers (ascenders, descenders, dots and possibly line diacritics) describe vowels, possibly underspecified.

The long-based character you corrected me on (I'll call it a6) seems to sit outside the pattern, so I'm ignoring it for now.

Didn't mention before that I'm confident of the following as punctuation: first character is a paragraph marker (p), end of the first line is a period (n), end of the second line is perhaps an exclamation or question mark (m), the tailed period is a comma (y) and the chevrons are brackets or quotations (k).

I've hit a bit of a brick wall, so I'm going to ask a question!

Question: With those assumptions above, aside from any quirky function words and our friend a6, and considering only the consonant aspect of the glyph, does a glyph represent at most a single consonant, or can it represent more than one in some cases? By glyph, I mean the base form and modifiers combined.

Also, Risla, what variety of English do you speak?

Risla · Post by **Risla** » Sun Aug 30, 2020 3:05 pm

I speak American English, and am notably cot/caught unmerged (they're very close though, and sometimes ambiguous).

And the answer to your question:

More: show

Risla · Post by **Risla** » Sat Sep 05, 2020 11:09 pm

It's been about a week—any progress?

I've got a bit of a hint, and a peculiarity. People have already observed that words in the code tend to have much fewer characters than English text. Here's the most extreme example of this that I've found:

This word is, in English, nine letters long…and three syllables!

It's a pretty rare word, but I thought I might mention it. If there's been no other progress, I can start dropping other hints too.

bradrn · Post by **bradrn** » Sat Sep 05, 2020 11:17 pm

Risla wrote: ↑Sat Sep 05, 2020 11:09 pm This word is, in English, nine letters long…and three syllables!

It's a pretty rare word, but I thought I might mention it. If there's been no other progress, I can start dropping other hints too.

May we ask how many phonemes it is? If said word ends in ‘-ough’, for instance, then that accounts for four letters (almost half the word) but only two phonemes.

Risla · Post by **Risla** » Sat Sep 05, 2020 11:18 pm

Seven.

bradrn · Post by **bradrn** » Sat Sep 05, 2020 11:23 pm

Risla wrote: ↑Sat Sep 05, 2020 11:18 pmSeven.

My, that was a quick reply! Though it doesn’t actually help me too much… it’s pretty easy to think up nine-letter seven-phoneme three-syllable words. e.g. arresting would qualify, but I’m pretty sure that’s not the correct word.

quinterbeck · Post by **quinterbeck** » Sun Sep 06, 2020 9:22 am

Is the script phonemic? Or near enough? I've been assuming it is, but probably should check. (EDIT: ignoring abbreviations, I should say)

Are there any sounds of English that don't appear in your text? If yes, what?

Risla · Post by **Risla** » Sun Sep 06, 2020 3:12 pm

It is indeed more or less phonemic, albeit with some quirks.

Every English phoneme can be represented in the script. In the example text given, there are no examples of /ʒ/. However, not all phonemes are represented in all contexts, particularly when they are recoverable based on knowledge of English phonology (which is indeed what's happening with that one word).

quinterbeck · Post by **quinterbeck** » Fri Sep 11, 2020 12:26 pm

Risla wrote: ↑Sun Sep 06, 2020 3:12 pmIn the example text given, there are no examples of /ʒ/.

Interesting! I've been hypothesizing that pairs distinguished by an attached horizontal stroke are voiced-voiceless pairs, but I haven't found a way to make that fit the data yet. Ignoring a1::a1b, that leaves seven pairs present (a2 a3 a4 a5 d2 d3 d5), which would line up with the eight pairs in English - although I'm struggling to see what could be /ʃ/ with no counterpart /ʒ/ - perhaps the c-shape (d4b)...

Looking at my table below - are the gaps at a4o, a5v and d6b fillable? Or would glyphs at those places have no value?

More: show

Risla wrote: ↑Sun Sep 06, 2020 3:12 pmHowever, not all phonemes are represented in all contexts, particularly when they are recoverable based on knowledge of English phonology

Thanks! I had assumed that was the case for vowels - is it also true for consonants?

Risla · Post by **Risla** » Sat Sep 12, 2020 9:20 am

Answer mostly provided in code:

All consonants are marked, aside (again) from some idiosyncracies surrounding a bare handful of function morphemes.

quinterbeck · Post by **quinterbeck** » Sun Sep 13, 2020 7:31 am

My word, there's quite a bit to get my teeth into there! Goodness gracious, there are things here I never dreamed of.

Zompist Bboard Again

A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge

Re: A decryption challenge