The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Topics that can go away
bradrn
Posts: 6257
Joined: Fri Oct 19, 2018 1:25 am

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by bradrn »

Richard W wrote: Sat Apr 23, 2022 5:48 pm I've been pondering a similar issue for fully designed Unicode-compliant regular expression engines. (The catch is that the set of strings recognised should be closed under canonical equivalence.) Strictly speaking, Kleene star isn't always a regular expression, though that can be made a non-issue by using a recursive NDFA (non-deterministic finite automaton) construction, which of course then isn't finite. (The simple example is deciding whether a string consists of an arbitrary number of repeats of U+0F73 TIBETAN VOWEL SIGN II. It's a fairly rare problem.) Once one throws Kleene stars and choices into the pattern, it's no longer even decidable in general whether the pattern is a regular expression.
I’m not sure I understand this. If you’re referring to ‘regular expressions’ in the formal, CS sense, the addition of a Kleene star will always give a valid regular expression, by definition. And PCRE-type regexps are a superset of formally defined regular expressions.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
User avatar
zyxw59
Posts: 83
Joined: Wed Jul 11, 2018 12:07 am
Contact:

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by zyxw59 »

The complication here is with Unicode canonical equivalence. The example here is as follows:
A sequence of n copies of U+07F3 is canonically equivalent to a sequence of n copies of U+07F1 followed by n copies of U+07F2. Thus, with the requirement that the set of strings recognized is closed under canonical equivalence, /\u07F3*/ is not a regular expression (since it matches strings of the form anbn for all n)

As far as I'm aware, this level of Unicode-awareness is not generally found it regex engines
bradrn
Posts: 6257
Joined: Fri Oct 19, 2018 1:25 am

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by bradrn »

zyxw59 wrote: Sat Apr 23, 2022 9:37 pm The complication here is with Unicode canonical equivalence. The example here is as follows:
A sequence of n copies of U+07F3 is canonically equivalent to a sequence of n copies of U+07F1 followed by n copies of U+07F2. Thus, with the requirement that the set of strings recognized is closed under canonical equivalence, /\u07F3*/ is not a regular expression (since it matches strings of the form anbn for all n)

As far as I'm aware, this level of Unicode-awareness is not generally found it regex engines
Ah; interesting problem. But I wouldn’t expect to see this sort of thing handled in a regex engine anyway — I’d want a regex engine to match sequences of character codes exactly, and it can just be passed canonicalised texts if I want it to treat equivalent sequences as equivalent.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
Richard W
Posts: 1471
Joined: Sat Aug 11, 2018 12:53 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Richard W »

bradrn wrote: Sat Apr 23, 2022 9:52 pm But I wouldn’t expect to see this sort of thing handled in a regex engine anyway — I’d want a regex engine to match sequences of character codes exactly, and it can just be passed canonicalised texts if I want it to treat equivalent sequences as equivalent.
Not easily, and I trust that Mark Davis knows that it isn't easy. There's actually a theorem (strictly speaking, a corollary to it), due I think to Edward Ochmański, that says that a finite automaton that can recognise a set of NFD strings can be converted to a finite automaton that will recognise the set of their canonical equivalents. The problem is to convert the naïve pattern to something intelligible. (Compare the problem of computing a regular expression for the difference of two regular sets.) To take a simple example, to determine your way if a string contains the Vietnamese letter â, one can't just use the pattern ".*a\u0302.*" for comparison with the NFD form of the string, one has to use the pattern ".*a\p{ccc≠230}*\u0302.*" or something similar. (Anglice: the regular expression must allow any number of 'non-starters' with canonical combining class less than 230 between base letter and diacritic.) I've chosen the most succinct form of the pattern I could think of - it does not matter what non-NFD strings it would match. Any Vietnamese word containing the Vietnamese letter and tonemark combination "ậ" would satisfy the condition.

Where I found this a real problem was in matching strings defined in terms of Unicode properties.

And of course, your approach can't handle the challenge of matching "\u0f73*", as the set of NFD forms of that set is notoriously not a regular set.
bradrn
Posts: 6257
Joined: Fri Oct 19, 2018 1:25 am

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by bradrn »

Richard W wrote: Sun Apr 24, 2022 6:07 pm
bradrn wrote: Sat Apr 23, 2022 9:52 pm But I wouldn’t expect to see this sort of thing handled in a regex engine anyway — I’d want a regex engine to match sequences of character codes exactly, and it can just be passed canonicalised texts if I want it to treat equivalent sequences as equivalent.
Not easily, and I trust that Mark Davis knows that it isn't easy. There's actually a theorem (strictly speaking, a corollary to it), due I think to Edward Ochmański, that says that a finite automaton that can recognise a set of NFD strings can be converted to a finite automaton that will recognise the set of their canonical equivalents. The problem is to convert the naïve pattern to something intelligible. (Compare the problem of computing a regular expression for the difference of two regular sets.) To take a simple example, to determine your way if a string contains the Vietnamese letter â, one can't just use the pattern ".*a\u0302.*" for comparison with the NFD form of the string, one has to use the pattern ".*a\p{ccc≠230}*\u0302.*" or something similar. (Anglice: the regular expression must allow any number of 'non-starters' with canonical combining class less than 230 between base letter and diacritic.) I've chosen the most succinct form of the pattern I could think of - it does not matter what non-NFD strings it would match. Any Vietnamese word containing the Vietnamese letter and tonemark combination "ậ" would satisfy the condition.

Where I found this a real problem was in matching strings defined in terms of Unicode properties.

And of course, your approach can't handle the challenge of matching "\u0f73*", as the set of NFD forms of that set is notoriously not a regular set.
OK, this makes sense. I was wondering if NFC could work, but that would be even worse, I suspect.



Unrelatedly: a family friend was throwing out some old computer stuff, and I’ve managed to install Arch Linux on their desktop computer. Hopefully I’ll be able to self-host my own website soon!
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
Richard W
Posts: 1471
Joined: Sat Aug 11, 2018 12:53 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Richard W »

bradrn wrote: Mon Apr 25, 2022 12:32 am I was wondering if NFC could work, but that would be even worse, I suspect.
Indeed. For the simple example, the regex would become something like:
.*([âấầẩẫậ]|[aąḁ]\p{ccc≠230}*\u0302).*
(The large size is to make the diacritics legible in likely fonts.)
Richard W
Posts: 1471
Joined: Sat Aug 11, 2018 12:53 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Richard W »

bradrn wrote: Sat Apr 23, 2022 9:06 pm If you’re referring to ‘regular expressions’ in the formal, CS sense, the addition of a Kleene star will always give a valid regular expression, by definition.
I believe the formal definition of a regular expression is one defining a regular language, i.e. one that can be recognised by a finite state machine. It is Kleene's theorem that for a finitely generated free monoid, the rational expressions only define regular languages.
bradrn
Posts: 6257
Joined: Fri Oct 19, 2018 1:25 am

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by bradrn »

Richard W wrote: Mon Apr 25, 2022 2:37 am
bradrn wrote: Sat Apr 23, 2022 9:06 pm If you’re referring to ‘regular expressions’ in the formal, CS sense, the addition of a Kleene star will always give a valid regular expression, by definition.
I believe the formal definition of a regular expression is one defining a regular language, i.e. one that can be recognised by a finite state machine.
That’s the one!
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
Travis B.
Posts: 6853
Joined: Sun Jul 15, 2018 8:52 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Travis B. »

Well, I have yet another release of zeptoforth out, release 0.31.0, which re-adds deferrred words and adds something called simple channels ("schannels" for short) to enable abstracted communication between tasks and interrupt service routines.
Yaaludinuya siima d'at yiseka wohadetafa gaare.
Ennadinut'a gaare d'ate eetatadi siiman.
T'awraa t'awraa t'awraa t'awraa t'awraa t'awraa t'awraa.
Richard W
Posts: 1471
Joined: Sat Aug 11, 2018 12:53 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Richard W »

Travis B. wrote: Sat Apr 23, 2022 8:27 pm
Richard W wrote: Sat Apr 23, 2022 5:48 pm Decidability theory seems to distinguish between machines with sequential memory access and those with random access memory, so the whole concept fractures.
Are you sure?
It seems that I confused statements about the existence of programs with a given complexity with more general statements of existence.
Travis B.
Posts: 6853
Joined: Sun Jul 15, 2018 8:52 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Travis B. »

I've ported my block support, including a block editor, for zeptoforth from the STM32F746 DISCOVERY platform to the Raspberry Pi Pico platform. There is less space available (1 MB rather than 8 MB, minus the space used in each Quad SPI flash sector for managing the blocks), but that still is a good amount of space all things considered. This should get into a new release soon.
Yaaludinuya siima d'at yiseka wohadetafa gaare.
Ennadinut'a gaare d'ate eetatadi siiman.
T'awraa t'awraa t'awraa t'awraa t'awraa t'awraa t'awraa.
User avatar
Raholeun
Posts: 352
Joined: Wed Jul 11, 2018 9:09 am
Location: sub omnibus canonibus

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Raholeun »

When using Google Docs, would anybody know a practical way to insert nicely spaced glosses? Currently I create invisible tables, but this might be the least practical way to add a gloss.
bradrn
Posts: 6257
Joined: Fri Oct 19, 2018 1:25 am

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by bradrn »

Raholeun wrote: Thu Apr 28, 2022 8:06 am When using Google Docs, would anybody know a practical way to insert nicely spaced glosses? Currently I create invisible tables, but this might be the least practical way to add a gloss.
The one time I tried to use Google Docs for linguistics, this is what I did also; it was so painful I swore off using the app for any more linguistic stuff. (It’s much easier in LaTeX.) But if you still want to torture yourself by using the thing, I suspect you might find it a bit easier to do glosses in a monospace font, like this:
Ŋapawelus naaqa waq fetlhalh thaŋ, sisasat thaŋ qi yusaye bengen qalit thaŋ waalhi.

Code: Select all

[ Ŋa-pa-welus   naaqa  waq      fetlhalh       thaŋ   ], si-sasat  thaŋ    qi  yusaye    be-ngen  qalit  thaŋ    waalhi.
  that-DIM-man  who    do.IMPF  IDEO.horrible  DEF.SG    this-sun  DEF.SG  3s  come.PFV  1s-POSS  house  DEF.SG  go.PFV
[That awful little man], he stopped by my house yesterday.
It’s still quite painful, but less so than invisible tables, at least.

(Sentence taken from a recent scratchpad post. phpBB adds a funny border to monospaced text, but in Google Docs you should just be able to change the font.)
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices

(Why does phpBB not let me add >5 links here?)
Travis B.
Posts: 6853
Joined: Sun Jul 15, 2018 8:52 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Travis B. »

Travis B. wrote: Wed Apr 27, 2022 3:05 pm I've ported my block support, including a block editor, for zeptoforth from the STM32F746 DISCOVERY platform to the Raspberry Pi Pico platform. There is less space available (1 MB rather than 8 MB, minus the space used in each Quad SPI flash sector for managing the blocks), but that still is a good amount of space all things considered. This should get into a new release soon.
And release 0.32.0 is out, including these changes.
Yaaludinuya siima d'at yiseka wohadetafa gaare.
Ennadinut'a gaare d'ate eetatadi siiman.
T'awraa t'awraa t'awraa t'awraa t'awraa t'awraa t'awraa.
User avatar
Raholeun
Posts: 352
Joined: Wed Jul 11, 2018 9:09 am
Location: sub omnibus canonibus

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Raholeun »

Thanks bradrn for the suggestion of using a monospace font. There are a couple that have IPA support. I think this indeed is the least bad option.
Travis B.
Posts: 6853
Joined: Sun Jul 15, 2018 8:52 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Travis B. »

I wrote a Brainfuck compiler for zeptoforth (technically two of them, as one version compiles code that acts on bytes and the other version compiles code that acts on cells) which generates native code and can interoperate with Forth code.
Yaaludinuya siima d'at yiseka wohadetafa gaare.
Ennadinut'a gaare d'ate eetatadi siiman.
T'awraa t'awraa t'awraa t'awraa t'awraa t'awraa t'awraa.
rotting bones
Posts: 1408
Joined: Tue Dec 04, 2018 5:16 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by rotting bones »

Raphael wrote: Sat Apr 23, 2022 12:18 pm Something I've been thinking about a bit lately: how useful, really, is the concept of Turing completeness? My impression is that if I understand that concept correctly, it seems to imply that in theory, you should be able to play last year's best-selling high-end games on an ENIAC. But how would that ever work? And if, as seems likely, it can't work, what use is the concept then?
The Turing machine is useful for distinguishing which calculations a mechanical computer can perform given the known laws of nature. For example, we know that under either classical mechanics or quantum mechanics, no computer can decide in finite time whether an arbitrary input program will finish executing. However, there are some problems like factoring that a computer exploiting the laws of quantum physics can solve faster than any classical computer: https://en.wikipedia.org/wiki/Shor%27s_algorithm

There are several assumptions baked into these models of computation, like:

1. As was mentioned, they assume infinite memory. An old computer might be able to execute Crysis given enough memory, but you probably won't be able to actually play the game even so.

Memory is infinite because what we want to know is whether a calculation can be done even if memory were infinite. Not everything can.

2. Another definitional trick is that when I say a "computer can decide", I mean "an algorithm exists". In this context, an "algorithm" is defined as a set of instructions that finishes executing in finite time. By that definition, no web browser is an "algorithm" if it doesn't finish executing until you select an Exit option.

An "algorithm" is a mathematical procedure like multiplying two numbers, not an interactive computer program like a video game or a web browser.

3. When I say an algorithm is "faster", I mean it's faster as input size tends to infinity.

For "small" input sizes, "faster" algorithms may execute more slowly.

...

The point of these theoretical models is to distinguish what's possible from what's impossible. Knowing what's preventing us from solving certain problems lets us pinpoint the conditions under which those barriers are absent. This kind of thinking has lead to theoretical breakthroughs like quantum computing. Consult a textbook if you want more details on the various complexity classes: https://www.mog.dog/files/SP2019/Sipser ... ion.3E.pdf

Science is important because ordinary human thinking is perversely resistant to common sense. Without theory, the only foundation for choosing first principles is advertising. The absorption of so much spiritual and otherwise irrational politics into QAnon suggests that the only political faction that benefits from replacing theory with advertising is fascism. No matter how spiritually moving your guru might make it seem, you wouldn't be able to write an algorithm that will solve world hunger if only you massacred minority X.
User avatar
Raphael
Posts: 4557
Joined: Sun Jul 22, 2018 6:36 am

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Raphael »

rotting bones wrote: Sun May 01, 2022 12:43 pm
2. Another definitional trick is that when I say a "computer can decide", I mean "an algorithm exists". In this context, an "algorithm" is defined as a set of instructions that finishes executing in finite time. By that definition, no web browser is an "algorithm" if it doesn't finish executing until you select an Exit option.

An "algorithm" is a mathematical procedure like multiplying two numbers, not an interactive computer program like a video game or a web browser.
Ah, that explains a lot! Thank you!

Science is important because ordinary human thinking is perversely resistant to common sense. Without theory, the only foundation for choosing first principles is advertising. The absorption of so much spiritual and otherwise irrational politics into QAnon suggests that the only political faction that benefits from replacing theory with advertising is fascism. No matter how spiritually moving your guru might make it seem, you wouldn't be able to write an algorithm that will solve world hunger if only you massacred minority X.
I don't have a problem with theories in science.
User avatar
Raphael
Posts: 4557
Joined: Sun Jul 22, 2018 6:36 am

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Raphael »

Completely unrelated: Am I the only one who thinks about PCs' power first and foremost in terms of RAM size? Tell me a PC has a CPU speed of 3.10GHz or something, and I have no idea what to make of that; tell me that it has 2GB RAM, and I have some idea of what it can do and where it fits into the history of computers. Thing is, I'm writing this on a PC whose RAM is a good deal bigger than the one I originally bought it with, while the CPU is still the original one, so in theory, I should know that RAM size can be misleading, but I somehow just can't help my "instincts" working the way they do.
Richard W
Posts: 1471
Joined: Sat Aug 11, 2018 12:53 pm

Re: The Computer And General Tech Thread - Software, Hardware, Questions, etc.

Post by Richard W »

But RAM can be significant for speed. I switched from 32-bit to 64-bit Ubuntu because mainstream 32-bit Ubuntu had been discontinued, and I've found that my PC is significantly slower. I strongly suspect that one solution would be to add more RAM, but I'm not sure if it's possible - I can't find a meaningful model number. The machine's of an age to have come with 64-bit Windows 7 installed. I frequently have the three memory gobblers Firefox, LibreOffice and Emacs running together.
Post Reply