On programming languages

alice · Post by **alice** » Sun Jan 26, 2025 2:22 pm

I've asked something similar before, but it's an interesting diversion, so here we go again.

Imagine an alternate timeline where the vast majority of programming languages were written not in Eglish but in something much more highly inflected (ultimately, of course, Latin, Greek, or even Sanskrit).

What forms of words would be used? For example, in German, for "print" would you say "drück" or "drücken Sie", or even "drückt"? And would Identifiers which were Nouns all have Capital letters? Taking this a bit further, would you need to use the accusative case for arguments to functions ("calcula(rationem)")?

malloc · Post by **malloc** » Sun Jan 26, 2025 2:57 pm

I have always wondered this myself. Considering that my main conlang is polysynthetic, this issue presents some obvious challenges for creating a programming language in it.

Travis B. · Post by **Travis B.** » Sun Jan 26, 2025 3:13 pm

Note however that the main areas where English comes into play is that keywords are normally English words and identifiers are commonly written in English (as are also commonly but not exclusively the comments). Beyond that, programming languages are not very English-like to begin with, unless you are programming in <shudders> COBOL...

Man in Space · Post by **Man in Space** » Sun Jan 26, 2025 3:37 pm

Perligata is an implementation of Perl in Latin. It might be useful to see how they did it.

Post by **zompist** » Sun Jan 26, 2025 3:45 pm

alice wrote: ↑Sun Jan 26, 2025 2:22 pm Taking this a bit further, would you need to use the accusative case for arguments to functions ("calcula(rationem)")?

Maybe the accusative for pass by reference, and the ablative for pass by value...

Zju · Post by **Zju** » Sun Jan 26, 2025 4:26 pm

alice wrote: ↑Sun Jan 26, 2025 2:22 pm Imagine an alternate timeline where the vast majority of programming languages were written not in Eglish but in something much more highly inflected (ultimately, of course, Latin, Greek, or even Sanskrit).

Inuktitut has left the chat.

What forms of words would be used?

For functions such as 'print', some kind of imperative or infinitive, I'd imagine. Or maybe 3SG (irrealis).

bradrn · Post by **bradrn** » Sun Jan 26, 2025 9:10 pm

My own view is that programming languages are just too different to human languages to be easily analogised to them. Our current languages are mostly dissimilar to English, and their syntax is more influenced by ease of implementation.

Take dot notation, for instance, as used for method-calling in most object-oriented languages. It’s very tempting to relate this to the SVO word order of English. Except — can the dispatching object really be called a ‘subject’? After all, the whole point of OOP is that its internal state is affected by calling methods, which makes it more like an object than a subject. (It’s even called an ‘object’ in OOP!) So that would make the word order… what, OVS? Except it isn’t really, because the parameters aren’t subjects either. It’s just a construction with no correspondent in human language at all.

On the other hand, it’s quite easy to justify this syntax by reference to its implementation. In most modern OOP languages, method dispatch is determined based on a single privileged argument. It thus makes sense to create a separate syntax for method calls which distinguishes this single argument. But it also makes sense to derive this syntax for that of ordinary function calls, which in most languages is f(x,y,z). (This in turn derives from mathematical notation, as indeed does most syntax in programming languges.) Thus we quite naturally get to obj.f(x,y,z).

It’s worth noting that even languages which don’t descend from ALGOL have converged to similar syntax — e.g. OCaml uses obj#f x y z. There are simply very strong functional pressures towards this syntax.

(Also worth noting: languages with multiple dispatch generally don’t use this syntax, because the same pressures aren’t there. Instead they tend to use the same syntax as ordinary function calls, like Common Lisp (method arg1 arg2 arg3).)

There are of course some languages which deliberately reproduce English syntax. COBOL has been mentioned, but I think a better example is SQL, in which a SELECT statement is structured as an imperative with subclauses. The only reason this works is because it’s expected that the database converts the SQL query to something which can actually be run — otherwise, it would be unusably verbose.

Another very interesting example is APL. This is one programming language which can genuinely be described as being inspired by English syntax. It’s most obvious in the descendant language J, where the ‘Vocabulary’ is divided into ‘nouns’, ‘verbs’, ‘adverbs’ and ‘conjunctions’. The syntactic behaviour of each category is similar to the syntactic behaviour of the corresponding English categories. Though even here there are compromises to convenience: thus, while divalent verbs take one argument on either side, like English SVO word order, monovalent verbs take a following argument rather than a preceding one.

Also relevant: the process of learning APL (or J) is rather similar to the process of learning a human language. APL has ~75 different glyphs, each with a different meaning; the programmer is expected to learn all of them. (J increases the number to ~120.) As with natlang words, the meanings of these glyphs are nontransparent from their forms. On top of these, there are numerous ‘idiomatic’ combinations of glyphs, which are compositional but which the programmer is also expected to recognise as performing specific common tasks.

Unsurprisingly, APL is considered very hard to learn and has never become really widespread. It seems that designing a programming language like human languages makes it harder, rather than easier to use.

Finally, I would be remiss if I didn’t mention Perl, designed by the linguistically-trained Larry Wall. He’s been known to describe Perl features in linguistic terms: thus, for instance, the auto-assigned variable $_ is a form of ‘topicalisation’, as are packages. A hash encodes ‘genitive or possessive’ relationships. Sigils are like ‘grammatical noun markers’, and are ‘singular’ or ‘plural’. (I’m pretty sure I’ve seen ‘agreement’ used in this context too, but I may be imagining it.) He also takes inspiration from general characteristics of human languages, such as letting a single word mean different things in different contexts. But that said, Perl doesn’t actually have very much resemblance to any specific human language, and is far more clearly inspired by previous programming languages like the Bourne shell and C.

All of which argues that probably speakers of other languages would, in general, design programming languages (and mathematical notation) similarly to us. At most, they might find Forth easier than we do.

TomHChappell · Post by **TomHChappell** » Sun Jan 26, 2025 11:01 pm

Most programs in most classical programming languages are nearly all imperatives.
In SQL and Oracle etc (IIANM), most programs are mostly interrogative.
We can count the data-type statements as declaratives; and count the error messages as exclamations.
Maybe we can count definitions of MACRO-instructions as declaratives, too. Though that would be complex; statements within statements.
Even so; a program in whatever programming language, hardly resembles the average thing that might be said in a discourse between two (or more) human beings!

bradrn · Post by **bradrn** » Sun Jan 26, 2025 11:34 pm

TomHChappell wrote: ↑Sun Jan 26, 2025 11:01 pm In SQL and Oracle etc (IIANM), most programs are mostly interrogative.

Really? I would call them imperatives also, addressed to the database. Grammatically, SQL certainly uses imperative sentences.

Ketsuban · Post by **Ketsuban** » Mon Jan 27, 2025 7:15 am

For your consideration: the قلب programming language.

bradrn · Post by **bradrn** » Mon Jan 27, 2025 8:50 am

Ketsuban wrote: ↑Mon Jan 27, 2025 7:15 am For your consideration: the قلب programming language.

But structurally it’s no different to any other Scheme.

Travis B. · Post by **Travis B.** » Mon Jan 27, 2025 11:49 am

bradrn wrote: ↑Sun Jan 26, 2025 9:10 pm All of which argues that probably speakers of other languages would, in general, design programming languages (and mathematical notation) similarly to us. At most, they might find Forth easier than we do.

One thing I should note is that traditionally in Forth the order is:

Code: Select all

arg0 arg1 arg2 ... target-arg operation

For instance, to set a variable foo to $DEADBEEF one does:

Code: Select all

$DEADBEEF foo !

In object-oriented code in my own Forth, zeptoforth, method calls take a similar pattern being:

Code: Select all

arg0 arg1 arg2 ... object-arg method

For instance, the code to send a frame to transmit to a WiFi interface in zeptoIP reads:

Code: Select all

self outgoing-buf bytes self out-frame-interface @ put-tx-frame

Torco · Post by **Torco** » Mon Jan 27, 2025 12:36 pm

i don't think this is determined (i.e. if inflectional language speakers come up with programming languages, they'll be more this whereas if agglutinative language speakers come up with them they'll look like that) but i do think there are some paths, conventions etcetera that English made somewhat less likely. for example.

functions could be like verbs: you could have "sumar(num, num) = blablabla" meaning define the function suma as blablabla, but "suma(1,2) = blablabla" mean that you're checking whether blablabla is equal to the sum of 1 and 2: this would save you the keyword def, and would help with avoiding = versus == (though that may not be a good idea).

animacy hierarchies or case could make it more intuitive to use them, instead of word order, to define what you mean by a = b. like, in real programming languages for the most part a = b means "let the variable a take the value b" instead of "let b take the value of a", but you could have a-NOM = b-ACC or for clarifying that it is a that is taking the value of b or something like that.

maybe you could have some rule that, following the way a language works, that uses plural markers so the compiler knows whether you mean an array or not. this would work in english, since things that end with s are often plural, so you can have "apple = 2.3" mean the float 2 point 3, whereas "apples= 2.3" might mean an array with the integers 2 and 3. this would help to avoid brackets.

overall, though, i'd expect alternate inventors of programming languages to mostly follow isolang logic: baking morphology into a compiler/interpreter sounds like making life more complicated than it has to be. then again this might familiarity bias.

WeepingElf · Post by **WeepingElf** » Mon Jan 27, 2025 12:50 pm

This comes to my mind here.

bradrn · Post by **bradrn** » Mon Jan 27, 2025 6:57 pm

Torco wrote: ↑Mon Jan 27, 2025 12:36 pm functions could be like verbs: you could have "sumar(num, num) = blablabla" meaning define the function suma as blablabla, but "suma(1,2) = blablabla" mean that you're checking whether blablabla is equal to the sum of 1 and 2: this would save you the keyword def, and would help with avoiding = versus == (though that may not be a good idea).

If I understand correctly, this sounds like Prolog and other logic languages.

animacy hierarchies or case could make it more intuitive to use them, instead of word order, to define what you mean by a = b. like, in real programming languages for the most part a = b means "let the variable a take the value b" instead of "let b take the value of a", but you could have a-NOM = b-ACC or for clarifying that it is a that is taking the value of b or something like that.

Perhaps, but it seems needlessly complicated compared to the alternative of simply having two operators. (As is the case in R, for instance: you can do var <- value or value -> var.)

maybe you could have some rule that, following the way a language works, that uses plural markers so the compiler knows whether you mean an array or not. this would work in english, since things that end with s are often plural, so you can have "apple = 2.3" mean the float 2 point 3, whereas "apples= 2.3" might mean an array with the integers 2 and 3. this would help to avoid brackets.

Perl sigils (mentioned in my post) work this way. $apple is a scalar, @apple is a list. If you do $count = @apple, then $count is set to the number of elements in the list @apple, because in Perl that’s what it means to use a list in a scalar context.

Note again that this design is deliberately dissimilar to English even when it’s inspired by natural languages, because other factors are far more important. For instance $apple and @apple are nice and easy for a computer to parse. They’re also prominent to the human eye — an important point for features which are as easy to misuse as this one. They also lead directly to Perl features such as string incorporation, which depends on variables being easy to distinguish.

By contrast apple and apples are more difficult to parse, and very easy for a human programmer to confuse. And including the entire plural morphology of English in an interpreter feels like a recipe for disaster.

Post by **zompist** » Mon Jan 27, 2025 8:02 pm

bradrn wrote: ↑Mon Jan 27, 2025 6:57 pm Note again that this design is deliberately dissimilar to English even when it’s inspired by natural languages, because other factors are far more important. For instance $apple and @apple are nice and easy for a computer to parse.

I tend to agree that programming languages owe a lot more to mathematics than to natural language. Some of math does come from natlangs, e.g. why most languages don't use prefix or postfix operators. But something like f(x) doesn't really correspond to anything in a natlang, to my knowledge, and that directly influenced how functions are called in programming languages.

By contrast apple and apples are more difficult to parse, and very easy for a human programmer to confuse. And including the entire plural morphology of English in an interpreter feels like a recipe for disaster.

Kind of, maybe? English is so poor in inflections that this sort of thing wouldn't occur to English-speaking programmers. Would that be the case in, say, Quechua, where the plural is -kuna with no exceptions? That would be pretty easy to parse.

Programmers generally invent entirely new conventions that have to be learned, with no aids except perhaps knowing other languages, and those might teach you the wrong thing. (E.g in Basic $ represents a string.) Maybe there's something to be said for conventions that users can figure out based on their general knowledge (not necessarily just language).

You could probably make a case that programming languages are still hamstrung by ASCII... almost all of them just reinterpret symbols that were defined on typewriters. APL goes beyond this, sometimes logically, sometimes perversely.

bradrn · Post by **bradrn** » Mon Jan 27, 2025 8:19 pm

zompist wrote: ↑Mon Jan 27, 2025 8:02 pm
By contrast apple and apples are more difficult to parse, and very easy for a human programmer to confuse. And including the entire plural morphology of English in an interpreter feels like a recipe for disaster.
Kind of, maybe? English is so poor in inflections that this sort of thing wouldn't occur to English-speaking programmers.

Not necessarily… it depends on the language. In Haskell this naming convention is very common: thus, for instance, a list is very often pattern-matched as x:xs, where x is the first element of the list and xs are the rest of the list.

Haskell does have a couple of factors in its favour here. Firstly, names tend to be very short, so the final -s is more prominent. Secondly, it has a very strong type system, so the programmer doesn’t have to rely on the naming convention. In Perl neither of those apply.

You could probably make a case that programming languages are still hamstrung by ASCII... almost all of them just reinterpret symbols that were defined on typewriters. APL goes beyond this, sometimes logically, sometimes perversely.

Non-ASCII symbols are becoming more common in recent languages. Julia and Lean in particular make very extensive use of them. There’s also the UnicodeSyntax language extension for Haskell, though most people find ASCII more convenient.

alice · Post by **alice** » Tue Jan 28, 2025 2:16 pm

bradrn wrote: ↑Sun Jan 26, 2025 9:10 pm Take dot notation, for instance, as used for method-calling in most object-oriented languages.

Of course, object.attribute could be reinterpreted as attribute-of-object, which is an obvious use for the genitive. Combined with this:

zompist wrote: ↑Sun Jan 26, 2025 3:45 pm Maybe the accusative for pass by reference, and the ablative for pass by value...

we're getting close to a Classical Latin or Greek programming language. (Or perhaps even German.)

Man in Space · Post by **Man in Space** » Tue Jan 28, 2025 4:32 pm

alice wrote: ↑Tue Jan 28, 2025 2:16 pmwe're getting close to a Classical Latin or Greek programming language.

Man in Space wrote: ↑Sun Jan 26, 2025 3:37 pm Perligata is an implementation of Perl in Latin.

alice · Post by **alice** » Wed Jan 29, 2025 2:12 pm

Man in Space wrote: ↑Tue Jan 28, 2025 4:32 pm
alice wrote: ↑Tue Jan 28, 2025 2:16 pmwe're getting close to a Classical Latin or Greek programming language.
Man in Space wrote: ↑Sun Jan 26, 2025 3:37 pm Perligata is an implementation of Perl in Latin.

I missed that!

Zompist Bboard Again

On programming languages

On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages

Re: On programming languages