xixixao/languages.md Secret

Last active March 19, 2025 13:08

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/xixixao/8e363dbd3663b6729cd5b6d74dbbf9d4.js"></script>

Raw

From Languages to Language Sets

After working with a lot of languages, writing my own, this is currently what I consider the most useful classification of programming languages, into 4 levels:

4: Interpreted, dynamically typed: JavaScript, Python, PHP
3: Interpreted, statically typed: Hack, Flow, TypeScript, mypy
2: Compiled with automatic memory management (statically typed): Go, Java (Kotlin), C#, Haskell, Objective-C, Swift
1: Compiled with manual memory management (statically typed): Rust, C, C++

There is a 0th level, assembly, but it’s not a practical choice for most programmers today.

Now every language trades off “ease of use” with “performance”. On this hierarchy the higher numbered, “higher level”, languages are easier to use, while the lower numbered, “lower level”, languages are more performant.

I postulate that for most programming, the “business logic” kind of programming, we want to use a language that sits right in the middle of that hierarchy. Looking at the languages listed that’s no revelation. One language could combine the 2nd and 3rd level though. A language that can be interpreted during development for fast iteration cycle, but compiled for better performance for deployment. There isn’t such a language popular today though.

Now let’s address level 4. Big players sit at this level, perhaps the most popular languages by headcount of their programmers. The problem with a lack of static typing is that it’s hard to work on such code in groups and at scale. Every successful business started with those languages eventually rewrites their codebase to use one of the “lower level” languages because big codebases written by many people are hard to maintain and modify without the support of a static type-checker. They are still great languages for solo, small projects, especially if the code can be easily automatically tested.

Now for level 1, Rust has done an amazing job bringing level 1 to a wider audience of programmers. By both being modern, and safe, it allows many more people to write code that requires best possible performance and resource utilization. In such scenarios Rust should be a clear choice. But coding in Rust is not easy, not in the way coding in JavaScript or Python is. The same solution, much more performant, might require many more lines of code.

And so we come to the levels 2 and 3, where most professional programmers today spend their time. The tradeoff between them is clear: The interpreted languages have a faster development cycle because they don’t require the programmer to wait for a compilation step. But this comes at the cost of performance, as the interpreter in general cannot be as good at optimizing and executing the code as the compiler.

The interesting thing is that these languages are almost identical in their expressive power. The only gap between them is that interpreted languages can include “eval” and dynamic meta-programming (modification of program structure at runtime). These features are usually shied away from in production code though, and are more helpful during development, especially for testing.

The discussion here implies that companies need to use at least 3, often 4 different languages in their codebases. This means 4 different toolsets to maintain. Trainings to provide. Experts to hire. And usually disjoint sets of employee programmers who cannot easily jump from one language to the other.

Clearly there will never be a single language that all programmers use. We need to take advantage of the tradeoffs laid out in this hierarchy. But what we could do is to build a language set, which would smooth out the transition between these levels.

As the basis of this set I propose to use Rust. It is a solid low level foundation to build our language set on. It has modern, well thought out language tooling (including things like its syntax).

There will be 3 languages in this set, besides Rust we want a level 2/3 hybrid and level 4 language.

Let’s look at an example to make this concrete. First a program in Rust:

fn main() {
    let rect1 = Rectangle {width: 30, height: 50};
    println!(“The area is {}.”, area(&rect1));
}

struct Rectangle {
    width: u32,
    height: u32,
}

fn area(rectangle: &Rectangle) -> u32 {
    rectangle.width * rectangle.height
}

Now a program in RustGC, our level 2/3 hybrid:

fn main() {
    let rect1 = Rectangle { width: 30, height: 50 };
    println!(“The area is {}.”, area(rect1));
}

struct Rectangle {
    width: int,
    height: int,
}

fn area(rectangle: Rectangle) {
    rectangle.width * rectangle.height
}

And now a program in RustScript, our level 4 language:

fn main() {
    let rect1 = { width: 30, height: 50 };
    println!(“The area is {}.”, area(rect1));
}

fn area(rectangle) {
    rectangle.width * rectangle.height
}

RustScript can be used for heavy prototyping, especially for complicated stateful programming (interactive UIs). RustGC is our workhorse, with great async support, decent performance thanks to a modern garbage collector, but without the mental overhead of fighting the borrow checker. Finally we reach for Rust any time we need maximum performance and 0-cost abstractions.

RustGC comes with a VM that allows instantenous save -> execute dev cycle, but is compiled for deployment to a binary similar to the one that Rust would compile to, but with an accompanying GC runtime.

The best part is that all three languages share pretty much the same syntax, and they are built so that calling from higher level to lower level variant is effortless. This gives us the ability to use the rich Rust ecosystem from a level 2/3 or even level 4 language.

More examples. UI component in RustScript:

fn app() {
  let (state, setState) = useState({
    total: None,
    next: None,
    operation: None,
  });

  let handleClick = |buttonName| => {
    setState(|state| => calculate(state, buttonName));
  };
  
  let value = state.next.or(state.total).unwrap_or("0");
  <div className="component-app">
    <Display value={value} />
    <ButtonPanel clickHandler={handleClick} />
  </div>
}

Async example in RustGC:

async fn main() {
  let user_ids = vec![1, 2, 3];
  let user_names = user_ids.iter().map_async(
    async |id| => fetch_user_name(id).await,
  ).await;
  println!(user_names.join(", "));
}

async fn fetch_user_name(_: int) -> Future<string> {
  // This could be a database request.
  ""
}

thor314 commented Apr 8, 2022

Reminds me of the homoiconicity arguments for Lisps. It would be great if this is existed! The rub, as it seems likely to me, is that if the flagship language for a language set is at one level, languages at other levels would inevitably get less attention and be less well maintained than their same-level alternatives.

tiye commented Apr 8, 2022

While this is an interesting idea, I think it will definitely complicate the compiler/interpreters and even IDEs. I don't hold an opinion on this. But it also reminds of some history in CoffeeScript community that sharing same syntax for multiple targets:

https://coffeescript.org/ which compiles to JavaScript
https://moonscript.org/ which compiles to Lua
https://github.com/runekaagaard/snowscript which compiles to PHP
https://nim-lang.org/ which compiles to C (, also with less syntax from coffee)
or maybe https://github.com/tcr/coffeescript-to-java , https://github.com/DisownedWheat/gopherscript ...

kaisadilla commented Feb 23, 2025 •

edited

Loading

Honestly, I don't know what to think. I've had this idea myself in the past - when you are looking at languages like C#, it's easy to think "if you remove the GC and introduce pointers, you basically have the raw power of C. If you add some extra features that don't care about performance, you have flexibility and ease of TypeScript — all in one single language!".

Truth is, on one hand, I really don't care much that I'm switching between C, C# and TypeScript (aside from the fact that C is a prehistoric language with inexcusable features like import being a copy paste, but that's C's fault). On the other hand, it is true that it simply isn't that hard to build a low-level language and create two more by adding features to them, and that would allow devs to share a lot more knowledge on how to write good code among all their languages.

All of this said, my final choice is that I don't like the idea for one reason: it would significantly gatekeep new languages from gaining adoption, as they'd be expected to be 3 different languages. New languages coming in and replacing older ones is something good, as newer languages are built with new knowledge that wasn't there when older languages were designed, and can introduce new features that may increase the quality of the code developers write; so I think anything that reduces the amount of new languages we get is more harm than good.

Also, I don't think dynamically typed languages are required in this, not even for prototyping. A good syntax that tries to eliminate type annotations when they are redundant is enough to greatly minimize the amount of type you need to spend writing types - and considering your tier 3 language should support dynamic types, good old "any" would suffice as a way to save on painful types when prototyping.

TJSomething commented Mar 17, 2025

This sounds a lot like some of the original goals of Perl, that intentionally made it so that there were multiple ways to do things based on context.

Alternately, I'm also reminded of Hedy, which has 17 different syntaxes, building lessons up to level 18, which is just Python.

I think the logical way to implement this is that the higher levels are implemented as sugar. RustScript implements everything as a RustObject (like how a PyObject works). RustGC implements traits and adds reference wrappers to handle the bookkeeping of garbage collection.

harismh commented Mar 17, 2025

A language that can be interpreted during development for fast iteration cycle, but compiled for better performance for deployment. There isn’t such a language popular today though.

For me, Common Lisp hits that middle-ground well. It offers fast prototyping via REPL explorations and inferred types, but gets JIT compiled later for respectable performance (1-2 orders of magnitude slower than C). But, Lisp does bring some baggage when it comes to practicality, which is why I feel Clojure gained popularity by piggy-backing off a mainstream language Java. Jank holds some promise here for the "ultimate" business-logic language by combining Lisp and C++. Another route, I guess, is writing exclusively in some declarative language with hints that can be passed through some logic-based LLM/tokenizer which will convert the intent of the program into efficient low-level code.

mpw commented Mar 19, 2025 •

edited

Loading

Hi, interesting writeup!

The idea of a language that straddles multiple levels has been around for some time, though the definition of what exactly the levels are varies. And yes, Common LISP certainly ticks some of those boxes.

You also mis-classified one of the languages: Objective-C. As Objective-C is a true super-set of C, it includes all of C and thus fairly obviously also encompasses Level 1. And this isn't just a nit-pick, but I think it shows that what you want already exists, to some extent.

Anyway, in addition to being Level 1 and Level 2, the "Objective" part of Objective-C is dynamically typed, with optional static typing. So that it also fits in the slightly different category of "dynamically typed, compiled".

And of course interpretation vs. compilation is really more of an implementation decisions rather than an inherent language characteristic, though today we mostly seem to stick to one or the other (with some good reasons).

However, Objective-C actually had a fully interpreted dialect called WebScript. It was almost entirely syntax-compatible with Objective-C, and could have been made so without too much effort.

This unusual range largely explains why Objective-C is so popular with some developers, and also why it's not so popular with others. The range is great, the individual components are so so. So if you really just want a level 2 language and only a level 2, you will be disappointed and having the other levels available is either something you don't notice, or worse it gets in your way.

I talk about this a bit in The 4 stages of Objective-S. Objective-S is designed to go all the way from Level 1 to higher than Level 4, though not necessarily with all the levels in-between.

The fact that a lot, if not most of our code is just glue code is one of the main motivations for Objective-S: Glue: the Dark Matter of Software. The approach is to (a) give equal support to other common connector types (dataflow/streaming, data-access/in-process REST, events, in addition to function/method/procedure call) and (b) to easily support variations of these connectors.

Allowing easy calling between different language levels the way you describe here falls into the category (b): it means supporting multiple variations of the "procedure/method/function call" connector.