Source

From the README

Elevator pitches (and anti-pitches)

noulith> 1 to 10 filter even map (3*)
[6, 12, 18, 24, 30]
noulith> f := \-> 2 + 5 * 3
noulith> f()
17
noulith> swap +, *
noulith> f() # (2 times 5) plus 3
13
noulith> swap +["precedence"], *["precedence"]
noulith> f() # 2 times (5 plus 3)
16
noulith> swap +, *
noulith> f() # (2 plus 5) times 3
21

Imagine all the operator parsing code you won't need to write. When you need like arbitrarily many levels of operator precedence, and are happy to eval inputs.

How do you run this thing?

It's a standard Rust project, so, in brief:

This will drop you into a REPL, or you can pass a filename to run it. If you just want to build an executable so you can alias it or add it to $PATH, just run cargo build --release --features cli,request,crypto and look inside target/release.

None of the command-line options to cargo run or cargo build are required; they just give you better run-time performance and features for a slower compile time and larger binary size. (Without --release, stack frames are so large that one of the tests overflows the stack...)

Features (and anti-features) (and claims that will become false as I keep hacking on this)

Example

Somewhat imperative:

for (x <- 1 to 100) (
o := '';
for (f, s <- [[3, 'Fizz'], [5, 'Buzz']])
if (x % f == 0)
o $= s;
print(if (o == '') x else o)
)

Somewhat functional:

for (x <- 1 to 100) print([[3, 'Fizz'], [5, 'Buzz']] map (\(f, s) -> if (x % f == 0) s else "") join "" or x)

More in-depth tour

NOTE: I will probably keep changing the language and may not keep all this totally up to date.

Numbers, arithmetic operators, and comparisons mostly work as you'd expect, including C-style bitwise operators, except that:

Tighter ^ << >>
* / % &
+ - ~
|
Looser  == != < > <= >=

We support arbitrary radixes up to 36 with syntax 36r1000 == 36^3, plus specifically the slightly weird base-64 64rBAAA == 64^3 (because in base-64 A is 0, B is 1, etc.)

Like in Python and mathematics, comparison operators can be chained like 1 < 2 < 3; we explain how this works later. We also have min, max, and the three-valued comparison operator <=> and its reverse >=<.

End-of-line comments: # (not immediately followed by (). Range comments: #( ... ). Those count parentheses so can be nested.

Strings: " or '. (What will we use the backtick for one day, I wonder.) Also like in Python, we don't have a separate character type; iterating over a string just gives single-character strings.

Data types:

Expressions

Everything is a global function and can be used as an operator! For example a + b is really just +(a, b); a max b is max(a, b). As a special case, a b (when fenced by other syntax that prevents treating either as binary operator) is a(b) (this is mainly to allow unary minus), but four or more evenly-many identifiers and similar things in a row like (a b c d) is illegal. (Also, beware that a[b] parses as indexing b into a, not a([b]) like you might sometimes hope if you start relying on this too much.) Also:

(Sort of considering removing some of the partial application stuff now that _s work... hmm...)

Operator precedence is determined at runtime! This is mainly to support chained comparisons: 1 < 2 < 3 works like in Python. Functions can decide at runtime when they chain (though there's no way for user-defined functions to do this yet), and we use this to make a few other functions nicer. For example, zip and ** (cartesian product) chain with themselves; a ** b ** c and a zip b zip c will give you a list of triplets, instead of a bunch of [[x, y], z]-shaped things.

Identifiers can consist of a letter or _ followed by any number of alphanumerics, ', or ?; or any consecutive number of valid symbols for use in operators, including ?. (So e.g. a*-1 won't work because *- will be parsed as a single token. a* -1 won't work either, but for a different reason — it parses like it begins with calling * with a and - as arguments. a*(-1) or a* -(1) would work.) Compared to similar languages, note that : is not a legal character to use in operators, while $ is. In addition, a bunch of keywords are forbidden, as are all single-letter uppercase letters and tokens beginning with single-letter uppercase letters immediately followed by a single quote (though these are just reserved and the language doesn't recognize all of them yet); =, !, ..., <-, ->, and <<-. Also, with the exception of == != <= and >=, operators ending in = will be parsed as the operator followed by an =, so in general operators cannot end with =.

Almost all builtin functions' precedences are determined by this Scala-inspired rule: Look up each character in the function's name in this table, then take the loosest precedence of any individual character. But note that this isn't a rule in the syntax, it's just a strategy I decided to follow when selecting builtin functions' precedences. For example, +, ++, .+, and +. all have the same precedence. As of time of writing, the only exceptions to this rule are << and >>, which have precedence like ^.

Tighter . (every other symbol, mainly @ which I haven't allocated yet)
!?
^
*/%&
+-~
|
$
=<>
Looser  (alphanumerics)

. is not special syntax, it's actually just an operator that does tightly-binding reverse function application! a.b = b(a). then is loosely-binding reverse function application.

! is syntax that's spiritually sort of like what Haskell's $ lets you write. It's as tight as an opening parenthesis on its left, but performs a function call that lets you can omit the closing one up to the next semicolon or so. f! a, b is f(a, b).

_ is special; assigning to it discards (but type checks still happen; see below). Some expressions produce Scala-style anonymous functions, e.g. 1 < _ < 3, [_, 2], _[3]. I might implement more later.

Types double as conversion functions: str(3) int(3) dict([[1, 2], [3, 4]]) etc. Bending internal consistency for pure syntax sweetness, to is overloaded to takes a type as its second argument to call the same conversion. Test types explicitly with is: 3 is int, int is type. The type of null is nulltype. Strings are str and functions are func. The "top" type is anything.

We got eval, a dumb dynamic guy; vars for examining local variables; assert, which is currently a silly function and will probably become a keyword so it can inspect the expression being asserted.

freeze is a wonky keyword that goes through an expression and eagerly resolves each free variable to what it points to outside. It can slightly optimize some functions, surface some name errors earlier, and more elegantly(??) handle some binding acrobatics that you might have to write IIFEs for in other languages.

The import statement takes a filename and approximately just parses it and splices it in where written, sort of like how C/C++'s #include works. This is an awful hack and might be fixed one day.

Variables and assignments

Declare with :=, assign with =. (Statements must be separated by semicolons.)

x := 0; x = 1

Actually := is just a declaration with an empty type. You can declare typed variables like:

x : int = 3

Pythonically, sequences can be unpacked with commas, including a single trailing comma for a single-element unpack. Type annotations are looser than commas, so below, x and y are both ints. Prefix ... to pack/unpack multiple things, and likewise in function calls.

x, y : int

You can declare in an assignment with a parenthesized annotation.

a := 0
a, (c:) = 1, 2
a, (d:int) = 3, 4

These are checked at runtime! Assigning non-ints to x will throw an error. Hopefully. This is useful in other scenarios.

You can also do operator-assignments like you'd expect, with any operator! a f= b is basically just a = f(a, b). Note that the left side is parsed just like a call a(f), so the operator can even be parenthesized: after

x := [1, 2]; x (zip+)= [3, 4]

x is [4, 6]. In particular when you want to write a = f(a) you can just write a .= f because . is function application.

One corner case in the semantics here: While the operator is being called, the LHS variable will be null. That is, the following code will print null:

x := 0
f := \a -> (print x; a)
x .= f

This allows us to not have to keep an extra copy of the LHS variable in common cases where we "modify" it, so code like x append= y is actually efficient (see discussion of immutability below).

The weird keyword every lets you assign to or operate on multiple variables or elements of a slice at once. This initializes three variables to 1. This doesn't work with operator-assignments, though it might in the future.

every a, b, c := 1

After this, x == [0, 0, 1, 1, 0].

x := [0] ** 5; every x[2:4] = 1

Important note about assignment: All data structures are immutable. When we mutate indexes, we make a fresh copy to mutate if anything else points to the same data structure. So for example, after

x := [1, 2, 3];
y := x;
x[0] = 4

y will still be [1, 2, 3]. You may wish to think of x[0] = 4 as syntax sugar for x = [4] ++ x[1:], although when nothing else refers to the same list, it's actually as fast as a mutation.

As a consequence, calling a function on a data structure cannot mutate it. There are a few special keywords that mutate whatever they're given. There's swap like swap x, y for swapping two values; there's pop and remove for mutating sequences; and the crudest instrument of all, consume gives you the value after replacing it in where it came from with null. After

x := [1, 2, 3, 4, 5];
y := pop x;
z := remove x[0]

y will be 5, z will be 1, and x will be [2, 3, 4]. There's no way to implement pop as a function yourself; the best you could do is take a list and separately return the last element and everything before it.

You can implement your own "mutable data cells" easily (?) with a closure:

make_cell := \init -> (x := init; [\ -> x, \y -> (x = y)])
get_a, set_a := make_cell(0)

Control Flow

As above: statements must be separated by semicolons.

Everything is an expression, so the "ternary expression" and if/else statement are one and the same: if (a) b else c. Loops: for (var <- list) body; while (cond) body. For loops can have many iteration clauses: for (a <- b; c <- d). Several other clauses are supported: for (p <<- m) iterates over index-value or key-value pairs, for (x := y) declares a variable in the middle, and for (if x) is a guard. Finally for loops can yield (only the entire body, not inside a more complicated expression) to turn into a list comprehension, like Scala: for (x <- xs) yield x + 1.

There are no "blocks"; just use more parentheses: if (a) (b; c; d).

We have short-circuiting, quite-low-precedence and and or. We also have coalesce, which is similar to or, but it only takes its RHS if its LHS is precisely null, not other falsy things. Note not is just a normal function.

Switch:

switch (x)
case 1 -> b
case 2 -> d

Run-time type checking does some work here:

switch (x)
case _: int -> print("it's an int")
case _ -> print("not sure")

Stupid complicated runtime types with satisfying:

switch (x)
case _: satisfying! 1 < _ < 9 -> print("it's between 1 and 9")
case _ -> print("not sure")

Don't do weird things in the argument to satisfying, it's illegal. (Also actually you can just write this because the comparison operators < have yet another layer of magic — 1 < _ < 9 is not a lambda here; you could have actually replaced _ with a named variable to bind it.)

switch (x)
case 1 < _ < 9 -> print("it's between 1 and 9")
case _ -> print("not sure")

Try-catch: try a catch x -> y.

break continue return work.

Only lambdas exist, declare all functions this way: \a, b -> c. You can annotate parameters and otherwise pattern match in functions as you'd expect: \a: int, (b, c) -> d.

Structs

Super bare-bones product types right now. No methods or namespaces or anything. (Haskell survived without them for a few decades, we can procrastinate.) You can't even give fields types or default values.

struct Foo (bar, baz);

Then you can construct an all-null instance Foo() or all values with Foo(a, b). bar and baz are now member access functions, or if you have a foo of type Foo, you can access, assign, or modify the fields as foo[bar] and foo[baz]. To be clear, these names really are not namespaced at all; bar and baz are new variables holding functions in whatever scope you declared this struct in, and can be passed around as functions in their own right, assigned to variables, etc. (but won't work on any other struct).

Sequence operators

len for length.

Most operators for working with lists/dictionaries/other sequences are two characters, doubled or involving a . on the side of an individual element:

Some functions to make streams: repeat cycle permutations combinations subsequences

start iterate func swallows, plus you can cause weird borrow errors if the function is weird. Don't do this:

x := iterate! 0, \t -> x const t
x[0] = 0

Functional programming

All the usuals and some weird ones: each, map, flat_map, filter, reject, any, all, find/find?, locate/locate? (finds the index of something), count, take, drop, zip, sort, group. These take the function as the second argument / on the right! Also they're eager!

zip, group, window have overloads that don't take functions.

zip is n-ary and can take a function to zip with too (which gets all arguments); you can also use with. merge is similar but for like-keyed entries of dictionaries. ziplongest is like zip, but, well, the longest; and when there's a function it's used to reduce all the remaining elements, two at a time, instead of called with all of them at once.

fold/reduce (which are the same) require a nonempty sequence with two arguments, but also chain with an optional from starting value, e.g. x fold * from 1.

sort takes a three-valued comparator, which you can get by <=> on some key function. Or >=< for backwards. Sorry, no built-in Schwartzian transform yet.

[[1], [2, 3, 4], [5, 6]] sort_by (<=> on len)
\1: [[1], [5, 6], [2, 3, 4]]

Other goodies: id, const (returns its second argument!), flip. Some Haskell-Arrow-esque operators exist: &&&, ***, >>>, <<<. The first two are n-ary like zip.

I/O and interfacing with the world

If compiled with request:


Tags: language   dynamic  

Last modified 16 December 2024