The Moses language

This document gives an overview of the Moses language: its features, how to run Moses programs, and the structure of its implementation.

The Moses Language

Moses is a simple functional language we have created for CS 131. It is similar to Haskell (it is called “Moses” after Moses Schönfinkel, inventor of schönfinkelization, much as Haskell is named after Haskell Curry, inventor of currying). It closely matches the syntax we have seen in examples in class. Like Haskell, we can see Moses as a more useable incarnation of the $\lambda$-calculus.

Features of Moses include:

  • An expressive functional language, with familiar features such as $\lambda$-expressions, let expressions, binary operators, etc.

  • A type system that supports subtyping, as well as tuples and lists.

Some of the most salient limitations of Moses compared to Haskell include:

  • Function arguments must be be annotated with their types. (Whereas Haskell infers the types of function arguments so we don’t have to state them explicitly.)

  • If we want to write recursive definitions, we have to use letrec instead of let.

  • Very limited pattern matching. We can break apart tuples in a let expression, but nested tuples will require nested lets. Similarly, case expressions are very restricted, working only on lists and only allowing you to write code to handle empty lists and lists that have a head and a tail, nothing more complex.

  • A quite different type system. Haskell allows type variables representing unknowns in types (e.g., id :: a -> a), whereas Moses does not.

  • Whitespace-independent syntax that is more restrictive than Haskell’s. You cannot give multiple definitions inside a let or letrec, instead you must nest them.

  • Moses uses different operators for list append and string concatenation. Moses uses @ for list append (compared to ++ in Haskell), in Moses ++ is always string concatenation. (Strings in Moses are not lists of characters.)

More examples of Moses code can be found in the homework directory.

Running Moses Programs

The easiest way to get familiar with Moses is to run it. On the HMC CS machines (e.g., knuth or the lab Macs), you can run:

  /cs/cs131/bin/Moses -i

to start up a REPL for Moses.

When you have the code (and when you’re working on the assignment), you should compile a command-line version of your implementation by running

  ghc -O Moses.hs

then start a Moses REPL:

  ./Moses -i

From here on out, we’ll assume that you’re running your own implementation of Moses (not the one in /cs/cs131/bin/).

There are a couple of other ways to run Moses that you might want to use:

  • Compile and run a file stored in a program:

      ./Moses «path to file»
    
  • Run an expression and see the result:

      ./Moses -e "42+3"
    

You can also run :load Moses.hs in ghci, then use the run or runFile functions to execute code.

Moses output

The Moses system prints a lot of information about the steps of execution. It first shows the parsed input (pretty printed), then it typechecks the code, then it converts all the nice constructs to vanilla lambda calculus, converts that to SK combinators, and then executes the program, and prints the result. Because it shows all the steps, you probably want to make sure you are using a terminal that lets you scroll back.

One interesting feature of the Moses system is that it uses the type of the expression to know how to print it out. It actually synthesizes printing code based on the type, thus an integer list (i.e., [INT]) is actually printed using a list printing higher-order function that is passed an int printer. Unfortunately, until you implement typechecking, your version of Moses won’t print any useful output on the command line for all but the very simplest programs. Instead, the typechecker gives almost everything the type ANY, which is always printed out as <??-who-knows-what-??>.1

If you run things in ghci, you’ll find that programs that produce simple values (such as integers) are visible in the returned value from the run or runfil functions, allowing us to “cheat” and see the results even without type checking. Thus, fact.mose will produce its answer. More complex types, such as lists, are not simple values and so you won’t be able to see their output printed until the type checker works.

If you want to see the full power of the printing code before the typechecker is complete, there are three special built-in values in Moses example1, example2, and example3, of type [REAL], ([INT], (REAL,BOOL), [[STRING]]), and [INT] respectively. The synthesized printing code for example2 is quite complex (it’s quite a complex type!), but is generated following fairly simple rules. Be aware that example3 is an infinite list of the natural numbers—it will never finish printing.

Structure of the Moses Language Implementation

Many languages have a rich high-level syntax that people write programs in, then the language translates (i.e., compiles) the program to a lower-level language that is actually executed (e.g., in C and C++, the low-level language is typically machine code). As you saw in the previous section, Moses follows this pattern: the higher-level syntax is a richly typed language, whereas the lower-level language is far more basic untyped language.

In our Moses implementation,

  • The untyped core is a simple combinator-based language. In addition to the combinators S, K and I, it does have a few built-in types and functions supporting integers, reals, and strings. Importantly, it does not and will not directly support booleans, lists, or pairs. (This kind of approach is a common one; for example, when C++ is compiled to X86 machine code, that code has no idea what C++ classes are.)

    Similarly, the core does not directly support recursion (it uses the Y combinator instead). It also has no “operators” and no ifthenelse, instead providing functions, such as _plus_ and _cons_ to serve the same purpose.2

  • The typed language provides lists, tuples, and a static type system, as well as niceties such as variables, let expressions, and so forth (the parser for the typed language also provides the familiar square-bracket list notation as syntactic sugar for colon notation as well as a few other niceties not actually present in the typed abstract syntax).

All programs in the typed language are translated into programs in the untyped core language. The only evaluator is the untyped-core-language evaluator.

The typed high-level language supports additional kinds of values beyond those directly supported by the low-level language—booleans, lists, and tuples are provided using the function-based (i.e., $\lambda$-calculus style) encodings we saw earlier in the semester. The untyped core language doesn’t know what these encodings are being used for, but the typed language does, because it has type information about the program.

Some of Moses is written in itself. For example, the @ operator is turned into a call to a function _append_ that appends two lists. There is nothing especially difficult about appending two lists, and so the code for _append_ is written in Moses. Other languages do similar things—much of Haskell is is written in Haskell (including the list append function), and similarly C and C++ have standard libraries written in C and C++. The file prelude.mosedefs includes definitions for both private internal functions and publicly available functions, all written in Moses code.

Phases of Execution

When Moses runs a program from a file (such as when you type runfile "tests/okay-value1.mose" or execute run "let x = 20 in x+x"), the following stages occur:

  1. The system first loads code from prelude.mosedefs to provide the necessary encodings of booleans, lists, and pairs, as well as a variety of other support functions that are written in Moses itself. Some of the functions (e.g., _cons_ for lists) are internal functions (in code, users write x:y not _cons_ x y), whereas others are intended to be visible and usable. For this latter group of functions, the file provides type declarations describing the functions. No actual type checking happens here at all, by design. Because we say what the types are, we can define a function, and then say “Don’t see this as a function, see it as a BOOL.” If course, this does assume that when we tell Moses what the types are, we’re saying something meaningful.

  2. The parser for the typed language reads in an expression from the desired file (such as okay-value1.mose). Any parse errors will abort with an error.

  1. The expression is typechecked, by typeofExpr from Typecheck.hs. If typechecking is successful, we know the type of the expression. Any type-checking errors will abort with an error.

  2. The expression is translated into an expression in the untyped lower-level core language by code in Transform.hs and SKRun.hs.

  3. The type of the expression is used to generate code to print a value of that type. This printing functionality is necessary because neither the core-language evaluator, nor Haskell know what our encodings mean. Something as simple as a pair of booleans will be encoded as a rather confusing-looking function (such as λ a.a (λ b.λ c.c) (λ d.λ e.d){language:moses}), but we know that in this case, these functions represent a pair of booleans, and thus how to make a function that will print it so that it looks like a pair.3

  4. The translated expression is evaluated by the core-language evaluator, resulting in a final value.

  5. An expression is created that applies the “printer” code to the value from the previous step. This expression is evaluated by the core-language evaluator, thereby printing program’s result in human-readable form. (The potentially non–human-readable SKAbsyn.Value that represents the program’s result is returned by run and runfile).

That’s the theory, anyway….

In practice, in the code that we give you, the typechecker is unfinished and so only typechecks very simple expressions. The biggest effect of this problem is the lack of a correct type for most kinds of expression, which means no printer for that kind of expression. The type checker operates correctly on okay-value1.mose through okay-value4.mose, and error-var1.mose. For the others, the lack of good typechecking results in late catching of errors and no good print function.4

  1. Because Moses is lazy, and because there is nothing useful that can be done with an ANY value, the command-line version of Moses will never do any computation on anything that types as ANY. This property isn’t always true when using run or runfile in ghci, because they return the value produced by evaluation. 

  2. The names have underscores to reduce the risk that a programmer might inadvertently reuse the name of a vital function. 

  3. When we know that this function is encoding a pair of booleans, we can determine that it holds the pair (false, true), but the same representation could also encode other values for other types, such as the pair of (false, [])—without type information, we have no hope of meaningfully printing data encoded inside an arbitrary function. 

  4. In some cases, you can cheat and read the result of evaluation from the untyped value that runfile returns to the Haskell interactive loop, but try that with okay6.mose