The Moses language
This document gives an overview of the Moses language: its features, how to run
Moses programs, and the structure of its implementation.
The Moses Language
Moses is a simple functional language we have created for
CS 131. It is similar to Haskell (it is called “Moses” after
Moses Schönfinkel, inventor of schönfinkelization, much as Haskell is
named after Haskell Curry, inventor of currying). It closely matches the
syntax we have seen in examples in class. Like Haskell, we can see
Moses as a more useable incarnation of the
$\lambda$-calculus.
Features of Moses include:
-
An expressive functional language, with familiar features such as $\lambda$-expressions,
letexpressions, binary operators, etc. -
A type system that supports subtyping, as well as tuples and lists.
Some of the most salient limitations of Moses compared to
Haskell include:
-
Function arguments must be be annotated with their types. (Whereas Haskell infers the types of function arguments so we don’t have to state them explicitly.)
-
If we want to write recursive definitions, we have to use
letrecinstead oflet. -
Very limited pattern matching. We can break apart tuples in a
letexpression, but nested tuples will require nestedlets. Similarly,caseexpressions are very restricted, working only on lists and only allowing you to write code to handle empty lists and lists that have a head and a tail, nothing more complex. -
A quite different type system. Haskell allows type variables representing unknowns in types (e.g.,
id :: a -> a), whereasMosesdoes not. -
Whitespace-independent syntax that is more restrictive than Haskell’s. You cannot give multiple definitions inside a
letorletrec, instead you must nest them. -
Mosesuses different operators for list append and string concatenation.Mosesuses@for list append (compared to++in Haskell), inMoses++is always string concatenation. (Strings inMosesare not lists of characters.)
More examples of Moses code can be found in the homework
directory.
Running Moses Programs
The easiest way to get familiar with Moses is to run it. On the HMC CS
machines (e.g., knuth or the lab Macs), you can run:
/cs/cs131/bin/Moses -i
to start up a REPL for Moses.
When you have the code (and when you’re working on the assignment), you should compile a command-line version of your implementation by running
ghc -O Moses.hs
then start a Moses REPL:
./Moses -i
From here on out, we’ll assume that you’re running your own implementation of
Moses (not the one in /cs/cs131/bin/).
There are a couple of other ways to run Moses that you might want to use:
-
Compile and run a file stored in a program:
./Moses «path to file» -
Run an expression and see the result:
./Moses -e "42+3"
You can also run :load Moses.hs in ghci, then use the run or runFile
functions to execute code.
Moses output
The Moses system prints a lot of information about the
steps of execution. It first shows the parsed input (pretty printed),
then it typechecks the code, then it converts all the nice constructs to vanilla lambda
calculus, converts that to SK combinators, and then executes the
program, and prints the result. Because it shows all the steps, you
probably want to make sure you are using a terminal that lets you scroll
back.
One interesting feature of the Moses system is that it
uses the type of the expression to know how to print it out. It actually
synthesizes printing code based on the type, thus an integer list (i.e.,
[INT]) is actually printed using a list printing higher-order function that
is passed an int printer. Unfortunately, until you implement typechecking, your
version of Moses won’t print any useful output on the command line for all but the very simplest
programs. Instead, the typechecker gives almost everything the type ANY, which is
always printed out as <??-who-knows-what-??>.1
If you run things in ghci, you’ll find that programs that produce simple
values (such as integers) are visible in the returned value from the run or
runfil functions, allowing us to “cheat” and see the results even without type
checking. Thus, fact.mose will produce its answer. More complex types, such as
lists, are not simple values and so you won’t be able to see their output
printed until the type checker works.
If you want to see the full power of the printing code before the
typechecker is complete, there are three special built-in values in
Moses example1, example2, and example3, of type [REAL], ([INT],
(REAL,BOOL), [[STRING]]), and [INT] respectively. The
synthesized printing code for example2 is quite complex (it’s quite a complex
type!), but is generated following fairly simple rules. Be
aware that example3 is an infinite list of the natural numbers—it will never
finish printing.
Structure of the Moses Language Implementation
Many languages have a rich high-level syntax that people write programs in, then
the language translates (i.e., compiles) the program to a lower-level language that is
actually executed (e.g., in C and C++, the low-level language is typically
machine code). As you saw in the previous section, Moses follows this pattern:
the higher-level syntax is a richly typed language, whereas the lower-level
language is far more basic untyped language.
In our Moses implementation,
-
The untyped core is a simple combinator-based language. In addition to the combinators
S,KandI, it does have a few built-in types and functions supporting integers, reals, and strings. Importantly, it does not and will not directly support booleans, lists, or pairs. (This kind of approach is a common one; for example, when C++ is compiled to X86 machine code, that code has no idea what C++ classes are.)Similarly, the core does not directly support recursion (it uses the
Ycombinator instead). It also has no “operators” and noif…then…else, instead providing functions, such as_plus_and_cons_to serve the same purpose.2 -
The typed language provides lists, tuples, and a static type system, as well as niceties such as variables,
letexpressions, and so forth (the parser for the typed language also provides the familiar square-bracket list notation as syntactic sugar for colon notation as well as a few other niceties not actually present in the typed abstract syntax).
All programs in the typed language are translated into programs in the untyped core language. The only evaluator is the untyped-core-language evaluator.
The typed high-level language supports additional kinds of values beyond those directly supported by the low-level language—booleans, lists, and tuples are provided using the function-based (i.e., $\lambda$-calculus style) encodings we saw earlier in the semester. The untyped core language doesn’t know what these encodings are being used for, but the typed language does, because it has type information about the program.
Some of Moses is written in itself. For example, the @
operator is turned into a call to a function _append_ that appends two lists.
There is nothing especially difficult about appending two lists, and so
the code for _append_ is written in Moses. Other languages do
similar things—much of Haskell is is written in Haskell (including the
list append function), and similarly C and C++ have standard libraries
written in C and C++. The file prelude.mosedefs includes definitions for both private
internal functions and publicly available functions, all written in
Moses code.
Phases of Execution
When Moses runs a program from a file (such as when you
type runfile "tests/okay-value1.mose" or execute run "let x = 20 in x+x"),
the following stages occur:
-
The system first loads code from
prelude.mosedefsto provide the necessary encodings of booleans, lists, and pairs, as well as a variety of other support functions that are written inMosesitself. Some of the functions (e.g.,_cons_for lists) are internal functions (in code, users writex:ynot_cons_ x y), whereas others are intended to be visible and usable. For this latter group of functions, the file provides type declarations describing the functions. No actual type checking happens here at all, by design. Because we say what the types are, we can define a function, and then say “Don’t see this as a function, see it as a BOOL.” If course, this does assume that when we tellMoseswhat the types are, we’re saying something meaningful. -
The parser for the typed language reads in an expression from the desired file (such as
okay-value1.mose). Any parse errors will abort with an error.
-
The expression is typechecked, by
typeofExprfromTypecheck.hs. If typechecking is successful, we know the type of the expression. Any type-checking errors will abort with an error. -
The expression is translated into an expression in the untyped lower-level core language by code in
Transform.hsandSKRun.hs. -
The type of the expression is used to generate code to print a value of that type. This printing functionality is necessary because neither the core-language evaluator, nor Haskell know what our encodings mean. Something as simple as a pair of booleans will be encoded as a rather confusing-looking function (such as
λ a.a (λ b.λ c.c) (λ d.λ e.d){language:moses}), but we know that in this case, these functions represent a pair of booleans, and thus how to make a function that will print it so that it looks like a pair.3 -
The translated expression is evaluated by the core-language evaluator, resulting in a final value.
-
An expression is created that applies the “printer” code to the value from the previous step. This expression is evaluated by the core-language evaluator, thereby printing program’s result in human-readable form. (The potentially non–human-readable
SKAbsyn.Valuethat represents the program’s result is returned byrunandrunfile).
That’s the theory, anyway….
In practice, in the code that we give you, the typechecker is unfinished and so
only typechecks very simple expressions. The biggest effect of this problem is
the lack of a correct type for most kinds of expression, which means no printer
for that kind of expression. The type checker operates correctly on
okay-value1.mose through okay-value4.mose, and error-var1.mose. For the
others, the lack of good typechecking results in late catching of errors and no
good print function.4
-
Because
Mosesis lazy, and because there is nothing useful that can be done with anANYvalue, the command-line version ofMoseswill never do any computation on anything that types asANY. This property isn’t always true when usingrunorrunfileinghci, because they return the value produced by evaluation. ↩ -
The names have underscores to reduce the risk that a programmer might inadvertently reuse the name of a vital function. ↩
-
When we know that this function is encoding a pair of booleans, we can determine that it holds the pair
(false, true), but the same representation could also encode other values for other types, such as the pair of(false, [])—without type information, we have no hope of meaningfully printing data encoded inside an arbitrary function. ↩ -
In some cases, you can cheat and read the result of evaluation from the untyped value that
runfilereturns to the Haskell interactive loop, but try that withokay6.mose. ↩