The Moses language
This document gives an overview of the Moses
language: its features, how to run
Moses
programs, and the structure of its implementation.
The Moses
Language
Moses
is a simple functional language we have created for
CS 131. It is similar to Haskell (it is called “Moses” after
Moses Schönfinkel, inventor of schönfinkelization, much as Haskell is
named after Haskell Curry, inventor of currying). It closely matches the
syntax we have seen in examples in class. Like Haskell, we can see
Moses
as a more useable incarnation of the
$\lambda$-calculus.
Features of Moses
include:
-
An expressive functional language, with familiar features such as $\lambda$-expressions,
let
expressions, binary operators, etc. -
A type system that supports subtyping, as well as tuples and lists.
Some of the most salient limitations of Moses
compared to
Haskell include:
-
Function arguments must be be annotated with their types. (Whereas Haskell infers the types of function arguments so we don’t have to state them explicitly.)
-
If we want to write recursive definitions, we have to use
letrec
instead oflet
. -
Very limited pattern matching. We can break apart tuples in a
let
expression, but nested tuples will require nestedlet
s. Similarly,case
expressions are very restricted, working only on lists and only allowing you to write code to handle empty lists and lists that have a head and a tail, nothing more complex. -
A quite different type system. Haskell allows type variables representing unknowns in types (e.g.,
id :: a -> a
), whereasMoses
does not. -
Whitespace-independent syntax that is more restrictive than Haskell’s. You cannot give multiple definitions inside a
let
orletrec
, instead you must nest them. -
Moses
uses different operators for list append and string concatenation.Moses
uses@
for list append (compared to++
in Haskell), inMoses
++
is always string concatenation. (Strings inMoses
are not lists of characters.)
More examples of Moses
code can be found in the homework
directory.
Running Moses
Programs
The easiest way to get familiar with Moses
is to run it. On the HMC CS
machines (e.g., knuth or the lab Macs), you can run:
/cs/cs131/bin/Moses -i
to start up a REPL for Moses
.
When you have the code (and when you’re working on the assignment), you should compile a command-line version of your implementation by running
ghc -O Moses.hs
then start a Moses
REPL:
./Moses -i
From here on out, we’ll assume that you’re running your own implementation of
Moses
(not the one in /cs/cs131/bin/
).
There are a couple of other ways to run Moses
that you might want to use:
-
Compile and run a file stored in a program:
./Moses «path to file»
-
Run an expression and see the result:
./Moses -e "42+3"
You can also run :load Moses.hs
in ghci
, then use the run
or runFile
functions to execute code.
Moses
output
The Moses
system prints a lot of information about the
steps of execution. It first shows the parsed input (pretty printed),
then it typechecks the code, then it converts all the nice constructs to vanilla lambda
calculus, converts that to SK combinators, and then executes the
program, and prints the result. Because it shows all the steps, you
probably want to make sure you are using a terminal that lets you scroll
back.
One interesting feature of the Moses
system is that it
uses the type of the expression to know how to print it out. It actually
synthesizes printing code based on the type, thus an integer list (i.e.,
[INT]
) is actually printed using a list printing higher-order function that
is passed an int printer. Unfortunately, until you implement typechecking, your
version of Moses
won’t print any useful output on the command line for all but the very simplest
programs. Instead, the typechecker gives almost everything the type ANY
, which is
always printed out as <??-who-knows-what-??>
.1
If you run things in ghci
, you’ll find that programs that produce simple
values (such as integers) are visible in the returned value from the run
or
runfil
functions, allowing us to “cheat” and see the results even without type
checking. Thus, fact.mose
will produce its answer. More complex types, such as
lists, are not simple values and so you won’t be able to see their output
printed until the type checker works.
If you want to see the full power of the printing code before the
typechecker is complete, there are three special built-in values in
Moses
example1
, example2
, and example3
, of type [REAL]
, ([INT],
(REAL,BOOL), [[STRING]])
, and [INT]
respectively. The
synthesized printing code for example2
is quite complex (it’s quite a complex
type!), but is generated following fairly simple rules. Be
aware that example3
is an infinite list of the natural numbers—it will never
finish printing.
Structure of the Moses
Language Implementation
Many languages have a rich high-level syntax that people write programs in, then
the language translates (i.e., compiles) the program to a lower-level language that is
actually executed (e.g., in C and C++, the low-level language is typically
machine code). As you saw in the previous section, Moses
follows this pattern:
the higher-level syntax is a richly typed language, whereas the lower-level
language is far more basic untyped language.
In our Moses
implementation,
-
The untyped core is a simple combinator-based language. In addition to the combinators
S
,K
andI
, it does have a few built-in types and functions supporting integers, reals, and strings. Importantly, it does not and will not directly support booleans, lists, or pairs. (This kind of approach is a common one; for example, when C++ is compiled to X86 machine code, that code has no idea what C++ classes are.)Similarly, the core does not directly support recursion (it uses the
Y
combinator instead). It also has no “operators” and noif
…then
…else
, instead providing functions, such as_plus_
and_cons_
to serve the same purpose.2 -
The typed language provides lists, tuples, and a static type system, as well as niceties such as variables,
let
expressions, and so forth (the parser for the typed language also provides the familiar square-bracket list notation as syntactic sugar for colon notation as well as a few other niceties not actually present in the typed abstract syntax).
All programs in the typed language are translated into programs in the untyped core language. The only evaluator is the untyped-core-language evaluator.
The typed high-level language supports additional kinds of values beyond those directly supported by the low-level language—booleans, lists, and tuples are provided using the function-based (i.e., $\lambda$-calculus style) encodings we saw earlier in the semester. The untyped core language doesn’t know what these encodings are being used for, but the typed language does, because it has type information about the program.
Some of Moses
is written in itself. For example, the @
operator is turned into a call to a function _append_
that appends two lists.
There is nothing especially difficult about appending two lists, and so
the code for _append_
is written in Moses
. Other languages do
similar things—much of Haskell is is written in Haskell (including the
list append function), and similarly C and C++ have standard libraries
written in C and C++. The file prelude.mosedefs
includes definitions for both private
internal functions and publicly available functions, all written in
Moses
code.
Phases of Execution
When Moses
runs a program from a file (such as when you
type runfile "tests/okay-value1.mose"
or execute run "let x = 20 in x+x"
),
the following stages occur:
-
The system first loads code from
prelude.mosedefs
to provide the necessary encodings of booleans, lists, and pairs, as well as a variety of other support functions that are written inMoses
itself. Some of the functions (e.g.,_cons_
for lists) are internal functions (in code, users writex:y
not_cons_ x y
), whereas others are intended to be visible and usable. For this latter group of functions, the file provides type declarations describing the functions. No actual type checking happens here at all, by design. Because we say what the types are, we can define a function, and then say “Don’t see this as a function, see it as a BOOL.” If course, this does assume that when we tellMoses
what the types are, we’re saying something meaningful. -
The parser for the typed language reads in an expression from the desired file (such as
okay-value1.mose
). Any parse errors will abort with an error.
-
The expression is typechecked, by
typeofExpr
fromTypecheck.hs
. If typechecking is successful, we know the type of the expression. Any type-checking errors will abort with an error. -
The expression is translated into an expression in the untyped lower-level core language by code in
Transform.hs
andSKRun.hs
. -
The type of the expression is used to generate code to print a value of that type. This printing functionality is necessary because neither the core-language evaluator, nor Haskell know what our encodings mean. Something as simple as a pair of booleans will be encoded as a rather confusing-looking function (such as
λ a.a (λ b.λ c.c) (λ d.λ e.d)
{language:moses}), but we know that in this case, these functions represent a pair of booleans, and thus how to make a function that will print it so that it looks like a pair.3 -
The translated expression is evaluated by the core-language evaluator, resulting in a final value.
-
An expression is created that applies the “printer” code to the value from the previous step. This expression is evaluated by the core-language evaluator, thereby printing program’s result in human-readable form. (The potentially non–human-readable
SKAbsyn.Value
that represents the program’s result is returned byrun
andrunfile
).
That’s the theory, anyway….
In practice, in the code that we give you, the typechecker is unfinished and so
only typechecks very simple expressions. The biggest effect of this problem is
the lack of a correct type for most kinds of expression, which means no printer
for that kind of expression. The type checker operates correctly on
okay-value1.mose
through okay-value4.mose
, and error-var1.mose
. For the
others, the lack of good typechecking results in late catching of errors and no
good print function.4
-
Because
Moses
is lazy, and because there is nothing useful that can be done with anANY
value, the command-line version ofMoses
will never do any computation on anything that types asANY
. This property isn’t always true when usingrun
orrunfile
inghci
, because they return the value produced by evaluation. ↩ -
The names have underscores to reduce the risk that a programmer might inadvertently reuse the name of a vital function. ↩
-
When we know that this function is encoding a pair of booleans, we can determine that it holds the pair
(false, true)
, but the same representation could also encode other values for other types, such as the pair of(false, [])
—without type information, we have no hope of meaningfully printing data encoded inside an arbitrary function. ↩ -
In some cases, you can cheat and read the result of evaluation from the untyped value that
runfile
returns to the Haskell interactive loop, but try that withokay6.mose
. ↩