Regular Expressions

Overview

Thus far, your task has been to understand how lazy data structures can provide backtracking-style search through a solution space. In this part, you will apply your knowledge to the problem of regular-expression matching.

Background: Regular expressions

Recall that a regular expression is a pattern that succinctly describes a set of strings. For example, the pattern a* describes the set of strings that have zero or more repetitions of the character a: {"", "a", "aa", "aaa", …}.

More formally, given a set of characters $\Sigma$ (also called an alphabet), we can describe regular expressions (abbreviated as regexps) like so:

  • The empty string ε is a regexp. It matches only the empty string.
  • Any single character in the alphabet is a regexp. It matches only that specific character.
  • The . symbol is a regexp. It matches any character in the alphabet.
  • If e1 and e2 are regexps, then e1 | e2 is a regexp. It represents alternatives, so it matches anything that matches either e1 or e2 (or both).
  • If e1 and e2 are regexps, then e1 · e2 is a regexp. It represents concatenation, so it matches any string that begins with a match for e1, immediately followed by a string that matches for e2.
  • If e is a regexp, then e* is a regexp. It represents repetition: zero or more concatenations of strings that match e.

Given a regular expression $e$ and an input string $s$, we will declare a pattern match to be successful if a (possibly empty) prefix of $s$ matches the regular-expression pattern. For example:

  • The pattern a matches the string aaab but not baaa.
  • There are multiple matches of a* for input string aaab: "", a, aa, aaa.

Lazy regular-expression matching [21%]

The directory regex contains a partial implementation of a regular-expression matcher, using laziness rather than DFA / NFA technology.

Be sure to look at all the files in the directory, including the file for the AST (which you won’t need to modify), the file for the evaluator (which you will modify), and the test cases (which provide examples of matching and which will help you test your code).

Your task: In RegexEvaluation.hs, complete the implementation for matching Letters, Alts, and Concats .

Double-check all your work…

Make sure you’ve done each of the three parts of the assignment and that you have submitted your repository and your image to Gradescope.