The chapter Advanced Flow Control ended with the promise of building control abstractions that have the same
status as built-in ones like try and yield
. We specifically mentioned
about a control structure that will enable us to do nondeterministic computing in Slogan.
The key idea here is that expressions in a nondeterministic language can have more than one possible value. Our nondeterministic program evaluator will work by automatically choosing a possible value and keeping track of the choice. If a subsequent requirement is not met, the evaluator will try a different choice, and it will keep trying new choices until the evaluation succeeds, or until we run out of choices. Nondeterministic computing has important applications in fields like artificial intelligence and relational programming.
But before undertaking the grand job of adding nondeterminism to the language, we must become familiar with the facilities that allow us to build new syntactic abstractions in Slogan. Two complementary methods are available for adding new syntax to Slogan. The first two sections of this chapter will discuss these in some depth and show some practical examples. Then we will add nondeterministic operators to Slogan by taking advantage of the knowledge we gained in this chapter and the chapter on flow control.1
The simplest way to add new syntax is by using the declare syntax
statement. It has the general form shown below:
declare syntax <keyword> <layout> => <expansion>
Keyword
is a name that uniquely identifies the new syntactic form. Layout
defines the structure of
the extension as it will appear in code. This can contain operators, identifiers and literals. All identifiers starting with
the dollar sign ($
) are treated specially by the compiler. They will be replaced by valid Slogan expressions.
The expansion
part will contain Slogan code that should be generated by the compiler upon successfully parsing the new syntax.
The best way to understand the declare syntax
statement, is to use it to implement
a new syntactic extension. We will add the do-while
construct from the C programming language to Slogan. Do-while
is
an imperative loop which executes its body at-least once, even if the condition in the while
clause is
false
. We will define the syntax of the do-while
loop as
do <expression_to_execute>
while (<condition_to_check>)
This syntax will expand to produce a code block that will execute expression_to_execute
once and then go into a named
let
to conditionally execute it in a loop, controlled by the value of condition_to_check
.
The implementation of this construct follows:
declare syntax
do $body while($condition)
=> { $body
letfn loop ()
when ($condition)
{ $body
loop() }}
Some examples of using the new do-while
loop follows:
let i = 0
do
{ showln(i)
i = i + 1
} while (i < 5)
//> 0
1
2
3
4
do
{ showln(i)
i = i + 1
} while (i < 5)
//> 5
The code generated by declare syntax
is
hygienic. That means,
free variables in the expansion code
will refer to bindings visible when the syntax was specified. This is demonstrated in the following program:
let x = 100
declare syntax
prnx => showln(x)
prnx
//> 100
let (x = 200) prnx
//> 100
Another facility for extending the compiler is the macro system. It gives full access to the lower-level compiler APIs
for generating code. These extensions are created using the declare macro
statement which has the general form:
declare macro <keyword> <layout> => <expander>
The keyword
and layout
parts are similar to that of
declare syntax
. Expander
is Slogan code that will be executed
by the compiler to generate the actual expansion of the macro. The module core.compiler
define
functions that can generate low-level Slogan code. The expander will consist mostly of calls to these functions.
The following example shows a kind of extension that can be defined only by using macros – one that introduces new variable bindings.
declare macro
bind_names ($names, $values) $body
=> let (bindings = map(^(n, v) [n, v], $names, $values))
compiler.let_(bindings, $body)
// Usage:
bind_names([a, b, c], [1, 2, 3]) a + b + c
// 6
It is possible to add custom code parsers and invoke them from macros. The parsers are functions called by the compiler.
So they must be defined using a declare function
statement. Let us look
at an example of this. The macro we are going to define is called str
. It expects an asterisk (*
)
symbol followed by an identifier. The macro converts the identifier to a string. In short, the str
macro defines a DSL
for converting symbols to strings. Not very useful, but good enough to illustrate compile-time parser functions!
declare function compile_ident(tokenizer)
let (token = tokenizer.next)
{ when (special_token_to_string(token) <> "*")
error("expected asterisk")
let (sym = tokenizer.next)
{ when (not(is_valid_identifier(sym)))
error("invalid identifier")
sym }}
declare macro
str $name
=> symbol_to_string($name),
[$name:compile_ident]
// Usage:
(str * hello) == "hello"
// true
Macros are un-hygienic in that free variables in the expander takes their definition from the context of macro expansion, not from where the macro was defined. This is demonstrated by the program below.
let x = 100
declare macro
prnx_impure => compiler.call('showln, ['x])
// Usage:
prnx_impure
// 100
let (x = 200) prnx_impure
// 200
The gensym
function can be used for generating symbolic names that are guaranteed not to conflict with
names of user-defined variables.
Nondeterministic computing2, like lazy sequence processing, is useful for "generate and test" applications. Consider the task of starting with two lists of positive integers and finding a pair of integers - one from the first list and one from the second list - whose sum is prime. In previous chapters we saw how to handle this with finite sequence operations and with infinite sequences. Our approach was to generate the sequence of all possible pairs and filter these to select the pairs whose sum is prime. Whether we actually generate the entire sequence of pairs first, or interleave the generating and filtering is immaterial to the essential image of how the computation is organized.
The nondeterministic approach evokes a different image. Imagine simply that we choose (in some way) a number from the first list and a number from the second list and require (using some mechanism) that their sum be prime. This is expressed by the following function:
function prime_sum_pair(xs, ys)
let (x = an_element_of(xs), y = an_element_of(ys))
{ require(is_prime(x + y));
[x, y] }
It might seem as if this function merely restates the problem, rather than specifying a way to solve it. Nevertheless, this is a legitimate nondeterministic program.
The key idea here is that expressions in a nondeterministic language can have more than one possible value. For instance,
an_element_of
might return any element of the given list. Our nondeterministic program will work by
automatically choosing a possible value and keeping track of the choice. If a subsequent requirement is not met, the program
will try a different choice, and it will keep trying new choices until the evaluation succeeds, or until we run out of choices.
Just as lazy expressions freed the programmer from the details of how values are delayed and forced, the nondeterministic program
will free the programmer from the details of how choices are made.
To enable Slogan to support nondeterminism, we introduce a new syntactic extension called amb
.
The expression amb [e1, e2, ..., en]
will return the value of one of
the n
expressions ei
ambiguously. For example, the expression
[amb [1, 2, 3], amb ['a, 'b]]
can have six possible values:
[1, a], [1, b], [2, a], [2, b], [3, a], [3, b]
Amb
with a single choice produces an ordinary (single) value.
Amb
with no choices - the expression amb []
- is an expression
with no acceptable values. Operationally, we can think of it as an expression that when evaluated causes the computation to "fail":
The computation aborts and no value is produced.
Amb
can be implemented with the help of continuations, which are used to keep track of the next
choice point the computation should return to. When it is first invoked, amb
simply returns the first
element from the list of values passed to it. It also establishes a failure continuation, which if invoked will restart
the selection process with the rest of the elements in the list. This search process is allowed to continue until the list is empty, at which point
the default failure condition is executed. In our implementation, this will raise an error.
function fail()
error("search tree exhausted")
The implementation of amb
will keep track of its next choice point by updating fail
with
a continuation to restart the search with the remaining values:
declare syntax
amb $values
=> callcc(^(k)
{ let failed_k, fail0 = false, fail
let xs = callcc(^(fk) { failed_k = fk; $values })
match (xs)
[] -> fail()
| [x] -> { fail = fail0; k(x) }
| y:ys -> { fail = ^() failed_k(ys); k(y) }})
We will also define a function that will repeat the search until a condition is met. This is the same require
function we used in the
prime_sum_pair
above:
function require(p)
when(not(p)) fail()
With the primitives required for adding nondeterminism to our programs in place, we are ready to define a working version
of the prime_sum_pair
function:
function prime_sum_pair(xs, ys)
let (x = amb xs, y = amb ys)
{ require(is_prime(x + y))
list(x, y) }
// Usage:
prime_sum_pair([11, 12, 4, 7], [10, 3, 9, 5])
// [12, 5]
prime_sum_pair([1,2,3,4,5], [120, 100, 300, 400])
// [1, 100]
Exercise 13.1. Having amb
as a syntactic extension clearly marks it out as doing something
very different from a regular function call. But it is not necessary that is should be a special form recognized by the compiler. How will you
define amb
as a normal function?
Exercise 13.2 As the argument to amb
is a regular list, all its elements are evaluated even
before their values may be actually required by the nondeterministic computation. How will you make sure that the expressions passed to
amb
will be evaluated only when they are explicitly required?
The following puzzle (taken from Dinesman 1968) is typical of a large class of simple logic puzzles:
Baker, Cooper, Fletcher, Miller, and Smith live on different floors of an apartment house that contains only five floors. Baker does not live on the top floor. Cooper does not live on the bottom floor. Fletcher does not live on either the top or the bottom floor. Miller lives on a higher floor than does Cooper. Smith does not live on a floor adjacent to Fletcher's. Fletcher does not live on a floor adjacent to Cooper's. Where does everyone live?
We can determine who lives on each floor in a straightforward way by enumerating all the possibilities and imposing the given restrictions:
function multiple_dwelling()
let (xs = [1, 2, 3, 4, 5],
baker = amb xs,
cooper = amb xs,
fletcher = amb xs,
miller = amb xs,
smith = amb xs)
{ require(is_distinct([baker, cooper, fletcher, miller, smith]))
require(baker <> 5)
require(cooper <> 1)
require(fletcher <> 5)
require(fletcher <> 1)
require(miller > cooper)
require(abs(smith - fletcher) <> 1)
require(abs(fletcher - cooper) <> 1);
['baker:baker, 'cooper:cooper,
'fletcher:fletcher, 'miller:miller,
'smith:smith] }
The function is_distinct
determines if all elements in a list are unique and can be defined as,
function is_distinct(xs)
if (is_empty(xs)) true
else if (is_empty(tail(xs))) true
else if (member(head(xs), tail(xs))) false
else is_distinct(tail(xs))
Calling the multiple_dwelling
function will solve the puzzle for us:
multiple_dwelling()
// [baker:3, cooper:2, fletcher:4, miller:5, smith:1]
Exercise 13.3 If you carefully read the specification of the puzzle, it will become clear that not all combinations of peoples and floors
needs to be generated and searched. For instance, the combination of baker
and floor 5
need not be checked as it
is explicitly stated that he does not live on that floor. Can you write a more efficient version of multiple_dwelling
by
eliminating all such redundant checks?
1In the current implementation of Slogan, syntactic abstractions are visible only in the scope they are defined. For instance, a macro can be used only in the closure, module or namespace that it's defined in. New syntax or macro should ideally be defined in the global scope.
2This section is based mostly on content from Chapter 4.3 of SICP.