Functional Abstractions

A powerful programming language will have the ability to build abstractions by assigning names to common patterns. This rises the level at which problems are solved, by working with abstractions directly. In Slogan, functions are the primary means of abstraction.

Sometimes the same programming pattern will be used with a number of different functions. If such patterns are identified and captured as functions, they can be applied in many different contexts. To be useful, the pattern-capturing function must be configurable for a specific context. This customization can be expressed as another function which becomes an argument to the pattern-capturing function. That means, a function must have the same properties of data objects like strings, numbers and lists. This includes a literal representation in code and an associated value that can be referenced through variables. In other words, functions should be treated as first-class values by the language.

Once functions have first-class status, they can become arguments or return values of other functions1, they can be stored and passed around in data structures and so on. First-class functions open a vast new world of possibilities before us. In this chapter, we will explore some powerful abstractions that first-class functions enable us to build.

6.1 Functions as Arguments

Imagine you are developing a payroll application. Your manager wants the application to report the names of all employees whose salaries are above a given threshold. Employee names and salaries are stored as pairs in a list:

let salaries = ['ann:1200, 'joe:1000, 'mark:2300, 'jane:1800]

You are planning to write a function that will walk this list and add all employees that satisfy the condition to a new result list. This result list will be returned when the salaries list runs out of entries. This plan can be realized by a recursive function that pattern match on the salaries list.

function filter_by_salary_above(threshold, salaries, result)
  match (salaries)
    [] -> result
  | (name:salary):rest where salary > threshold ->
      filter_by_salary_above(threshold, rest, name:result)
  | _:rest -> filter_by_salary_above(threshold, rest, result)

The salary filter function can be used as,

filter_by_salary_above(1500, salaries, [])
// [jane, mark]
filter_by_salary_above(1000, salaries, [])
// [jane, mark, ann]

Your manager is happy with the result, but now he also want a new function to report all employees with salaries below a given threshold! You set out to write a new filter function for that. If you attempt to implement this function, you will find that it will differ from the filter_by_salary_above function only at the condition check in the second clause of the match. The rest of the code follows the same pattern. In fact, after writing a few more such "list-filtering" functions you will want to write a generic function that can filter a list based on any condition. That condition will be captured as a functional argument. Slogan comes prepackaged with such a function that can filter out elements from a list by passing them through a predicate.

// filter all odd numbers in a list
filter(is_odd, [1, 2, 3, 4, 5])
// [1, 3, 5]

// filter all names that start with \a
function starts_with_a(name)
  char_is_eq(name[0], \a)

filter(starts_with_a, ["annie", "joe", "bob", "agatha", "mary"])
// ["annie", "agatha"]

You can use filter to define your salary reporting functions and avoid duplicating the filtering logic.

function salaries_above(threshold, salaries)
{ function check_above(entry) tail(entry) > threshold
  filter(check_above, salaries) }

function salaries_below(threshold, salaries)
{ function check_below(entry) tail(entry) < threshold
  filter(check_below, salaries) }

salaries_above(1500, salaries)
// [mark:2300, jane:1800]
salaries_below(1500, salaries)
// [ann:1200, joe:1000]

The new reporting functions are shorter and sweeter. One problem you may have noticed is that they now report both the employee name and salary, while the manager wants only the names. We will figure out an elegant and general method to process the filtered list further to extract only the information we want. But before that, let us look at ways to make the reporting functions even shorter.

6.1.1 Function Literals

The salaries_above and salaries_below define two predicates locally and pass them as arguments to filter. These are the check_above and check_below functions. To be forced to define and name short functions used only once is an inconvenience.

Fortunately, we don't have to live with this inconvenience because functions are first-class values in Slogan. They can be expressed as literals in code. It is quite easy to define a function literal – just omit the name. Here are some examples:

function (x, y) x + y
// <function>

(function (x, y) x + y)(10, 20)
// 30

let f = function (x, y) x + y
f(20, 30)
// 50

Function literals are so ubiquitous in Slogan programs that the ^ operator is provided to define them.2 So the preceding examples can be rewritten as,

^(x, y) x + y
// <function>

(^(x, y) x + y)(10, 20)
// 30

let f = ^(x, y) x + y
f(20, 30)
// 50

This leads to the following single-line definitions of the salary reporting functions:

function salaries_above(threshold, salaries)
  filter(^(e) tail(e) > threshold, salaries)

function salaries_below(threshold, salaries)
  filter(^(e)tail(e) < threshold, salaries)

6.1.2 Doing Something to Everything

Now let us tackle the next problem, i.e extracting only the required portion of the filtered salaries. For instance, the manager wants to see only the names of the employees and not their salaries. But the finance manager may need only the salaries, so that he can ask a query like "the average salary above a particular threshold."

Basically, this problem can be generalized as - "do something to every element in a sequence, and return a new sequence of the results". This pattern is captured by the map function which accepts a function and an arbitrary number of lists as arguments. The function represents the something that should be done to each element in the lists. It must take as many arguments as there are lists.

map(sqrt, [2, 10, 20])
// [1.4142135623730951, 3.1622776601683795, 4.47213595499958]
map(^(x, y) x + y, [1, 2, 3, 4, 5], [10, 20, 30, 40, 50])
// [11, 22, 33, 44, 55]

Extracting parts from the employees list is easy. Just map the head and tail functions on this list.

let name = head
let salary = tail

map(name, salaries_above(1500, salaries))
// [mark, jane]

map(salary, salaries_above(1500, salaries))
// [2300, 1800]

6.1.3 Folding

As we saw earlier, an accountant may not be interested in the names but the salaries of the employees. Once he get the list of salaries that satisfy a condition, he may want to extract some statistical information from them, like finding their average. For this the salaries has to be first reduced to a single value by summing all of them. Then this sum has to be divided by the number of salaries. How will you sum the numbers in a list? One solution is to use the higher-order function fold which takes a two-argument function, an initial value and a list as arguments. Each element in the list is combined to the value to produce a final result.

let s = map(salary, salaries_above(1500, salaries))
// [2300, 1800]

fold(`+`, 0, s)
// 4100

// the average salary above 1500
fold(`+`, 0, s) / length(s)
// 2050

6.1.4 Applying

A very useful function that takes a function as an argument is apply. It takes a function and a list of arguments for it, and returns the result of applying the function to the arguments:

apply(add, [1, 2, 3, 4, 5])
// 15

It can be given any number of arguments, so long as the last is a list:

apply(add, 1, 2, [3, 4, 5])
// 15

Apply is useful when some or all of the arguments to be passed to a function are in a list, since it frees the programmer from explicitly destructuring the list.

function find_min_max(xs) apply(min, xs):apply(max, xs)

find_min_max([10, 20, 1, 4, -8, 100])
// -8:100

6.2 Local Definitions

We often need variables that are limited in scope, visible only to a particular part of the program. To understand the importance of local definitions, consider the following code snippet:

let x = 100
function f(y) x * y
// 2000

let x = 2
x  + 20
// 22

// with the original definition of `x` lost,
// `f` will return an unexpected result!
// 40

The second binding of x "overwrote" its original definition. Any code that depended on this definition is broken now. We need a way to introduce the second binding of x in such a way that it is visible only to the computation that need it. (i.e, the expression x + 20). We could specify an anonymous function for binding such "local" variables. Let us rewrite the example program using this idea:

let x = 100
function f(y) x * y
// 2000

(^(x) x + 20)(2)
// 22

// 2000

This pattern is so useful that Slogan provide the let expression as a concise way to introduce multiple local variables. The let expression has the following syntax:

let (variable1 = value1, ..., variableN = valueN) body

The bindings introduced by the let expression is visible only within the body expression.

Our example program can be written using the let expression as,

let x = 100
function f(y) x * y
// 2000

let (x = 2) x + 20
// 22

// 2000

The value expressions in the bindings list of a let can refer to variables defined earlier:

let (x = 10, y = x + 2) [y, x]
// [12, 10]

The let expression can also be used to destructure lists, arrays and hash tables.

function f() [1, 2, 3]
function g() #{'a:1, 'b:2}

let ([a, b, c] = f()) a + b + c
// 6
let (#{'a:x, 'b:y} = g()) [x,y]
// [1, 2]

// values that are not required can be ommitted by using the `_` wildcard variable:
let ([a, _, c] = f()) a + c
// 4

There is a variation of let known as letfn, where value expressions are not allowed to refer to preceding variables in the bindings list.

letfn (a = 10, b = a + 2) [a, b]
//> error: Unbound variable: a

let x = 100
letfn (x = 20, y = x + 10) [x, y]
// [20, 110]

let (x = 20, y = x + 10) [x, y]
// [20, 30]

A letfn expression can be assigned an optional name. This makes it easy to define local recursive functions that behave like looping constructs:

letfn loop (x = 0)
  when (x < 5)
  { showln(x)
    loop(x + 1) }

//> 0

For defining mutually recursive local functions, the letrec expression is used. The following program defines two local functions to check evenness and oddity of numbers. They call each other to get their work done:

letrec (even = ^(x) x == 0 || odd(x - 1),
        odd  = ^(x) x <> 0 && even(x - 1))
  [even(20), odd(20)]
// [true, false]

Unlike let, letfn and letrec cannot destructure data structures.

The basic letfn expression has the most optimized implementation in Slogan because it is translated to a simple function definition and invocation. A let expression may get translated to multiple letfn expressions. The letrec expression may do state mutations behind the scene.

6.3 Functions as Return Values

Earlier in this chapter we saw how the ability to pass functions as arguments significantly improves the expressiveness of the programs we could write. Even more expressive power is achieved once we have functions whose return values themselves are functions. Let us start with a simple example. The following function checks the type of its argument and return a function that can be used to combine two values of that type:

function combiner(x)
  if (is_number(x)) `+`
  else if (is_string(x)) string_append
  else if (is_list(x)) append
  else if (is_char(x)) string
  else error("type unsupported")

// Usage:
let char_comb = combiner(\x)  
char_comb(\x, \y)
// xy

let num_comb = combiner(2)
num_comb(3, 2)
// 5

combiner([1,2])([1,2], [3,4,5])
// [1, 2, 3, 4, 5] 

It's a bit awkward to pass an object twice in an expression that calls combiner - once to type-check it and then to combine it with another value. It would be convenient for callers if the function returned already knows about its first argument. This is accomplished by the next function, make_combiner:

function make_combiner(x)
  if (is_number(x)) ^(y) `+`(x, y)
  else if (is_string(x)) ^(y) string_append(x, y)
  else if (is_list(x)) ^(y) append(x, y)
  else if (is_char(x)) ^(y) string(x, y)
  else error("type unsupported")

// ab

let hello_comb = make_combiner("hello, ")
// hello, world
hello_comb("how are you?")
// hello, how are you?

Instead of explicitly creating a function object for each type, you may also use the built-in function partial, which returns a new version of a function partially applied to its first few arguments. This is how the next version of make_combiner is defined.

function make_combiner(x)
  let (f = if (is_number(x)) `+`
           else if (is_string(x)) string_append
           else if (is_list(x)) append
           else if (is_char(x)) string
           else error("type unsupported"))
    partial(f, x)

make_combiner(10)(2, 3)
// 15
make_combiner(\a)(\b, \c, \d)
// abcd

You may have noticed the function returned by make_combiner getting a permanent reference to make_combiner's argument, i.e, the variable x. This is because functions get a reference to the lexical environment they are defined in and can freely refer to the variables defined there. The combination of a function along with its lexical environment is known as a closure and can be used to model complex structures out of pure functions. This technique is discussed later in this chapter.

6.3.1 Function Builders

We saw how to use partial to create a partially applied version of a function. Slogan also have other "function-building functions". Some of these are illustrated here.

Let us consider a function builder known as compose. It takes one or more functions and returns a new function in which all of them are applied in succession. Each function, except the last, must take exactly one argument. The last function can take an arbitrary number of arguments.

compose(sqrt, `+`)(10, 20, 30)
// 7.745966692414834

// same as:
sqrt(10 + 20 + 30)
// 7.745966692414834

Another useful function builder is complement which takes a predicate and returns the opposite predicate.

map(complement(is_odd), [1, 2, 3, 4, 5])
// [false, true, false, true, false]

6.4 Functions as Objects

As closures can eternally refer to the environment in which they were defined, they can be used to emulate objects with internal state. By making the closure to return a function that accepts a symbolic constant, we can define objects that respond to messages. This style of program organization, where all computation takes place through message-passing is the central idea behind Object Oriented Programming.

The following program defines a simple object that represents a bank account. A bank account object is constructed with a positive initial balance. It can respond to two new messages – balance, which returns the current balance and withdraw, which returns a new object with the updated balance.3

function make_bank_account(balance)
  if (balance <= 0)
    error("invalid balance")
  { function withdraw(amount)
      if (amount >= balance)
        error("not enough balance")
        make_bank_account(balance - amount)

    | 'balance -> balance
    | 'withdraw -> withdraw }

Let us check if our bank account abstraction works as expected:

let b1 = make_bank_account(1000)
let b2 = make_bank_account(2300)
// 1000
// 2100
let b3 = make_bank_account(-100)
//> error: invalid balance
//> error: not enough balance

To work with such objects that respond to symbolic-messages, Slogan provides the dot-notation where you specify the message name after a period that follows the function name:

let b3 = make_bank_account(145)
// 145
// 45

6.4.1 Pre-Conditions and Post-Conditions

Make_bank_account and its internal withdraw function both expect their argument to meet certain specifications. This expectation is expressed in the if condition at the top of the function definitions. There is another way to clearly express a function's specifications as pre and post conditions. The pre-condition specifies the restrictions on the function's arguments and the post-condition must be satisfied by the function's return value. These conditions are specified outside the body of the function using the where clause. Thus it is possible to separate the function's specification from its implementation. This also plays easy with external tools that may have to extract a function's specification.

Let us rewrite the make_bank_account function by separating out the specifications.

function make_bank_account(balance)
  where is_number(balance) && balance > 0
        -> is_function(%)
{ function withdraw(amount)
    where amount < balance
          -> is_function(%)
    make_bank_account(balance - amount)

    | 'balance -> balance
    | 'withdraw -> withdraw }

The first part of the where clause is the specification for the parameters. The part that follows -> is the specification for the return value. The variable % is automatically bound to the return value. The specification for the return value is optional.

If you make a call that violates the contract for the parameters, for e.g: make_bank_account(0), a pre-condition failed error will be raised. If one of the functions return a value that don't meet the specification for its return value, a post-condition failed error will be raised.

You can call the disable_function_contracts() and enable_function_contracts() functions to dynamically control if the pre/post conditions are checked on function call.

6.4.1 Multiple Representations

Using closures that accept messages we can implement objects that conform to a specific protocol or interface. To continue with the above example, a bank may offer its customers various types of accounts but all support the basic operations to check the current balance and withdraw money. The account object we defined earlier do not allow withdrawal that will result in a zero or negative balance. But the bank may also offer a type of account that allows an overdraft. The limit of the overdraft will be specified at the time of creating the account. This new account type is defined as follows:

function make_bank_account_with_od(balance, od_limit)
  where is_number(balance) && balance > od_limit
        -> is_function(%)

{ function check_balance(wd_amount)
     let (b = balance - wd_amount)
       b > od_limit

  function withdraw(amount)
    where check_balance(amount)
          -> is_function(%)
    make_bank_account_with_od(balance - amount, od_limit)

   | 'balance -> balance
   | 'withdraw -> withdraw }

Now we have two implementations of the "bank account protocol" in our system. They can coexist in the same program and respond to the same set of messages. But their behavior is different. The user need not worry about the internal implementation details of each type of account. He interacts with them using the published message protocol and can expect the objects to respond in an appropriate manner.

let acc01 = make_bank_account(1000)
let acc02 = make_bank_account_with_od(1000, -500)

// 1000
// 1000
// 430
// 430
//> error: precondition_failed, withdraw, (lt-compare amount balance)
// -70
// -400
//> error: precondition_failed, withdraw, (check_balance amount)

6.5 Dispatching on Type

Slogan allows you to vary a functions behavior based on the type of its arguments. Consider the simple example of a function that returns a greeting message. If its argument is a string, it should return "hello, <string>". If the argument is an integer, it should return "hi, <integer_in_binary>". The first behavior is the default. It is quite trivial to implement this:

function greet(name)
  string_append("hello ", name)

Now we need to implement an version of greet "overloaded" to work with integers. This new definition is shown below:

function greet(n:integer)
  string_append("hi ", number_to_string(n, 2))

A function is overloaded by specifying a type for one or more of its parameters. A type is basically an identifier that is complemented by a predicate. In our example, the type integer is complemented by the predicate is_integer.

We are ready to see the greet function in action!

// hello ann
// hi 1111011

Let us look at one more example of typed-dispatch. This time we will dispatch on the type of multiple-arguments. Imagine a game that has, among its (user-visible) objects, spaceships and asteroids. When two objects collide, the program may need to do different things according to what has just hit what. This behavior is captured by the collide_with function which does the right thing for the different combinations of colliding objects.

// objects in our game, type-predicates are automatically
// generated by the `record` statement.
record asteroid()
record spaceship()

// the default implementation does nothing useful.
function collide_with(a, b) false

function collide_with(a:asteroid, b:asteroid) "asteroid**asteroid!!"
function collide_with(a:asteroid, b:spaceship) "asteroid**spaceship!!"
function collide_with(a:spaceship, b:spaceship) "spaceship**spaceship!!"
function collide_with(a:spaceship, b:asteroid) "spaceship**asteroid!!"

The game is now able to handle all collision scenarios with just a single function call:

collide_with(asteroid(), spaceship())
// asteroid**spaceship!!
collide_with(asteroid(), asteroid())
// asteroid**asteroid!!
collide_with(spaceship(), asteroid())
// spaceship**asteroid!!
collide_with(spaceship(), spaceship())
// spaceship**spaceship!!

6.5.1 Generic Message Dispatch

A function can be declared as generic so that it starts acting as a "message-dispatcher" for its first argument. The first argument must be a closure that can respond to a message with the same name as the generic function. Generics gives us a fast mechanism for dynamically dispatching on the type of only the first argument.

A good example for a generic function is one that concatenates two aggregate objects of the same type. We shall call this function concat. Its declaration is shown below:

function concat(a, b);
declare generic concat

Take care to end the function definition of concat with a semicolon (;) because it has no default implementation.

The declare generic statement specifically instructs the compiler to emit special dispatching code when it sees a call to concat.4 Right now, this function has no behavior. New object definitions can start adding their own implementations of concat. The following code listing shows two implementation of concat, one for concatenating lists and the other for concatenating strings. Note that the implementations must respond to the 'concat message with a function that knows how to concatenate two objects of a particular aggregate.

function listc(a)
  | 'concat -> ^(b) append(a, b)

function stringc(a)
  | 'concat -> ^(b) string_append(a, b)

With these implementations we can use a single interface to concatenate different aggregate types:

concat(listc([1, 2, 3]), [4, 5, 6])
// [1, 2, 3, 4, 5, 6]
concat(stringc("hello "), "world")
// hello world

Some of the basic operators in Slogan are built on top of generics. For example, the equals (==) and other comparison operators are implemented on top of the generics is_equal and compare. Objects that implement these generics automatically gets added to the list of types handled by these operators. Another useful generic is to_string which can be used to get a string representation of any object. Let us implement these generics for an object that represents a point in the Cartesian plane and see how to extend the behavior of the language.

function point(x, y)
{ function cmp(p)
    if (p.x == x && p.y == y) 0
    else if (x < p.x && y < p.y) -1
    else 1

  | 'x -> x
  | 'y -> y
  | 'is_equal -> ^(p) x == p.x && y == p.y
  | 'compare -> cmp
  | 'to_string -> ^() string_append(number_to_string(x), "@", number_to_string(y)) }

A bit of explanation is required for the implementation of compare. If two objects are deemed to have the same value, compare must return 0. If the first value is less than the second, return -1 and if it is greater return 1. For adding explicit support for the == and <> operators, overriding the is_equal generic is required.

Let's see how point objects integrate with the rest of the language:

let p1 = point(10, 20)
let p2 = point(30, 40)
p1 == p2
// false
p1 <> p2
// true
p1 == point(10, 20)
// true
p1 < p2
// true
p1 >= p2
// false
p1 >= point(10, 20)
// true
map(to_string, sort([p1, point(1, 2), p2]))
// [1@2, 10@20, 30@40]

Later in this book, we will encounter more generics as we start defining our own aggregate types.

6.6 Optional, Rest and Keyword Arguments

Some or all of the parameters of a function can be marked as optional. An optional parameter takes on false as its default value. The following function takes two arguments - the first one is required and the second one is optional:

function f(a, @optional b) [a, b]

// [1, false]
f(1, 10)
// [1, 10]

You can specify a default value for an optional parameter. This way you can also avoid explicitly using the @optional marker.

function f(a, b = 10) [a, b]

// [1, 10]
f(1, 12)
// [1, 12]

Keyword arguments are more flexible than optional arguments because they allow us to specify the arguments by name and not by position. We have already used functions with keyword arguments as constructors for records. These were auto-generated by the compiler. If you want a function to accept keyword arguments, this is how you would define it:

function make_point(@key x, y) x:y

make_point(x = 10, y = 20)
// 10:20
make_point(y = 20, x = 10)
// 10:20

Keyword parameters can have default values:

function make_point(@key x, y = 20) x:y

make_point(x = 100)
// 100:20
make_point(x = 100, y = 200)
// 100:200

A function can also be specified to accept an arbitrary number of arguments. This is done by marking the last parameter as rest. All the additional arguments passed to the function will be packaged into a list that will be bound to this parameter.

function g(a, b, @rest c) a:b:c

g(1, 2, 3)
// [1, 2, 3]
g(1, 2, 3, 4, 5)
// [1, 2, 3, 4, 5]

In an earlier section, we learned about function-building functions. As an example of functions that take an arbitrary number of arguments, let us define two generic function builders. These are the disjoin and conjoin functions. Both take one or more predicates as arguments: disjoin returns a predicate that returns true when any of the predicates return true, and conjoin returns a predicate that returns true when all of the predicates return true. The definitions of these functions follow:

function disjoin(fn, @rest fns)
  if (is_empty(fns)) fn
  else let (disj = apply(disjoin, fns))
    ^(@rest args) apply(fn, args) || apply(disj, args)

function conjoin(fn, @rest fns)
  if (is_empty(fns)) fn
  else let (disj = apply(disjoin, fns))
    ^(@rest args) apply(fn, args) && apply(disj, args)

We can use the disjoin/conjoin pair as shown below:

let d = disjoin(^(a, b) a < b,
                ^(a, b) a == 10 && b == 20)
d(10, 20)
// true
d(10, 30)
// true
d(100, 30)
// false

let c = conjoin(^(a, b) a < b,
                ^(a, b) a == 10 && b == 20)
c(10, 20)
// true
c(10, 30)
// false

1Functions that manipulate functions are known as higher-order functions. The name comes from the concept of order of a function. A function that take no function arguments is of first-order, a function that takes at-least one first-order function as argument is of second-order and so on. Higher-order programming simply means functions can be of any order.

2This notation is derived from Russell and Whitehead's Principia Mathematica, which used a caret over bound variables. Alonzo Church adopted this as the notation for functions, but he moved the caret in front: ^x(x+x). He later replaced the caret with the lowercase lambda character. John McCarthy who invented Lisp, adopted this lambda notation. As there were no Greek letters on the keypunches of that era, McCarthy used (lambda (x) (+ x x)). As there are no Greek letters on most modern keyboards as well, Slogan decided to stick with the original, shorter notation. (Courtesy: Paradigms of Artificial Intelligence Programming by Peter Norvig).

3We will introduce the idea of assigning new values to variables in the next chapter. Once we have understood the implications of assignment and state-mutations, we will write a new version of the back account object which can keep track of its own changing balance.

4The declare statement is used to give specific directives to the Slogan compiler. One of course, is to ask it treat a function as generic. You can also use this statement to add new syntactic forms to the language. Another use is for importing the prototypes of C functions that can be called from a Slogan program. These topics will be covered in later chapters.

Next | Previous | Contents