The world around us is populated by independent objects. If an object's behavior is influenced by its history, then it is said to have state. For example, a bank account has state if it reports a new balance after a successful withdrawal operation. The bank account we defined in the previous chapter do not have state, because the withdrawal operation returned a new bank account instead of updating the current balance. This is not how bank accounts in the real world works!
To create bank account object that behave exactly like its real world counterpart, we need an operation that can change the value associated with its local variable. Slogan provides the assignment operator for this purpose. Assignment is accomplished by the statement:
var = new_value
Using assignment, we will define a new bank account abstraction that support three fundamental operations – balance
for
checking the current balance, deposit
for adding money to the current balance and withdraw
for getting money out of
the account.
function make_bank_account(balance)
where balance > 0
{ function deposit(amount)
balance = balance + amount
function withdraw(amount)
where balance - amount > 0
balance = balance - amount
^(message)
| 'balance -> balance
| 'deposit -> deposit
| 'withdraw -> withdraw }
As it can now update its internal state, a single instance of bank account is able to keep track of the running balance over a series of transactions:
let account = make_bank_account(100)
account.deposit(10)
account.balance
// 110
account.withdraw(80)
account.balance
// 30
Assignment helps a lot in improving the modularity of a system. To understand this better, consider a function that
adds the number 2
to its argument. For the purpose of reporting some metrics, it should also log the number of times
it was called. Without assignment, this is how we would define and use this function:
function add2(x, n)
{ showln("add2: ", n+1)
x+2:n+1 }
let fn = 0
let r:fn2 = add2(10, fn)
//> add2: 1
r
// 12
let r:fn3 = add2(21, fn2)
//> add2: 2
r
// 23
The user of add2
only cares about the result of calling the function, i.e x
added with 2
.
But now he has to worry about the implementation details of add2
and define extra variables to correctly keep track of the
metrics. If the caller is careless about the second argument, the metrics log will become wrong and useless.
The implementation of add2
in not modular because it leaks its internals to other parts of the program.
Assignment allows us to easily fix the problem. We will maintain the count in a changing variable visible only to add2
:
let add2 =
let (call_count = 0)
function (x)
{ call_count = call_count + 1
showln("add2: ", call_count)
x + 2 }
add2(10)
//> add2: 1
// 12
add2(21)
//> add2: 2
// 23
As we saw in the preceding section, assignment helps to model objects in the real world in a modular fashion. But this convenience comes with a price. Once we introduce assignment, it becomes harder to reason about the functions we write. As long as we don't do assignments, two invocations of the same function with the same arguments will produce the same result. In other words, functions without assignment are computing mathematical functions with easily verifiable properties. This style of programming is known as functional programming.
Consider a simple function that decrements a number by a given amount. Let us first write a functional version and how see how simple reasoning techniques can be applied to verify its correctness.
function make_dec(balance)
^(amount) balance - amount
Does the function returned by make_dec
work as expected? Is it possible to verify its correctness without actually running it?
For instance, consider the function returned by the call make_dec(100)
:
let d = make_dec(100)
What will be the result of calling d(20)
followed by a call to d(45)
?
It will produce 80
and 55
. We can verify this by simply substituting the
variables balance
and amount
in the function body with their respective values. We can "expand" the function call
to the following steps:
d(20)
=> balance - amount
=> 100 - 20
=> 80
d(45)
=> balance - amount
=> 100 - 45
=> 55
We are able to substitute the variable names with their values because the variables are guaranteed not to change. In other words, they are said to be referentially transparent.
Now let us look at a new version of the decrementor function that depend on assignment:
function make_dec(balance)
^(amount)
{ balance = balance - amount
balance }
Will the simple reasoning based on value substitution work for this new implementation? Let us find out by manually evaluating the same sequence of calls as before:
d(20)
=> { balance = balance - amount; balance }
=> { balance = 100 - 20; 100 }
=> 100
d(45)
=> { balance = balance - amount; balance }
=> { balance = 100 - 20; 100 }
=> 100
Our reasoning is obviously wrong1. If we actually run the code, we will get 80
and 35
as results. Once assignment is brought in we loose the ability to do such simple reasoning about functions. We can no
longer assume that a variable is just a name for a value. It now reflects on the underlying computer architecture. The variable has become
a reference to a location in the computer's memory, which can change. The programming model that makes extensive use of assignment is known
as imperative programming.
Assignment also affects the way how two objects are compared for equality. Without assignment two objects that were equal when they were created continues to be so for the rest of their lives. This is because their internal state never changes. In fact, a smart compiler can even optimize on storage by putting both objects at the same location in memory and doing a fast equality check by just comparing memory addresses. But once we add assignment to the language, for any equality check to be reliable, the structure of both objects has to be inspected.
Assignment tends to be an expensive operation in modern multi-core chips with multiple levels of caches. In functional programming variables are initialized once and they never changes. This leads to efficient caching of frequently accessed values at the hardware level. When a new value is assigned, the cached values has to be flushed and written back to the main memory. This might make a computation with assignments costlier than a purely functional one.
Assignment is also more susceptible to bugs because the order of doing assignments is important. The situation becomes more complicated once we have multiple threads of execution in the same program. Slogan provide elegant solutions for managing state in concurrent programs. This will be the topic of the next two chapters.
Earlier in this book, we learned how to do iterations with recursive function calls.
The named letfn
construct is a generalization
on this to enable local iterations. The iterations we did so far were functional because they don't have to change state
to keep track of the loops boundary conditions. For example, consider the following loop. On each iteration, a new value is created and passed to
the recursive call to loop
. No mutation of existing state is involved.
letfn loop (i = 0)
when (i <= 2)
{ showln(i)
loop(i + 1) }
//> 0
1
2
If you have programmed in an imperative language like C or Java, you must be more familiar with writing this kind of iterations using the
for
or while
statements. Though both styles will work for simple loops like the one above,
some algorithms can be implemented
efficiently and cleanly only in an imperative style. Slogan provides the for
looping construct to meet such
a requirement. The preceding example can be rewritten using for
as shown below:
for (i = 0 to 2)
showln(i)
//> 0
1
2
Counting downwards is accomplished by,
for (i = 2 downto 0)
showln(i)
//> 2
1
0
You are allowed to have explicit control over the looping counter and condition:
for (i = 5; i < 15; i+3)
showln(i)
//> 5
8
11
14
The for
loop construct is an expression in Slogan. It returns the value of the last expression
evaluated in its body.
let s = for (j = 0, i = 5; i < 15; i+3)
{ j = j + i
j }
s
// 38
You can exit early from the loop by calling the break
expression. This expression returns
false
by default. Break
can be asked to return another value,
by calling it like a function and passing the required value as argument.
for (i = 0; i < 10; i + 1)
when (i == 5) break
// false
for (i = 0; i < 10; i + 1)
when (i == 5) break(i + 1)
// 6
The continue
expression allows control to return to the top of the loop. Just like
break
, an optional argument can be passed to continue
.
This will become the value of the loop if continue
happens to be the last expression
evaluated in its body.
for (j = 0, i = 0; i < 10; i + 1)
if (remainder(i, 3) == 0) continue
else { j = j + i; j }
// false
for (j = 0, i = 0; i < 10; i + 1)
if (remainder(i, 3) == 0) continue(j)
else { j = j + i; j }
// 27
for (j = 0, i = 0; i < 10; i + 1)
if (remainder(i, 2) == 0) continue
else { j = j + i; j }
// 25
If your style of programming demands more imperative control structures like the for
loop, you can
add them to the language! We will figure out how to do this and more in the last chapter of Part I.
Once we have assignment in our language, we also have mutable data structures. For instance, it is now possible to have functions that change the values in the head or tail of a pair:
Other data structures like lists, arrays and hash tables can also be mutated:
let xs = [10, 20, 30]
xs[1] = xs[0] + xs[2]
xs
// [10, 40, 30]
let ys = #[10, 20, 30]
ys[1] = ys[1] * ys[2]
ys
// #[10, 600, 30]
let zs = #{'name:'alice, 'age:3}
zs['age] = 2
zs;
// #{name: alice, age: 2}
// for faster updates, use type-specific mutators/accessors
// instead of the generic [] operator.
list_set(xs, 0, "hello")
xs
// [hello, 40, 30]
array_set(ys, 0, \a)
ys
// #[\a, 600, 30]
hashtable_set(zs, 'name, 'bob)
zs
// #{name: bob, age: 2}
In this section, we will see how to define new mutable data structures. We will define a queue, a sequence in which items are inserted at one end (called the rear of the queue) and deleted from the other end (the front). The queue data structure will support the following operations:
make_queue – construct and return an empty queue
front_queue – select and return the first item in the queue
insert_queue – insert a new item at the rear of the queue
delete_queue – remove the first item in the queue
is_empty_queue – check if the queue is empty
These operations will internally use the following functions to access and modify the pointers to front and rear of the queue. Two lists will represent the front and rear parts and they enable constant time lookup and insertion operations.
function front_ptr(queue) head(queue)
function rear_ptr(queue) tail(queue)
function set_front_ptr(queue, item) set_head(queue, item)
function set_rear_ptr(queue, item) set_tail(queue, item)
The queue constructor return a pair of empty lists.
function make_queue() []:[]
A queue is considered empty if its front part is the empty list.
function is_empty_queue(q) is_empty(front_ptr(q))
Insertion should check for two conditions. If the queue is empty a list is created with the new item and both the front an rear pointers point to this list. If the queue is not empty, the rear pointer is updated to point to this list. Then the rear pointer is "truncated" to make sure the next insertion need not traverse all the elements in the queue.
function insert_queue(q, item) let (new_entry = [item]) if (is_empty_queue(q)) { set_front_ptr(q, new_entry) set_rear_ptr(q, new_entry) } else { set_tail(rear_ptr(q), new_entry) set_rear_ptr(q, new_entry) }
To select the item at the front of the queue, we return the head of the pair indicated by the front pointer.
function front_queue(q)
if (is_empty_queue(q))
error("front called on an empty queue")
else
head(front_ptr(q))
To delete the item at the front of the queue, we merely modify the front pointer so that it now points at the second item in the queue.
function delete_queue(q)
if (is_empty_queue(q))
error("delete called on an empty queue")
else
set_front_ptr(q, tail(front_ptr(q)))
Now it's time to see the queue data structure in action!
let q = make_queue()
insert_queue(q, 1)
insert_queue(q, 2)
front_queue(q)
// 1
delete_queue(q)
insert_queue(q, 3)
front_queue(q)
// 2
delete_queue(q)
front_queue(q)
// 3
Each insert and delete operation mutates the internal state of the queue. An advantage of this is that, we are able to do all operations in constant time. On the downside, we lose a version of the queue with an insert or delete operation. There is no way to "go back" and see the contents of the queue at point past in time. We also cannot easily share this queue across simultaneously executing computations (threads or tasks) without adding code for data synchronization. For applications that require these properties, we need a queue that is immutable.
In the pure functional model that disallow assignments, the best we can do is to get amortized constant-time insert and delete operations.
That is, any sequence of n
insert and delete operations takes a total time that is proportional to some constant times n
.
Any individual operation might not be constant-time, however.
The functional queue is also represented as a pair of lists, one for the front and the other for the rear.
At any instant, the queue content is given by
append(front, reverse(rear))
.
An element is inserted by adding it to the head of rear
and deleted by removing it from
the head of front
. To make this representation work, each element in rear
has to
be moved to front
sooner or later. This should be done only occasionally, when front
becomes empty. Then rear
is reversed and moved to front
. The reverse takes time
proportional to the length of rear
, i.e., to the number of elements it reverses. Each
element that goes through the queue is passed exactly once from rear
to front
.
Allocating the execution time of reverse
to each element therefore gives a constant time per element.
This is why the queue is amortized.
Having described how our immutable queue works, let us move on to its implementation. The queue constructor needs to accept the
front
and rear
lists as optional arguments. This is because each insert and delete operation has to create
a new queue, instead of mutating the existing one. We will also define a helper function called
check_queue
whose purpose is to reverse and move rear
to front
if front
is empty.
function make_queue(f = [], r = []) f:r
function check_queue(q)
if (is_empty(front_ptr(q)))
make_queue(reverse(rear_ptr(q)))
else q
The insert_queue
function adds the new item to the head of rear and makes sure the queue is ready to return the next item
from its front
. Similarly, delete_queue
also ensures the required invariant after removing the
head of front
.
function insert_queue(q, item)
check_queue(make_queue(front_ptr(q), item:rear_ptr(q)))
function delete_queue(q)
check_queue(make_queue(tail(front_ptr(q)), rear_ptr(q)))
The program below show operations on the queue. As this queue is immutable, we need to explicitly keep track of instances to reach a desired final state.
let q = insert_queue(insert_queue(make_queue(), 1), 2)
front_queue(q)
// 1
front_queue(delete_queue(insert_queue(delete_queue(q), 3)))
//3
The functions that implemented the immutable queue has effectively over-written the imperative queue implementation. Is it possible to have both implementations in the same program, while maintaining a single "protocol" for operations on queues? The obvious solution is to implement the queues as two closures, one imperative and the other functional.
Slogan provide a convenient way to describe closure objects, without having to explicitly define and return a "message-handler" function.
This is accomplished by the module
expression. A module
definition
has the following syntax:
module <name> (<export_list>) <definitions>
The export_list
contain the names from definitions
that must be visible outside the module. These are essentially
the "messages" an instance of the module can respond to.
Let us define the imperative queue as a module:
module mutable_queue(make, is_empty:is_empty_queue,
front, insert, delete)
{ function front_ptr(queue) head(queue)
function rear_ptr(queue) tail(queue)
function set_front_ptr(queue, item) set_head(queue, item)
function set_rear_ptr(queue, item) set_tail(queue, item)
function make() []:[]
function is_empty_queue(q) is_empty(front_ptr(q))
function insert(q, item)
let (new_entry = [item])
if (is_empty_queue(q))
{ set_front_ptr(q, new_entry)
set_rear_ptr(q, new_entry) }
else
{ set_tail(rear_ptr(q), new_entry)
set_rear_ptr(q, new_entry) }
function front(q)
if (is_empty_queue(q))
error("front called on an empty queue")
else
head(front_ptr(q))
function delete(q)
if (is_empty_queue(q))
error("delete called on an empty queue")
else
set_front_ptr(q, tail(front_ptr(q))) }
The functions make
, is_empty_queue
, front
, insert
and delete
are exported and made visible to the consumers of the module. These names constitute the public interface
of the module. The functions front_ptr
, rear_ptr
,
set_front_ptr
and set_rear_ptr
are used in the implementation of the module and need not be published to
the outside world.
Note that for all public functions, except for is_empty_queue
we have omitted the _queue
suffix.
This is because the module name makes the extra suffix redundant. We retained the suffix for is_empty_queue
so that
it won't conflict with the is_empty
built-in function used in its implementation. But note that we export
is_empty_queue
as is_empty
. This achieved by the export specification – is_empty:is_empty_queue
.
The exported name will not conflict with the global is_empty
because it is scoped by the module name.
This is how we would use the new mutable_queue
module:
let mq = mutable_queue
let q = mq.make()
mq.insert(q, 1)
mq.insert(q, 2)
mq.front(q)
// 1
mq.delete(q)
mq.insert(q, 3)
mq.front(q)
// 2
mq.delete(q)
mq.front(q)
// 3
Modules
are first-class objects in Slogan. They can be assigned to variables, they can be passed as arguments to functions and
they can be dynamically created and returned from functions. This is why we could assign the shorter alias mq
to refer to
the mutable_queue
module.
We shall put the immutable queue into its own module as well:
module immutable_queue(make, is_empty:is_empty_queue,
front, insert, delete)
{ function front_ptr(queue) head(queue)
function rear_ptr(queue) tail(queue)
function make(f = [], r = []) f:r
function check(q)
if (is_empty(front_ptr(q)))
make(reverse(rear_ptr(q)))
else q
function is_empty_queue(q) is_empty(front_ptr(q))
function front(q)
if (is_empty_queue(q))
error("front called on an empty queue")
else head(front_ptr(q))
function insert(q, item)
check(make(front_ptr(q), item:rear_ptr(q)))
function delete(q)
check(make(tail(front_ptr(q)), rear_ptr(q))) }
This is how we would use the immutable_queue
module:
let imq = immutable_queue
let q = imq.insert(imq.insert(imq.make(), 1), 2)
imq.front(q)
// 1
imq.front(imq.delete(imq.insert(imq.delete(q), 3)))
// 3
Now we have two separate queue modules that share a single protocol. How can we package these two together so that users can choose between the mutable and immutable implementations as they wish? One way to do this is to put them into a "super" module as shown below:
module queue(mutable_queue, immutable_queue)
{ module mutable_queue(...) { ... }
module immutable_queue(...) { ... } }
Another option is to make use of the namespace
facility:
namespace queue
{ module mutable_queue(...) { ... }
module immutable_queue(...) { ... } }
All members in a namespace
are public and are accessible via the ::
operator.
let mq = queue::mutable_queue
let imq = queue::immutable_queue
The primary purpose of a namespace is to arrange modules and other namespaces into logical groups. Just like modules, namespaces are first-class objects in Slogan.
1We did not substitute balance
before the assignment operator
because the expression 100 = 100 - 20
does not make sense and cannot be evaluated.