# State and Modularity

## 7.1 The State of the World

The world around us is populated by independent objects. If an object's behavior is influenced by its history, then it is said to have state. For example, a bank account has state if it reports a new balance after a successful withdrawal operation. The bank account we defined in the previous chapter do not have state, because the withdrawal operation returned a new bank account instead of updating the current balance. This is not how bank accounts in the real world works!

To create bank account object that behave exactly like its real world counterpart, we need an operation that can change the value associated with its local variable. Slogan provides the assignment operator for this purpose. Assignment is accomplished by the statement:

``````
var = new_value
``````

Using assignment, we will define a new bank account abstraction that support three fundamental operations – `balance` for checking the current balance, `deposit` for adding money to the current balance and `withdraw` for getting money out of the account.

``````
function make_bank_account(balance)
where balance > 0
{ function deposit(amount)
balance = balance + amount

function withdraw(amount)
where balance - amount > 0
balance = balance - amount

^(message)
| 'balance -> balance
| 'deposit -> deposit
| 'withdraw -> withdraw }
``````

As it can now update its internal state, a single instance of bank account is able to keep track of the running balance over a series of transactions:

``````
let account = make_bank_account(100)
account.deposit(10)
account.balance
// 110
account.withdraw(80)
account.balance
// 30
``````

Assignment helps a lot in improving the modularity of a system. To understand this better, consider a function that adds the number `2` to its argument. For the purpose of reporting some metrics, it should also log the number of times it was called. Without assignment, this is how we would define and use this function:

``````
x+2:n+1 }

let fn = 0
r
// 12

r
// 23
``````

The user of `add2` only cares about the result of calling the function, i.e `x` added with `2`. But now he has to worry about the implementation details of `add2` and define extra variables to correctly keep track of the metrics. If the caller is careless about the second argument, the metrics log will become wrong and useless.

The implementation of `add2` in not modular because it leaks its internals to other parts of the program. Assignment allows us to easily fix the problem. We will maintain the count in a changing variable visible only to `add2`:

``````
let (call_count = 0)
function (x)
{ call_count = call_count + 1
x + 2 }

// 12

// 23
``````

### 7.1.1 The Costs of Assignment

As we saw in the preceding section, assignment helps to model objects in the real world in a modular fashion. But this convenience comes with a price. Once we introduce assignment, it becomes harder to reason about the functions we write. As long as we don't do assignments, two invocations of the same function with the same arguments will produce the same result. In other words, functions without assignment are computing mathematical functions with easily verifiable properties. This style of programming is known as functional programming.

Consider a simple function that decrements a number by a given amount. Let us first write a functional version and how see how simple reasoning techniques can be applied to verify its correctness.

``````
function make_dec(balance)
^(amount) balance - amount
``````

Does the function returned by `make_dec` work as expected? Is it possible to verify its correctness without actually running it? For instance, consider the function returned by the call `make_dec(100)`:

``````
let d = make_dec(100)
``````

What will be the result of calling `d(20)` followed by a call to `d(45)`? It will produce `80` and `55`. We can verify this by simply substituting the variables `balance` and `amount` in the function body with their respective values. We can "expand" the function call to the following steps:

``````
d(20)
=> balance - amount
=> 100 - 20
=> 80

d(45)
=> balance - amount
=> 100 - 45
=> 55
``````

We are able to substitute the variable names with their values because the variables are guaranteed not to change. In other words, they are said to be referentially transparent.

Now let us look at a new version of the decrementor function that depend on assignment:

``````
function make_dec(balance)
^(amount)
{ balance = balance - amount
balance }
``````

Will the simple reasoning based on value substitution work for this new implementation? Let us find out by manually evaluating the same sequence of calls as before:

``````
d(20)
=> { balance = balance - amount; balance }
=> { balance = 100 - 20; 100 }
=> 100

d(45)
=> { balance = balance - amount; balance }
=> { balance = 100 - 20; 100 }
=> 100
``````

Our reasoning is obviously wrong1. If we actually run the code, we will get `80` and `35` as results. Once assignment is brought in we loose the ability to do such simple reasoning about functions. We can no longer assume that a variable is just a name for a value. It now reflects on the underlying computer architecture. The variable has become a reference to a location in the computer's memory, which can change. The programming model that makes extensive use of assignment is known as imperative programming.

Assignment also affects the way how two objects are compared for equality. Without assignment two objects that were equal when they were created continues to be so for the rest of their lives. This is because their internal state never changes. In fact, a smart compiler can even optimize on storage by putting both objects at the same location in memory and doing a fast equality check by just comparing memory addresses. But once we add assignment to the language, for any equality check to be reliable, the structure of both objects has to be inspected.

Assignment tends to be an expensive operation in modern multi-core chips with multiple levels of caches. In functional programming variables are initialized once and they never changes. This leads to efficient caching of frequently accessed values at the hardware level. When a new value is assigned, the cached values has to be flushed and written back to the main memory. This might make a computation with assignments costlier than a purely functional one.

Assignment is also more susceptible to bugs because the order of doing assignments is important. The situation becomes more complicated once we have multiple threads of execution in the same program. Slogan provide elegant solutions for managing state in concurrent programs. This will be the topic of the next two chapters.

## 7.2 Imperative Loops

Earlier in this book, we learned how to do iterations with recursive function calls. The named `letfn` construct is a generalization on this to enable local iterations. The iterations we did so far were functional because they don't have to change state to keep track of the loops boundary conditions. For example, consider the following loop. On each iteration, a new value is created and passed to the recursive call to `loop`. No mutation of existing state is involved.

``````
letfn loop (i = 0)
when (i <= 2)
{ showln(i)
loop(i + 1) }

//> 0
1
2
``````

If you have programmed in an imperative language like C or Java, you must be more familiar with writing this kind of iterations using the `for` or `while` statements. Though both styles will work for simple loops like the one above, some algorithms can be implemented efficiently and cleanly only in an imperative style. Slogan provides the `for` looping construct to meet such a requirement. The preceding example can be rewritten using `for` as shown below:

``````
for (i = 0 to 2)
showln(i)

//> 0
1
2
``````

Counting downwards is accomplished by,

``````
for (i = 2 downto 0)
showln(i)

//> 2
1
0
``````

You are allowed to have explicit control over the looping counter and condition:

``````
for (i = 5; i < 15; i+3)
showln(i)

//> 5
8
11
14
``````

The `for` loop construct is an expression in Slogan. It returns the value of the last expression evaluated in its body.

``````
let s = for (j = 0, i = 5; i < 15; i+3)
{ j = j + i
j }

s
// 38
``````

You can exit early from the loop by calling the `break` expression. This expression returns `false` by default. `Break` can be asked to return another value, by calling it like a function and passing the required value as argument.

``````
for (i = 0; i < 10; i + 1)
when (i == 5) break
// false

for (i = 0; i < 10; i + 1)
when (i == 5) break(i + 1)
// 6
``````

The `continue` expression allows control to return to the top of the loop. Just like `break`, an optional argument can be passed to `continue`. This will become the value of the loop if `continue` happens to be the last expression evaluated in its body.

``````
for (j = 0, i = 0; i < 10; i + 1)
if (remainder(i, 3) == 0) continue
else { j = j + i; j }
// false

for (j = 0, i = 0; i < 10; i + 1)
if (remainder(i, 3) == 0) continue(j)
else { j = j + i; j }
// 27

for (j = 0, i = 0; i < 10; i + 1)
if (remainder(i, 2) == 0) continue
else { j = j + i; j }
// 25
``````

If your style of programming demands more imperative control structures like the `for` loop, you can add them to the language! We will figure out how to do this and more in the last chapter of Part I.

## 7.3 Variations on a Theme

Once we have assignment in our language, we also have mutable data structures. For instance, it is now possible to have functions that change the values in the head or tail of a pair:

``````
let p = 10:20
p
// 10:20

p
// 100:2000
``````

Other data structures like lists, arrays and hash tables can also be mutated:

``````
let xs = [10, 20, 30]
xs[1] = xs[0] + xs[2]
xs
// [10, 40, 30]

let ys = #[10, 20, 30]
ys[1] = ys[1] * ys[2]
ys
// #[10, 600, 30]

let zs = #{'name:'alice, 'age:3}
zs['age] = 2
zs;
// #{name: alice, age: 2}

// for faster updates, use type-specific mutators/accessors
// instead of the generic [] operator.
list_set(xs, 0, "hello")
xs
// [hello, 40, 30]

array_set(ys, 0, \a)
ys
// #[\a, 600, 30]

hashtable_set(zs, 'name, 'bob)
zs
// #{name: bob, age: 2}
``````

### 7.3.1 An Imperative Queue

In this section, we will see how to define new mutable data structures. We will define a queue, a sequence in which items are inserted at one end (called the rear of the queue) and deleted from the other end (the front). The queue data structure will support the following operations:

``````
make_queue      – construct and return an empty queue
front_queue     – select and return the first item in the queue
insert_queue    – insert a new item at the rear of the queue
delete_queue    – remove the first item in the queue
is_empty_queue  – check if the queue is empty
``````

These operations will internally use the following functions to access and modify the pointers to front and rear of the queue. Two lists will represent the front and rear parts and they enable constant time lookup and insertion operations.

``````
function rear_ptr(queue) tail(queue)
function set_rear_ptr(queue, item) set_tail(queue, item)
``````

The queue constructor return a pair of empty lists.

``````
function make_queue() []:[]
``````

A queue is considered empty if its front part is the empty list.

``````
function is_empty_queue(q) is_empty(front_ptr(q))
``````

Insertion should check for two conditions. If the queue is empty a list is created with the new item and both the front an rear pointers point to this list. If the queue is not empty, the rear pointer is updated to point to this list. Then the rear pointer is "truncated" to make sure the next insertion need not traverse all the elements in the queue.

``````
function insert_queue(q, item)
let (new_entry = [item])
if (is_empty_queue(q))
{ set_front_ptr(q, new_entry)
set_rear_ptr(q, new_entry) }
else
{ set_tail(rear_ptr(q), new_entry)
set_rear_ptr(q, new_entry) }
```    ```

To select the item at the front of the queue, we return the head of the pair indicated by the front pointer.

``````
function front_queue(q)
if (is_empty_queue(q))
error("front called on an empty queue")
else
``````

To delete the item at the front of the queue, we merely modify the front pointer so that it now points at the second item in the queue.

``````
function delete_queue(q)
if (is_empty_queue(q))
error("delete called on an empty queue")
else
set_front_ptr(q, tail(front_ptr(q)))
``````

Now it's time to see the queue data structure in action!

``````
let q = make_queue()
insert_queue(q, 1)
insert_queue(q, 2)
front_queue(q)
// 1
delete_queue(q)
insert_queue(q, 3)
front_queue(q)
// 2
delete_queue(q)
front_queue(q)
// 3
``````

Each insert and delete operation mutates the internal state of the queue. An advantage of this is that, we are able to do all operations in constant time. On the downside, we lose a version of the queue with an insert or delete operation. There is no way to "go back" and see the contents of the queue at point past in time. We also cannot easily share this queue across simultaneously executing computations (threads or tasks) without adding code for data synchronization. For applications that require these properties, we need a queue that is immutable.

### 7.3.2 A Functional Queue

In the pure functional model that disallow assignments, the best we can do is to get amortized constant-time insert and delete operations. That is, any sequence of `n` insert and delete operations takes a total time that is proportional to some constant times `n`. Any individual operation might not be constant-time, however.

The functional queue is also represented as a pair of lists, one for the front and the other for the rear. At any instant, the queue content is given by `append(front, reverse(rear))`. An element is inserted by adding it to the head of `rear` and deleted by removing it from the head of `front`. To make this representation work, each element in `rear` has to be moved to `front` sooner or later. This should be done only occasionally, when `front` becomes empty. Then `rear` is reversed and moved to `front`. The reverse takes time proportional to the length of `rear`, i.e., to the number of elements it reverses. Each element that goes through the queue is passed exactly once from `rear` to `front`. Allocating the execution time of `reverse` to each element therefore gives a constant time per element. This is why the queue is amortized.

Having described how our immutable queue works, let us move on to its implementation. The queue constructor needs to accept the `front` and `rear` lists as optional arguments. This is because each insert and delete operation has to create a new queue, instead of mutating the existing one. We will also define a helper function called `check_queue` whose purpose is to reverse and move `rear` to `front` if `front` is empty.

``````
function make_queue(f = [], r = []) f:r

function check_queue(q)
if (is_empty(front_ptr(q)))
make_queue(reverse(rear_ptr(q)))
else q
``````

The `insert_queue` function adds the new item to the head of rear and makes sure the queue is ready to return the next item from its `front`. Similarly, `delete_queue` also ensures the required invariant after removing the head of `front`.

``````
function insert_queue(q, item)
check_queue(make_queue(front_ptr(q), item:rear_ptr(q)))

function delete_queue(q)
check_queue(make_queue(tail(front_ptr(q)), rear_ptr(q)))
``````

The program below show operations on the queue. As this queue is immutable, we need to explicitly keep track of instances to reach a desired final state.

``````
let q = insert_queue(insert_queue(make_queue(), 1), 2)
front_queue(q)
// 1
front_queue(delete_queue(insert_queue(delete_queue(q), 3)))
//3
``````

## 7.4 Modules and Namespaces

The functions that implemented the immutable queue has effectively over-written the imperative queue implementation. Is it possible to have both implementations in the same program, while maintaining a single "protocol" for operations on queues? The obvious solution is to implement the queues as two closures, one imperative and the other functional.

Slogan provide a convenient way to describe closure objects, without having to explicitly define and return a "message-handler" function. This is accomplished by the `module` expression. A `module` definition has the following syntax:

``````
module <name> (<export_list>) <definitions>
``````

The `export_list` contain the names from `definitions` that must be visible outside the module. These are essentially the "messages" an instance of the module can respond to.

Let us define the imperative queue as a module:

``````
module mutable_queue(make, is_empty:is_empty_queue,
front, insert, delete)
function rear_ptr(queue) tail(queue)
function set_rear_ptr(queue, item) set_tail(queue, item)

function make() []:[]

function is_empty_queue(q) is_empty(front_ptr(q))

function insert(q, item)
let (new_entry = [item])
if (is_empty_queue(q))
{ set_front_ptr(q, new_entry)
set_rear_ptr(q, new_entry) }
else
{ set_tail(rear_ptr(q), new_entry)
set_rear_ptr(q, new_entry) }

function front(q)
if (is_empty_queue(q))
error("front called on an empty queue")
else

function delete(q)
if (is_empty_queue(q))
error("delete called on an empty queue")
else
set_front_ptr(q, tail(front_ptr(q))) }
``````

The functions `make`, `is_empty_queue`, `front`, `insert` and `delete` are exported and made visible to the consumers of the module. These names constitute the public interface of the module. The functions `front_ptr`, `rear_ptr`, `set_front_ptr` and `set_rear_ptr` are used in the implementation of the module and need not be published to the outside world.

Note that for all public functions, except for `is_empty_queue` we have omitted the `_queue` suffix. This is because the module name makes the extra suffix redundant. We retained the suffix for `is_empty_queue` so that it won't conflict with the `is_empty` built-in function used in its implementation. But note that we export `is_empty_queue` as `is_empty`. This achieved by the export specification – `is_empty:is_empty_queue`. The exported name will not conflict with the global `is_empty` because it is scoped by the module name.

This is how we would use the new `mutable_queue` module:

``````
let mq = mutable_queue
let q = mq.make()
mq.insert(q, 1)
mq.insert(q, 2)
mq.front(q)
// 1

mq.delete(q)
mq.insert(q, 3)
mq.front(q)
// 2

mq.delete(q)
mq.front(q)
// 3
``````

`Modules` are first-class objects in Slogan. They can be assigned to variables, they can be passed as arguments to functions and they can be dynamically created and returned from functions. This is why we could assign the shorter alias `mq` to refer to the `mutable_queue` module.

We shall put the immutable queue into its own module as well:

``````
module immutable_queue(make, is_empty:is_empty_queue,
front, insert, delete)
function rear_ptr(queue) tail(queue)

function make(f = [], r = []) f:r

function check(q)
if (is_empty(front_ptr(q)))
make(reverse(rear_ptr(q)))
else q

function is_empty_queue(q) is_empty(front_ptr(q))

function front(q)
if (is_empty_queue(q))
error("front called on an empty queue")

function insert(q, item)
check(make(front_ptr(q), item:rear_ptr(q)))

function delete(q)
check(make(tail(front_ptr(q)), rear_ptr(q))) }
``````

This is how we would use the `immutable_queue` module:

``````
let imq = immutable_queue
let q = imq.insert(imq.insert(imq.make(), 1), 2)
imq.front(q)
// 1
imq.front(imq.delete(imq.insert(imq.delete(q), 3)))
// 3
``````

Now we have two separate queue modules that share a single protocol. How can we package these two together so that users can choose between the mutable and immutable implementations as they wish? One way to do this is to put them into a "super" module as shown below:

``````
module queue(mutable_queue, immutable_queue)
{ module mutable_queue(...) { ... }
module immutable_queue(...) { ... } }
``````

Another option is to make use of the `namespace` facility:

``````
namespace queue
{ module mutable_queue(...) { ... }
module immutable_queue(...) { ... } }
``````

All members in a `namespace` are public and are accessible via the `::` operator.

``````
let mq = queue::mutable_queue
let imq = queue::immutable_queue
``````

The primary purpose of a namespace is to arrange modules and other namespaces into logical groups. Just like modules, namespaces are first-class objects in Slogan.

1We did not substitute `balance` before the assignment operator because the expression `100 = 100 - 20` does not make sense and cannot be evaluated.