2019-May-21

Random generation of events is important for modeling many statistical processes. In this post we will discuss how to simulate the tossing of an unbiased coin in Clojure. We will do this using pure functional techniques. We will also write functions to derive some interesting statistics from coin-toss experiments.

A coin toss can result in either a head or a tail. We shall assign the integer `1`

to head and `0`

to tail,
so that we can use the `rand-int`

function to simulate a single toss.

```
> (defn toss [] (rand-int 2))
> (toss)
;;=> will output 1 or 0 with equal chance. probability=0.5
```

How can we simulate `N`

tosses? With the help
of lazy sequences.
The following function will give us as many tosses as we want!

```
> (defn tosses []
(lazy-seq (cons (toss) (tosses))))
> (take 10 (tosses))
;;=> (1 1 0 1 0 1 0 0 0 1)
```

Now we would want to answer some questions about the data gathered from all those tosses, like how many heads were generated
by the experiment? The higher-order function `reduce`

can help us here:

```
> (reduce + 0 (take 10 (tosses)))
;;=> 5
```

**Note** You may get a different result, as the simulation includes random values.

It is well-known that the ratio of the number of heads (or tails) to the total number of tosses will approach `0.5`

as
the number of tosses increases. To test this out, we first need a version of `reduce`

that returns a sequence of cumulative
results. The definition of this function follows:

```
(defn cumulate [f init xs]
(when (seq xs)
(let [r (f init (first xs))]
(lazy-seq (cons r (cumulate f r (rest xs)))))))
```

`Cumulate`

applies the function `f`

to `init`

and the first element of `xs`

.
The result (`r`

) becomes the head of a lazy sequence. The tail of this sequence is constructed by
a recursive call to `cumulate`

with `r`

being the value of `init`

. This process continues
until `xs`

runs out of values. The returned sequence will contain all intermediate results of the reduction of `xs`

by `f`

.

Let's call `reduce`

and `cumulate`

on the same inputs and see how the outputs differ:

```
> (reduce + 0 [1 2 3 4 5])
;;=> 15
> (cumulate + 0 [1 2 3 4 5])
;;=> (1 3 6 10 15)
```

The value returned by `reduce`

and the final value in the sequence returned by `cumulate`

are the same.
In addition, `cumulate`

also captures the intermediate values of the reduction.

Cumulative summing of sequences can be expressed by the following partial application of `cumulate`

:

```
> (def sums (partial cumulate + 0))
```

We have all the tools necessary to experiment and find out if heads appear with a probability of `0.5`

if a coin is
tossed for a sufficiently large number of times. The following program will conduct this experiment for us:

```
> (def xs (take 3000 (tosses)))
> (def ss (sums xs))
> (def rs (map #(/ (float %1) %2) ss (range 1 3001)))
> (drop 2500 rs)
;;=> (0.5041983206717313 0.504396482813749 0.5041949660407511 ...)
```

First, `3000`

tosses are made. The number of heads seen until each toss is found by calling `sums`

.
The higher-order function `map`

walks down this sequence, dividing each entry by the corresponding toss number
to produce a sequence of ratios.

As we see in the final output, the ratio of heads indeed approach `0.5`

.

There are other interesting information that we can derive from the data returned by `tosses`

.
For instance, can you design a function to report the excess of heads over tails for a particular toss?

This problem can be solved by finding the difference between the cumulative number of heads and the cumulative number of tails
for the specified toss. We can find the cumulative of tails by first "inverting" the tosses and then applying `sums`

to
the result. Here is the invert function:

```
> (defn inv [xs]
(map #(if (zero? %) 1 0) xs))
> (def xs (take 10 (tosses)))
> xs
;;=> (0 0 0 1 1 1 0 0 1 0)
> (def ys (inv xs))
> ys
;;=> (1 1 1 0 0 0 1 1 0 1)
```

The function `excess`

defined below, finds the difference between a sequence and its inverse:

```
> (defn excess [xs n]
(let [hs (sums xs)
ts (sums (inv xs))]
(nth (map #(- %1 %2) hs ts) n)))
> (excess xs 4)
;;=> -1
> (excess xs 5)
;;=> 0
> (excess xs 6)
;;=> -1
```

**Exercise 1** Write a function to report the first `n`

excesses of tails over heads.

**Exercise 2** Use Incanter to visualize the various
information you gather about coin tosses. To start, you may plot the ratios obtained from the first 3000 tosses.

**Note** The inspiration for this post came from this lecture
by Prof. Keith Smillie.