Toss them, Coins!

2019-May-21

Random generation of events is important for modeling many statistical processes. In this post we will discuss how to simulate the tossing of an unbiased coin in Clojure. We will do this using pure functional techniques. We will also write functions to derive some interesting statistics from coin-toss experiments.

Simulating Coin Tosses

A coin toss can result in either a head or a tail. We shall assign the integer 1 to head and 0 to tail, so that we can use the rand-int function to simulate a single toss.


> (defn toss [] (rand-int 2))
> (toss)	  
;;=> will output 1 or 0 with equal chance. probability=0.5
      

How can we simulate N tosses? With the help of lazy sequences. The following function will give us as many tosses as we want!


> (defn tosses []
    (lazy-seq (cons (toss) (tosses))))

> (take 10 (tosses))
;;=> (1 1 0 1 0 1 0 0 0 1)
      

Functional Inquiry

Now we would want to answer some questions about the data gathered from all those tosses, like how many heads were generated by the experiment? The higher-order function reduce can help us here:


> (reduce + 0 (take 10 (tosses)))
;;=> 5
      

Note You may get a different result, as the simulation includes random values.

It is well-known that the ratio of the number of heads (or tails) to the total number of tosses will approach 0.5 as the number of tosses increases. To test this out, we first need a version of reduce that returns a sequence of cumulative results. The definition of this function follows:


(defn cumulate [f init xs]
  (when (seq xs)
    (let [r (f init (first xs))]
      (lazy-seq (cons r (cumulate f r (rest xs)))))))
      

Cumulate applies the function f to init and the first element of xs. The result (r) becomes the head of a lazy sequence. The tail of this sequence is constructed by a recursive call to cumulate with r being the value of init. This process continues until xs runs out of values. The returned sequence will contain all intermediate results of the reduction of xs by f.

Let's call reduce and cumulate on the same inputs and see how the outputs differ:


> (reduce + 0 [1 2 3 4 5])
;;=> 15
> (cumulate + 0 [1 2 3 4 5])
;;=> (1 3 6 10 15)
      

The value returned by reduce and the final value in the sequence returned by cumulate are the same. In addition, cumulate also captures the intermediate values of the reduction.

Cumulative summing of sequences can be expressed by the following partial application of cumulate:


> (def sums (partial cumulate + 0))
      

We have all the tools necessary to experiment and find out if heads appear with a probability of 0.5 if a coin is tossed for a sufficiently large number of times. The following program will conduct this experiment for us:


> (def xs (take 3000 (tosses)))
> (def ss (sums xs))
> (def rs (map #(/ (float %1) %2) ss (range 1 3001)))
> (drop 2500 rs)
;;=> (0.5041983206717313 0.504396482813749 0.5041949660407511 ...)
      

First, 3000 tosses are made. The number of heads seen until each toss is found by calling sums. The higher-order function map walks down this sequence, dividing each entry by the corresponding toss number to produce a sequence of ratios.

As we see in the final output, the ratio of heads indeed approach 0.5.

There are other interesting information that we can derive from the data returned by tosses. For instance, can you design a function to report the excess of heads over tails for a particular toss?

This problem can be solved by finding the difference between the cumulative number of heads and the cumulative number of tails for the specified toss. We can find the cumulative of tails by first "inverting" the tosses and then applying sums to the result. Here is the invert function:


> (defn inv [xs]
    (map #(if (zero? %) 1 0) xs))

> (def xs (take 10 (tosses)))
> xs
;;=> (0 0 0 1 1 1 0 0 1 0)

> (def ys (inv xs))
> ys
;;=> (1 1 1 0 0 0 1 1 0 1)

      

The function excess defined below, finds the difference between a sequence and its inverse:


> (defn excess [xs n]
    (let [hs (sums xs)
	  ts (sums (inv xs))]
      (nth (map #(- %1 %2) hs ts) n)))

> (excess xs 4)
;;=> -1
> (excess xs 5)
;;=> 0
> (excess xs 6)
;;=> -1
      

Exercise 1 Write a function to report the first n excesses of tails over heads.

Exercise 2 Use Incanter to visualize the various information you gather about coin tosses. To start, you may plot the ratios obtained from the first 3000 tosses.

Note The inspiration for this post came from this lecture by Prof. Keith Smillie.