Sunday, September 21, 2008

Caching recursive functions in F#

A classical example of a recursive definition is the Fibonacci series. We could implement it in F as:
let rec fib = function
              | 0 -> 0
              | 1 -> 1
              | n -> fib (n-1) + fib (n-2)

This works, but it is quite inefficient as many intermediary results are calculated over and over again. It would be great to have a caching version of this function.
One of the things I like about F is the ability to easily and generically create caching versions of functions. Below is a possible implementation:
let cached f=
  let cache=new Dictionary<_,_>()
  fun a -> let ok,b=cache.TryGetValue a
           if ok then b
           else let result=f a
                lock cache (fun ()->cache.[a]<-result)
                 result

However: this doesn't help us much, because applying this function caches only the final result and not any intermediary results. The only way to start caching intermediary results is to break up the fib function:
let splitFib sub n = match n with
                     | 0 -> 0
                     | 1 -> 1
                     | _ -> sub (n-1) + sub (n-2)

We have substituted the recursive call by a call to a function parameter. Conceptually we would like to define our caching version of Fibonacci as:
let rec cachedFib = splitFib cachedFib |> cached

However, the F compiler will not tolerate this, raising "The value 'cachedFib' will be evaluated as part of its own definition." Therefore we are forced to explicitly mention to the compiler that the evaluation of cachedFib actually occurs at runtime only:
let rec cachedFib = splitFib (fun n->cachedFib n) |> cached

So we have established a simple 2 step procedure to create caching versions of recursive functions:
1. Create a "split" function by using a function parameter to handle all recursive calls
2. Use this "split" function to create a caching version by supplying an indirect recursive argument

Wednesday, September 17, 2008

The beauty of F# programming without side-effects

My first digression in F# was to build a sequence generating primes. Not really an original task, but then again it's better to start conquering some common ground.
So let's start easy:
let rec primes = seq {
  yield 2
  yield! Seq.filter is_prime {3..2..System.Int32.MaxValue} }

Enumerations are called sequences (hence seq) in F# and the yield syntax of F# is very much like C#, except for the yield! statement, that simply yields everything in the sequence that follows it.
The intentions are clear enough: 2 must be a prime and it will be followed by all odd numbers that are prime. Obviously this will not compile yet, as we haven't defined the function is_prime.
and is_prime n = n>1 && n=first_factor n

We have defined a prime as a number larger than 1 that equals the smallest number dividing it.
and first_factor n =
  let factor = primes
               |> Seq.take_while (fun x -> n>=x*x)
               |> Seq.tryfind (fun x -> n%x=0)
  if factor=None then n else factor.Value

The forward pipeline operator |> is widely described as 'merely' syntactic sugar: x |> f is equivalent to f(x). But all who have suffered from brackets in Lisp will greet this operator with cheer.
The function first_factor attempts to find the first factor by trying all primes. There is no need to try primes with a square larger than the current number, so we truncate the stream of primes with this criterion and the filter is also useful to prevent us creating an endless recursion :-) If no factor has been found, the number itself must be the first factor.
This implementation of the prime generator is completely side-effect free! Moreover it has a very clean interface. No implementation dependant parameters - such as e.g. size of the sieving window - are exposed. The downside is that this implementation is not very fast - I will get to that in another post.
To print the first 10 primes, we can simply execute
primes |> Seq.truncate 10 |> Seq.iter (printfn "%d")

The clean interface allows us to simply extend the functionality to factorization:
let rec prime_factors n = seq {
  let p = first_factor n
  yield p
  if n>p then yield! prime_factors (n/p) }

This returns a sequence of all the prime factors of a number in order. As so many times, this F# function is terse and concise and hardly needs explanation.
For all the things (all people for that matter) you love, there is usually a defining moment when you get hooked. For me it was when I realized how to compose a function to determine the number of dividers:
let divider_count=
  prime_factors
  >> Seq.count_by id
  >> Seq.map snd
  >> Seq.map ((+) 1)
  >> Seq.reduce (*)

Wow! That is compositional extravaganza! The function composition operator >> chains together two functions. How does it work? The output of prime_factors is grouped by the values themselves (the function id simply returns its argument). This gives a sequence of tuples with the first value the prime factor and the second value its power. For the number of dividers it is the second value we need, so we compose the mapping the snd (second) function. Finally of the powers has te be increased by 1 before multiplying them all together.
So where are the parameters for this function? Well, they are implicitly inferred. This may seem very confusing at first, but the more I work with composition, the more natural and elegant it becomes. Not only are there no side effects is this construct, the steps in the computation are completely disjunct. Divide et impera!

Friday, September 12, 2008

Are C# foreach iteration variables immutable?

Are C# foreach iteration variables immutable?
Well, they are supposed to be and sure enough this code won't compile:
var words = new List { "monkey", "nut", "mice" };
foreach (string word in words)
{
  word = "sugar";
}

The C# compiler reports: Cannot assign to 'word' because it is a 'foreach iteration variable'.
But now consider the following code:
var words = new List { "monkey", "nut", "mice" };
List printers = new List();
foreach (var word in words)
{
  printers.Add(
    new Action(() => Console.WriteLine(word)));
}
printers.ForEach(printer => printer());

To my surprise this generates:
mice
mice
mice

This may be conform the language specifications, but to me it is highly unlogical, as it violates the illusion that word is immutable with scope local to a single iteration.
Obviously there are always work-arounds to get 3 different words, like this one (to rub it in):
var words = new List { "monkey", "nut", "mice" };
List printers = new List();
foreach (var word in words)
{
  string local=word;
  printers.Add(
    new Action(() => Console.WriteLine(local)));
}
printers.ForEach(printer => printer());

Or this one (more usable):
var words = new List { "monkey", "nut", "mice" };
var printers = new List(
  words.Select(word =>
    new Action(() => Console.WriteLine(word))));
printers.ForEach(printer => printer());

I have been checking out F# lately. It's really neat, I'll have more to say about it subsequent posts. But of course I wanted to be sure the situation in F# is more logical:
let printers = new ResizeArray<(unit->unit)>()
for word in [ "monkey"; "nut"; "mice" ] do
  (fun ()->printfn "%s" word) |> printers.Add
Seq.iter (fun action->action()) printers

No fooling around it gets us to the expected:
monkey
nut
mice

Maybe there is some thruth after all in the claim that functional programming makes your code more reliable :-)