Briefly, Laziness

Mon Apr 29, 2013

I've gotten some questions about what, exactly, laziness is good for, so I'll want to touch on it briefly.

Short answer: it saves you space and time.

Space

Really, this should be obvious, but if we're going through this, lets do it properly. Here are some lists

(defparameter foo (list 1 2 3 4 5 6 7 8 9))
foo = [1, 2, 3, 4, 5, 6, 7, 8, 9] ## This one's from Python
foo = [1, 2, 3, 4, 5, 6, 7, 8, 9] -- This one's from Haskell
foo = [1, 2, 3, 4, 5, 6, 7, 8, 9] // This one's from Javascript

and here's some lazy lists

(defun lazy-numbers (&optional (starting-with 1))
  (cons starting-with (lambda () (lazy-numbers (+ starting-with 1)))))

(defparameter foo (lazy-numbers))
def lazyNumbers(startingWith=1):
    i = startingWith
    while True:
        yield i
        i += 1

foo = lazyNumbers()
foo = [1..]
function lazyNumbers (startingWith) {
    if (startingWith === undefined) startingWith = 1;
    return { num: startingWith,
             next: function () {
                 this.num += 1;
                 return this.num
             }
           }
}

foo = lazyNumbers();

Granted the second set looks more complicated, except for the Haskell line, but it has some advantages. Firstly, while the first bunch of lists is bounded, this second bunch is infinite, and you do sometimes want to express that. More to the point though, the reason these can be infinite is that they're lazy. They only compute as much of the remainder of the sequence as you actually ask for, which means they save you space in two ways

That's basic stuff, it should be pretty obvious. Less obvious, but more significant, is how this saves you time.

Time

By its lonesome, it actually doesn't.

(loop with lst = (list 1 2 3 4 5 6 7 8 9)
   repeat 5 for n in lst
   do (format t "~a~%" n))

takes exactly the same amount of run time as

(loop with lst = (lazy-numbers 1) repeat 5
     repeat 5 for n = (car lst) for lst = (funcall (cdr lst))
     do (format t "~a~%" n))

But what about when we compose multiple operations on one sequence? You'll recall that back when I sat down to optimize Life for a couple of days, I made a big deal of reducing the number of iterations of our corpus. Specifically, recall that the Haskell version of Gridless Life started out like this:

import Data.List

neighbors :: (Int, Int) -> [(Int, Int)]
neighbors (x, y) = [(x+dx, y+dy) | dx <- [-1..1], dy <- [-1..1], (dx,dy) /= (0,0)]

lifeStep :: [(Int, Int)] -> [(Int, Int)]
lifeStep cells = [head g | g <- grouped cells, viable g]
  where grouped = group . sort . concat . map neighbors
        viable [_,_,_] = True
        viable [c,_] = c `elem` cells
        viable _ = False

and zoom in on one line in particular.

...
  where grouped = group . sort . concat . map neighbors
        ...

That right there is what laziness trivially enables. If you think about what's being done here, you'll realize that you'd never do this in an eager context. Because if you did, what you'd get is

  1. the argument would get evaluated
  2. it would get traversed once by map neighbors
  3. and then it would get traversed again by concat
  4. and then it would get traversed again by sort|1|
  5. and then it would get traversed again by group

which means six total trips over the entire list. That's bad because every element you touch adds to your run time, and each element you have to touch again as part of the computation is one that you can't throw out of memory. On the other hand, expressing a computation by composing smaller, easy computations is very straightforward|2|. This is a place where the code you want to run, and the code you want to write are completely different.

What you want to run is a tight loop that walks over the entire corpus once, and applies all of the chained functions at once per element. What you want to write is the naive composition of those chained functions, because you've otherwise created an algorithm that will only ever be useful for your particular computation, and that computation will be burdened with the specifics of iteration which will make it non-trivial at best and impenetrable at worst.

Now, granted, Common Lisp has other ways of dealing of dealing with this|3|, but laziness is one general solution. If each of those functions were lazy (as, in fact, they are in Haskell), what you'd get instead is exactly what you want. One, tight loop running over the entire corpus, applying only as much grouping, sorting, concating and neighborsing as it needed to for the next part of the computation. That saves you a few trips over the input with no additional effort on your part. Talk about low-hanging fruit.

I'll be honest, I was also going to talk about my latest thoughts on the square dissection problem, but this ended up being longer than I'd like as it is. It'll probably happen next time, I guess. Probably. Stay tuned if you're into that sort of thing.


Footnotes

1 - |back| - A note on functional, lazy sorts, because I was wondering about this both back when tuning the Haskell version of Life and as I was re-visiting it for today's feature. The way that lazy sorts seem to work is basically by using a destructured heapsort. Specifically, if you take a look at this pseudocode, what's happening is that a lazy sort runs heapify right away and passes up the first element, then pulls out the next element each time it's asked for one. That results in On performance for finding the first element (which as far as I'm aware is what you'd have to do in order to get the "nextest" element in the general case anyway), followed by O(log n) performance on looking up each next element. That's good because it means you don't have to do it all at once, and it means that if you only want the first 5 sorted elements of a list of 10000, you get to avoid most of the work. On the other hand, note that this doesn't save you much memory, if any; you still need to store that heap for the whole list, even if you only want the first few chunklets.

2 - |back| - Which is why I did this for the first pass of that program, not realizing that Haskell's lazy-by-default outlook would also make it about as efficient as it could be.

3 - |back| - In fact, the situation of "I want to write the pretty code, but run the ugly, optimized code" should send any Lisp veterans off thinking about how you'd defmacro your way out of this particular annoyance.


Creative Commons License

all articles at langnostic are licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License

Reprint, rehost and distribute freely (even for profit), but attribute the work and allow your readers the same freedoms. Here's a license widget you can use.

The menu background image is Jewel Wash, taken from Dan Zen's flickr stream and released under a CC-BY license