Collectors in Scheme

Collector functions are a functional way of processing and accumulating collections of data. I recently encountered this pattern while going through the book The Little Schemer. In the discussion of lambdas, there’s a function named multirember&co, which stands for “remove multiple members and collect”. Basically, it removes multiple elements from a list and collects the removed elements."

This function is a good, nontrivial example of collector functions. Once I worked through it, I decided writing a short blog post about it would be helpful so others could benefit.

(define multirember&co
  (lambda (a lat col)
    (cond
      ; (1) Base case. Collect empty lists.
      ((null? lat) (col (quote ()) (quote ())))
      ; (2) The head of the list is equal to `a`. Collect it in `seen`.
      ((eq? a (car lat))
       (multirember&co a
                       (cdr lat)
                       (lambda (newlat seen)
                         (col newlat (cons (car lat) seen)))))
      ; (3) The head of the list is not equal to `a`. Collect it in `newlat`.
      (else
       (multirember&co a
                       (cdr lat)
                       (lambda (newlat seen)
                         (col (cons (car lat) newlat) seen)))))))

This function has three arguments: the value a to be removed from the list, the list lat, and a collector function col. It looks at every element of the list lat to see if it is equal to a. If so, it collects them in a list (seen ). If not, it collects them in another list (newlat).

The recurrence for this function at first seems complicated. At each step, this function recurs by building a new lambda function that uses the given collector to prepend an element to one of the lists. It then passes this new lambda down to the next call of multirember&co.

Let’s walk through the expansion of (multirember&co 'b '(a b c b) list). I will name the collector lambda at each step to make it easier to see what’s happening. At each step, comments describe what’s going on with the expansion:

; Initial: Name the collector that's passed in as `col-0`
(define col-0 list)

; Step-1:
(multirember&co 'b '(a b c b) col-0)

; `b` does not match the head of the list `a`. The code builds a
; new lambda for recursion. Call it `col-1`.
(define col-1
	(lambda (newlat seen)
		(col-0 (cons 'a newlat) seen)))


; Step-2:
(multirember&co 'b '(b c b) col-1)

; `b` matches the head of the list. Name the new lambda for the
; recursion as `col-2`.
(define col-2
  (lambda (newlat seen)
	  (col-1 newlat (cons 'b seen))))


; Step-3:
(multirember&co 'b '(c b) col-2)

; `b` does not match the head of the list. Name the new lambda
; for the recursion as `col-3`.
(define col-3
  (lambda (newlat seen)
	  (col-2 (cons 'c newlat) seen)))


; Step-4:
(multirember&co 'b '(b) col-3)

; `b` matches the head of the list. Name the new lambda for the
; recursion as `col-4`:
(define col-4
  (lambda (newlat seen)
	  (col-3 newlat (cons 'b seen))))


; Step-5:
(multirember&co 'b '() col-4)
((null? lat) (col-4 '() '()))

At the last step, the stack of function calls logically looks as follows:

(col-4 '() '())
;|_____|
;   |
;   |
(define col-4
	(lambda (newlat seen)
		(col-3 newlat (cons 'b seen))))
	;   |_____|
	;      |
	;      |
		(define col-3
			(lambda (newlat seen)
				(col-2 (cons 'c newlat) seen)))
			;   |_____|
			;      |
			;      |
				(define col-2
					(lambda (newlat seen)
						(col-1 newlat (cons 'b seen))))
					;   |_____|
					;      |
					;      |
						(define col-1
							(lambda (newlat seen)
								(col-0 (cons 'a newlat) seen)))
							;   |_____|
							;      |
							;      |
								(define col-0 list)

Without the labels, the actual expansion (below) is much harder to parse and understand!

((lambda (newlat4 seen4) ; col-4
   ((lambda (newlat3 seen3) ; col-3
      ((lambda (newlat2 seen2) ; col-2
         ((lambda (newlat1 seen1); col-1

            (col-0 (cons 'a newlat1) seen1)) ; arguments to col-0
          newlat2 (cons 'b seen2))) ; arguments to col-1
       (cons 'c newlat3) seen3)) ; arguments to col-2
    newlat4 (cons 'b seen4))) ; arguments to col-3

 '() '()
 )

Executing the above block results in '((a c) (b b)), which is what we expect.

Another way of looking at the expansion above is with the computed arguments for each lambda:

(col-4 '() '())
(col-3 '() '(b))
(col-2 '(c) '(b))
(col-1 '(c) '(b b))
(col-0 '(a c) '(b b))

; `col-0` is `list`
(list '(a c) '(b b))
('(a c) '(b b))

Hopefully this expansion of multirember&co gives a clearer idea of how collector functions behave and are used in Scheme. This analysis applies generally to other functional programming languages as well.