A deeper look at for

Introduction

The for function is at the same time the most important and the most complex control-flow function in PicoLisp. There is some empirical evidence for the first claim, at least an analysis of more than 600 Rosetta Code Solutions yielded the following numbers:

| function | count |
|----------+-------|
| for      |   458 |
| do       |   145 |
| mapcar   |   143 |
| while    |   106 |
| mapc     |    80 |
| loop     |    69 |
| until    |    36 |
| map      |    20 |
| mapcan   |    10 |
| mapcon   |     2 |



While the selection of control-flow functions in this table is a bit subjective (the original Rosetta Code Analysis gives numbers for some 370 global symbols in PicoLisp), and the Rosetta Code Solutions might not be representative for PicoLisp programs in other domains, the overall picture is pretty clear: for is used ca. three times as often as the next popular functions do and mapcar, and more than four times as often as while.

The second claim is of a more subjective nature - I find it often quite difficult to understand the for-loops in PicoLisp code, and this article is an attempt to learn more about the for function while letting others benefit from the effort. However, the following original documentation of for should somehow support the claim that it is a complex function.

The Official (rather terse) Documentation


(for sym 'num ['any | (NIL 'any . prg) | (T 'any . prg) ..]) -> any
(for sym|(sym2 . sym) 'lst ['any | (NIL 'any . prg) | (T 'any . prg) ..]) ->
    any
(for (sym|(sym2 . sym) 'any1 'any2 [. prg]) ['any | (NIL 'any . prg) | (T 'any
    . prg) ..]) -> any

    Conditional loop with local variable(s) and multiple conditional
    exits:

    In the first form, the value of sym is saved, sym is bound to 1,
    and the body is executed with increasing values up to (and
    including) num.

    In the second form, the value of sym is saved, sym is subsequently
    bound to the elements of lst, and the body is executed each time.

    In the third form, the value of sym is saved, and sym is bound to
    any1. If sym2 is given, it is treated as a counter variable, first
    bound to 1 and then incremented for each execution of the body.
    While the condition any2 evaluates to non-NIL, the body is
    repeatedly executed and, if prg is given, sym is re-bound to the
    result of its evaluation.

    If a clause has NIL or T as its CAR, the clause's second element
    is evaluated as a condition and - if the result is NIL or non-NIL,
    respectively - the prg is executed and the result returned. If the
    body is never executed, NIL is returned.

    See also do and loop.

    # First form:
    : (for N 5 (printsp N))
    1 2 3 4 5 -> 5
    : (for N 5 (printsp N) (NIL (< N 3) (printsp 'enough)))
    1 2 3 enough -> enough
    : (for N 5 (T (> N 3) (printsp 'enough)) (printsp N))
    1 2 3 enough -> enough

    # Second form:
    : (for X (1 a 2 b) (printsp X))
    1 a 2 b -> b
    : (for (I . X) '(a b c) (println I X))
    1 a
    2 b
    3 c
    -> c

    # Third form:
    : (for (L (1 2 3 4 5) L) (printsp (pop 'L)))
    1 2 3 4 5 -> 5
    : (for (N 1 (>= 5 N) (inc N)) (printsp N))
    1 2 3 4 5 -> 5
    : (for ((I . L) '(a b c d e f) L (cddr L)) (println I L))
    1 (a b c d e f)
    2 (c d e f)
    3 (e f)
    -> (e f)


The First Form

(for sym 'num ['any | (NIL 'any . prg) | (T 'any . prg) ..]) -> any

    In the first form, the value of sym is saved, sym is bound to 1,
    and the body is executed with increasing values up to (and
    including) num.


Lets walk through the examples given in the documentation.

Example 1.1

    : (for N 5 (printsp N))
    1 2 3 4 5 -> 5


First, as a reminder, the documentation of printsp:

(printsp 'any ..) -> any
    Prints all any arguments to the current output
    channel, followed by a space. If there is more
    than one argument, a space is printed between
    successive arguments.


In this example, N is the symbol sym whose value is saved first and which is then bound to 1. The number 5 is the upper limit num. (printsp N) is the body of the function, and it is a clause that neither has NIL or T as its CAR.

So the body is executed five times while the counter N is incremented from 1 to 5. Simple enough.

Example 1.2

    : (for N 5 (printsp N) (NIL (< N 3) (printsp 'enough)))
    1 2 3 enough -> enough


The second example is identical to the first one except for an additional clause in the body of for:

(printsp N) (NIL (< N 3) (printsp 'enough))

Lets recall the original documentation to understand this:

    If a clause has NIL or T as its CAR, the clause's second element is
    evaluated as a condition and - if the result is NIL or non-NIL,
    respectively - the prg is executed and the result returned. If the body is
    never executed, NIL is returned.


The second clause has NIL as its CAR, therefore its second element, (< N 3), is evaluated as a condition. Here, in the first two iterations, N has the value 1 and then 2, thus the condition returns T. In the third iteration, the condition (< 3 3) returns NIL and the associated prg ((printsp 'enough)) is executed and the result enough returned.

Example 1.3

The third example yields the same result as the second, but in a different way:

    : (for N 5 (T (> N 3) (printsp 'enough)) (printsp N))
    1 2 3 enough -> enough


Here, the first clause has T in its CAR, but the condition is (> N 3) this time, thus in the first three iterations NIL is returned and the second clause (printsp N) evaluated. When N is incremented to 4, the condition becomes true, the associated prg is executed and its result (enough) returned.

The Second Form

(for sym|(sym2 . sym) 'lst ['any | (NIL 'any . prg) | (T 'any . prg) ..]) ->
    any

    In the second form, the value of sym is saved, sym is subsequently bound to
    the elements of lst, and the body is executed each time.


While the part inside the square brackets ['any ...] is the same as in the first form, we can see two differences in the definition:



The PicoLisp Reference gives the following explanation for cons pairs like (sym2 . sym):

Lists are used in PicoLisp to emulate composite data
structures like arrays, trees, stacks or queues.

In contrast to lists, numbers and symbols are
collectively called "Atoms".

Typically, the CDR of each cell in a list points to
the following cell, except for the last cell which
points to NIL. If, however, the CDR of the last cell
points to an atom, that cell is called a "dotted
pair" (because of its I/O syntax with a dot '.'
between the two values).


Lets walk again through the examples given in the documentation. Of course, the special clauses with T or NIL in the CAR can be used like in the first form, although no further examples are given for this.

Example 2.1

   : (for X (1 a 2 b) (printsp X))
    1 a 2 b -> b


Here, X is sym, (1 a 2 b) is 'lst and (printsp X) is 'any. Hence, there are 4 iterations, since the list has 4 elements, and the symbol X is bound to one list element after another. In each iteration, the program (printsp X) is executed with the actual value of X.

Example 2.2

    : (for (I . X) '(a b c) (println I X))
    1 a
    2 b
    3 c
    -> c


In this example, a dotted pair (sym2 . sym) is used instead of a single symbol sym. The documentation explains its use:

If sym2 is given, it is treated as a counter variable, first bound to
1 and then incremented for each execution of the body.


The output of the example shows that symbol X is bound subsequently to a, b and c, while the counter variable I is incremented in each iteration.

Third Form

(for (sym|(sym2 . sym) 'any1 'any2 [. prg]) ['any | (NIL 'any .  prg) | (T 'any . prg) ..]) -> any

    In the third form, the value of sym is saved, and sym is bound to
    any1. If sym2 is given, it is treated as a counter variable, first
    bound to 1 and then incremented for each execution of the body.
    While the condition any2 evaluates to non-NIL, the body is
    repeatedly executed and, if prg is given, sym is re-bound to the
    result of its evaluation.


The part in the square brackets ['any ...] is the same as in the two previously discussed forms. And, like in form 2, either a single symbol sym or a cons pair (sym2 . sym) can be used. However, there are a few differences that make this the most general, versatil and complex of the three forms:



Here, sym is bound to 'any1, and, if sym2 is given too, it is treated as a counter variable just like in form 2. 'any2 is a condition, and the body of the for-function is repeated as often as this condition evaluates to non-NIL. If [. prg] is given, sym is rebound to the result of its evaluation after each iteration.

Lets have a look at the examples:

Example 3.1

    : (for (L (1 2 3 4 5) L) (printsp (pop 'L)))
    1 2 3 4 5 -> 5


First, as a reminder, the documentation of pop:

(pop 'var) -> any
    Pops the first element (CAR) from the stack in var.


In this example, L is sym, the list (1 2 3 4 5) is 'any1, and L is again used as condition 'any2. The body of the function is (printsp (pop 'L)).

Thus the symbol L is bound to the whole list (1 2 3 4 5) in this case, and not, as in form 2, subsequently to its elements. The number of iterations in form 3 is determined by the condition 'any2, in this case the symbol L again. Evaluating L returns the list 'any1, i.e. a non-NIL value. Without changing the value of L, this for-function would loop forever. But in the body of the function, the CAR of the list in L is popped in each iteration, resulting in a list with one element less. After five iterations, the list is empty, and since in PicoLisp (like in other Lisps) the empty list () means NIL, evaluating the condition 'any2 (the list L) in iteration 6 returns NIL and the for-loop is terminated.

Example 3.2

    : (for (N 1 (>= 5 N) (inc N)) (printsp N))
    1 2 3 4 5 -> 5


Lets start with the documentation of inc:

(inc 'num) -> num
    [inc] returns the value of num incremented by 1.


In this example, the for keyword is followed by two clauses, with the second clause (printsp N) representing the body of the function.

The interesting things happen in the first clause here: N is sym, which is bound to 'any1, i.e. the number 1 in this case. Then follows a condition 'any2, (>= 5 N), and a [.prg] associated with this condition, (inc N).

Thus, in the first iteration, N is bound to 1, it is checked that 5 >= 1, the function body is executed, i.e. 1 is printed followed by a space, and finally [.prg] in the first clause is evaluated and the symbol N bound to the result of this evaluation (2).

This is repeated five times, until the condition (>= 5 6) returns NIL in the sixth iteration.

Example 3.3

    : (for ((I . L) '(a b c d e f) L (cddr L)) (println I L))
    1 (a b c d e f)
    2 (c d e f)
    3 (e f)
    -> (e f)


Lets start again with a bit of documentation for cddr, println should be obvious:

(c[ad]*ar 'var) -> any
(c[ad]*dr 'lst) -> any
    List access shortcuts. Combinations of the car
    and cdr functions, with up to four letters 'a'
    and 'd'.

    : (cdr '(a b c d))
    -> (b c d)
    : (cdr '(b c d))
    -> (c d)
    : (cddr '(a b c d))
    -> (c d)


Here, like in Example 2.2, (sym2 . sym) instead of just sym is used, i.e. the L in the cons pair (I . L) is used as symbol and the I as counter variable starting with 1 and then incremented in each iteration. The function body is the same as in Example 2.2, except that the printed results are separated by newlines rather than spaces.

Again, we find 'any1, 'any2 and [.prg] in the first clause after the for keyword. This time, like in Example 3.1, 'any1 is a list instead of a number, and symbol L is bound to the whole list. The condition 'any2 checked in each iteration is simply L - is L the empty list returning NIL on evaluation, or does it still contain elements? At the end of each iteration, [.prg] is evaluated and symbol L bound to the result. In this case, (cddr L) returns the list L without its first two elements. Since the original list has 6 elements, L is the empty list after 3 iterations, and evaluating the condition 'any2 (in this case simply the list L) in the fourth iteration returns NIL what leads to the termination of the for loop.

https://picolisp.com/wiki/?deeperlookatfor

24nov12    tj