Pour Some (Syntactic) Sugar On Me

In my previous post, I went over the first of two code-transformers intended to replace S-expressions with something a little easier on the eyes. In this post I’ll be going over the second.

All of the code for this is up at http://github.com/mavant/pythonic, but for the sake of reading inline, here’s the current source code as a gist:

In brief:

  • Lines 1-4 comprise the macro that converts from quoted S-expressions to the whitespace-based syntax.
  • Lines 7, 8, and 10 are helper functions for the reverse transformation, back from return-indent notation to parenthetical notation.
  • Line 12 is the function that converts from a string in indent form to a string in parenthetical form.
  • Line 14 converts from a string in indent form to an S-expression that the evaluator can understand, but it just does this by piggybacking on the builtin reader.
  • Lines 16+ are examples demonstrating the transformations back and forth.

Lines 1-4 and 12 are due a little explanation, especially since I glossed over explaining 1-4 in the last post.

This is a macro, so arguments are not evaluated prior to being passed to it. Instead, it operates on the nested list structure of the expressions it is passed. Additionally, this is a multiple-arity macro, meaning that it can be called with either one or two arguments. The one-argument form (calling with just a list of expressions) is just a wrapper; it calls the two-argument form and makes a string out of the result. The two-argument form is somewhat more involved: For every sublist of the list it is given, it prepends a newline and some number of tab-indents, then recursively calls itself on all the items of that sublist, with the number of tabs incremented by one. For items of a list which are not lists, it jjust returns the item unaltered. The result of this is that every subexpression is represented as deeper level of indentation. This is not tail-recursive, and cannot be, so hopefully nothing is deeply nested enough to cause a stack overflow.
This is not a macro, so arguments ARE evaluated prior to being passed. However, since this function is just parsing strings, that’s fine. It is only single-arity, but does dispatch based on the type of its argument: If it’s given a string, it splits the string at all newline characters, then passes the result list back to itself and prepends a do-block (that part could be done with loop-recur, but since it’s only ever one call I didn’t bother). If it’s called with a list as an argument, first it splits the list up by indentation level, and then it pairs up each unindented line with the indented lines that follow it. For each of these pairs, it creates a vector of the unindented like followed by a subvector of all of the indented lines with the first tab removed from each, and recursively calls itself on that subvector. This, again, is not tail-recursive, so we are building up stack frames by doing it. The result of this is a nested list structure (because map outputs a list, not a vector, even when passed vectors as arguments) in which each bottom-level element is a string. Calling print-str on this structure results in a string of syntactically correct S-expressions.
Just calls the built-in read-string function on the output of whitespace->parentheses. The reader takes care of the work for us.

And that’s really all there is to it. We have a whitespace-based syntax that represents all the same things s-expressions do. Of course, we are somewhat hampered by the fact that we can’t write in this directly - but it would be pretty easy to wrap read and read-string to run these transformers before evaluating…

Maybe that’ll be part three.