From Stateful Parsing to Transactional Parsing

(This page is a mirrored copy of an article originally posted on the LShift blog; see the archive index here.)

Thinking further on the problem of stateful parsing (from yesterday’s article), the way we’ve done it in the past (essentially by using Scheme’s parameter mechanism) is a special case of a transactional system, with rollback on error and commit after each toplevel expression parsed.

This suggests that if Scheme had an implementation of a Software Transactional Memory, then it would be a good fit for a stateful packrat parser. The symbol-table (or whatever state needs to be speculatively propagated along a line of parsing) would simply be placed under transactional control. Each time an attempt is made to match against a particular nonterminal a nested transaction would be entered. If the nonterminal matched the input successfully, the nested transaction would be committed; otherwise it would be rolled back, and any alternative options would be tried in their own nested transactions.