LMX Parser
The LMX parser is a reverse XML parser, written and used in Objective-C. The project was started by Peter Hosey as a contribution to Adium's new log format, which is XML-based.
Adium has a feature called "message history". When you open a chat with somebody with this enabled, and you've chatted with them before, Adium will fetch the last few messages exchanged between you and the contact, and re-display them in the new chat window - setting up context for the new chat.
In all pre-1.0 releases, the message history was stored separately. With the new log format, that has stopped, and it is fetched from the logs instead. But with a conventional (forwards) XML parser, the entire file must be parsed to get those last few messages, making the required amount of time proportional to the length of the log. With LMX, Adium can stop parsing when it has the requisite n messages, so that it always takes the same amount of time no matter how long the log is.
Official website
LMX's official webpage is on Peter's website. This includes the full version history and a tarball for every version.
Source code
The code is under a BSD license, and is in a Subversion repository hosted by Network Redux (thanks to Evan Schoenberg for setting this up).
To check out the current LMX sources, type:
svn co svn://svn.adiumx.com/liblmx/branches/LMX-1.0
This creates a folder named "liblmx" in the current directory, containing all the sources and Xcode projects for the LMX library and test application. You will need to compile these yourself into a usable form.
Test application
The LMX source code comes with a test application, LMXTest. This application opens any .xml file and displays a log of the events of parsing it. You will note that elements end before they are started; this is the reverse nature of LMX.
The test application also serves as sample code. It isn't long; you should be able to figure out what's going on by the contents of LMXTestDocument.m.
Programming LMX
The LMX library contains one class, LMXParser, that works similarly to Foundation's NSXMLParser. You create an instance and set a delegate. The delegate responds to certain methods, which are called by LMX to inform the delegate of the progress of the parse. All of the methods are listed in a category in LMXParser.h; you don't need to implement all of them yourself.
There are two major differences:
- Rather than passing the whole data to the parser at init time, you pass it a chunk at a time (though the chunk can be the whole data if you want). Accordingly, -parseChunk: returns an enumeration constant that returns not only whether the parse was successful, but whether it was complete.
- Because of the reverse nature of the parser, elements end before they begin (hence the strange ordering of the delegate methods in the header file).