I recently needed to navigate some XML in Haxe and noticed that there were few options for doing this quickly and easily in Haxe.

I did notice Oleg’s Walker class which brings some of the E4X functionality of AS3 to Haxe.
While the resulting code was more elegant than hand-writing loops and tests, it still felt too verbose, and I decided to add some macro sugar to it to cut down the syntax (and bring it closer in line with the E4X spec).

The result is the E4X class, which reduces the amount of code by 2-3 times (in comparison to a fully runtime, function-based solution). Due to haxe language restrictions, the resulting syntax is not quite as compact as the AS3 equivalent, but it’s close.

Usage

E4X expressions must be wrapped in the macro call, and they return an iterator of values (the type of which is based on the last part of the expression).

To get all children:

1
var nodes:Iterator<Xml> = E4X.x(xml.child());

Here are some different ways to get a list of all the child nodes with the name “node”:

1
2
3
4
5
6
7
8
9
10
11
var xml:Xml;
var nodes:Iterator<Xml> = E4X.x(xml.node);

// or (for example)
var nodes:Iterator<Xml> = E4X.x(xml.child("node"));

// or (using an expression which will be wrapped in a function call)
var nodes:Iterator<Xml> = E4X.x(xml.child(nodeName=="node"));

// all of which are shortcuts for this filter expression
var nodes:Iterator<Xml> = E4X.x(xml.child(function(xml:Xml, _i:Int):Bool{return xml.nodeName=="node";}));

To get the text of a node use the text() method:

1
var nodes:Iterator<String> = E4X.x(xml.text());

To access descendants, use the “desc()” method, or the underscore shortcut:

1
2
3
4
5
var nodes:Iterator<Xml> = E4X.x(xml.desc());
// or
var nodes:Iterator<Xml> = E4X.x(xml._());
// or just
var nodes:Iterator<Xml> = E4X.x(xml._);

Here are some different ways to get a list of all the descendant nodes with the name “node”:

1
2
3
4
5
6
7
8
9
10
11
var xml:Xml;
var nodes:Iterator<Xml> = E4X.x(xml._("node"));

// or (for example)
var nodes:Iterator<Xml> = E4X.x(xml.desc("node"));

// or (using an expression which will be wrapped in a function call)
var nodes:Iterator<Xml> = E4X.x(xml.desc(nodeName=="node"));

// all of which are shortcuts for this filter expression
var nodes:Iterator<Xml> = E4X.x(xml.desc(function(xml:Xml):Bool{return xml.nodeName=="node";}));

Getting a list of descendants that have an “id” attribute would be done like this (the a(“id”) call acts like a filter):

1
2
3
4
5
6
7
var nodes:Iterator<Xml> = E4X.x(xml._(a("id")));

// which could also be written as
var nodes:Iterator<Xml> = E4X.x(xml._(a(attName=="id")));

// both of which will be expanded to
var nodes:Iterator<Xml> = E4X.x(xml._(a(function(attName:String, attValue:String, xml:Xml):Bool{return attName=="id";})));

Whereas, if you wanted to get the “id” attributes themselves, you could do this:

1
var nodes:Iterator<Hash<String>> = E4X.x(xml._.a("id"));

To get all of the ancestors of any nodes with an “id” attribute equal to “test”, you could do this:

1
2
3
var nodes:Iterator<Xml> = E4X.x(xml._(a("id")=="test").ances());
// or (a little less legible, but will perform slightly better)
var nodes:Iterator<Xml> = E4X.x(xml._(a(attName=="id" &amp;&amp; attValue=="test")).ances());

 

Comparison with AS3 E4X

Getting children with a specific node name (i.e. “node”)
AS3 E4X xmlRoot.node
Haxe E4X xmlRoot.node
Getting descendants with a specific node name (i.e. “node”)
AS3 E4X xmlRoot..node
Haxe E4X xmlRoot._(“node”)
Getting an attribute
AS3 E4X xmlRoot.@id
Haxe E4X xmlRoot.a(“id”)
Getting all descendants with a “id” attribute
AS3 E4X xmlRoot..(@id.length())
Haxe E4X xmlRoot._.(a(“id”))

Note that all of these examples should be wrapped in the E4X.x() call, as in the code snippets above.

 

Performance

I also ran some performance tests for several targets (and the equivalent tests in AS3 E4X), the results of which are below.
This helped me make some performance improvements to Oleg’s original code, and I managed to squeeze an extra 25-30% increase in performance out of it.

Surprisinigly, the JS target seems to perform best overall (although this is probably more as a result of Chrome’s JS engine).
Even after my improvements, the AS3 target was woefully slow in comparison to it’s native counterpart, although all of the other targets seemed to hold their own, with more complex expressions becoming faster than the AS3 E4X equivalent (if anyone knows why this performs so poorly, let me know).

AS3 E4X Hx > Flash Hx > JS Hx > C++ Hx > Neko
Get Children 0.00 0.21 0.03 0.22 0.03
Get Children With Attrib 0.08 1.02 0.24 0.10 0.31
Get Descendants 0.92 7.20 0.52 0.57 0.94
Get Descendant Text 1.48 19.40 1.97 1.83 3.22
Get Descendants by Name 2.33 11.05 0.24 0.51 1.42
Get Descendants with matched Attrib. 2.70 30.10 1.20 2.23 7.51
Measurements are in seconds per 1000 calls
JS tests done in Chrome 24 Win64

If anyone has any idea how to use the @ symbol in method names in haxe (without the compiler complaining), let me know and I’ll make attribute accessors match the spec.

I will be releasing this code as part of an upcoming haxe library called “xml-tools”, but until then, feel free to download the E4X class here.

Shout out if have any issues.