VTD-XML Benchmark Report for Version 2.3

(Part II) Parsing Based Update Performance

Objective

This benchmark reports some of the latest performance numbers of VTD-XML version 2.3 and compare them to Xerces 2.7.1.

Methodology Discussion

What are measured?

This benchmark report focuses on the combination of three facets of XML content update performance: parsing, XPath evaluation, and XML modification.

How to measure?

The benchmark code performs the following steps:

  1. Read XML file into an in-memory buffer.
  2. Parse the document.
  3. Evaluate a single pre-compiled XPath expression.
  4. Remove the node in the result set from the XML.
  5. Write the updated document out into an byteArrayOutputStream.

All benchmark programs first loop thru the parsing code a number of iterations so the server JVM compile them into native code to obtain optimal performance, before the real measurement of the average latency combining parsing, XPath evaluation and output generation.

Test Setup

Hardware

  • Processor:  Core2 Duo T9300 2.5GHz (6MB L2 integrated cache).
  • Memory: 3GB 800Mhz FSB

Software

  • Windows Vista
  • JDK version 1.6.0; default set to server JVM.
  • XML parsers: Xerces DOM (with and without node expansion) and SAX version 2.7.1, VTD-XML 2.3 (with and without buffer reuse)

The XML files

Three XML files of similar structure, but different sizes, are used for the test.

<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
    <shipTo country="US">
        <name>Alice Smith</name>
        <street>123 Maple Street</street>
        <city>Mill Valley</city>
        <state>CA</state>
        <zip>90952</zip>
    </shipTo>
    <billTo country="US">
        <name> Robert Smith </name>
        <street>8 Oak Avenue</street>
        <city>Old Town</city>
        <state>PA</state>
        <zip>95819</zip>
    </billTo>
    <comment>Hurry, my lawn is going wild!</comment>
    <items>
        <item partNum="872-AA">
            <productName>Lawnmower</productName>
            <quantity></quantity>
            <USPrice>148.95</USPrice>
            <comment>Confirm this is electric</comment>
        </item>
        <item partNum="926-AA">
            <productName>Baby Monitor</productName>
            <quantity>1</quantity>
            <USPrice>39.98</USPrice>
            <shipDate>1999-05-21</shipDate>
        </item>
        ...
    </items>
</purchaseOrder>

The respective file sizes are:

  • "po_small.xml" ----  6780 bytes
  • "po_medium.xml" ---- 112,238 bytes
  • "po_big.xml" -----   1,219,388 bytes
  • "po_huge.xml" ----- 9,907,759 bytes

The following XPath expressions are used for the test

  • /*/*/*[position() mod 2 = 0]
  • /purchaseOrder/items/item[USPrice<100]
  • /*/*/*/quantity/text()
  • //item/comment
  • //item/comment/../quantity

The Benchmark Code

DOM Code

VTD-XML Code

Document d = null;
k = total;
while (k > 0) {
        bais.reset();
        d = parser.parse(bais);
        NodeList nodeList = (NodeList) xPathExpression.evaluate(d,
        XPathConstants.NODESET);
        //System.out.println("# of nodes ==>" + nodeList.getLength());

        // remove nodes from DOM tree
        for (int z = 0; z < nodeList.getLength(); z++) {
            nodeList.item(z).getParentNode().removeChild(
            nodeList.item(z));
        }
        baos.reset();
        tf.transform(new DOMSource(d), new StreamResult(baos));
        k--;
}
long l=0,lt=0;
for (int j = 0; j < 10; j++) {
        l = System.currentTimeMillis();
        for (int i = 0; i < total; i++) {
            bais.reset();
            d = parser.parse(bais);
            NodeList nodeList = (NodeList) xPathExpression.evaluate(d,
            XPathConstants.NODESET);
            //System.out.println("# of nodes ==>" + nodeList.getLength());

            // remove nodes from DOM tree
            for (int z = 0; z < nodeList.getLength(); z++) {
                nodeList.item(z).getParentNode().removeChild(
                nodeList.item(z));
            }
            baos.reset();
            tf.transform(new DOMSource(d), new StreamResult(baos));
}
long l2 = System.currentTimeMillis();
lt = lt + (l2 - l);
}
System.out.println(" average combined latency ==> "
+ ((double) (lt) / total / 10) + " ms");

 

k=total;
VTDNav vn = null;
while(k>0){
        vg.setDoc(b);
        vg.parse(true);
        vn = vg.getNav();
        ap.bind(vn);
        xm.bind(vn);
        while((i1=ap.evalXPath())!=-1){
            xm.remove();
        }
        xm.output(baos);
        ap.resetXPath();
        xm.reset();
        baos.reset();
        k--;
}

for (int j=0;j<10;j++){
        l = System.currentTimeMillis();
        for(int i = 0;i<total;i++)
        {
            vg.setDoc(b);
            vg.parse(true);
            vn = vg.getNav();
            ap.bind(vn);
            xm.bind(vn);
            while((i1=ap.evalXPath())!=-1){
                xm.remove();
            }
            xm.output(baos);
            ap.resetXPath();
            xm.reset();
            baos.reset();
        }
        long l2 = System.currentTimeMillis();
        lt = lt + (l2 - l);
        //System.out.println("j ==> "+j);
}
System.out.println(" average combined latency ==> "+
((double)(lt)/total/10) + " ms");

 

Results

Absolute Latency Comparison

/*/*/*[position() mod 2 = 0]

  DOM deferred (ms) DOM full (ms) VTD-XML (ms) VTD-XML buffer reuse (ms)
po_small.xml  (6,780 bytes)  1.0919 1.014 0.102 0.0938
po_medium.xml  (112,238 bytes) 9.98625 8.867 1.3475 1.308
po_big.xml (1,060,823 bytes)  112.05 95.825 14.47 14.251
po_huge.xml (9,907,759 bytes)  1456.3 1331.63 119.74 123.7

/purchaseOrder/items/item[USPrice<100]

  DOM deferred (ms) DOM full (ms) VTD-XML (ms) VTD-XML buffer reuse (ms)
po_small.xml (6,780 bytes) 1.212 1.12 0.126 0.1165
po_medium.xml  (112,238 bytes) 11.91575 10.925 1.863 1.78275
po_big.xml (1,060,823 bytes)  137.6666667 116.4 19.83 18.996
po_huge.xml (9,907,759 bytes)  1634.25 1452.17 159.53 158.57

/*/*/*/quantity/text()

  DOM deferred (ms) DOM full (ms) VTD-XML (ms) VTD-XML buffer reuse (ms)
po_small.xml (6,780 bytes) 2.416 1.917 0.187 0.176
po_medium.xml  (112,238 bytes) 36.104 28.646 2.841 2.766
po_big.xml (1,060,823 bytes)  319.81 347.065 38.355 31.511
po_huge.xml (9,907,759 bytes)  1773.59 1597.37 159.53 158.57

//item/comment

  DOM deferred (ms) DOM full (ms) VTD-XML (ms) VTD-XML buffer reuse (ms)
po_small.xml (6,780 bytes) 2.488 2.058 0.25 0.237
po_medium.xml  (112,238 bytes) 38.948 31.365 3.738 3.637
po_big.xml (1,060,823 bytes)  361.52 375.741 47.101 42.126
po_huge.xml (9,907,759 bytes)  1833.15 1661.23 208.66 210.4

//item/comment/../quantity

  DOM deferred (ms) DOM full (ms) VTD-XML(ms) VTD-XML buffer reuse (ms)
po_small.xml (6,780 bytes) 2.531 2.067 0.259 0.248
po_medium.xml  (112,238 bytes) 41.212 32.832 3.98 3.865
po_big.xml (1,060,823 bytes)  395.87 383.85 49.355 44.413
po_huge.xml (9,907,759 bytes)  1872.19 1667.06 224.07 219.81

 

Graphical View of Relative Latency Comparison

/*/*/*[position() mod 2 = 0]

/purchaseOrder/items/item[USPrice<100]

/*/*/*/quantity/text()

//item/comment

//item/comment/../quantity