VTD-XML Benchmark Report for Version 2.3

(Part I) Indexing based Update Performance

Objective

This benchmark reports indexing-related performance numbers of VTD-XML version 2.3 and compare them to Xerces 2.7.1.

Methodology Discussion

How to measure?

The benchmark code performs the following steps:

  1. Load in-memory XML Index file into a VTDNav instance
  2. Evaluate a single pre-compiled XPath expression.
  3. Remove the node in the result set from the XML.
  4. Write the updated document out into a ByteArrayOutputStream object.

All benchmark programs first loop thru the parsing code a number of iterations so the server JVM compile them into native code to obtain optimal performance, before the real measurement of the average latency combining index-loading, XPath evaluation and output generation.

Test Setup

Hardware

  • Processor:  Core2 Duo T9300 2.5GHz (6MB L2 integrated cache).
  • Memory: 3GB 800Mhz FSB

Software

  • Windows Vista
  • JDK version 1.6.0; default set to server JVM.
  • XML parsers: Xerces DOM (with and without node expansion) and SAX version 2.7.1, VTD-XML 2.3 (with and without buffer reuse)

The XML files

Three XML files of similar structure, but different sizes, are used for the test.

<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
    <shipTo country="US">
        <name>Alice Smith</name>
        <street>123 Maple Street</street>
        <city>Mill Valley</city>
        <state>CA</state>
        <zip>90952</zip>
    </shipTo>
    <billTo country="US">
        <name> Robert Smith </name>
        <street>8 Oak Avenue</street>
        <city>Old Town</city>
        <state>PA</state>
        <zip>95819</zip>
    </billTo>
    <comment>Hurry, my lawn is going wild!</comment>
    <items>
        <item partNum="872-AA">
            <productName>Lawnmower</productName>
            <quantity></quantity>
            <USPrice>148.95</USPrice>
            <comment>Confirm this is electric</comment>
        </item>
        <item partNum="926-AA">
            <productName>Baby Monitor</productName>
            <quantity>1</quantity>
            <USPrice>39.98</USPrice>
            <shipDate>1999-05-21</shipDate>
        </item>
        ...
    </items>
</purchaseOrder>

The respective file sizes are:

  • "po_small.xml" ----  6780 bytes
  • "po_medium.xml" ---- 112,238 bytes
  • "po_big.xml" -----   1,219,388 bytes
  • "po_huge.xml" ----- 9,907,759 bytes

The following XPath expressions are used for the test

  • /*/*/*[position() mod 2 = 0]
  • /purchaseOrder/items/item[USPrice<100]
  • /*/*/*/quantity/text()
  • //item/comment
  • //item/comment/../quantity

The Benchmark Code

DOM Code

VTD-XML Code

Document d = null;
k = total;
while (k > 0) {
        bais.reset();
        d = parser.parse(bais);
        NodeList nodeList = (NodeList) xPathExpression.evaluate(d,
        XPathConstants.NODESET);
        //System.out.println("# of nodes ==>" + nodeList.getLength());

        // remove nodes from DOM tree
        for (int z = 0; z < nodeList.getLength(); z++) {
            nodeList.item(z).getParentNode().removeChild(
            nodeList.item(z));
        }
        baos.reset();
        tf.transform(new DOMSource(d), new StreamResult(baos));
        k--;
}
long l=0,lt=0;
for (int j = 0; j < 10; j++) {
        l = System.currentTimeMillis();
        for (int i = 0; i < total; i++) {
            bais.reset();
            d = parser.parse(bais);
            NodeList nodeList = (NodeList) xPathExpression.evaluate(d,
            XPathConstants.NODESET);
            //System.out.println("# of nodes ==>" + nodeList.getLength());

            // remove nodes from DOM tree
            for (int z = 0; z < nodeList.getLength(); z++) {
                nodeList.item(z).getParentNode().removeChild(
                nodeList.item(z));
            }
            baos.reset();
            tf.transform(new DOMSource(d), new StreamResult(baos));
}
long l2 = System.currentTimeMillis();
lt = lt + (l2 - l);
}
System.out.println(" average combined latency ==> "
+ ((double) (lt) / total / 10) + " ms");

 

k=total;
VTDNav vn = null;
while(k>0){
        vn = vg.loadIndex(ba); 
        ap.bind(vn);
        xm.bind(vn);
        while((i1=ap.evalXPath())!=-1){
            xm.remove();
        }
        xm.output(baos);
        ap.resetXPath();
        xm.reset();
        baos.reset();
        k--;
}

for (int j=0;j<10;j++){
        l = System.currentTimeMillis();
        for(int i = 0;i<total;i++)
        {
            vn = vg.loadIndex(ba); 
            ap.bind(vn);
            xm.bind(vn);
            while((i1=ap.evalXPath())!=-1){
                xm.remove();
            }
            xm.output(baos);
            ap.resetXPath();
            xm.reset();
            baos.reset();
        }
        long l2 = System.currentTimeMillis();
        lt = lt + (l2 - l);
        //System.out.println("j ==> "+j);
}
System.out.println(" average combined latency ==> "+
((double)(lt)/total/10) + " ms");

 

Results

Absolute Latency Comparison

/*/*/*[position() mod 2 = 0]

  DOM defered (ms) DOM full (ms) VTD-XML Indexing (ms)
po_small.xml  (6,780 bytes) 1.09 1.00515 0.0398
po_medium.xml  (112,238 bytes) 10.02 8.7095 0.465
po_big.xml (1,060,823 bytes)  110.43 94.191 4.993
po_huge.xml (9,907,759 bytes)  1442.62 1305.76 47.02

/purchaseOrder/items/item[USPrice<100]

  DOM defered (ms) DOM full (ms) VTD-XML Indexing (ms)
po_small.xml (6,780 bytes) 1.2 1.1 0.064
po_medium.xml  (112,238 bytes) 11.95 10.65 0.888
po_big.xml (1,060,823 bytes)  133.64 111.57 9.555
po_huge.xml (9,907,759 bytes)  1651.1 1466.54 84.725

/*/*/*/quantity/text()

  DOM defered (ms) DOM full (ms) VTD-XML Indexing (ms)
po_small.xml (6,780 bytes) 1.18 1.1 0.056
po_medium.xml  (112,238 bytes) 12.08 11.195 0.84
po_big.xml (1,060,823 bytes)  140.59 121.39 9.08
po_huge.xml (9,907,759 bytes)  1651.1 1584.82 83.39

//item/comment

  DOM defered (ms) DOM full (ms) VTD-XML Indexing (ms)
po_small.xml (6,780 bytes) 1.29 1.19 0.091
po_medium.xml  (112,238 bytes) 13.27 12.47 1.32
po_big.xml (1,060,823 bytes)  152.95 137.94 15.13
po_huge.xml (9,907,759 bytes)  1812 1645.15 130.95

//item/comment/../quantity

  DOM defered (ms) DOM full (ms) VTD-XML Indexing (ms)
po_small.xml (6,780 bytes) 1.25 1.21 0.0982
po_medium.xml  (112,238 bytes) 13.69 12.42 1.47
po_big.xml (1,060,823 bytes)  165.7 146.77 17.585
po_huge.xml (9,907,759 bytes) 1841.8 1678.94 144.4

 

Graphical View of Relative Latency Comparison

/*/*/*[position() mod 2 = 0]

/purchaseOrder/items/item[USPrice<100]

/*/*/*/quantity/text()

//item/comment

//item/comment/../quantity