XML Parsing Performance Benchmark of VTD-XML 1.5

Objective

Since its initial release, VTD-XML has undergone several rounds of improvement. This report show how well VTD-XML fares against some of the well-known XML parsers. The old version of the benchmark can be found here.

Testing Methodology

Hardware

Software

Benchmark Apps

    For DOM and VTD-XML, the benchmark programs generate hierarchical structures.

    For SAX and PULL parsers, the benchmark programs scan over the entire documents without any processing logic.

Notes on Performance Tuning and Performance Measurement

    For performance numbers, all benchmark programs first loop thru the parsing code a number of iterations so the server JVM compile them into native code to obtain optimal performance, before the real measurement of parsing throughput and latency starts.

    It should be noted that comparing VTD-XML with SAX or PULL is not really fair comparisons: VTD-XML allows random access; SAX and Pull are forward only.

   A wide selection of XML files, ranging from very small (1k) to big (15MB)  are chosen and grouped into small (<10k), medium sized (<1M), and big.

   Benchmark programs for measuring parsing performance can be downloaded below:

  Benchmark programs for measuring memory usage can be downloaded below

 Benchmark programs for doing node iteration can be downloaded below

  XML files used in the benchmark can be downloaded here.

Parsing Performance

Throughput Comparison

 

Latency Comparison

File name/size VTD-XML (ms) VTD-XML buffer reuse (ms) SAX (ms) DOM(ms) DOM deferred(ms) Piccolo (ms) Pull (ms)
soap2.xml (1727 bytes) 0.0446 0.0346 0.0782 0.1122 0.16225 0.092 0.066
nav_48_0.xml (4608 bytes) 0.1054 0.0928 0.266 0.37 0.385 0.2784 0.1742
cd_catalog.xml (5035 bytes) 0.118 0.108 0.19 0.348 0.4 0.2 0.214
nav_63_0.xml (6848 bytes) 0.149 0.135 0.354 0.513 0.557 0.484 0.242
nav_78_0.xml (6920 bytes) 0.153 0.142 0.3704 0.588 0.52 0.42 0.29
File name/size VTD-XML (ms) VTD-XML buffer reuse (ms) SAX (ms) DOM(ms) DOM deferred(ms) Piccolo (ms) Pull (ms)
nav_50_0.xml (10304 bytes) 0.2 0.185 0.55 0.802 0.773 0.701 0.398
officeOrder.xml (10591 bytes) 0.186 0.174 0.41 0.617 0.615 0.526 0.432
form.xml (15845 bytes) 0.274 0.258 0.227 0.214 0.486 0.773 0.921
book.xml (22996 bytes) 0.368 0.354 0.743 2.391 2.046 0.843 0.857
soap_small.xml (26734 bytes) 0.58 0.563 1.221 3.825 3.068 1.346 1.137
cd.xml (30831 bytes) 0.569 0.549 1.205 5.092 4.376 1.211 1.362
bioinfo.xml (34759 bytes) 0.529 0.517 1.068 4.126 4.366 1.188 1.33
soap_mid.xml (134334 bytes) 2.885 2.804 6.028 32.846 21.896 6.668 5.828

 

File name/size VTD-XML (ms) VTD-XML buffer reuse (ms) SAX (ms) DOM(ms) DOM deferred(ms) Piccolo (ms) Pull (ms)
po1m.xml (1.01 MB) 25.71 20.08 36.4 186.16 115.67 47.62 63.27
soap.xml (2.59 MB) 64.7 57.27 123.18 502.32 380.74 134.8 393.96
bioinfo_big.xml (4.27 MB) 70.1 73.9 131.8 629.1 442.02 151.62 177.64
SUAS.xml (13.13 MB) 359.91 315.24 665.36 1961.01 1296.08 820.38 637.72
address.xml (15.24 MB) 315.06 276 658.56 2158.5 1822.22 617.48 684.57

 

Memory Usage

Because SAX and Pull do not build data structures in memory, so the meaningful comparison is between DOM and VTD-XML. To that end, we benchmark  the multiplying factor which is the ratio between the memory usage and the document size.

Navigation Performance

The goal for this part is to benchmark the performance of the XML parsers visiting every single node after finishing building the hierarchical structure.

Small Files

File name/size VTD-XML (ms) DOM(ms)
soap2.xml (1727 bytes) 0.00671 0.00676
nav_48_0.xml (4608 bytes) 0.028 0.0155
cd_catalog.xml (5035 bytes) 0.0388 0.0385
nav_63_0.xml (6848 bytes) 0.0431 0.0238
nav_78_0.xml (6920 bytes) 0.043 0.0244

Mid-Sized Files

File name/size VTD-XML (ms) DOM(ms)
nav_50_0.xml (10304 bytes) 0.063 0.034
officeOrder.xml (10591 bytes) 0.0788 0.051
form.xml (15845 bytes) 0.065 0.046
book.xml (22996 bytes) 0.149 0.144
soap_small.xml (26734 bytes) 0.225 0.193
cd.xml (30831 bytes) 0.226 0.3
bioinfo.xml (34759 bytes) 0.236 0.178
soap_mid.xml (134334 bytes) 1.61 1.151

Large Files

File name/size VTD-XML (ms) DOM(ms)
po1m.xml (1.01 MB) 11.19 10.84
soap.xml (2.59 MB) 32.84 35.44
bioinfo_big.xml (4.27 MB) 30.43 38.26
SUAS.xml (13.13 MB) 21.43 21.82
address.xml (15.24 MB) 132.18 130.8

 

Conclusion

As the next generation XML parser, VTD-XML delivers compelling improvement in both memory usage and parsing performance. Moreover, its performance and memory usage benefits apply to all file sizes and purposes. So it should satisfy even the most demanding XML processing needs and enable new and exciting XML applications.