iQ, meet DOM

I’m spending some time on iQ today.

I’m trying to stay mindful of performance impacts.

  1. I anticipate users may want to work on large iXBRL documents. MassiveDynamic is 2.8MB.
  2. Or users may leverage iXBRL document sets, that they should like to query as a single document
  3. Or iQ tools might make it possible to stitch together disparate iXBRL documents (i.e. Google vs Apple)

iQ is document-based. It’s not supposed to handle database-sized quantities of data. But I don’t want to be short-sighted.
I wanted to apply what I’ve learned recently about document.querySelectorAll.

The results are enlightening!

  • It is a method which accepts CSS selector(s) and “compiled code” in the heart of the browser returns a NodeList
  • It’s like jQuery, except that jQuery’s source code is firstly “Javascript code” (not compiled code), but it may try to route to compiled code
    • Before it can route your selector something like document.querySelectorAll.

The two functions below are run against the Massive Dynamic iXBRL sample document:

Both functions use a startDate/endDate strategy to measure the speed of a selector method

  1. The first uses jQuery, in the form of $
  2. The second uses document.querySelectorAll
    1. (Notice that the colon which separates the prefix from the local part, must be escaped with a backslash (for the selector’s sake). Which itself must be escaped with a backslash (for Javascript’s sake)!)
function test1() { 
var startDate = new Date(); 
var endDate = new Date(); 

function test2() { 
var startDate = new Date(); 
var endDate = new Date(); 

The results (on Chrome 27.0.1453.116 m) of a single test:

  • test1:
    • Uses jQuery
    • 7-8 ms
  • test2:
    • Uses document.querySelectorAll
    • 2-3ms

That means document.querySelectorAll is 66% faster! (Update: See below. This is exaggerated. I need to do more scientific tests!)

Increasing the number of tests showed that it was usually between 33 and 66% faster. Still substantial!

What does this mean for iQ? iQ will have many “financial” methods — helping users find values with certain elements, or dates, or precisions.

Can/should these be mapped to CSS selectors?

  • It would require “prepping” the DOM beforehand.
    • For instance: putting dates directly onto iXBRL elements
    • A context pointing to 
      • <xbrli:instant>2013-07-31</xbrli:instant>
    • Becomes an attribute on the value, like
      • data-date=”2013-07-01″
  • And mapping the “financial” methods; like{gt: ‘2013-06-30’}) (all values greater than June 30, 2013) to a corresponding CSS selector.
    • For instance, filtering an array of contexts in the iXBRL document,
      •  [‘2012-12-31’, ‘2013-03-31’, ‘2013-07-01’, ‘2013-09-30’, ‘2013-12-31’]
    • removing those which don’t meet the criteria to reduce the array:
      • [ ‘2013-07-01’, ‘2013-09-30’, and ‘2013-12-31’]
    • and then forming a CSS selector
      • ‘[data-date=”2013-07-01″], [data-date=”2013-09-30″], [data-date=”2013-12-31″]’

In summary, this approach requires:

  1. prepping the DOM; mapping iXBRL syntax (like contexts) into DOM/CSS-understandable syntax (like data-attributes)
  2. filtering separated arrays of characteristics; date arrays, member arrays, etc.
    1. And I guess this isn’t necessary for elements, since they’re already DOM/CSS-understandable; they’re an attribute:
      1. name=”us-gaap:Assets”

Interesting. Any thoughts?


querySelectorAll Overloads

document.querySelectorAll has two “signatures” (it’s “overloaded” in a Javascript sense!)

  1. The first overload accepts one string which is a CSS selector
  2. The second overload accepts an array of strings, which is each a CSS selector, and joining the results of processing each one

Knowing this, I wondered about the performance difference between these two, knowing that they would produce the same result:

  1. document.querySelectorAll(‘div,p,span’)
  2. document.querySelectorAll([‘div’, ‘p’, ‘span’])

In my case, I wasn’t looking for ‘div’, ‘p’, and ‘span’, but for iXBRL elements:

  1. “ix\:nonFraction”
  2. “ix\:nonNumeric”
  3. “ix\:denominator”
  4. “ix\:fraction”
  5. “ix\:numerator”
  6. “ix\:tuple”

I did the first method with a single comma-delimited string, and the second method with the array.

I ran this test 100 times, for each set of results seen below. I bolded the “winner”

  • array average 7.15
  • string average 7.39
  • ——————-
  • array average 7.53
  • string average 7.46
  •  ——————-
  • array average 6.88
  • string average 6.95
  •  ——————-
  • array average 7.03
  • string average 7.05
  •  ——————-
  • array average 7.36
  • string average 7.16

The results suggest neither has a strong performance benefit.

This does not surprise me. The difference is a matter of splitting or joining on a comma. That is trivial overhead, one way or the other.

If pressed for performance, or if my implementation made it easier (I discuss filtering arrays, above…), I would use the array.

jQuery Nodelist Constructor

I was curious if jQuery objects could be created from NodeLists (since they can be created from Nodes)

In fact, this may account for the difference between $(‘…’) and document.querySelectorAll(‘…’); the former may just return $(resultsOfTheLatter)

It’s as easy as using the jQuery Constructor:

var nodeList = document.querySelectorAll(‘ix\:nonFraction’),

$nonFractions = $(nodeList);

And in 100 tests, converting a NodeList to a jQuery object took

0.035 (or 3.5%)

of the time it took to create the NodeList itself.

So I could do it and still save substantial performance (vs 33%) over using jQuery with a selector directly.

Browser Differences, Namespaces, and Scientific Methods

I realized that Browsers may behave differently with regards to the two selector approaches. Maybe I could try browserstack or jsperf.

I also realize the prefix may make part of the difference, even though querySelectorAll can handle them natively.

I’m doing some tests now in Firefox’s console, and jQuery’s performance is not quite so bad.

  • jQuery 49 ms to find the comma-delimited string of ix elements defined above
  • document.querySelectorAll 43 ms to do the same
  • document.querySelectorAll 41 ms to do the same with the raw list of ix elements (before comma-joining)
Tagged , , , , , , ,

3 thoughts on “iQ, meet DOM

  1. Nate-O! I have been, all be it a novice, parsing more and more data; either through XBRL RSS or some random DOM of a website that has the data I want for my analysis. Having not thought about or opined on ix as much as you, the epiphany that I would be parsing the DOM/CSS not the element is going to make my life easier in many ways. I will have to parse to the iXBRL elements, so I can contribute to this blog in a meaningful way. Thank you for the education in querySelectorAll(‘…’). Alo your first link iQ takes me to edit your wordpress.

    I am not sure I understand the entirety of this blog, but, I am a big fan and I will continue to struggle to wrap my head around these brilliant and progressive posts.

  2. redpeas says:

    Thanks, Jordan. I’m glad you are making more out of the data! Unfortunately I’m doing a lot of pontificating, not a lot of parsing! Thank you for pointing out the broken link. I think I’ve fixed it.

    At some point I’ll make more progress on iQ. This blog does not make much sense. And now I realize I should be more scientific about the tests. And I still haven’t decided what approach makes the most sense!

    Someday, I hope!.

    X-Be-R-Well, Nate

  3. […] methods accept a string with a CSS3 selector; like the native Javascript document.querySelectorAll(). In this case, they select the […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: