Skip to content

Instantly share code, notes, and snippets.

@wycats

wycats/sizzle.md Secret

Created May 21, 2012 01:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wycats/c1d6c49873bbba3589b7 to your computer and use it in GitHub Desktop.
Save wycats/c1d6c49873bbba3589b7 to your computer and use it in GitHub Desktop.

Sizzle

This document explains why Sizzle can't simply be replaced with querySelectorAll, and what browsers would need to do to make that possible.

API Surface

First of all, Sizzle is used as a general-purpose selector-related library for jQuery, so it has more features than querySelectorAll. You can think of it as a cleanup and superset of Selectors API Level 1. A number of the issues it addresses are addressed in Selectors API Level 2 and I'll call those out as appropriate.

This particular document is limited to document.querySelectorAll and document.findAll. A forthcoming document will cover additional Sizzle APIs.

API Note: XML

In general, Sizzle's methods work on both HTML and XML documents. In some cases, this requires additional shimming.

Sizzle(selector)

Similar native API: "Selectors 3" document.querySelectorAll(selector) Improved native API: "Selectors 4" document.findAll(selector)

Caveat 1: XML namespaces

Both Selectors API 1 and Selectors API 2 explicitly punt on resolving namespaces. jQuery does naïve namespace resolution that works for most cases.

If possible, Selectors API 2 should define namespace resolution.

Caveat 2: XML id and class

jQuery allows .className and #ident to be used in XML. .className is aliased to class~=className and #ident is aliased to id=ident.

Selectors 3 does not apply class to non-HTML documents, and both Selectors 3 and Selectors 4 leave the definition of #ident and .className to the document language. This is probably semantically correct.

  • Option 1: jQuery deprecates these aliases
  • Option 2: The browser allows these aliases in XML

Caveat 3: Positional Selectors

jQuery provides a series of positional selectors, like :eq and :first. The semantics of these selectors are defined in terms of a top-down search, when most selector engines (Sizzle included) use a mostly bottom-down approach. That said, positional selectors are extremely useful:

// all p's inside first div in the document
$("div:first p")

It would be very difficult for jQuery to deprecate positional selectors, although it might be possible.

It's also not so easy to see how positional selectors could be implemented with script-defined pseudo-selectors, but it might be possible if several additional APIs were added.

  • Option 1: jQuery deprecates position selectors
  • Option 2: jQuery leaves around Sizzle and uses it only when positional selectors are used (requires regex matching)
  • Option 3: Browsers provide enough API to implement positional selectors on top of Selectors 2

Caveat 4: Invalid Support for [href^=#]

Because hrefs starting with # are commonly used, jQuery has special support for the use of # in an attribute value where an identifier would normally be required.

In my opinion, this is a mistake, and jQuery should require a String here.

Caveat 5: Attribute Not Equal

jQuery has support for "attribute not equal" via a[hreflang!='en']. This is not present in either Selectors 3 or Selectors 4.

  • Option 1: jQuery could deprecate != and encourage people to use :not([hreflang=en])
  • Option 2: Selectors 4 could add support for attr!=val

Caveat 6: Aliases

jQuery has support for a number of pseudo-selectors that are essentially aliases for other pseudoselectors or attribute selectors.

For example, :button is an alias for button, input[type=button]. Most of these aliases are for convenience when working with forms (see http://api.jquery.com/category/selectors/form-selectors/). The :header alias is an alias for h1, h2, h3, h4, h5, h6.

Some aliases are just for clarity. For example, :selected is allowed as an alias for :checked. The :parent selector is an alias for :not(:empty)

jQuery also provides :visible and :hidden aliases that reflect whether an element takes up space in the document. It also provides an :animated selector that reflects whether an element is in the process of being animated.

The strategies for dealing with these issues differ based on the alias.

For aliases that map directly onto existing selectors or lists of selectors:

  • Option 1: jQuery could deprecate the aliases
  • Option 2: The browser could provide a mechanism for defining pseudo-selector aliases.

I personally would prefer Option 2. It would also provide a mechanism for some amount of forwards-compatibility, as Selectors 5+ features that could be implemented in terms of pure aliases could be backported to Selectors 4 browsers.

For aliases that require JavaScript help, but reflect general-purpose concerns (:visible and :hidden):

  • Option 1: jQuery could deprecate the aliases
  • Option 2: Selectors 4 could add support for either :visible, and jQuery could use the alias feature above to map :hidden to :not(visible)
  • Option 3: Selectors 4 could add a mechanism for defining pseudo-selectors via script. The specific API would need to be discussed in much more detail.

For aliases that require JavaScript help, and refer to jQuery concerns (:animated):

  • Option 1: jQuery could deprecate the aliases
  • Option 2: Allow pseudo-selectors to be defined via script

Note Selectors 4 may want to add an :animated pseudoselector that reflects CSS-initiated animations.

Caveat 7: The :has Pseudoselector

jQuery allows the use of the :has() pseudoselector to select elements with descendents matching a particular selector. It supports selector lists, and is allowed inside :not.

This general pattern is now supported in Selectors 4:

// in jQuery
$("div:has(h1, p.title)")

// in Selectors 4
document.findAll("div! :matches(h1, p.title)")

The fact that this patterns is possible in Selectors 4 is very exciting. I personally consider the :has version to be somewhat clearer.

  • Option 1: jQuery could deprecate :has
  • Option 2: Selectors 4 could add support for :has
  • Option 3: A script-defined pseudoselector mechanism might allow :has to be implemented in terms of that mechanism

Caveat 8: The :contains Pseudoselector

jQuery has support for the :contains pseudoselector. This pseudo has been under consideration for Selectors 3 in the past, but was removed due to performance consideration that mostly apply to usage in style sheets.

  • Option 1: jQuery could deprecate :contains
  • Option 2: Selectors 4 could add support for :contains
  • Option 3: Selectors 4 could define batch processors extensions that work in Selectors API but not in stylesheets that includes :contains and possibly other expensive selectors
  • Option 4: A script-defined pseudoselector mechanism might allow :contains to be implemented in terms of that mechanism

Caveat 9: The :not Pseudoselector

jQuery's implementation of :not is aligned with Selectors 4's more expansive definition. This "caveat" is just to say that efforts to remove the more expansive definition before Selectors 4 was finalized would stymy this effort.

Caveat 10: Performance

jQuery looks for simple selectors that map directly onto older primitives (getElementById, getElementsByTagName, getElementsByClassName) and uses the old primitives directly.

It also optimizes the case of findAll("body"), as people do it a lot, and you don't actually need to do a query of any kind to find the body.

The case of getElementById is a bit murky because jQuery's optimization is technically non-conforming with Selectors 4:

Document languages may contain attributes that are declared to be of type ID. What makes attributes of type ID special is that no two such attributes can have the same value in a conformant document, regardless of the type of the elements that carry them; whatever the document language, an ID typed attribute can be used to uniquely identify its element. In HTML all ID attributes are named "id"; XML applications may name ID attributes differently, but the same restriction applies.

jQuery's implementation essentially assumes that documents ar conforming, while browsers do not. The specifics notwithstanding, browsers should optimize these common simple selectors.

@paulirish
Copy link

While it is a performance fix more than a bug fix, the check to see if it's a simple ID selector and reroute to getElementByID is easy for jQuery, but WebKit still doesn't implement this optimization. WebKit runs document.querySelectorAll('#foo') through all the same codepaths as a complex selector. Unsure about the other engines.

@wycats
Copy link
Author

wycats commented May 21, 2012

Added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment