Skip to content

Instantly share code, notes, and snippets.

@kangax kangax/gist:94ea9cade0cebfb16c02 Secret
Last active Sep 24, 2015

Embed
What would you like to do?
  1. Let s be the empty string.
  2. For each descendant of the context node in tree order:
  3. If the node is a Text node: 1. If the CSS "white-space" property of node's parent is "normal":
    1. Let collapsed_s be the value of node data.
    2. Replace each sequence of 1+ whitespace characters in collapsed_s with single whitespace character.
    3. Append collapsed_s to s. 1. Otherwise, append node data to s. 2. If the node's parent is any of <td>, <th> elements, append "\t" to s.
  4. If the node is an Element node: 1. If an element is any of <script>, <style>, <link>, <canvas>, proceed to the next node. 1. If an element is hidden (that is, its CSS "display" property is set to "none"), proceed to the next node. 1. If an element is a block-styled element (that is, its CSS "display" property is set to one of: "block", "list-item", "table", "table-caption", "table-row"):
    1. append "\n" to s.
    2. For each of the descendant's child nodes in tree order, perform the same algorithm recursively.
    3. append "\n" to s. 1. If an element is a <br> element, append "\n" to s.
  5. Trim s (that is, remove leading and trailing whitespaces).
  6. Return s.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.