Last active
August 29, 2015 13:57
-
-
Save wppurking/9583555 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'nokogiri' | |
html = %q( | |
<td class="data-display-field" width="60%"> | |
<a href="https://sellercentral.amazon.co.uk/gp/orders-v2/contact?ie=UTF8&buyerID=ABAZTC59A7Z12&orderID=205-7265948-3023534">Syed Ali</a> | |
<span id="_myo_buyerEmail_progressIndicator" style="vertical-align: middle; display: none;"> | |
<img src="https://images-na.ssl-images-amazon.com/images/G/02/rainier/ajax/snake._V192262569_.gif" id="_myo_buyerEmail_loadingBar" style="display:inline"> | |
</span> | |
<b id="_myo_buyerEmail_showRepeatOrders" buyeremail="hbbxb99wq139140@marketplace.amazon.co.uk" class="tiny"> </b> | |
</td>) | |
doc = Nokogiri::HTML(html) | |
# 使用 at_css 其实就是 css('xxx').first 取第一个元素. at_css 返回值可能为 nil, 但 css 返回值一定为 NodeSet 只是 size 可能为 0 | |
span = doc.at_css('#_myo_buyerEmail_progressIndicator') | |
# 我想拿 buyeremail, 使用 next_element | |
b = span.next_element['buyeremail'] # hbbxb99wq139140@marketplace.amazon.co.uk | |
# 拿到空白节点, 因为在 空白, 换行也是一个节点, 类型为 Text. 而 next_element 则会跳过这些 Text 空白节点. | |
blank = span.next # #(Text "\n ") | |
# 当然也可以手动走过空白节点 | |
span.next.next['buyeremail'] == span.next_element['buyeremail'] # true | |
# 查看父节点 | |
img = doc.at_css('img') | |
img.parent['id'] # _myo_buyerEmail_progressIndicator | |
# 更多相关 API 可以查看 https://github.com/sparklemotion/nokogiri/wiki/Cheat-sheet#working-with-a-nokogirixmlnode |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment