Skip to content

Instantly share code, notes, and snippets.

@fukata
Last active October 15, 2018 06:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fukata/1f11619c45c16b8f928eb28557b1b3c1 to your computer and use it in GitHub Desktop.
Save fukata/1f11619c45c16b8f928eb28557b1b3c1 to your computer and use it in GitHub Desktop.
WebScraper - transform機能

レシピ

recipe:
  - url: 'https://fukata.org'
    steps:
      - key: title
        dom: 'html > head > title'
        action: get_text
        transform:
          - translate:
              from: ja_JP
              to:
                - en_US
                - ch_ZH
      - key: date_str
        dom: '#publish_at'
        action: get_text
        transform:
          - regex:
              re: '/([0-9]{4})年([0-9]{1,2})月([0-9]{1,2})日/'
              output: '\1-\2-\3'

出力

{
  "title": "オリジナルタイトル",
  "title_en_US": "英語に翻訳されたもの",
  "title_ch_ZH": "中国語に翻訳されたもの",
  "date_str": "2018年10月15日",
  "date_str_regex": "2018-10-15",
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment