Skip to content

Instantly share code, notes, and snippets.

@hiroyuki-sato
Created July 7, 2016 06:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hiroyuki-sato/086ea654326318e1c6ab302748cfe20b to your computer and use it in GitHub Desktop.
Save hiroyuki-sato/086ea654326318e1c6ab302748cfe20b to your computer and use it in GitHub Desktop.
digdag and Liquid
.
|-- _test.yml.liquid
|-- config.yml.liquid
|-- csv
|   `-- sample_01.csv.gz
`-- hoge.dig

hoge.dig

timezone: UTC

+step1:
  embulk>: ./config.yml.liquid

config.yml.liquid

in:
{% include 'test' %}
out: {type: stdout}

_test.yml.liquid

#
  type: file
  path_prefix: /private/tmp/hoge/csv/sample_
  decoders:
  - {type: gzip}
  parser:
    charset: UTF-8
    newline: CRLF
    type: csv
    delimiter: ','
    quote: '"'
    escape: '"'
    null_string: 'NULL'
    trim_if_not_quoted: false
    skip_header_lines: 1
    allow_extra_columns: false
    allow_optional_columns: false
    columns:
    - {name: id, type: long}
    - {name: account, type: long}
    - {name: time, type: timestamp, format: '%Y-%m-%d %H:%M:%S'}
    - {name: purchase, type: timestamp, format: '%Y%m%d'}
    - {name: comment, type: string}
embulk preview -G config.yml.liquid 
2016-07-07 15:05:18.423 +0900: Embulk v0.8.9
2016-07-07 15:05:20.069 +0900 [INFO] (0001:preview): Listing local files at directory '/private/tmp/hoge/csv' filtering filename by prefix 'sample_'
2016-07-07 15:05:20.078 +0900 [INFO] (0001:preview): Loading files [/private/tmp/hoge/csv/sample_01.csv.gz]
*************************** 1 ***************************
      id (     long) : 1
 account (     long) : 32,864
    time (timestamp) : 2015-01-27 19:23:49 UTC
purchase (timestamp) : 2015-01-27 00:00:00 UTC
 comment (   string) : embulk
*************************** 2 ***************************
      id (     long) : 2
 account (     long) : 14,824
    time (timestamp) : 2015-01-27 19:01:23 UTC
purchase (timestamp) : 2015-01-27 00:00:00 UTC
 comment (   string) : embulk jruby
*************************** 3 ***************************
      id (     long) : 3
 account (     long) : 27,559
    time (timestamp) : 2015-01-28 02:20:02 UTC
purchase (timestamp) : 2015-01-28 00:00:00 UTC
 comment (   string) : Embulk "csv" parser plugin
*************************** 4 ***************************
      id (     long) : 4
 account (     long) : 11,270
    time (timestamp) : 2015-01-29 11:54:36 UTC
purchase (timestamp) : 2015-01-29 00:00:00 UTC
 comment (   string) : 
2016-07-07 15:07:36 +0900: Digdag v0.8.3
2016-07-07 15:07:37 +0900 [WARN] (main): Reusing the last session time 2016-07-07T00:00:00+00:00.
2016-07-07 15:07:37 +0900 [INFO] (main): Using session .digdag/status/20160707T000000+0000.
2016-07-07 15:07:37 +0900 [INFO] (main): Starting a new session project id=1 workflow name=hoge session_time=2016-07-07T00:00:00+00:00
2016-07-07 15:07:38 +0900 [INFO] (0018@+hoge+step1): embulk>: ./config.yml.liquid
2016-07-07 15:07:43.199 +0900: Embulk v0.8.9
while scanning for the next token
found character % '%' that cannot start any token. (Do not use % for indentation)
 in 'string', line 2, column 2:
    {% include 'test' %}
     ^

	at org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(org/yaml/snakeyaml/scanner/ScannerImpl.java:420)
	at org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(org/yaml/snakeyaml/scanner/ScannerImpl.java:226)
	at org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingValue.produce(org/yaml/snakeyaml/parser/ParserImpl.java:586)
	at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(org/yaml/snakeyaml/parser/ParserImpl.java:158)
	at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(org/yaml/snakeyaml/parser/ParserImpl.java:143)
	at org.yaml.snakeyaml.composer.Composer.composeNode(org/yaml/snakeyaml/composer/Composer.java:132)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(org/yaml/snakeyaml/composer/Composer.java:231)
	at org.yaml.snakeyaml.composer.Composer.composeNode(org/yaml/snakeyaml/composer/Composer.java:155)
	at org.yaml.snakeyaml.composer.Composer.composeDocument(org/yaml/snakeyaml/composer/Composer.java:122)
	at org.yaml.snakeyaml.composer.Composer.getSingleNode(org/yaml/snakeyaml/composer/Composer.java:105)
	at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(org/yaml/snakeyaml/constructor/BaseConstructor.java:120)
	at org.yaml.snakeyaml.Yaml.loadFromReader(org/yaml/snakeyaml/Yaml.java:481)
	at org.yaml.snakeyaml.Yaml.load(org/yaml/snakeyaml/Yaml.java:400)
	at org.embulk.config.ConfigLoader.fromYamlString(org/embulk/config/ConfigLoader.java:66)
	at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:498)
	at RUBY.read_config(/Users/hsato/.embulk/bin/embulk!/embulk/runner.rb:131)
	at RUBY.run(/Users/hsato/.embulk/bin/embulk!/embulk/runner.rb:55)
	at RUBY.run(/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_run.rb:306)
	at RUBY.<top>(/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_main.rb:2)
	at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:937)
	at RUBY.(root)(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1)
	at Users.hsato.$_dot_embulk.bin.embulk.embulk.command.embulk_bundle.<top>(file:/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_bundle.rb:51)
	at java.lang.invoke.MethodHandle.invokeWithArguments(java/lang/invoke/MethodHandle.java:627)
	at org.embulk.cli.Main.main(org/embulk/cli/Main.java:23)

Error: while scanning for the next token
found character % '%' that cannot start any token. (Do not use % for indentation)
 in 'string', line 2, column 2:
    {% include 'test' %}
     ^

2016-07-07 15:07:45 +0900 [ERROR] (0018@+hoge+step1): Task failed with unexpected error: Command failed: 2016-07-07 15:07:43.199 +0900: Embulk v0.8.9
while scanning for the next token
found character % '%' that cannot start any token. (Do not use % for indentation)
 in 'string', line 2, column 2:
    {% include 'test' %}
     ^

	at org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(org/yaml/snakeyaml/scanner/ScannerImpl.java:420)
	at org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(org/yaml/snakeyaml/scanner/ScannerImpl.java:226)
	at org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingValue.produce(org/yaml/snakeyaml/parser/ParserImpl.java:586)
	at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(org/yaml/snakeyaml/parser/ParserImpl.java:158)
	at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(org/yaml/snakeyaml/parser/ParserImpl.java:143)
	at org.yaml.snakeyaml.composer.Composer.composeNode(org/yaml/snakeyaml/composer/Composer.java:132)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(org/yaml/snakeyaml/composer/Composer.java:231)
	at org.yaml.snakeyaml.composer.Composer.composeNode(org/yaml/snakeyaml/composer/Composer.java:155)
	at org.yaml.snakeyaml.composer.Composer.composeDocument(org/yaml/snakeyaml/composer/Composer.java:122)
	at org.yaml.snakeyaml.composer.Composer.getSingleNode(org/yaml/snakeyaml/composer/Composer.java:105)
	at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(org/yaml/snakeyaml/constructor/BaseConstructor.java:120)
	at org.yaml.snakeyaml.Yaml.loadFromReader(org/yaml/snakeyaml/Yaml.java:481)
	at org.yaml.snakeyaml.Yaml.load(org/yaml/snakeyaml/Yaml.java:400)
	at org.embulk.config.ConfigLoader.fromYamlString(org/embulk/config/ConfigLoader.java:66)
	at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:498)
	at RUBY.read_config(/Users/hsato/.embulk/bin/embulk!/embulk/runner.rb:131)
	at RUBY.run(/Users/hsato/.embulk/bin/embulk!/embulk/runner.rb:55)
	at RUBY.run(/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_run.rb:306)
	at RUBY.<top>(/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_main.rb:2)
	at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:937)
	at RUBY.(root)(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1)
	at Users.hsato.$_dot_embulk.bin.embulk.embulk.command.embulk_bundle.<top>(file:/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_bundle.rb:51)
	at java.lang.invoke.MethodHandle.invokeWithArguments(java/lang/invoke/MethodHandle.java:627)
	at org.embulk.cli.Main.main(org/embulk/cli/Main.java:23)

Error: while scanning for the next token
found character % '%' that cannot start any token. (Do not use % for indentation)
 in 'string', line 2, column 2:
    {% include 'test' %}
     ^

java.lang.RuntimeException: Command failed: 2016-07-07 15:07:43.199 +0900: Embulk v0.8.9
while scanning for the next token
found character % '%' that cannot start any token. (Do not use % for indentation)
 in 'string', line 2, column 2:
    {% include 'test' %}
     ^

	at org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(org/yaml/snakeyaml/scanner/ScannerImpl.java:420)
	at org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(org/yaml/snakeyaml/scanner/ScannerImpl.java:226)
	at org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingValue.produce(org/yaml/snakeyaml/parser/ParserImpl.java:586)
	at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(org/yaml/snakeyaml/parser/ParserImpl.java:158)
	at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(org/yaml/snakeyaml/parser/ParserImpl.java:143)
	at org.yaml.snakeyaml.composer.Composer.composeNode(org/yaml/snakeyaml/composer/Composer.java:132)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(org/yaml/snakeyaml/composer/Composer.java:231)
	at org.yaml.snakeyaml.composer.Composer.composeNode(org/yaml/snakeyaml/composer/Composer.java:155)
	at org.yaml.snakeyaml.composer.Composer.composeDocument(org/yaml/snakeyaml/composer/Composer.java:122)
	at org.yaml.snakeyaml.composer.Composer.getSingleNode(org/yaml/snakeyaml/composer/Composer.java:105)
	at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(org/yaml/snakeyaml/constructor/BaseConstructor.java:120)
	at org.yaml.snakeyaml.Yaml.loadFromReader(org/yaml/snakeyaml/Yaml.java:481)
	at org.yaml.snakeyaml.Yaml.load(org/yaml/snakeyaml/Yaml.java:400)
	at org.embulk.config.ConfigLoader.fromYamlString(org/embulk/config/ConfigLoader.java:66)
	at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:498)
	at RUBY.read_config(/Users/hsato/.embulk/bin/embulk!/embulk/runner.rb:131)
	at RUBY.run(/Users/hsato/.embulk/bin/embulk!/embulk/runner.rb:55)
	at RUBY.run(/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_run.rb:306)
	at RUBY.<top>(/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_main.rb:2)
	at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:937)
	at RUBY.(root)(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1)
	at Users.hsato.$_dot_embulk.bin.embulk.embulk.command.embulk_bundle.<top>(file:/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_bundle.rb:51)
	at java.lang.invoke.MethodHandle.invokeWithArguments(java/lang/invoke/MethodHandle.java:627)
	at org.embulk.cli.Main.main(org/embulk/cli/Main.java:23)

Error: while scanning for the next token
found character % '%' that cannot start any token. (Do not use % for indentation)
 in 'string', line 2, column 2:
    {% include 'test' %}
     ^

	at io.digdag.standards.operator.EmbulkOperatorFactory$EmbulkOperator.runTask(EmbulkOperatorFactory.java:117)
	at io.digdag.util.BaseOperator.run(BaseOperator.java:49)
	at io.digdag.core.agent.OperatorManager.callExecutor(OperatorManager.java:261)
	at io.digdag.cli.Run$OperatorManagerWithSkip.callExecutor(Run.java:663)
	at io.digdag.core.agent.OperatorManager.runWithWorkspace(OperatorManager.java:227)
	at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$2(OperatorManager.java:120)
	at io.digdag.core.agent.NoopWorkspaceManager.withExtractedArchive(NoopWorkspaceManager.java:20)
	at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:118)
	at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:102)
	at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:645)
	at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:70)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
2016-07-07 15:07:45 +0900 [WARN] (0018@+hoge^failure-alert): Skipped
error: 
  * +hoge+step1:
    Command failed: 2016-07-07 15:07:43.199 +0900: Embulk v0.8.9
while scanning for the next token
found character % '%' that cannot start any token. (Do not use % for indentation)
 in 'string', line 2, column 2:
    {% include 'test' %}
     ^

	at org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(org/yaml/snakeyaml/scanner/ScannerImpl.java:420)
	at org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(org/yaml/snakeyaml/scanner/ScannerImpl.java:226)
	at org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingValue.produce(org/yaml/snakeyaml/parser/ParserImpl.java:586)
	at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(org/yaml/snakeyaml/parser/ParserImpl.java:158)
	at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(org/yaml/snakeyaml/parser/ParserImpl.java:143)
	at org.yaml.snakeyaml.composer.Composer.composeNode(org/yaml/snakeyaml/composer/Composer.java:132)
	at org.yaml.snakeyaml.composer.Composer.composeMappingNode(org/yaml/snakeyaml/composer/Composer.java:231)
	at org.yaml.snakeyaml.composer.Composer.composeNode(org/yaml/snakeyaml/composer/Composer.java:155)
	at org.yaml.snakeyaml.composer.Composer.composeDocument(org/yaml/snakeyaml/composer/Composer.java:122)
	at org.yaml.snakeyaml.composer.Composer.getSingleNode(org/yaml/snakeyaml/composer/Composer.java:105)
	at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(org/yaml/snakeyaml/constructor/BaseConstructor.java:120)
	at org.yaml.snakeyaml.Yaml.loadFromReader(org/yaml/snakeyaml/Yaml.java:481)
	at org.yaml.snakeyaml.Yaml.load(org/yaml/snakeyaml/Yaml.java:400)
	at org.embulk.config.ConfigLoader.fromYamlString(org/embulk/config/ConfigLoader.java:66)
	at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:498)
	at RUBY.read_config(/Users/hsato/.embulk/bin/embulk!/embulk/runner.rb:131)
	at RUBY.run(/Users/hsato/.embulk/bin/embulk!/embulk/runner.rb:55)
	at RUBY.run(/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_run.rb:306)
	at RUBY.<top>(/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_main.rb:2)
	at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:937)
	at RUBY.(root)(uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1)
	at Users.hsato.$_dot_embulk.bin.embulk.embulk.command.embulk_bundle.<top>(file:/Users/hsato/.embulk/bin/embulk!/embulk/command/embulk_bundle.rb:51)
	at java.lang.invoke.MethodHandle.invokeWithArguments(java/lang/invoke/MethodHandle.java:627)
	at org.embulk.cli.Main.main(org/embulk/cli/Main.java:23)

Error: while scanning for the next token
found character % '%' that cannot start any token. (Do not use % for indentation)
 in 'string', line 2, column 2:
    {% include 'test' %}
     ^


Task state is saved at .digdag/status/20160707T000000+0000 directory.
  * Use --session <daily | hourly | "yyyy-MM-dd[ HH:mm:ss]"> to not reuse the last session time.
  * Use --rerun, --start +NAME, or --goal +NAME argument to rerun skipped tasks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment