Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save douglasgoodwin/5560439 to your computer and use it in GitHub Desktop.
Save douglasgoodwin/5560439 to your computer and use it in GitHub Desktop.
Snowplow failures
:aws:
:access_key_id: <snip>
:secret_access_key: <snip>
:s3:
:region: us-east-1
:buckets:
# Update assets if you want to host the serde and HiveQL yourself
:assets: s3://snowplowtest/
:log: s3://snowplow-logger/
:in: s3://snowplowtest-log/
:processing: s3://snowplow-processing/
:out: s3://snowplow-out/out/ # Make sure this bucket is in the US standard region if you wish to use Redshift
:archive: s3://snowplow-archive/
:emr:
# Can bump the below as EMR upgrades Hadoop
:hadoop_version: 1.0.3
:placement: us-east-1a
:ec2_key_name: ec2-keypair
# Adjust your Hive cluster below
:jobflow:
:instance_count: 2
:master_instance_type: m1.small
:slave_instance_type: m1.small
:etl:
:job_name: SnowPlow ETL # Give your job a name
:implementation: hadoop # hadoop Or 'hive' for legacy ETL
:collector_format: cloudfront # Or 'clj-tomcat' for the Clojure Collector
:continue_on_unexpected_error: false # You can switch to 'true' if you really don't want the ETL throwing exceptions. Doesn't work for Hadoop ETL yet
:storage_format: redshift # Or 'hive' or 'mysql-infobright'. Doesn't work for Hadoop ETL yet (always outputs redshift format)
# Can bump the below as SnowPlow releases new versions
:snowplow:
:hadoop_etl_version: 0.2.0 # Version of the Hadoop ETL
:serde_version: 0.5.5 # Version of the Hive deserializer
:hive_hiveql_version: 0.5.7
:mysql_infobright_hiveql_version: 0.0.8
:redshift_hiveql_version: 0.0.1
#Version: 1.0
#Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id
2013-05-10 22:14:19 SFO5 794 69.163.48.214 GET d3da8gtxngrv8u.cloudfront.net /i 403 http://10.0.1.131:380/ Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010.7;%20rv:20.0)%20Gecko/20100101%20Firefox/20.0 &e=ad&ad_ba=1&ad_ca=2&ad_ad=12&ad_uid=1&dtm=1368224038757&tid=016219&vp=1920x884&ds=1920x884&vid=4&duid=2d3353b4272270f9&p=web&tv=js-0.11.1&fp=1156021008&lang=en-US&cs=UTF-8&tz=America%252FLos_Angeles&f_pdf=1&f_qt=1&f_realp=0&f_wma=0&f_dir=0&f_fla=1&f_java=1&f_gears=0&f_ag=0&res=1920x1080&cd=24&cookie=1&url=http%253A%252F%252F10.0.1.131%253A380%252F - Error 44alQ_qANhQ6RTvRfQ-WSLKDUsN0XkyatjnpUDNd3eBOWg0fc0AEPg==
2013-05-10 22:14:19 SFO5 794 69.163.48.214 GET d3da8gtxngrv8u.cloudfront.net /i 403 http://10.0.1.131:380/ Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010.7;%20rv:20.0)%20Gecko/20100101%20Firefox/20.0 &e=se&ev_ca=test%2520Category&ev_ac=test%2520Action&ev_la=test%2520Label&dtm=1368224038769&tid=033880&vp=1920x884&ds=1920x884&vid=4&duid=2d3353b4272270f9&p=web&tv=js-0.11.1&fp=1156021008&lang=en-US&cs=UTF-8&tz=America%252FLos_Angeles&f_pdf=1&f_qt=1&f_realp=0&f_wma=0&f_dir=0&f_fla=1&f_java=1&f_gears=0&f_ag=0&res=1920x1080&cd=24&cookie=1&url=http%253A%252F%252F10.0.1.131%253A380%252F - Error X3C1zOgHoIcLRd-L_KT5G_hOv9kd1Xdt-taXVT7o7Pwm6hy7F3O8KA==
Last State Change: Jar doesn't exist: s3://snowplowtest/3-enrich/hadoop-etl/snowplow-hadoop-etl-0.2.0.jar
# Description
Name: SnowPlow ETL Creation Date: 2013-05-11 08:01 PDT
Start Date: - End Date: 2013-05-11 08:01 PDT
Availability Zone: us-east-1a Instance Count: -
Master Instance Type: - Slave Instance Type: -
Key Name: ec2-keypair Log URI: s3n://snowplow-logger/
Ami Version: 2.3.5 Master Public DNS Name: -
Hadoop Version: 1.0.3 Keep Alive: false
Termination Protected: false Visible To All Users: false
Subnet Id: - Supported Products: -
# steps
Step Name State Start Date End Date JAR Main Class Args
Elasticity Custom Jar Step CANCELLED - - s3://snowplowtest/3-enrich/hadoop-etl/snowplow-hadoop-etl-0.2.0.jar - com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob --hdfs --input_folder s3://snowplow-processing/ --input_format cloudfront --output_folder s3://snowplow-out/out/2013-05-11-07-59-53/ --bad_rows_folder 2013-05-11-07-59-53/
# Bootstrap actions
(none)
#Instance Groups
Instance Group Id Role Instance Type State Market Bid Price Running Count Request Count Creation DateTime Last State Change
ig-1YKFUCPOFLJ0M MASTER m1.small ENDED ON_DEMAND - 0 1 2013-05-11 08:01 PDT Job flow terminated
ig-1HBGPI3LG6CRS CORE m1.small ENDED ON_DEMAND - 0 1 2013-05-11 08:01 PDT Job flow terminated
# monitoring
(nothing drawn on graphs)
dgoodwinlaptop:emr-etl-runner douglasgoodwin$ bundle exec bin/snowplow-emr-etl-runner -c config/config.yml
Staging CloudFront logs...
moving files from s3://snowplowtest-log/ to s3://snowplow-processing/
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.0tqwzhuJ.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.0tqwzhuJ.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.8AI8sZln.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.8AI8sZln.gz MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.INfm5AP1.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.INfm5AP1.gz MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.Jhp9Bc1G.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.Jhp9Bc1G.gz MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.Y0sUR1v6.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.Y0sUR1v6.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.VOVrxhym.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.VOVrxhym.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.cTsuImZF.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.cTsuImZF.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.ddAPgoP4.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.ddAPgoP4.gz MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.oaS7lsEN.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.oaS7lsEN.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.ddAPgoP4.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.VOVrxhym.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.Y0sUR1v6.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.0tqwzhuJ.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.Jhp9Bc1G.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.8AI8sZln.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.VOVrxhym.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.qscG1gtG.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.qscG1gtG.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.ddAPgoP4.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.rrUuWz98.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.rrUuWz98.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.Y0sUR1v6.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.x9I6wJNy.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.x9I6wJNy.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.0tqwzhuJ.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.xwE9RRM0.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.xwE9RRM0.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.8AI8sZln.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.zYovwGaw.gz -> snowplow-processing/E16PK26F794569.2013-05-10-22.zYovwGaw.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.Jhp9Bc1G.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.2upHgXVI.gz -> snowplow-processing/E16PK26F794569.2013-05-10-23.2upHgXVI.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.INfm5AP1.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.oaS7lsEN.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.cTsuImZF.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.oaS7lsEN.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.GGoyleBS.gz -> snowplow-processing/E16PK26F794569.2013-05-10-23.GGoyleBS.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.INfm5AP1.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.HWKHkvst.gz -> snowplow-processing/E16PK26F794569.2013-05-10-23.HWKHkvst.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.rrUuWz98.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.cTsuImZF.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.UCONVsrw.gz -> snowplow-processing/E16PK26F794569.2013-05-10-23.UCONVsrw.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.x9I6wJNy.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-23.2upHgXVI.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.zYovwGaw.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.x9I6wJNy.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.WyTGvFfU.gz -> snowplow-processing/E16PK26F794569.2013-05-10-23.WyTGvFfU.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.qscG1gtG.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.rrUuWz98.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.wVjtNASf.gz -> snowplow-processing/E16PK26F794569.2013-05-10-23.wVjtNASf.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.2upHgXVI.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.xCMAjvJn.gz -> snowplow-processing/E16PK26F794569.2013-05-10-23.xCMAjvJn.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-22.xwE9RRM0.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-23.HWKHkvst.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.qscG1gtG.gz
MOVE snowplowtest-log/cflog/E16PK26F794569.2013-05-11-00.quuKy4Ef.gz -> snowplow-processing/E16PK26F794569.2013-05-11-00.quuKy4Ef.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-23.GGoyleBS.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.xwE9RRM0.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.HWKHkvst.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-22.zYovwGaw.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.GGoyleBS.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-23.UCONVsrw.gz
+-> snowplow-processing/E16PK26F794569.2013-05-11-00.quuKy4Ef.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.UCONVsrw.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-11-00.quuKy4Ef.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-23.WyTGvFfU.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-23.xCMAjvJn.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.WyTGvFfU.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.xCMAjvJn.gz
+-> snowplow-processing/E16PK26F794569.2013-05-10-23.wVjtNASf.gz
x snowplowtest-log/cflog/E16PK26F794569.2013-05-10-23.wVjtNASf.gz
Waiting a minute to allow S3 to settle (eventual consistency)
Initializing EMR jobflow
EMR jobflow j-20TWZJ3ID51KK failed, check Amazon logs for details. Data files not archived.
dgoodwinlaptop:emr-etl-runner douglasgoodwin$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment