Created
January 19, 2017 18:13
-
-
Save aqt01/283b3acd5eec12c4cd9a3cae48792cf9 to your computer and use it in GitHub Desktop.
Running 'remote-task ImportCountryWorkflow --n-reduce-tasks 1 --host localhost --remote-name analyticstack --user hadoop --repo https://github.com/codetigerco/edx-analytics-pipeline --branch stage_edxapp --local-scheduler --verbose --wait --overwrite --interval-start 2017-01-01 --overwrite-n-days 16 --geolocation-data hdfs://localhost:9000/data/…
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(pipeline)hadoop@ip-172-30-0-176:~$ remote-task ImportCountryWorkflow --n-reduce-tasks 1 --host localhost --remote-name analyticstack --user hadoop --repo https://github.com/codetigerco/edx-analytics-pipeline --branch stage_edxapp --local-scheduler --verbose --wait --overwrite --interval-start 2017-01-01 --overwrite-n-days 16 --geolocation-data hdfs://localhost:9000/data/GeoIP.dat --override-config $HOME/edx-analytics-pipeline/config/devstack.cfg --skip-setup | |
Parsed arguments = Namespace(branch='stage_edxapp', extra_repo=None, host='localhost', job_flow_id=None, job_flow_name=None, launch_task_arguments=['ImportCountryWorkflow', '--n-reduce-tasks', '1', '--local-scheduler', '--overwrite', '--interval-start', '2017-01-01', '--overwrite-n-days', '16', '--geolocation-data', 'hdfs://localhost:9000/data/GeoIP.dat'], log_path=None, override_config='/edx/app/hadoop/edx-analytics-pipeline/config/devstack.cfg', package=None, private_key=None, remote_name='analyticstack', repo='https://github.com/codetigerco/edx-analytics-pipeline', secure_config=None, secure_config_branch=None, secure_config_repo=None, shell=None, skip_setup=True, sudo_user='hadoop', user='hadoop', vagrant_path=None, verbose=True, virtualenv_extra_args=None, wait=True, wheel_url=None, workflow_profiler=None) | |
Running commands from path = /edx/app/hadoop/pipeline/share/edx.analytics.tasks | |
Remote name = analyticstack | |
Running command = ['ssh', '-tt', '-o', 'ForwardAgent=yes', '-o', 'StrictHostKeyChecking=no', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PasswordAuthentication=no', '-o', 'User=hadoop', '-o', 'ConnectTimeout=10', 'localhost', "sudo -Hu hadoop /bin/bash -c 'cd /var/lib/analytics-tasks/analyticstack/repo && . $HOME/.bashrc && /var/lib/analytics-tasks/analyticstack/venv/bin/launch-task ImportCountryWorkflow --n-reduce-tasks 1 --local-scheduler --overwrite --interval-start 2017-01-01 --overwrite-n-days 16 --geolocation-data hdfs://localhost:9000/data/GeoIP.dat'"] | |
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts. | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('sqoop-import = edx.analytics.tasks.common.sqoop:SqoopImportFromMysql') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('run-vertica-sql-script = edx.analytics.tasks.warehouse.run_vertica_sql_script:RunVerticaSqlScriptTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('obfuscation = edx.analytics.tasks.export.obfuscation:ObfuscatedCourseTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('enrollment_validation = edx.analytics.tasks.monitor.enrollment_validation:CourseEnrollmentValidationTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('inc-enrollments-report = edx.analytics.tasks.warehouse.enrollment.incremental_enrollments:WeeklyIncrementalUsersAndEnrollments') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('total-enrollments-report = edx.analytics.tasks.warehouse.enrollment.total_enrollments:WeeklyAllUsersAndEnrollments') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('orders = edx.analytics.tasks.warehouse.financial.orders_import:OrderTableTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('problem_response = edx.analytics.tasks.insights.problem_response:LatestProblemResponseDataTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('push_to_vertica_lms_courseware_link_clicked = edx.analytics.tasks.warehouse.lms_courseware_link_clicked:PushToVerticaLMSCoursewareLinkClickedTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('video = edx.analytics.tasks.insights.video:InsertToMysqlAllVideoTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('ed_services_report = edx.analytics.tasks.warehouse.financial.ed_services_financial_report:BuildEdServicesReportTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('load-internal-database = edx.analytics.tasks.warehouse.load_internal_reporting_database:ImportMysqlToVerticaTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('export-student-module = edx.analytics.tasks.export.database_exports:StudentModulePerCourseAfterImportWorkflow') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('calendar = edx.analytics.tasks.insights.calendar_task:CalendarTableTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('load-warehouse = edx.analytics.tasks.warehouse.load_warehouse:LoadWarehouseWorkflow') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('cybersource = edx.analytics.tasks.warehouse.financial.cybersource:DailyPullFromCybersourceTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('load-d-user = edx.analytics.tasks.warehouse.load_internal_reporting_user:LoadInternalReportingUserToWarehouse') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('location-per-course = edx.analytics.tasks.insights.location_per_course:LastCountryOfUser') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('payment_reconcile = edx.analytics.tasks.warehouse.financial.reconcile:ReconcileOrdersAndTransactionsTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('enrollments-report = edx.analytics.tasks.warehouse.enrollment.enrollments:EnrollmentsByWeek') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('financial_reports = edx.analytics.tasks.warehouse.financial.finance_reports:BuildFinancialReportsTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('engagement = edx.analytics.tasks.insights.module_engagement:ModuleEngagementDataTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('events_obfuscation = edx.analytics.tasks.export.events_obfuscation:ObfuscateCourseEventsTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('dump-student-module = edx.analytics.tasks.export.database_exports:StudentModulePerCourseTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('export-events-by-course = edx.analytics.tasks.export.event_exports_by_course:EventExportByCourseTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('noop = edx.analytics.tasks.monitor.performance:ParseEventLogPerformanceTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('course_blocks = edx.analytics.tasks.insights.course_blocks:CourseBlocksApiDataTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('load-d-certificates = edx.analytics.tasks.warehouse.load_internal_reporting_certificates:LoadInternalReportingCertificatesToWarehouse') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('user-activity = edx.analytics.tasks.insights.user_activity:CourseActivityWeeklyTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('run-vertica-sql-scripts = edx.analytics.tasks.warehouse.run_vertica_sql_scripts:RunVerticaSqlScriptTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('paypal = edx.analytics.tasks.warehouse.financial.paypal:PaypalTransactionsByDayTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('grade-dist = edx.analytics.tasks.data_api.studentmodule_dist:GradeDistFromSqoopToMySQLWorkflow') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('enrollments_and_registrations_workflow-manifest = edx.analytics.tasks.warehouse.enrollment.enrollments_and_registrations_workflow_manifest:EnrollmentsandRegistrationsWorkflow') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('database-import = edx.analytics.tasks.insights.database_imports:ImportAllDatabaseTablesTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('catalog = edx.analytics.tasks.warehouse.course_catalog:CourseCatalogWorkflow') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('load-course-catalog = edx.analytics.tasks.warehouse.load_internal_reporting_course_catalog:PullCourseCatalogAPIData') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('enrollments = edx.analytics.tasks.insights.enrollments:ImportEnrollmentsIntoMysql') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('event-type-dist = edx.analytics.tasks.warehouse.event_type_dist:PushToVerticaEventTypeDistributionTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('tags-dist = edx.analytics.tasks.insights.tags_dist:TagsDistributionPerCourse') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('export-events = edx.analytics.tasks.export.event_exports:EventExportTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('course-enroll = edx.analytics.tasks.warehouse.enrollment.course_enroll:CourseEnrollmentChangesPerDay') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('data_obfuscation = edx.analytics.tasks.export.data_obfuscation:ObfuscatedCourseDumpTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('course_list = edx.analytics.tasks.insights.course_list:CourseListApiDataTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('load-d-user-course = edx.analytics.tasks.warehouse.load_internal_reporting_user_course:LoadUserCourseSummary') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('load-d-country = edx.analytics.tasks.warehouse.load_internal_reporting_country:LoadInternalReportingCountryToWarehouse') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('overall_events = edx.analytics.tasks.monitor.overall_events:TotalEventsDailyTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('load-f-user-activity = edx.analytics.tasks.warehouse.load_internal_reporting_user_activity:LoadInternalReportingUserActivityToWarehouse') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('student_engagement = edx.analytics.tasks.data_api.student_engagement:StudentEngagementTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('answer-dist = edx.analytics.tasks.insights.answer_dist:AnswerDistributionPerCourse') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('insert-into-table = edx.analytics.tasks.common.mysql_load:MysqlInsertTask') | |
DEBUG:stevedore.extension:found extension EntryPoint.parse('all_events_report = edx.analytics.tasks.monitor.total_events_report:TotalEventsReportWorkflow') | |
DEBUG:edx.analytics.tasks.launchers.local:Using override.cfg | |
2017-01-19 18:01:24,426 INFO 16985 [luigi-interface] worker.py:267 - Scheduled ImportCountryWorkflow(source=('hdfs://localhost:9000/data/',), expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, geolocation_data=hdfs://localhost:9000/data/GeoIP.dat, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, interval=None) (PENDING) | |
2017-01-19 18:01:24,428 INFO 16985 [luigi-interface] worker.py:267 - Scheduled InsertToMysqlCourseEnrollByCountryWorkflow(source=('hdfs://localhost:9000/data/',), expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, geolocation_data=hdfs://localhost:9000/data/GeoIP.dat, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, interval=2017-01-01-2017-01-19) (PENDING) | |
2017-01-19 18:01:24,430 INFO 16985 [luigi-interface] worker.py:267 - Scheduled InsertToMysqlLastCountryPerCourseTask(source=('hdfs://localhost:9000/data/',), expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, geolocation_data=hdfs://localhost:9000/data/GeoIP.dat, database=stage_reports, credentials=/edx/etc/edx-analytics-pipeline/output.json, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, interval=2017-01-01-2017-01-19) (PENDING) | |
2017-01-19 18:01:24,445 INFO 16985 [luigi-interface] worker.py:267 - Scheduled QueryLastCountryPerCourseTask(source=('hdfs://localhost:9000/data/',), expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, geolocation_data=hdfs://localhost:9000/data/GeoIP.dat, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, interval=2017-01-01-2017-01-19) (PENDING) | |
2017-01-19 18:01:24,446 INFO 16985 [luigi-interface] worker.py:267 - Scheduled ImportAuthUserTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) (PENDING) | |
2017-01-19 18:01:24,447 INFO 16985 [luigi-interface] worker.py:267 - Scheduled SqoopImportFromMysql(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/auth_user/dt=2017-01-19/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, num_mappers=None, verbose=False, table_name=auth_user, where=None, columns=('id', 'username', 'last_login', 'date_joined', 'is_active', 'is_superuser', 'is_staff', 'email'), null_string=\\N, fields_terminated_by=, delimiter_replacement= , mysql_delimiters=False) (PENDING) | |
2017-01-19 18:01:24,448 INFO 16985 [luigi-interface] worker.py:267 - Scheduled ExternalURL(url=/edx/etc/edx-analytics-pipeline/input.json) (DONE) | |
2017-01-19 18:01:24,449 INFO 16985 [luigi-interface] worker.py:267 - Scheduled ImportStudentCourseEnrollmentTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) (PENDING) | |
2017-01-19 18:01:24,449 INFO 16985 [luigi-interface] worker.py:267 - Scheduled SqoopImportFromMysql(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/student_courseenrollment/dt=2017-01-19/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, num_mappers=None, verbose=False, table_name=student_courseenrollment, where=None, columns=('id', 'user_id', 'course_id', 'created', 'is_active', 'mode'), null_string=\\N, fields_terminated_by=, delimiter_replacement= , mysql_delimiters=False) (PENDING) | |
2017-01-19 18:01:24,451 INFO 16985 [luigi-interface] worker.py:267 - Scheduled LastCountryOfUserPartitionTask(source=('hdfs://localhost:9000/data/',), expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, geolocation_data=hdfs://localhost:9000/data/GeoIP.dat, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, interval=2017-01-01-2017-01-19) (PENDING) | |
2017-01-19 18:01:38,736 INFO 16985 [luigi-interface] worker.py:267 - Scheduled LastCountryOfUserTableTask(warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/) (DONE) | |
2017-01-19 18:01:38,750 INFO 16985 [luigi-interface] worker.py:267 - Scheduled LastCountryOfUser(source=('hdfs://localhost:9000/data/',), expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, geolocation_data=hdfs://localhost:9000/data/GeoIP.dat, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, interval=2017-01-01-2017-01-19) (PENDING) | |
2017-01-19 18:01:38,751 INFO 16985 [luigi-interface] worker.py:267 - Scheduled LastDailyIpAddressOfUserTask(source=('hdfs://localhost:9000/data/',), interval=2017-01-03-2017-01-19, expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/) (PENDING) | |
2017-01-19 18:01:47,033 INFO 16985 [luigi-interface] worker.py:267 - Scheduled PathSelectionByDateIntervalTask(source=('hdfs://localhost:9000/data/',), interval=2017-01-03-2017-01-19, expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d) (DONE) | |
2017-01-19 18:01:51,024 INFO 16985 [luigi-interface] worker.py:267 - Scheduled ExternalURL(url=hdfs://localhost:9000/data/GeoIP.dat) (DONE) | |
2017-01-19 18:01:59,535 INFO 16985 [luigi-interface] worker.py:267 - Scheduled PathSelectionByDateIntervalTask(source=('hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user',), interval=2017-01-01-2017-01-03, expand_interval=0:00:00, pattern=('.*?last_ip_of_user_(?P<date>\\d{4}-\\d{2}-\\d{2})',), date_pattern=%Y-%m-%d) (DONE) | |
2017-01-19 18:01:59,536 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-18/last_ip_of_user_2017-01-18) (DONE) | |
2017-01-19 18:01:59,536 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-17/last_ip_of_user_2017-01-17) (DONE) | |
2017-01-19 18:01:59,537 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-16/last_ip_of_user_2017-01-16) (DONE) | |
2017-01-19 18:01:59,537 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-15/last_ip_of_user_2017-01-15) (DONE) | |
2017-01-19 18:01:59,537 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-14/last_ip_of_user_2017-01-14) (DONE) | |
2017-01-19 18:01:59,538 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-13/last_ip_of_user_2017-01-13) (DONE) | |
2017-01-19 18:01:59,538 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-12/last_ip_of_user_2017-01-12) (DONE) | |
2017-01-19 18:01:59,538 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-11/last_ip_of_user_2017-01-11) (DONE) | |
2017-01-19 18:01:59,539 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-10/last_ip_of_user_2017-01-10) (DONE) | |
2017-01-19 18:01:59,539 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-09/last_ip_of_user_2017-01-09) (DONE) | |
2017-01-19 18:01:59,539 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-08/last_ip_of_user_2017-01-08) (DONE) | |
2017-01-19 18:01:59,540 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-07/last_ip_of_user_2017-01-07) (DONE) | |
2017-01-19 18:01:59,540 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-06/last_ip_of_user_2017-01-06) (DONE) | |
2017-01-19 18:01:59,540 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-05/last_ip_of_user_2017-01-05) (DONE) | |
2017-01-19 18:01:59,540 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-04/last_ip_of_user_2017-01-04) (DONE) | |
2017-01-19 18:01:59,541 INFO 16985 [luigi-interface] worker.py:267 - Scheduled UncheckedExternalURL(url=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/last_ip_of_user/dt=2017-01-03/last_ip_of_user_2017-01-03) (DONE) | |
2017-01-19 18:01:59,541 INFO 16985 [luigi-interface] worker.py:267 - Scheduled ExternalURL(url=/edx/etc/edx-analytics-pipeline/output.json) (DONE) | |
2017-01-19 18:01:59,542 INFO 16985 [luigi-interface] worker.py:267 - Scheduled InsertToMysqlLastCountryOfUserTask(source=('hdfs://localhost:9000/data/',), expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, geolocation_data=hdfs://localhost:9000/data/GeoIP.dat, database=stage_reports, credentials=/edx/etc/edx-analytics-pipeline/output.json, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, interval=2017-01-01-2017-01-19) (PENDING) | |
2017-01-19 18:01:59,544 INFO 16985 [luigi-interface] worker.py:267 - Scheduled LoadInternalReportingCountryTableHive(source=('hdfs://localhost:9000/data/',), expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, geolocation_data=hdfs://localhost:9000/data/GeoIP.dat, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, interval=2017-01-01-2017-01-19) (PENDING) | |
2017-01-19 18:01:59,544 INFO 16985 [luigi-interface] interface.py:193 - Done scheduling tasks | |
2017-01-19 18:01:59,544 INFO 16985 [luigi-interface] worker.py:282 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) running SqoopImportFromMysql(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/auth_user/dt=2017-01-19/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, num_mappers=None, verbose=False, table_name=auth_user, where=None, columns=('id', 'username', 'last_login', 'date_joined', 'is_active', 'is_superuser', 'is_staff', 'email'), null_string=\\N, fields_terminated_by=, delimiter_replacement= , mysql_delimiters=False) | |
2017-01-19 18:02:25,432 INFO 16985 [luigi-interface] hadoop.py:242 - sqoop import --connect jdbc:mysql://stage-instituto-is.ct77wcfkafjj.us-east-1.rds.amazonaws.com/stage_edxapp --username stage_read_only --password-file /tmp/luigi/partial/luigitemp-591492406 --table auth_user --target-dir hdfs://localhost:9000/edx-analytics-pipeline/warehouse/auth_user/dt=2017-01-19/ --columns id,username,last_login,date_joined,is_active,is_superuser,is_staff,email --null-string \\N --null-non-string \\N --fields-terminated-by --hive-delims-replacement --direct | |
2017-01-19 18:02:27,324 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:27 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6 | |
2017-01-19 18:02:29,780 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:29 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. | |
2017-01-19 18:02:29,780 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:29 INFO tool.CodeGenTool: Beginning code generation | |
2017-01-19 18:02:30,694 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `auth_user` AS t LIMIT 1 | |
2017-01-19 18:02:30,761 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `auth_user` AS t LIMIT 1 | |
2017-01-19 18:02:30,791 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:30 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /edx/app/hadoop/hadoop | |
2017-01-19 18:02:36,891 INFO 16985 [luigi-interface] hadoop.py:273 - Note: /tmp/sqoop-hadoop/compile/70fd7a72a651031bc6338d8402d7e09f/auth_user.java uses or overrides a deprecated API. | |
2017-01-19 18:02:36,891 INFO 16985 [luigi-interface] hadoop.py:273 - Note: Recompile with -Xlint:deprecation for details. | |
2017-01-19 18:02:36,907 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:36 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/70fd7a72a651031bc6338d8402d7e09f/auth_user.jar | |
2017-01-19 18:02:36,932 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:36 WARN manager.DirectMySQLManager: Direct-mode import from MySQL does not support column | |
2017-01-19 18:02:36,933 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:36 WARN manager.DirectMySQLManager: selection. Falling back to JDBC-based import. | |
2017-01-19 18:02:36,933 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:36 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) | |
2017-01-19 18:02:36,970 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:36 INFO mapreduce.ImportJobBase: Beginning import of auth_user | |
2017-01-19 18:02:37,019 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:37 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar | |
2017-01-19 18:02:37,085 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:37 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps | |
2017-01-19 18:02:37,366 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:37 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 | |
2017-01-19 18:02:44,269 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:44 INFO db.DBInputFormat: Using read commited transaction isolation | |
2017-01-19 18:02:44,270 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:44 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `auth_user` | |
2017-01-19 18:02:44,799 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:44 INFO mapreduce.JobSubmitter: number of splits:4 | |
2017-01-19 18:02:45,144 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1484841390427_0009 | |
2017-01-19 18:02:45,820 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:45 INFO impl.YarnClientImpl: Submitted application application_1484841390427_0009 | |
2017-01-19 18:02:45,892 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:45 INFO mapreduce.Job: The url to track the job: http://ip-172-30-0-176:8088/proxy/application_1484841390427_0009/ | |
2017-01-19 18:02:45,893 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:02:45 INFO mapreduce.Job: Running job: job_1484841390427_0009 | |
2017-01-19 18:03:00,016 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:03:00 INFO mapreduce.Job: Job job_1484841390427_0009 running in uber mode : false | |
2017-01-19 18:03:00,017 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:03:00 INFO mapreduce.Job: map 0% reduce 0% | |
2017-01-19 18:03:43,917 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:03:43 INFO mapreduce.Job: map 25% reduce 0% | |
2017-01-19 18:03:44,983 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:03:44 INFO mapreduce.Job: map 50% reduce 0% | |
2017-01-19 18:03:45,988 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:03:45 INFO mapreduce.Job: map 100% reduce 0% | |
2017-01-19 18:03:48,054 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:03:48 INFO mapreduce.Job: Job job_1484841390427_0009 completed successfully | |
2017-01-19 18:03:48,439 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:03:48 INFO mapreduce.Job: Counters: 30 | |
2017-01-19 18:03:48,439 INFO 16985 [luigi-interface] hadoop.py:273 - File System Counters | |
2017-01-19 18:03:48,440 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of bytes read=0 | |
2017-01-19 18:03:48,440 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of bytes written=417068 | |
2017-01-19 18:03:48,440 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of read operations=0 | |
2017-01-19 18:03:48,440 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of large read operations=0 | |
2017-01-19 18:03:48,441 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of write operations=0 | |
2017-01-19 18:03:48,441 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of bytes read=394 | |
2017-01-19 18:03:48,441 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of bytes written=781 | |
2017-01-19 18:03:48,441 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of read operations=16 | |
2017-01-19 18:03:48,441 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of large read operations=0 | |
2017-01-19 18:03:48,442 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of write operations=8 | |
2017-01-19 18:03:48,450 INFO 16985 [luigi-interface] hadoop.py:273 - Job Counters | |
2017-01-19 18:03:48,450 INFO 16985 [luigi-interface] hadoop.py:273 - Launched map tasks=4 | |
2017-01-19 18:03:48,450 INFO 16985 [luigi-interface] hadoop.py:273 - Other local map tasks=4 | |
2017-01-19 18:03:48,450 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all maps in occupied slots (ms)=166540 | |
2017-01-19 18:03:48,450 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all reduces in occupied slots (ms)=0 | |
2017-01-19 18:03:48,450 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all map tasks (ms)=166540 | |
2017-01-19 18:03:48,451 INFO 16985 [luigi-interface] hadoop.py:273 - Total vcore-seconds taken by all map tasks=166540 | |
2017-01-19 18:03:48,451 INFO 16985 [luigi-interface] hadoop.py:273 - Total megabyte-seconds taken by all map tasks=170536960 | |
2017-01-19 18:03:48,451 INFO 16985 [luigi-interface] hadoop.py:273 - Map-Reduce Framework | |
2017-01-19 18:03:48,451 INFO 16985 [luigi-interface] hadoop.py:273 - Map input records=8 | |
2017-01-19 18:03:48,451 INFO 16985 [luigi-interface] hadoop.py:273 - Map output records=8 | |
2017-01-19 18:03:48,451 INFO 16985 [luigi-interface] hadoop.py:273 - Input split bytes=394 | |
2017-01-19 18:03:48,451 INFO 16985 [luigi-interface] hadoop.py:273 - Spilled Records=0 | |
2017-01-19 18:03:48,452 INFO 16985 [luigi-interface] hadoop.py:273 - Failed Shuffles=0 | |
2017-01-19 18:03:48,453 INFO 16985 [luigi-interface] hadoop.py:273 - Merged Map outputs=0 | |
2017-01-19 18:03:48,454 INFO 16985 [luigi-interface] hadoop.py:273 - GC time elapsed (ms)=2285 | |
2017-01-19 18:03:48,454 INFO 16985 [luigi-interface] hadoop.py:273 - CPU time spent (ms)=6610 | |
2017-01-19 18:03:48,470 INFO 16985 [luigi-interface] hadoop.py:273 - Physical memory (bytes) snapshot=440082432 | |
2017-01-19 18:03:48,470 INFO 16985 [luigi-interface] hadoop.py:273 - Virtual memory (bytes) snapshot=8846278656 | |
2017-01-19 18:03:48,470 INFO 16985 [luigi-interface] hadoop.py:273 - Total committed heap usage (bytes)=243531776 | |
2017-01-19 18:03:48,470 INFO 16985 [luigi-interface] hadoop.py:273 - File Input Format Counters | |
2017-01-19 18:03:48,470 INFO 16985 [luigi-interface] hadoop.py:273 - Bytes Read=0 | |
2017-01-19 18:03:48,471 INFO 16985 [luigi-interface] hadoop.py:273 - File Output Format Counters | |
2017-01-19 18:03:48,471 INFO 16985 [luigi-interface] hadoop.py:273 - Bytes Written=781 | |
2017-01-19 18:03:48,505 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:03:48 INFO mapreduce.ImportJobBase: Transferred 781 bytes in 71.3525 seconds (10.9457 bytes/sec) | |
2017-01-19 18:03:48,554 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:03:48 INFO mapreduce.ImportJobBase: Retrieved 8 records. | |
2017-01-19 18:04:19,142 INFO 16985 [luigi-interface] worker.py:296 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) done SqoopImportFromMysql(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/auth_user/dt=2017-01-19/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, num_mappers=None, verbose=False, table_name=auth_user, where=None, columns=('id', 'username', 'last_login', 'date_joined', 'is_active', 'is_superuser', 'is_staff', 'email'), null_string=\\N, fields_terminated_by=, delimiter_replacement= , mysql_delimiters=False) | |
2017-01-19 18:04:19,143 INFO 16985 [luigi-interface] worker.py:282 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) running ImportAuthUserTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) | |
2017-01-19 18:04:23,371 INFO 16985 [edx.analytics.tasks.insights.database_imports] database_imports.py:110 - Marking existing output as having been removed for task ImportAuthUserTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) | |
2017-01-19 18:04:23,371 INFO 16985 [luigi-interface] hive.py:358 - ['hive', '-f', '/tmp/tmpksUMTq', '--hiveconf', 'mapred.job.name=ImportAuthUserTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None)'] | |
2017-01-19 18:04:23,371 INFO 16985 [luigi-interface] hadoop.py:242 - hive -f /tmp/tmpksUMTq --hiveconf mapred.job.name=ImportAuthUserTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) | |
2017-01-19 18:04:26,796 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:26 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive | |
2017-01-19 18:04:26,797 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:26 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize | |
2017-01-19 18:04:26,799 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:26 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize | |
2017-01-19 18:04:26,799 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:26 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack | |
2017-01-19 18:04:26,799 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:26 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node | |
2017-01-19 18:04:26,800 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:26 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces | |
2017-01-19 18:04:26,872 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:26 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative | |
2017-01-19 18:04:27,530 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:27 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@76f2b07d:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. | |
2017-01-19 18:04:27,547 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:27 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@76f2b07d:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. | |
2017-01-19 18:04:27,553 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:04:27 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name | |
2017-01-19 18:04:27,987 INFO 16985 [luigi-interface] hadoop.py:273 - Logging initialized using configuration in jar:file:/edx/app/hadoop/hive-0.11.0-bin/lib/hive-common-0.11.0.jar!/hive-log4j.properties | |
2017-01-19 18:04:28,008 INFO 16985 [luigi-interface] hadoop.py:273 - Hive history file=/tmp/hadoop/hive_job_log_hadoop_18723@ip-172-30-0-176.ec2.internal_201701191804_1780438655.txt | |
2017-01-19 18:04:28,397 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: Class path contains multiple SLF4J bindings. | |
2017-01-19 18:04:28,398 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: Found binding in [jar:file:/edx/app/hadoop/hadoop-2.3.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] | |
2017-01-19 18:04:28,398 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: Found binding in [jar:file:/edx/app/hadoop/hive-0.11.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] | |
2017-01-19 18:04:28,398 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. | |
2017-01-19 18:04:28,401 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] | |
2017-01-19 18:04:35,752 INFO 16985 [luigi-interface] hadoop.py:273 - OK | |
2017-01-19 18:04:35,753 INFO 16985 [luigi-interface] hadoop.py:273 - Time taken: 6.913 seconds | |
2017-01-19 18:04:39,964 INFO 16985 [luigi-interface] hadoop.py:273 - OK | |
2017-01-19 18:04:39,976 INFO 16985 [luigi-interface] hadoop.py:273 - Time taken: 4.222 seconds | |
2017-01-19 18:04:40,994 INFO 16985 [luigi-interface] hadoop.py:273 - OK | |
2017-01-19 18:04:40,995 INFO 16985 [luigi-interface] hadoop.py:273 - Time taken: 1.018 seconds | |
2017-01-19 18:04:41,786 INFO 16985 [luigi-interface] hadoop.py:273 - OK | |
2017-01-19 18:04:41,787 INFO 16985 [luigi-interface] hadoop.py:273 - Time taken: 0.792 seconds | |
2017-01-19 18:04:41,836 INFO 16985 [luigi-interface] worker.py:296 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) done ImportAuthUserTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) | |
2017-01-19 18:04:41,836 INFO 16985 [luigi-interface] worker.py:282 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) running SqoopImportFromMysql(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/student_courseenrollment/dt=2017-01-19/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, num_mappers=None, verbose=False, table_name=student_courseenrollment, where=None, columns=('id', 'user_id', 'course_id', 'created', 'is_active', 'mode'), null_string=\\N, fields_terminated_by=, delimiter_replacement= , mysql_delimiters=False) | |
2017-01-19 18:05:05,636 INFO 16985 [luigi-interface] hadoop.py:242 - sqoop import --connect jdbc:mysql://stage-instituto-is.ct77wcfkafjj.us-east-1.rds.amazonaws.com/stage_edxapp --username stage_read_only --password-file /tmp/luigi/partial/luigitemp-730345178 --table student_courseenrollment --target-dir hdfs://localhost:9000/edx-analytics-pipeline/warehouse/student_courseenrollment/dt=2017-01-19/ --columns id,user_id,course_id,created,is_active,mode --null-string \\N --null-non-string \\N --fields-terminated-by --hive-delims-replacement --direct | |
2017-01-19 18:05:09,463 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:09 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6 | |
2017-01-19 18:05:12,963 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:12 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. | |
2017-01-19 18:05:12,963 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:12 INFO tool.CodeGenTool: Beginning code generation | |
2017-01-19 18:05:13,825 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:13 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student_courseenrollment` AS t LIMIT 1 | |
2017-01-19 18:05:13,919 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:13 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `student_courseenrollment` AS t LIMIT 1 | |
2017-01-19 18:05:13,925 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:13 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /edx/app/hadoop/hadoop | |
2017-01-19 18:05:18,543 INFO 16985 [luigi-interface] hadoop.py:273 - Note: /tmp/sqoop-hadoop/compile/5d8c6d16892db7daefd7c7ce3c4d19e6/student_courseenrollment.java uses or overrides a deprecated API. | |
2017-01-19 18:05:18,547 INFO 16985 [luigi-interface] hadoop.py:273 - Note: Recompile with -Xlint:deprecation for details. | |
2017-01-19 18:05:18,559 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:18 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/5d8c6d16892db7daefd7c7ce3c4d19e6/student_courseenrollment.jar | |
2017-01-19 18:05:18,577 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:18 WARN manager.DirectMySQLManager: Direct-mode import from MySQL does not support column | |
2017-01-19 18:05:18,591 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:18 WARN manager.DirectMySQLManager: selection. Falling back to JDBC-based import. | |
2017-01-19 18:05:18,591 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:18 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) | |
2017-01-19 18:05:18,606 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:18 INFO mapreduce.ImportJobBase: Beginning import of student_courseenrollment | |
2017-01-19 18:05:18,629 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:18 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar | |
2017-01-19 18:05:18,699 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:18 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps | |
2017-01-19 18:05:19,046 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:19 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 | |
2017-01-19 18:05:25,730 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:25 INFO db.DBInputFormat: Using read commited transaction isolation | |
2017-01-19 18:05:25,731 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:25 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `student_courseenrollment` | |
2017-01-19 18:05:25,839 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:25 INFO mapreduce.JobSubmitter: number of splits:4 | |
2017-01-19 18:05:26,264 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:26 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1484841390427_0010 | |
2017-01-19 18:05:27,181 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:27 INFO impl.YarnClientImpl: Submitted application application_1484841390427_0010 | |
2017-01-19 18:05:27,345 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:27 INFO mapreduce.Job: The url to track the job: http://ip-172-30-0-176:8088/proxy/application_1484841390427_0010/ | |
2017-01-19 18:05:27,351 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:27 INFO mapreduce.Job: Running job: job_1484841390427_0010 | |
2017-01-19 18:05:43,842 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:43 INFO mapreduce.Job: Job job_1484841390427_0010 running in uber mode : false | |
2017-01-19 18:05:43,842 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:05:43 INFO mapreduce.Job: map 0% reduce 0% | |
2017-01-19 18:06:24,534 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:06:24 INFO mapreduce.Job: map 50% reduce 0% | |
2017-01-19 18:06:27,769 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:06:27 INFO mapreduce.Job: map 100% reduce 0% | |
2017-01-19 18:06:28,794 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:06:28 INFO mapreduce.Job: Job job_1484841390427_0010 completed successfully | |
2017-01-19 18:06:29,133 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:06:29 INFO mapreduce.Job: Counters: 30 | |
2017-01-19 18:06:29,133 INFO 16985 [luigi-interface] hadoop.py:273 - File System Counters | |
2017-01-19 18:06:29,133 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of bytes read=0 | |
2017-01-19 18:06:29,134 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of bytes written=417176 | |
2017-01-19 18:06:29,182 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of read operations=0 | |
2017-01-19 18:06:29,182 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of large read operations=0 | |
2017-01-19 18:06:29,182 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of write operations=0 | |
2017-01-19 18:06:29,197 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of bytes read=400 | |
2017-01-19 18:06:29,197 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of bytes written=2311 | |
2017-01-19 18:06:29,197 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of read operations=16 | |
2017-01-19 18:06:29,198 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of large read operations=0 | |
2017-01-19 18:06:29,198 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of write operations=8 | |
2017-01-19 18:06:29,198 INFO 16985 [luigi-interface] hadoop.py:273 - Job Counters | |
2017-01-19 18:06:29,198 INFO 16985 [luigi-interface] hadoop.py:273 - Launched map tasks=4 | |
2017-01-19 18:06:29,198 INFO 16985 [luigi-interface] hadoop.py:273 - Other local map tasks=4 | |
2017-01-19 18:06:29,199 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all maps in occupied slots (ms)=158540 | |
2017-01-19 18:06:29,199 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all reduces in occupied slots (ms)=0 | |
2017-01-19 18:06:29,199 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all map tasks (ms)=158540 | |
2017-01-19 18:06:29,199 INFO 16985 [luigi-interface] hadoop.py:273 - Total vcore-seconds taken by all map tasks=158540 | |
2017-01-19 18:06:29,199 INFO 16985 [luigi-interface] hadoop.py:273 - Total megabyte-seconds taken by all map tasks=162344960 | |
2017-01-19 18:06:29,200 INFO 16985 [luigi-interface] hadoop.py:273 - Map-Reduce Framework | |
2017-01-19 18:06:29,200 INFO 16985 [luigi-interface] hadoop.py:273 - Map input records=29 | |
2017-01-19 18:06:29,200 INFO 16985 [luigi-interface] hadoop.py:273 - Map output records=29 | |
2017-01-19 18:06:29,200 INFO 16985 [luigi-interface] hadoop.py:273 - Input split bytes=400 | |
2017-01-19 18:06:29,200 INFO 16985 [luigi-interface] hadoop.py:273 - Spilled Records=0 | |
2017-01-19 18:06:29,200 INFO 16985 [luigi-interface] hadoop.py:273 - Failed Shuffles=0 | |
2017-01-19 18:06:29,201 INFO 16985 [luigi-interface] hadoop.py:273 - Merged Map outputs=0 | |
2017-01-19 18:06:29,201 INFO 16985 [luigi-interface] hadoop.py:273 - GC time elapsed (ms)=2360 | |
2017-01-19 18:06:29,201 INFO 16985 [luigi-interface] hadoop.py:273 - CPU time spent (ms)=6580 | |
2017-01-19 18:06:29,201 INFO 16985 [luigi-interface] hadoop.py:273 - Physical memory (bytes) snapshot=438423552 | |
2017-01-19 18:06:29,201 INFO 16985 [luigi-interface] hadoop.py:273 - Virtual memory (bytes) snapshot=8846278656 | |
2017-01-19 18:06:29,202 INFO 16985 [luigi-interface] hadoop.py:273 - Total committed heap usage (bytes)=243531776 | |
2017-01-19 18:06:29,202 INFO 16985 [luigi-interface] hadoop.py:273 - File Input Format Counters | |
2017-01-19 18:06:29,202 INFO 16985 [luigi-interface] hadoop.py:273 - Bytes Read=0 | |
2017-01-19 18:06:29,202 INFO 16985 [luigi-interface] hadoop.py:273 - File Output Format Counters | |
2017-01-19 18:06:29,202 INFO 16985 [luigi-interface] hadoop.py:273 - Bytes Written=2311 | |
2017-01-19 18:06:29,202 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:06:29 INFO mapreduce.ImportJobBase: Transferred 2.2568 KB in 70.4338 seconds (32.811 bytes/sec) | |
2017-01-19 18:06:29,203 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:06:29 INFO mapreduce.ImportJobBase: Retrieved 29 records. | |
2017-01-19 18:06:54,740 INFO 16985 [luigi-interface] worker.py:296 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) done SqoopImportFromMysql(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/student_courseenrollment/dt=2017-01-19/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, num_mappers=None, verbose=False, table_name=student_courseenrollment, where=None, columns=('id', 'user_id', 'course_id', 'created', 'is_active', 'mode'), null_string=\\N, fields_terminated_by=, delimiter_replacement= , mysql_delimiters=False) | |
2017-01-19 18:06:54,750 INFO 16985 [luigi-interface] worker.py:282 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) running ImportStudentCourseEnrollmentTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) | |
2017-01-19 18:06:58,935 INFO 16985 [edx.analytics.tasks.insights.database_imports] database_imports.py:110 - Marking existing output as having been removed for task ImportStudentCourseEnrollmentTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) | |
2017-01-19 18:06:58,936 INFO 16985 [luigi-interface] hive.py:358 - ['hive', '-f', '/tmp/tmpDUrda6', '--hiveconf', 'mapred.job.name=ImportStudentCourseEnrollmentTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None)'] | |
2017-01-19 18:06:58,936 INFO 16985 [luigi-interface] hadoop.py:242 - hive -f /tmp/tmpDUrda6 --hiveconf mapred.job.name=ImportStudentCourseEnrollmentTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) | |
2017-01-19 18:07:03,080 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:03 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive | |
2017-01-19 18:07:03,083 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:03 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize | |
2017-01-19 18:07:03,083 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:03 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize | |
2017-01-19 18:07:03,092 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:03 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack | |
2017-01-19 18:07:03,093 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:03 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node | |
2017-01-19 18:07:03,093 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:03 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces | |
2017-01-19 18:07:03,095 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:03 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative | |
2017-01-19 18:07:04,530 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:04 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@76f2b07d:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. | |
2017-01-19 18:07:04,597 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:04 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@76f2b07d:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. | |
2017-01-19 18:07:04,612 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:04 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name | |
2017-01-19 18:07:05,948 INFO 16985 [luigi-interface] hadoop.py:273 - Logging initialized using configuration in jar:file:/edx/app/hadoop/hive-0.11.0-bin/lib/hive-common-0.11.0.jar!/hive-log4j.properties | |
2017-01-19 18:07:05,986 INFO 16985 [luigi-interface] hadoop.py:273 - Hive history file=/tmp/hadoop/hive_job_log_hadoop_20244@ip-172-30-0-176.ec2.internal_201701191807_449754460.txt | |
2017-01-19 18:07:06,775 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: Class path contains multiple SLF4J bindings. | |
2017-01-19 18:07:06,775 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: Found binding in [jar:file:/edx/app/hadoop/hadoop-2.3.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] | |
2017-01-19 18:07:06,795 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: Found binding in [jar:file:/edx/app/hadoop/hive-0.11.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] | |
2017-01-19 18:07:06,795 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. | |
2017-01-19 18:07:06,883 INFO 16985 [luigi-interface] hadoop.py:273 - SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] | |
2017-01-19 18:07:15,587 INFO 16985 [luigi-interface] hadoop.py:273 - OK | |
2017-01-19 18:07:15,588 INFO 16985 [luigi-interface] hadoop.py:273 - Time taken: 8.147 seconds | |
2017-01-19 18:07:19,642 INFO 16985 [luigi-interface] hadoop.py:273 - OK | |
2017-01-19 18:07:19,642 INFO 16985 [luigi-interface] hadoop.py:273 - Time taken: 4.053 seconds | |
2017-01-19 18:07:20,483 INFO 16985 [luigi-interface] hadoop.py:273 - OK | |
2017-01-19 18:07:20,484 INFO 16985 [luigi-interface] hadoop.py:273 - Time taken: 0.841 seconds | |
2017-01-19 18:07:21,352 INFO 16985 [luigi-interface] hadoop.py:273 - OK | |
2017-01-19 18:07:21,371 INFO 16985 [luigi-interface] hadoop.py:273 - Time taken: 0.887 seconds | |
2017-01-19 18:07:21,464 INFO 16985 [luigi-interface] worker.py:296 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) done ImportStudentCourseEnrollmentTask(destination=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/, credentials=/edx/etc/edx-analytics-pipeline/input.json, database=stage_edxapp, import_date=None) | |
2017-01-19 18:07:21,464 INFO 16985 [luigi-interface] worker.py:282 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) running LastDailyIpAddressOfUserTask(source=('hdfs://localhost:9000/data/',), interval=2017-01-03-2017-01-19, expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/) | |
2017-01-19 18:07:27,912 INFO 16985 [luigi-interface] hadoop.py:242 - /edx/app/hadoop/hadoop/bin/hadoop jar /edx/app/hadoop/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.3.0.jar -D mapred.job.name=LastDailyIpAddressOfUserTask(source=('hdfs://localhost:9000/data/',), interval=2017-01-03-2017-01-19, expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/) -D mapred.reduce.tasks=1 -mapper /usr/bin/python2.7 mrrunner.py map -reducer /usr/bin/python2.7 mrrunner.py reduce -file /var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/luigi/mrrunner.py -file /tmp/tmpFlhioS/packages.tar -file /tmp/tmpFlhioS/job-instance.pickle -input /data/tracking.log -output /edx-analytics-pipeline/marker/-969637439385692816-temp-2017-01-19T18-07-25.670339 | |
2017-01-19 18:07:29,727 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:29 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead. | |
2017-01-19 18:07:32,974 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:32 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 | |
2017-01-19 18:07:33,404 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:33 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 | |
2017-01-19 18:07:36,031 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:36 INFO mapred.FileInputFormat: Total input paths to process : 1 | |
2017-01-19 18:07:36,164 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:36 INFO mapreduce.JobSubmitter: number of splits:2 | |
2017-01-19 18:07:36,171 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:36 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name | |
2017-01-19 18:07:36,173 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:36 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces | |
2017-01-19 18:07:36,567 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:36 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1484841390427_0011 | |
2017-01-19 18:07:37,167 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:37 INFO impl.YarnClientImpl: Submitted application application_1484841390427_0011 | |
2017-01-19 18:07:37,314 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:37 INFO mapreduce.Job: The url to track the job: http://ip-172-30-0-176:8088/proxy/application_1484841390427_0011/ | |
2017-01-19 18:07:37,316 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:37 INFO mapreduce.Job: Running job: job_1484841390427_0011 | |
2017-01-19 18:07:50,192 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:50 INFO mapreduce.Job: Job job_1484841390427_0011 running in uber mode : false | |
2017-01-19 18:07:50,193 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:07:50 INFO mapreduce.Job: map 0% reduce 0% | |
2017-01-19 18:08:12,997 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:08:12 INFO mapreduce.Job: map 100% reduce 0% | |
2017-01-19 18:08:29,339 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:08:29 INFO mapreduce.Job: map 100% reduce 100% | |
2017-01-19 18:08:42,632 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:08:42 INFO mapreduce.Job: Task Id : attempt_1484841390427_0011_r_000000_0, Status : FAILED | |
2017-01-19 18:08:42,659 INFO 16985 [luigi-interface] hadoop.py:273 - Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 | |
2017-01-19 18:08:42,659 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) | |
2017-01-19 18:08:42,659 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) | |
2017-01-19 18:08:42,660 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134) | |
2017-01-19 18:08:42,660 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237) | |
2017-01-19 18:08:42,660 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) | |
2017-01-19 18:08:42,660 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) | |
2017-01-19 18:08:42,660 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) | |
2017-01-19 18:08:42,661 INFO 16985 [luigi-interface] hadoop.py:273 - at java.security.AccessController.doPrivileged(Native Method) | |
2017-01-19 18:08:42,661 INFO 16985 [luigi-interface] hadoop.py:273 - at javax.security.auth.Subject.doAs(Subject.java:422) | |
2017-01-19 18:08:42,661 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) | |
2017-01-19 18:08:42,661 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) | |
2017-01-19 18:08:43,680 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:08:43 INFO mapreduce.Job: map 100% reduce 0% | |
2017-01-19 18:08:59,085 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:08:59 INFO mapreduce.Job: map 100% reduce 100% | |
2017-01-19 18:09:17,410 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:09:17 INFO mapreduce.Job: Task Id : attempt_1484841390427_0011_r_000000_1, Status : FAILED | |
2017-01-19 18:09:17,411 INFO 16985 [luigi-interface] hadoop.py:273 - Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 | |
2017-01-19 18:09:17,411 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) | |
2017-01-19 18:09:17,411 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) | |
2017-01-19 18:09:17,412 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134) | |
2017-01-19 18:09:17,412 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237) | |
2017-01-19 18:09:17,412 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) | |
2017-01-19 18:09:17,412 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) | |
2017-01-19 18:09:17,412 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) | |
2017-01-19 18:09:17,413 INFO 16985 [luigi-interface] hadoop.py:273 - at java.security.AccessController.doPrivileged(Native Method) | |
2017-01-19 18:09:17,414 INFO 16985 [luigi-interface] hadoop.py:273 - at javax.security.auth.Subject.doAs(Subject.java:422) | |
2017-01-19 18:09:17,414 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) | |
2017-01-19 18:09:17,414 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) | |
2017-01-19 18:09:18,417 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:09:18 INFO mapreduce.Job: map 100% reduce 0% | |
2017-01-19 18:09:33,705 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:09:33 INFO mapreduce.Job: map 100% reduce 100% | |
2017-01-19 18:09:46,970 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:09:46 INFO mapreduce.Job: Task Id : attempt_1484841390427_0011_r_000000_2, Status : FAILED | |
2017-01-19 18:09:46,983 INFO 16985 [luigi-interface] hadoop.py:273 - Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 | |
2017-01-19 18:09:46,983 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) | |
2017-01-19 18:09:46,984 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) | |
2017-01-19 18:09:46,984 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134) | |
2017-01-19 18:09:46,984 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237) | |
2017-01-19 18:09:46,984 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459) | |
2017-01-19 18:09:46,984 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) | |
2017-01-19 18:09:46,985 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) | |
2017-01-19 18:09:46,985 INFO 16985 [luigi-interface] hadoop.py:273 - at java.security.AccessController.doPrivileged(Native Method) | |
2017-01-19 18:09:46,990 INFO 16985 [luigi-interface] hadoop.py:273 - at javax.security.auth.Subject.doAs(Subject.java:422) | |
2017-01-19 18:09:46,990 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) | |
2017-01-19 18:09:46,991 INFO 16985 [luigi-interface] hadoop.py:273 - at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) | |
2017-01-19 18:09:47,983 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:09:47 INFO mapreduce.Job: map 100% reduce 0% | |
2017-01-19 18:10:04,428 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:10:04 INFO mapreduce.Job: map 100% reduce 100% | |
2017-01-19 18:10:23,820 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:10:23 INFO mapreduce.Job: Job job_1484841390427_0011 failed with state FAILED due to: Task failed task_1484841390427_0011_r_000000 | |
2017-01-19 18:10:23,820 INFO 16985 [luigi-interface] hadoop.py:273 - Job failed as tasks failed. failedMaps:0 failedReduces:1 | |
2017-01-19 18:10:24,151 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:10:24 INFO mapreduce.Job: Counters: 37 | |
2017-01-19 18:10:24,152 INFO 16985 [luigi-interface] hadoop.py:273 - File System Counters | |
2017-01-19 18:10:24,187 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of bytes read=0 | |
2017-01-19 18:10:24,187 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of bytes written=195423 | |
2017-01-19 18:10:24,187 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of read operations=0 | |
2017-01-19 18:10:24,188 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of large read operations=0 | |
2017-01-19 18:10:24,191 INFO 16985 [luigi-interface] hadoop.py:273 - FILE: Number of write operations=0 | |
2017-01-19 18:10:24,191 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of bytes read=686974 | |
2017-01-19 18:10:24,191 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of bytes written=0 | |
2017-01-19 18:10:24,192 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of read operations=6 | |
2017-01-19 18:10:24,192 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of large read operations=0 | |
2017-01-19 18:10:24,192 INFO 16985 [luigi-interface] hadoop.py:273 - HDFS: Number of write operations=0 | |
2017-01-19 18:10:24,192 INFO 16985 [luigi-interface] hadoop.py:273 - Job Counters | |
2017-01-19 18:10:24,192 INFO 16985 [luigi-interface] hadoop.py:273 - Failed reduce tasks=4 | |
2017-01-19 18:10:24,204 INFO 16985 [luigi-interface] hadoop.py:273 - Launched map tasks=2 | |
2017-01-19 18:10:24,204 INFO 16985 [luigi-interface] hadoop.py:273 - Launched reduce tasks=4 | |
2017-01-19 18:10:24,205 INFO 16985 [luigi-interface] hadoop.py:273 - Data-local map tasks=2 | |
2017-01-19 18:10:24,205 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all maps in occupied slots (ms)=43964 | |
2017-01-19 18:10:24,205 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all reduces in occupied slots (ms)=118112 | |
2017-01-19 18:10:24,205 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all map tasks (ms)=43964 | |
2017-01-19 18:10:24,205 INFO 16985 [luigi-interface] hadoop.py:273 - Total time spent by all reduce tasks (ms)=118112 | |
2017-01-19 18:10:24,206 INFO 16985 [luigi-interface] hadoop.py:273 - Total vcore-seconds taken by all map tasks=43964 | |
2017-01-19 18:10:24,206 INFO 16985 [luigi-interface] hadoop.py:273 - Total vcore-seconds taken by all reduce tasks=118112 | |
2017-01-19 18:10:24,206 INFO 16985 [luigi-interface] hadoop.py:273 - Total megabyte-seconds taken by all map tasks=45019136 | |
2017-01-19 18:10:24,206 INFO 16985 [luigi-interface] hadoop.py:273 - Total megabyte-seconds taken by all reduce tasks=120946688 | |
2017-01-19 18:10:24,206 INFO 16985 [luigi-interface] hadoop.py:273 - Map-Reduce Framework | |
2017-01-19 18:10:24,206 INFO 16985 [luigi-interface] hadoop.py:273 - Map input records=1742 | |
2017-01-19 18:10:24,207 INFO 16985 [luigi-interface] hadoop.py:273 - Map output records=167 | |
2017-01-19 18:10:24,207 INFO 16985 [luigi-interface] hadoop.py:273 - Map output bytes=15809 | |
2017-01-19 18:10:24,207 INFO 16985 [luigi-interface] hadoop.py:273 - Map output materialized bytes=16155 | |
2017-01-19 18:10:24,207 INFO 16985 [luigi-interface] hadoop.py:273 - Input split bytes=182 | |
2017-01-19 18:10:24,207 INFO 16985 [luigi-interface] hadoop.py:273 - Combine input records=0 | |
2017-01-19 18:10:24,208 INFO 16985 [luigi-interface] hadoop.py:273 - Spilled Records=167 | |
2017-01-19 18:10:24,208 INFO 16985 [luigi-interface] hadoop.py:273 - Failed Shuffles=0 | |
2017-01-19 18:10:24,208 INFO 16985 [luigi-interface] hadoop.py:273 - Merged Map outputs=0 | |
2017-01-19 18:10:24,208 INFO 16985 [luigi-interface] hadoop.py:273 - GC time elapsed (ms)=651 | |
2017-01-19 18:10:24,208 INFO 16985 [luigi-interface] hadoop.py:273 - CPU time spent (ms)=3880 | |
2017-01-19 18:10:24,208 INFO 16985 [luigi-interface] hadoop.py:273 - Physical memory (bytes) snapshot=415764480 | |
2017-01-19 18:10:24,209 INFO 16985 [luigi-interface] hadoop.py:273 - Virtual memory (bytes) snapshot=4414054400 | |
2017-01-19 18:10:24,226 INFO 16985 [luigi-interface] hadoop.py:273 - Total committed heap usage (bytes)=317587456 | |
2017-01-19 18:10:24,226 INFO 16985 [luigi-interface] hadoop.py:273 - File Input Format Counters | |
2017-01-19 18:10:24,226 INFO 16985 [luigi-interface] hadoop.py:273 - Bytes Read=686792 | |
2017-01-19 18:10:24,226 INFO 16985 [luigi-interface] hadoop.py:273 - 17/01/19 18:10:24 ERROR streaming.StreamJob: Job not Successful! | |
2017-01-19 18:10:24,226 INFO 16985 [luigi-interface] hadoop.py:273 - Streaming Command Failed! | |
2017-01-19 18:10:24,237 ERROR 16985 [luigi-interface] worker.py:304 - [pid 16985] Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) failed LastDailyIpAddressOfUserTask(source=('hdfs://localhost:9000/data/',), interval=2017-01-03-2017-01-19, expand_interval=2 days, 0:00:00, pattern=('.*tracking.log.*',), date_pattern=%Y%m%d, warehouse_path=hdfs://localhost:9000/edx-analytics-pipeline/warehouse/) | |
Traceback (most recent call last): | |
File "/var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/luigi/worker.py", line 292, in _run_task | |
task.run() | |
File "/var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/edx/analytics/tasks/insights/location_per_course.py", line 163, in run | |
super(LastDailyIpAddressOfUserTask, self).run() | |
File "/var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/luigi/hadoop.py", line 612, in run | |
self.job_runner().run_job(self) | |
File "/var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/luigi/hadoop.py", line 482, in run_job | |
run_and_track_hadoop_job(arglist) | |
File "/var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/luigi/hadoop.py", line 318, in run_and_track_hadoop_job | |
return track_process(arglist, tracking_url_callback, env) | |
File "/var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/luigi/hadoop.py", line 302, in track_process | |
raise HadoopJobError(message + 'Also, no tracking url found.', out, err) | |
HadoopJobError: ('Streaming job failed with exit code 1. Also, no tracking url found.', 'packageJobJar: [/var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/luigi/mrrunner.py, /tmp/tmpFlhioS/packages.tar, /tmp/tmpFlhioS/job-instance.pickle, /tmp/hadoop-hadoop/hadoop-unjar1279580823033983790/] [] /tmp/streamjob7319831079718813713.jar tmpDir=null\n', '17/01/19 18:07:29 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.\n17/01/19 18:07:32 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032\n17/01/19 18:07:33 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032\n17/01/19 18:07:36 INFO mapred.FileInputFormat: Total input paths to process : 1\n17/01/19 18:07:36 INFO mapreduce.JobSubmitter: number of splits:2\n17/01/19 18:07:36 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name\n17/01/19 18:07:36 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces\n17/01/19 18:07:36 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1484841390427_0011\n17/01/19 18:07:37 INFO impl.YarnClientImpl: Submitted application application_1484841390427_0011\n17/01/19 18:07:37 INFO mapreduce.Job: The url to track the job: http://ip-172-30-0-176:8088/proxy/application_1484841390427_0011/\n17/01/19 18:07:37 INFO mapreduce.Job: Running job: job_1484841390427_0011\n17/01/19 18:07:50 INFO mapreduce.Job: Job job_1484841390427_0011 running in uber mode : false\n17/01/19 18:07:50 INFO mapreduce.Job: map 0% reduce 0%\n17/01/19 18:08:12 INFO mapreduce.Job: map 100% reduce 0%\n17/01/19 18:08:29 INFO mapreduce.Job: map 100% reduce 100%\n17/01/19 18:08:42 INFO mapreduce.Job: Task Id : attempt_1484841390427_0011_r_000000_0, Status : FAILED\nError: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1\n\tat org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)\n\tat org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)\n\tat org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134)\n\tat org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237)\n\tat org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)\n\tat org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)\n\tat org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)\n\tat org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)\n\n17/01/19 18:08:43 INFO mapreduce.Job: map 100% reduce 0%\n17/01/19 18:08:59 INFO mapreduce.Job: map 100% reduce 100%\n17/01/19 18:09:17 INFO mapreduce.Job: Task Id : attempt_1484841390427_0011_r_000000_1, Status : FAILED\nError: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1\n\tat org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)\n\tat org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)\n\tat org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134)\n\tat org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237)\n\tat org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)\n\tat org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)\n\tat org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)\n\tat org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)\n\n17/01/19 18:09:18 INFO mapreduce.Job: map 100% reduce 0%\n17/01/19 18:09:33 INFO mapreduce.Job: map 100% reduce 100%\n17/01/19 18:09:46 INFO mapreduce.Job: Task Id : attempt_1484841390427_0011_r_000000_2, Status : FAILED\nError: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1\n\tat org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)\n\tat org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)\n\tat org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134)\n\tat org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237)\n\tat org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)\n\tat org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)\n\tat org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)\n\tat org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)\n\n17/01/19 18:09:47 INFO mapreduce.Job: map 100% reduce 0%\n17/01/19 18:10:04 INFO mapreduce.Job: map 100% reduce 100%\n17/01/19 18:10:23 INFO mapreduce.Job: Job job_1484841390427_0011 failed with state FAILED due to: Task failed task_1484841390427_0011_r_000000\nJob failed as tasks failed. failedMaps:0 failedReduces:1\n\n17/01/19 18:10:24 INFO mapreduce.Job: Counters: 37\n\tFile System Counters\n\t\tFILE: Number of bytes read=0\n\t\tFILE: Number of bytes written=195423\n\t\tFILE: Number of read operations=0\n\t\tFILE: Number of large read operations=0\n\t\tFILE: Number of write operations=0\n\t\tHDFS: Number of bytes read=686974\n\t\tHDFS: Number of bytes written=0\n\t\tHDFS: Number of read operations=6\n\t\tHDFS: Number of large read operations=0\n\t\tHDFS: Number of write operations=0\n\tJob Counters \n\t\tFailed reduce tasks=4\n\t\tLaunched map tasks=2\n\t\tLaunched reduce tasks=4\n\t\tData-local map tasks=2\n\t\tTotal time spent by all maps in occupied slots (ms)=43964\n\t\tTotal time spent by all reduces in occupied slots (ms)=118112\n\t\tTotal time spent by all map tasks (ms)=43964\n\t\tTotal time spent by all reduce tasks (ms)=118112\n\t\tTotal vcore-seconds taken by all map tasks=43964\n\t\tTotal vcore-seconds taken by all reduce tasks=118112\n\t\tTotal megabyte-seconds taken by all map tasks=45019136\n\t\tTotal megabyte-seconds taken by all reduce tasks=120946688\n\tMap-Reduce Framework\n\t\tMap input records=1742\n\t\tMap output records=167\n\t\tMap output bytes=15809\n\t\tMap output materialized bytes=16155\n\t\tInput split bytes=182\n\t\tCombine input records=0\n\t\tSpilled Records=167\n\t\tFailed Shuffles=0\n\t\tMerged Map outputs=0\n\t\tGC time elapsed (ms)=651\n\t\tCPU time spent (ms)=3880\n\t\tPhysical memory (bytes) snapshot=415764480\n\t\tVirtual memory (bytes) snapshot=4414054400\n\t\tTotal committed heap usage (bytes)=317587456\n\tFile Input Format Counters \n\t\tBytes Read=686792\n17/01/19 18:10:24 ERROR streaming.StreamJob: Job not Successful!\nStreaming Command Failed!\n') | |
2017-01-19 18:10:24,239 INFO 16985 [luigi-interface] notifications.py:96 - Skipping error email. Set `error-email` in the `core` section of the luigi config file to receive error emails. | |
2017-01-19 18:10:32,881 INFO 16985 [luigi-interface] worker.py:337 - Done | |
2017-01-19 18:10:32,881 INFO 16985 [luigi-interface] worker.py:338 - There are no more tasks to run at this time | |
2017-01-19 18:10:32,882 INFO 16985 [luigi-interface] worker.py:343 - There are 8 pending tasks possibly being run by other workers | |
2017-01-19 18:10:32,924 INFO 16985 [luigi-interface] worker.py:117 - Worker Worker(salt=062530328, host=ip-172-30-0-176, username=hadoop, pid=16985) was stopped. Shutting down Keep-Alive thread |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment