Skip to content

Instantly share code, notes, and snippets.

@qqibrow
Created November 5, 2020 01:05
Show Gist options
  • Save qqibrow/27e5c3bce0b2830809bbfd0da8a0f91c to your computer and use it in GitHub Desktop.
Save qqibrow/27e5c3bce0b2830809bbfd0da8a0f91c to your computer and use it in GitHub Desktop.
jm_log
2020-11-04 18:39:39,854 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 6917 @ 1604515179567 for job 47d07d9ba88330da5940b96d82c0e5b1.
2020-11-04 18:42:07,846 WARN org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 48188ms for sessionid 0x200572cda940ebb
2020-11-04 18:42:07,846 INFO org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 48188ms for sessionid 0x200572cda940ebb, closing socket connection and attempting reconnect
2020-11-04 18:42:07,857 INFO org.apache.flink.yarn.YarnResourceManager - The heartbeat of JobManager with id 19b2149e615087e5aeb8fcdb3f6d8daf timed out.
2020-11-04 18:42:07,857 INFO org.apache.flink.yarn.YarnResourceManager - Disconnect job manager 89ebd520ca11b33c2d37d55295e04f33@akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/jobmanager_0 for job 47d07d9ba88330da5940b96d82c0e5b1 from the resource manager.
2020-11-04 18:42:07,861 INFO org.apache.flink.runtime.jobmaster.JobMaster - Close ResourceManager connection 8babe17f0e7ef2a320a4f3350794d019: The heartbeat of JobManager with id 19b2149e615087e5aeb8fcdb3f6d8daf timed out..
2020-11-04 18:42:07,861 INFO org.apache.flink.runtime.jobmaster.JobMaster - Connecting to ResourceManager akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/resourcemanager(819240c9c97ced16fd16859e671d47dc)
2020-11-04 18:42:07,871 INFO org.apache.flink.runtime.jobmaster.JobMaster - Resolved ResourceManager address, beginning registration
2020-11-04 18:42:07,871 INFO org.apache.flink.runtime.jobmaster.JobMaster - Registration at ResourceManager attempt 1 (timeout=100ms)
2020-11-04 18:42:07,871 INFO org.apache.flink.yarn.YarnResourceManager - Registering job manager 89ebd520ca11b33c2d37d55295e04f33@akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/jobmanager_0 for job 47d07d9ba88330da5940b96d82c0e5b1.
2020-11-04 18:42:07,921 INFO org.apache.flink.yarn.YarnResourceManager - Registered job manager 89ebd520ca11b33c2d37d55295e04f33@akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/jobmanager_0 for job 47d07d9ba88330da5940b96d82c0e5b1.
2020-11-04 18:42:07,921 INFO org.apache.flink.runtime.jobmaster.JobMaster - JobManager successfully registered at ResourceManager, leader id: 819240c9c97ced16fd16859e671d47dc.
2020-11-04 18:42:07,952 INFO org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager - State change: SUSPENDED
2020-11-04 18:42:07,952 WARN org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Connection to ZooKeeper suspended. The contender http://flinkhost-data-slave-prod-0a021443.ec2.pin220.com:41000 no longer participates in the leader election.
2020-11-04 18:42:07,953 WARN org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Connection to ZooKeeper suspended. The contender akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/dispatcher no longer participates in the leader election.
2020-11-04 18:42:07,953 WARN org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - ZooKeeper connection SUSPENDING. Changes to the submitted job graphs are not monitored (temporarily).
2020-11-04 18:42:07,956 WARN org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Connection to ZooKeeper suspended. The contender akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/jobmanager_0 no longer participates in the leader election.
2020-11-04 18:42:07,956 INFO org.apache.flink.yarn.YarnResourceManager - ResourceManager akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/resourcemanager was revoked leadership. Clearing fencing token.
2020-11-04 18:42:07,956 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Stopping ZooKeeperLeaderRetrievalService /leader/47d07d9ba88330da5940b96d82c0e5b1/job_manager_lock.
2020-11-04 18:42:07,957 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Dispatcher akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/dispatcher was revoked leadership.
2020-11-04 18:42:07,957 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Stopping all currently running jobs of dispatcher akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/dispatcher.
2020-11-04 18:42:07,957 INFO org.apache.flink.runtime.jobmaster.JobManagerRunner - JobManager for job foo-job (47d07d9ba88330da5940b96d82c0e5b1) was revoked leadership at akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/jobmanager_0.
2020-11-04 18:42:07,960 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Stopping ZooKeeperLeaderRetrievalService /leader/resource_manager_lock.
2020-11-04 18:42:07,960 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl - Suspending the SlotManager.
2020-11-04 18:42:07,962 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Job foo-job (47d07d9ba88330da5940b96d82c0e5b1) switched from state RUNNING to SUSPENDED.
org.apache.flink.util.FlinkException: JobManager is no longer the leader.
at org.apache.flink.runtime.jobmaster.JobManagerRunner.revokeJobMasterLeadership(JobManagerRunner.java:391)
at org.apache.flink.runtime.jobmaster.JobManagerRunner.lambda$revokeLeadership$5(JobManagerRunner.java:377)
at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:981)
at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2124)
at org.apache.flink.runtime.jobmaster.JobManagerRunner.revokeLeadership(JobManagerRunner.java:374)
at org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService.notLeader(ZooKeeperLeaderElectionService.java:247)
at org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.leader.LeaderLatch$8.apply(LeaderLatch.java:640)
at org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.leader.LeaderLatch$8.apply(LeaderLatch.java:636)
at org.apache.flink.shaded.curator.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93)
at org.apache.flink.shaded.curator.org.apache.curator.shaded.com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at org.apache.flink.shaded.curator.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:85)
at org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.leader.LeaderLatch.setLeadership(LeaderLatch.java:635)
at org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.leader.LeaderLatch.handleStateChange(LeaderLatch.java:623)
at org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.leader.LeaderLatch.access$000(LeaderLatch.java:64)
at org.apache.flink.shaded.curator.org.apache.curator.framework.recipes.leader.LeaderLatch$1.stateChanged(LeaderLatch.java:82)
at org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager$2.apply(ConnectionStateManager.java:259)
at org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager$2.apply(ConnectionStateManager.java:255)
at org.apache.flink.shaded.curator.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93)
at org.apache.flink.shaded.curator.org.apache.curator.shaded.com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at org.apache.flink.shaded.curator.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:85)
at org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager.processEvents(ConnectionStateManager.java:253)
at org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager.access$000(ConnectionStateManager.java:43)
at org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager$1.call(ConnectionStateManager.java:111)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-11-04 18:42:07,962 WARN org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Connection to ZooKeeper suspended. Can no longer retrieve the leader from ZooKeeper.
2020-11-04 18:42:07,962 WARN org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Connection to ZooKeeper suspended. Can no longer retrieve the leader from ZooKeeper.
2020-11-04 18:42:07,962 WARN org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Connection to ZooKeeper suspended. The contender akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/resourcemanager no longer participates in the leader election.
2020-11-04 18:42:07,962 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - http://flinkhost-data-slave-prod-0a021443.ec2.pin220.com:41000 lost leadership
2020-11-04 18:42:07,962 WARN org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Connection to ZooKeeper suspended. Can no longer retrieve the leader from ZooKeeper.
2020-11-04 18:42:08,204 WARN org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn - SASL configuration failed: javax.security.auth.login.LoginException: No JAAS configuration section named 'Client' was found in specified JAAS configuration file: '/tmp/jaas-4928991522824363931.conf'. Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it.
2020-11-04 18:42:08,204 INFO org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn - Opening socket connection to server ip-10-12-69-206.ec2.internal/10.12.69.206:2181
2020-11-04 18:42:08,204 ERROR org.apache.flink.shaded.curator.org.apache.curator.ConnectionState - Authentication failed
2020-11-04 18:42:08,207 INFO org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn - Socket connection established to ip-10-12-69-206.ec2.internal/10.12.69.206:2181, initiating session
2020-11-04 18:42:08,211 INFO org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn - Session establishment complete on server ip-10-12-69-206.ec2.internal/10.12.69.206:2181, sessionid = 0x200572cda940ebb, negotiated timeout = 60000
2020-11-04 18:42:08,211 INFO org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager - State change: RECONNECTED
2020-11-04 18:42:08,211 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Connection to ZooKeeper was reconnected. Leader election can be restarted.
2020-11-04 18:42:08,211 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Connection to ZooKeeper was reconnected. Leader election can be restarted.
2020-11-04 18:42:08,211 INFO org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - ZooKeeper connection RECONNECTED. Changes to the submitted job graphs are monitored again.
2020-11-04 18:42:08,212 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Connection to ZooKeeper was reconnected. Leader election can be restarted.
2020-11-04 18:42:08,213 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Connection to ZooKeeper was reconnected. Leader retrieval can be restarted.
2020-11-04 18:42:08,213 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Connection to ZooKeeper was reconnected. Leader election can be restarted.
2020-11-04 18:42:08,213 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Connection to ZooKeeper was reconnected. Leader retrieval can be restarted.
2020-11-04 18:42:08,218 INFO org.apache.flink.yarn.YarnResourceManager - ResourceManager akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/resourcemanager was granted leadership with fencing token 963488347a812f22637c492e3a5249e8
2020-11-04 18:42:08,218 INFO org.apache.flink.runtime.jobmaster.JobManagerRunner - JobManagerRunner already shutdown.
2020-11-04 18:42:08,218 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Dispatcher akka.tcp://flink@flinkhost-data-slave-prod-0a021443.ec2.pin220.com:35779/user/dispatcher was granted leadership with fencing token 209ab2c2-da88-473a-b42e-cc60ee45cf65
2020-11-04 18:42:08,218 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl - Starting the SlotManager.
2020-11-04 18:42:08,218 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - http://flinkhost-data-slave-prod-0a021443.ec2.pin220.com:41000 was granted leadership with leaderSessionID=224e1d2a-8da7-4f39-bd22-75ae3c9e1fa3
2020-11-04 18:42:08,218 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Recovering all persisted jobs.
2020-11-04 18:42:08,240 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000209 --> akka.tcp://flink@flinkhost-data-slave-prod-0a021347.ec2.pin220.com:37185/user/taskmanager_0,38625
2020-11-04 18:42:08,250 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000310 --> akka.tcp://flink@flinkhost-data-slave-prod-0a021663.ec2.pin220.com:34113/user/taskmanager_0,44359
2020-11-04 18:42:08,251 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000306 --> akka.tcp://flink@flinkhost-data-slave-prod-0a02136c.ec2.pin220.com:38739/user/taskmanager_0,37285
2020-11-04 18:42:08,252 INFO org.apache.flink.yarn.YarnResourceManager - Registering TaskManager with ResourceID container_e02_1599158147594_30378_01_000209 (akka.tcp://flink@flinkhost-data-slave-prod-0a021347.ec2.pin220.com:37185/user/taskmanager_0) at ResourceManager
2020-11-04 18:42:08,256 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000198 --> akka.tcp://flink@flinkhost-data-slave-prod-0a021f64.ec2.pin220.com:39097/user/taskmanager_0,44669
2020-11-04 18:42:08,256 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000225 --> akka.tcp://flink@flinkhost-data-slave-prod-0a0205a4.ec2.pin220.com:39779/user/taskmanager_0,39649
2020-11-04 18:42:08,257 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000233 --> akka.tcp://flink@flinkhost-data-slave-prod-0a020e46.ec2.pin220.com:43629/user/taskmanager_0,35419
2020-11-04 18:42:08,257 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000213 --> akka.tcp://flink@flinkhost-data-slave-prod-0a0200f7.ec2.pin220.com:39649/user/taskmanager_0,36721
2020-11-04 18:42:08,257 INFO org.apache.flink.yarn.YarnResourceManager - Registering TaskManager with ResourceID container_e02_1599158147594_30378_01_000310 (akka.tcp://flink@flinkhost-data-slave-prod-0a021663.ec2.pin220.com:34113/user/taskmanager_0) at ResourceManager
2020-11-04 18:42:08,258 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000254 --> akka.tcp://flink@flinkhost-data-slave-prod-0a021a4d.ec2.pin220.com:37401/user/taskmanager_0,44997
2020-11-04 18:42:08,258 INFO org.apache.flink.yarn.YarnResourceManager - Registering TaskManager with ResourceID container_e02_1599158147594_30378_01_000306 (akka.tcp://flink@flinkhost-data-slave-prod-0a02136c.ec2.pin220.com:38739/user/taskmanager_0) at ResourceManager
2020-11-04 18:42:08,259 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000231 --> akka.tcp://flink@flinkhost-data-slave-prod-0a0209bb.ec2.pin220.com:38603/user/taskmanager_0,35761
2020-11-04 18:42:08,259 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000315 --> akka.tcp://flink@flinkhost-data-slave-prod-0a020cab.ec2.pin220.com:41879/user/taskmanager_0,38993
2020-11-04 18:42:08,259 INFO org.apache.flink.yarn.YarnResourceManager - Registering TaskManager with ResourceID container_e02_1599158147594_30378_01_000198 (akka.tcp://flink@flinkhost-data-slave-prod-0a021f64.ec2.pin220.com:39097/user/taskmanager_0) at ResourceManager
2020-11-04 18:42:08,259 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000194 --> akka.tcp://flink@flinkhost-data-slave-prod-0a020f3d.ec2.pin220.com:35945/user/taskmanager_0,34625
2020-11-04 18:42:08,260 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000175 --> akka.tcp://flink@flinkhost-data-slave-prod-0a021b15.ec2.pin220.com:35471/user/taskmanager_0,44273
2020-11-04 18:42:08,260 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000252 --> akka.tcp://flink@flinkhost-data-slave-prod-0a0217de.ec2.pin220.com:41557/user/taskmanager_0,39281
2020-11-04 18:42:08,260 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000162 --> akka.tcp://flink@flinkhost-data-slave-prod-0a021c4b.ec2.pin220.com:39237/user/taskmanager_0,45009
2020-11-04 18:42:08,261 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000247 --> akka.tcp://flink@flinkhost-data-slave-prod-0a0203d1.ec2.pin220.com:42657/user/taskmanager_0,39763
2020-11-04 18:42:08,261 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000248 --> akka.tcp://flink@flinkhost-data-slave-prod-0a0218f4.ec2.pin220.com:43453/user/taskmanager_0,40499
2020-11-04 18:42:08,262 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000202 --> akka.tcp://flink@flinkhost-data-slave-prod-0a0216da.ec2.pin220.com:38183/user/taskmanager_0,35915
2020-11-04 18:42:08,262 INFO org.apache.flink.yarn.YarnResourceManager - container info: container_e02_1599158147594_30378_01_000232 --> akka.tcp://flink@flinkhost-data-slave-prod-0a020e2c.ec2.pin220.com:44347/user/taskmanager_0,36463
......
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Discarding the results produced by task execution 206501984bb11e0d0ed3d8ce27591aa3.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Sink: reporting-sink (27/30) (80b6abde031ae87aeb4c6e6a5822daca) switched from RUNNING to CANCELING.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Sink: reporting-sink (27/30) (80b6abde031ae87aeb4c6e6a5822daca) switched from CANCELING to CANCELED.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Discarding the results produced by task execution 80b6abde031ae87aeb4c6e6a5822daca.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Sink: reporting-sink (28/30) (8e338ff505bbbb9f0c3ea09f7440c167) switched from RUNNING to CANCELING.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Sink: reporting-sink (28/30) (8e338ff505bbbb9f0c3ea09f7440c167) switched from CANCELING to CANCELED.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Discarding the results produced by task execution 8e338ff505bbbb9f0c3ea09f7440c167.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Sink: reporting-sink (29/30) (59071c4c8d6f025885de8ddb162bac8e) switched from RUNNING to CANCELING.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Sink: reporting-sink (29/30) (59071c4c8d6f025885de8ddb162bac8e) switched from CANCELING to CANCELED.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Discarding the results produced by task execution 59071c4c8d6f025885de8ddb162bac8e.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Sink: reporting-sink (30/30) (bec1ab0a6a3e2806d9b22b39d497efe6) switched from RUNNING to CANCELING.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Sink: reporting-sink (30/30) (bec1ab0a6a3e2806d9b22b39d497efe6) switched from CANCELING to CANCELED.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Discarding the results produced by task execution bec1ab0a6a3e2806d9b22b39d497efe6.
2020-11-04 18:42:09,350 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping checkpoint coordinator for job 47d07d9ba88330da5940b96d82c0e5b1.
2020-11-04 18:42:09,351 INFO org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Suspending
2020-11-04 18:42:09,365 INFO org.apache.flink.runtime.checkpoint.ZooKeeperCheckpointIDCounter - Shutting down.
2020-11-04 18:42:09,365 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Job 47d07d9ba88330da5940b96d82c0e5b1 has been suspended.
2020-11-04 18:42:09,368 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Suspending SlotPool.
2020-11-04 18:42:09,368 INFO org.apache.flink.runtime.jobmaster.JobMaster - Close ResourceManager connection 8babe17f0e7ef2a320a4f3350794d019: JobManager is no longer the leader..
2020-11-04 18:42:09,368 INFO org.apache.flink.runtime.jobmaster.JobMaster - Stopping the JobMaster for job foo-job(47d07d9ba88330da5940b96d82c0e5b1).
2020-11-04 18:42:09,372 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Stopping SlotPool.
2020-11-04 18:42:09,373 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Stopping ZooKeeperLeaderElectionService ZooKeeperLeaderElectionService{leaderPath='/leader/47d07d9ba88330da5940b96d82c0e5b1/job_manager_lock'}.
2020-11-04 18:42:09,378 INFO org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - Released locks of job graph 47d07d9ba88330da5940b96d82c0e5b1 from ZooKeeper.
2020-11-04 18:42:09,417 WARN org.apache.flink.configuration.Configuration - Configuration cannot evaluate value false as a long integer number
2020-11-04 18:42:15,883 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.jobmaster.JobMaster at akka://flink/user/jobmanager_1 .
2020-11-04 18:42:15,883 INFO org.apache.flink.runtime.jobmaster.JobMaster - Initializing job foo-job (47d07d9ba88330da5940b96d82c0e5b1).
2020-11-04 18:42:15,884 INFO org.apache.flink.runtime.jobmaster.JobMaster - Using restart strategy FixedDelayRestartStrategy(maxNumberRestartAttempts=2147483647, delayBetweenRestartAttempts=10000) for foo-job (47d07d9ba88330da5940b96d82c0e5b1).
2020-11-04 18:42:15,885 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Job recovers via failover strategy: full graph restart
2020-11-04 18:42:15,885 INFO org.apache.flink.runtime.jobmaster.JobMaster - Running initialization on master for job foo-job (47d07d9ba88330da5940b96d82c0e5b1).
2020-11-04 18:42:15,885 INFO org.apache.flink.runtime.jobmaster.JobMaster - Successfully ran initialization on master in 0 ms.
2020-11-04 18:42:16,051 INFO org.apache.flink.runtime.util.ZooKeeperUtils - Initialized ZooKeeperCompletedCheckpointStore in '/checkpoints/47d07d9ba88330da5940b96d82c0e5b1'.
2020-11-04 18:42:16,054 INFO org.apache.flink.runtime.jobmaster.JobMaster - Using application-defined state backend: RocksDBStateBackend{checkpointStreamBackend=File State Backend (checkpoints: 's3a://bucket/foo-prod/checkpoints/_entropy_/prod', savepoints: 'null', asynchronous: UNDEFINED, fileStateThreshold: -1), localRocksDbDirectories=null, enableIncrementalCheckpointing=TRUE, numberOfTransferingThreads=-1}
2020-11-04 18:42:16,054 INFO org.apache.flink.runtime.jobmaster.JobMaster - Configuring application-defined state backend with job/cluster config
2020-11-04 18:42:16,054 INFO org.apache.flink.contrib.streaming.state.RocksDBStateBackend - Using predefined options: FLASH_SSD_OPTIMIZED.
2020-11-04 18:42:16,056 INFO org.apache.flink.contrib.streaming.state.RocksDBStateBackend - Using default options factory: DefaultConfigurableOptionsFactory{configuredOptions={}}.
2020-11-04 18:42:43,523 ERROR org.apache.flink.runtime.rest.handler.job.JobDetailsHandler - Unhandled exception.
org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing token mismatch: Ignoring message LocalFencedMessage(826fbfb893e4dbce095f5f1b1d75426b, LocalRpcInvocation(requestJob(JobID, Time))) because the fencing token 826fbfb893e4dbce095f5f1b1d75426b did not match the expected fencing token b42ecc60ee45cf65209ab2c2da88473a.
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:81)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2020-11-04 18:43:04,247 ERROR org.apache.flink.runtime.rest.handler.job.JobDetailsHandler - Unhandled exception.
org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing token mismatch: Ignoring message LocalFencedMessage(826fbfb893e4dbce095f5f1b1d75426b, LocalRpcInvocation(requestJob(JobID, Time))) because the fencing token 826fbfb893e4dbce095f5f1b1d75426b did not match the expected fencing token b42ecc60ee45cf65209ab2c2da88473a.
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:81)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
......
2020-11-04 18:44:08,291 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Processing Event EventType: STOP_CONTAINER for Container container_e02_1599158147594_30378_01_000190
2020-11-04 18:44:08,291 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_e02_1599158147594_30378_01_000190 because: TaskExecutor exceeded the idle timeout.
2020-11-04 18:44:08,295 INFO org.apache.flink.yarn.YarnResourceManager - Stopping container container_e02_1599158147594_30378_01_000312.
2020-11-04 18:44:08,295 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_e02_1599158147594_30378_01_000312 because: TaskExecutor exceeded the idle timeout.
2020-11-04 18:44:08,295 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Processing Event EventType: STOP_CONTAINER for Container container_e02_1599158147594_30378_01_000312
2020-11-04 18:44:08,295 INFO org.apache.flink.yarn.YarnResourceManager - Stopping container container_e02_1599158147594_30378_01_000158.
2020-11-04 18:44:08,297 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_e02_1599158147594_30378_01_000158 because: TaskExecutor exceeded the idle timeout.
2020-11-04 18:44:08,297 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Processing Event EventType: STOP_CONTAINER for Container container_e02_1599158147594_30378_01_000158
2020-11-04 18:44:08,299 INFO org.apache.flink.yarn.YarnResourceManager - Stopping container container_e02_1599158147594_30378_01_000163.
2020-11-04 18:44:08,299 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_e02_1599158147594_30378_01_000163 because: TaskExecutor exceeded the idle timeout.
2020-11-04 18:44:08,299 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Processing Event EventType: STOP_CONTAINER for Container container_e02_1599158147594_30378_01_000163
2020-11-04 18:44:08,302 INFO org.apache.flink.yarn.YarnResourceManager - Stopping container container_e02_1599158147594_30378_01_000206.
2020-11-04 18:44:08,302 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_e02_1599158147594_30378_01_000206 because: TaskExecutor exceeded the idle timeout.
2020-11-04 18:44:08,302 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Processing Event EventType: STOP_CONTAINER for Container container_e02_1599158147594_30378_01_000206
2020-11-04 18:44:08,304 INFO org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy - Opening proxy : flinkhost-data-slave-prod-0a021324.ec2.pin220.com:8041
2020-11-04 18:44:08,304 INFO org.apache.flink.yarn.YarnResourceManager - Stopping container container_e02_1599158147594_30378_01_000182.
2020-11-04 18:44:08,304 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_e02_1599158147594_30378_01_000182 because: TaskExecutor exceeded the idle timeout.
2020-11-04 18:44:08,305 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Processing Event EventType: STOP_CONTAINER for Container container_e02_1599158147594_30378_01_000182
2020-11-04 18:44:08,307 INFO org.apache.flink.yarn.YarnResourceManager - Stopping container container_e02_1599158147594_30378_01_000247.
2020-11-04 18:44:08,307 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_e02_1599158147594_30378_01_000247 because: TaskExecutor exceeded the idle timeout.
2020-11-04 18:44:08,307 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Processing Event EventType: STOP_CONTAINER for Container container_e02_1599158147594_30378_01_000247
2020-11-04 18:44:08,309 INFO org.apache.flink.yarn.YarnResourceManager - Stopping container container_e02_1599158147594_30378_01_000227.
2020-11-04 18:44:08,309 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_e02_1599158147594_30378_01_000227 because: TaskExecutor exceeded the idle timeout.
2020-11-04 18:44:08,309 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Processing Event EventType: STOP_CONTAINER for Container container_e02_1599158147594_30378_01_000227
2020-11-04 18:44:08,309 INFO org.apache.flink.yarn.YarnResourceManager - Stopping container container_e02_1599158147594_30378_01_000211.
2020-11-04 18:44:08,310 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_e02_1599158147594_30378_01_000211 because: TaskExecutor exceeded the idle timeout.
2020-11-04 18:44:08,311 INFO org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Processing Event EventType: STOP_CONTAINER for Container container_e02_1599158147594_30378_01_000211
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment