Skip to content

Instantly share code, notes, and snippets.

@debraj-manna
Created February 24, 2021 16:38
Show Gist options
  • Save debraj-manna/3732616fe78db439c0d6453454e8e02b to your computer and use it in GitHub Desktop.
Save debraj-manna/3732616fe78db439c0d6453454e8e02b to your computer and use it in GitHub Desktop.
Node Manager Logs
2021-02-24 16:36:34,190 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for appattempt_1614159836384_0047_000001 (auth:SIMPLE)
2021-02-24 16:36:34,201 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1614159836384_0047_01_000001 by user ubuntu
2021-02-24 16:36:34,203 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app application_1614159836384_0047
2021-02-24 16:36:34,203 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1614159836384_0047 transitioned from NEW to INITING
2021-02-24 16:36:34,204 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1614159836384_0047 transitioned from INITING to RUNNING
2021-02-24 16:36:34,204 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Adding container_1614159836384_0047_01_000001 to application application_1614159836384_0047
2021-02-24 16:36:34,204 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=ubuntu IP=127.0.0.1 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1614159836384_0047 CONTAINERID=container_1614159836384_0047_01_000001
2021-02-24 16:36:34,219 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1614159836384_0047_01_000001 transitioned from NEW to LOCALIZING
2021-02-24 16:36:34,219 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1614159836384_0047
2021-02-24 16:36:34,220 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1614159836384_0047_01_000001
2021-02-24 16:36:34,222 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1614159836384_0047_01_000001.tokens
2021-02-24 16:36:34,222 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user ubuntu
2021-02-24 16:36:34,224 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying from /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1614159836384_0047_01_000001.tokens to /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/application_1614159836384_0047/container_1614159836384_0047_01_000001.tokens
2021-02-24 16:36:34,224 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Localizer CWD set to /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/application_1614159836384_0047 = file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/application_1614159836384_0047
2021-02-24 16:36:34,247 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer: Disk Validator: yarn.nodemanager.disk-validator is loaded.
2021-02-24 16:36:34,268 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: { file:/home/ubuntu/.flink/application_1614159836384_0047/flink-dist_2.12-1.12.1.jar, 1614184593000, FILE, null } failed: File file:/home/ubuntu/.flink/application_1614159836384_0047/flink-dist_2.12-1.12.1.jar does not exist
java.io.FileNotFoundException: File file:/home/ubuntu/.flink/application_1614159836384_0047/flink-dist_2.12-1.12.1.jar does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:867)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:242)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:235)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:223)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2021-02-24 16:36:34,271 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1614159836384_0047_01_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
2021-02-24 16:36:34,271 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1614159836384_0047_01_000001 sent RELEASE event on a resource request { file:/home/ubuntu/.flink/application_1614159836384_0047/flink-dist_2.12-1.12.1.jar, 1614184593000, FILE, null } not present in cache.
2021-02-24 16:36:34,271 WARN org.apache.hadoop.ipc.Client: interrupted waiting to send rpc request to server
java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1141)
at org.apache.hadoop.ipc.Client.call(Client.java:1397)
at org.apache.hadoop.ipc.Client.call(Client.java:1355)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy75.heartbeat(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:63)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:324)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:197)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:179)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1209)
2021-02-24 16:36:34,274 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=ubuntu OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: LOCALIZATION_FAILED APPID=application_1614159836384_0047 CONTAINERID=container_1614159836384_0047_01_000001
"/var/log/hadoop-yarn/hadoop-yarn-nodemanager-vrni-platform.log" [readonly] 105L, 13274C 1,1 Top
java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1141)
at org.apache.hadoop.ipc.Client.call(Client.java:1397)
at org.apache.hadoop.ipc.Client.call(Client.java:1355)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy75.heartbeat(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:63)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:324)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:197)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:179)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1209)
2021-02-24 16:36:34,274 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=ubuntu OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: LOCALIZATION_FAILED APPID=application_1614159836384_0047 CONTAINERID=container_1614159836384_0047_01_000001
2021-02-24 16:36:34,275 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Unknown localizer with localizerId container_1614159836384_0047_01_000001 is sending heartbeat. Ordering it to DIE
2021-02-24 16:36:34,276 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer failed for container_1614159836384_0047_01_000001
java.io.IOException: java.io.IOException: java.lang.InterruptedException
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:199)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:179)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1209)
Caused by: java.io.IOException: java.lang.InterruptedException
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at org.apache.hadoop.ipc.Client.call(Client.java:1355)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy75.heartbeat(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:63)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:324)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:197)
... 2 more
Caused by: java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1141)
at org.apache.hadoop.ipc.Client.call(Client.java:1397)
... 9 more
2021-02-24 16:36:34,276 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher thread interrupted
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1223)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:265)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
2021-02-24 16:36:34,276 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1614159836384_0047_01_000001 transitioned from LOCALIZATION_FAILED to DONE
2021-02-24 16:36:34,277 ERROR org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[LocalizerRunner for container_1614159836384_0047_01_000001,5,main] threw an Exception.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.InterruptedException
at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:273)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1233)
Caused by: java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1223)
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:265)
... 1 more
2021-02-24 16:36:34,279 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Removing container_1614159836384_0047_01_000001 from application application_1614159836384_0047
2021-02-24 16:36:34,282 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_1614159836384_0047_01_000001
2021-02-24 16:36:34,282 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1614159836384_0047
2021-02-24 16:36:35,286 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_1614159836384_0047_01_000001]
2021-02-24 16:36:35,286 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1614159836384_0047 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP
2021-02-24 16:36:35,287 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_1614159836384_0047
2021-02-24 16:36:35,287 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Application application_1614159836384_0047 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED
2021-02-24 16:36:35,287 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler: Scheduling Log Deletion for application: application_1614159836384_0047, with delay of 604800 seconds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment