Skip to content

Instantly share code, notes, and snippets.

@tagomoris
Created June 5, 2013 04:30
Show Gist options
  • Save tagomoris/5711608 to your computer and use it in GitHub Desktop.
Save tagomoris/5711608 to your computer and use it in GitHub Desktop.
I checked that:
In /etc/init.d/hive-server of CDH4.1.2 & CDH4.2.0
* specified 'LOG_FILE="/var/log/hive/${NAME}.log"'
* same filename specified as 'HADOOP_OPTS=\"-Dhive.log.dir=`dirname $LOG_FILE` -Dhive.log.file=${DAEMON}.log'
So, all logs of hive-server in 'hive-server.log' file is truncated when hive-server started.
In etc/rc.d/init.d of CDH4.3.0 (extracted tree of rpm)
* specified 'LOG_FILE=/var/log/hive/${DAEMON}.out'
* specified 'HADOOP_OPTS=\"-Dhive.log.dir=`dirname $LOG_FILE` -Dhive.log.file=${DAEMON}.log'
So, stdout/stderr of server daemon will be redirected to '.out', that is different from hive-server's normal logs.
(I cannot find any CDH4.2.1 package files including init script of hive-server. Where can we get?)
Conclusion:
* In CDH4.2.0 or earlier, hive-server init script truncates valuable logs like 'hive-server.log'
* In CDH4.3.0 (or later, or CDH4.2.1 or later?), hive-server write normal logs into 'hive-server.log' and hive-server's abnormal logs (it is near empty?), and doesn't truncate normal log of 'hive-server.log'.
OK, I understood that the problem i figured out 4 month ago is solved now in CDH4.3.0.
At last:
bq. Since these messages are only relevant to the last execution, the convention in CDH, Apache Bigtop, and other projects is to truncate them.
This convention is really wrong. Scripts for start-stop-daemon MUST NOT delete any logs on user's system, even if these are empty in normal situation. These log files are really needed in abnormal situation, and these are truncated in restarting, no one methods exists for investigation.
If you want to see logs only relevant to the last execution in current '.out', you should make distributions with configuration file for logrotate, and do log rotatetion when restarting.
@oza
Copy link

oza commented Jun 5, 2013

I checked that:

In /etc/init.d/hive-server of CDH4.1.2 & CDH4.2.0

  • specified 'LOG_FILE="/var/log/hive/${NAME}.log"'
  • same filename specified as 'HADOOP_OPTS="-Dhive.log.dir=dirname $LOG_FILE -Dhive.log.file=${DAEMON}.log'

So, all logs of hive-server in 'hive-server.log' file is truncated when hive-server started.

In etc/rc.d/init.d of CDH4.3.0 (extracted tree of rpm)

  • specified 'LOG_FILE=/var/log/hive/${DAEMON}.out'
  • specified 'HADOOP_OPTS="-Dhive.log.dir=dirname $LOG_FILE -Dhive.log.file=${DAEMON}.log'

So, stdout/stderr of server daemon will be redirected to '.out', that is different from hive-server's normal logs.

(I cannot find any CDH4.2.1 package files including init script of hive-server. Where can we get?)

Conclusion:

  • In CDH4.2.0 or earlier, hive-server init script truncates valuable logs like 'hive-server.log'
  • In CDH4.3.0 (or later, or CDH4.2.1 or later?), hive-server write normal logs into 'hive-server.log' and hive-server's abnormal logs (it is near empty?), and doesn't truncate normal log of 'hive-server.log'.

OK, I understood that the problem i figured out 4 month ago is solved now in CDH4.3.0.

At last:
bq. Since these messages are only relevant to the last execution, the convention in CDH, Apache Bigtop, and other projects is to truncate them.

This convention is really wrong. Scripts for start-stop-daemon MUST NOT delete any logs on user's system, even if these are empty in normal situation. These log files are really needed in abnormal situation, and these are truncated in restarting, no one methods exists for investigation.
If you want to see logs only relevant to the last execution in current '.out', you should make distributions with configuration file for logrotate, and do log rotatetion when restarting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment