Skip to content

Instantly share code, notes, and snippets.

@rampage644
Last active March 21, 2019 15:07
Show Gist options
  • Save rampage644/4e24ee5014d3d6d8bee8 to your computer and use it in GitHub Desktop.
Save rampage644/4e24ee5014d3d6d8bee8 to your computer and use it in GitHub Desktop.
Impala + HDP

Downloads

HDP sandbox

Installation

yum-config-manager --add-repo http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh5.repo
yum install  impala-server impala-catalog impala-state-store impala-shell
ln -sf /usr/lib/hbase/lib/hbase-client.jar /usr/lib/impala/lib
ln -sf /usr/lib/hbase/lib/hbase-common.jar /usr/lib/impala/lib
ln -sf /usr/lib/hbase/lib/hbase-protocol.jar /usr/lib/impala/lib
echo export JAVA_HOME=/usr/jdk64/jdk1.7.0_45 >> /etc/default/bigtop-utils

Manage impala daemon

for i in server state-store catalog ; do service "impala-$i" start ; done
for i in server state-store catalog ; do service "impala-$i" status ; done
for i in server state-store catalog ; do service "impala-$i" stop ; done

Configuration

IMPORTANT! Impala looks for configuration files in directories found in $CLASSPATH.

Add the following to /etc/hadoop/conf/core-site.xml:

<property>
	<name>dfs.client.read.shortcircuit</name> <value>true</value>
</property>

<property>
	<name>dfs.client.read.shortcircuit.skip.checksum</name>
        <value>false</value>
</property>

<property> 
	<name>dfs.datanode.hdfs-blocks-metadata.enabled</name> 
	<value>true</value>
</property>

Add the following to /etc/hadoop/conf/hdfs-site.xml:

<property>
	<name>dfs.datanode.hdfs-blocks-metadata.enabled</name> 
	<value>true</value>
</property>
<property> 
	<name>dfs.block.local-path-access.user</name> 
	<value>impala</value>
</property>
<property>
	<name>dfs.client.file-block-storage-locations.timeout.millis</name>
	<value>60000</value>
</property>

Copy conf files to impala configuration directory:

cp /etc/hadoop/conf/*.xml /etc/impala/conf
cp /etc/hive/conf/hive-site.xml /etc/impala/conf

Check for permissions:

chmod a+rx /var/lib/hadoop-hdfs

Restart hadoop & impala.

Troubleshooting

If something goes wrong look into logs first:

  1. /var/log/impala/impala-server.log
  2. /var/log/impala/impala-state-store.log
  3. /var/log/impala/impala-catalog.log
  4. /var/log/impala/impalad.ERROR
  5. /var/log/impala/catalogd.ERROR
  6. /var/log/impala/statestored.ERROR
  7. /var/log/hadoop/hdfs/*

Empty queries

Try invalidate metadata; in impala-shell.

@daranil
Copy link

daranil commented Nov 14, 2016

I am facing the same issue as @prakash12

Is anyone able to resolve the issue?

@rachmaninovquartet
Copy link

I too am facing the same issue as @prakash12
Any updates?

@rachmaninovquartet
Copy link

Seeing this, in /var/log/impala/statestored.ERROR:
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injectorboost::thread_resource_error >'
what(): boost::thread_resource_error: Resource temporarily unavailable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment