Skip to content

Instantly share code, notes, and snippets.

View ranqiqiang's full-sized avatar
🤒
Out sick

ranqiqiang

🤒
Out sick
View GitHub Profile
@epiphani
epiphani / CDHTez.md
Last active February 14, 2024 08:03
Getting Tez enabled on CDH5.4+

So Hive in CDH is horribly, painfully slow. Cloudera ships Hive 1.1, which is actually moderately modern. It is, however, very badly configured out of the box and patched with custom code from Cloudera. With a bit of effort, we managed to improve hive performance considerably. We really shouldn't have to do this, but Cloudera is actively working against supporting a performant Hive.

First, building Tez was fairly straightforward. Using the instructions at https://github.com/apache/tez/blob/master/docs/src/site/markdown/install.md, the only change was to use the version string "2.6.0" for the build. I believe that was the default. Don't use the CDH string, it won't work.

At the bottom of the installation instructions, there's mention of the fact that to use the local hadoop jars (rather than those packaged with tez) you must unpack the jars in HDFS rather than using the tarball. In this case, unpack the tez-minimal tarball and upload the contents to /apps/tez-0.7.0 (or whatever you prefer). Don't fo

@rednaxelafx
rednaxelafx / DirectMemorySize.java
Created January 11, 2012 07:18
An Serviceability-Agent based tool to see stats of NIO direct memory, as an alternative on JDK6 without JMX support for direct memory monitoring. Only works on JDK6; to work on JDK7 will need some tweaking because static variables are moved to Java mirror
import java.io.*;
import java.util.*;
import sun.jvm.hotspot.memory.*;
import sun.jvm.hotspot.oops.*;
import sun.jvm.hotspot.debugger.*;
import sun.jvm.hotspot.runtime.*;
import sun.jvm.hotspot.tools.*;
import sun.jvm.hotspot.utilities.*;
public class DirectMemorySize extends Tool {