Skip to content

Instantly share code, notes, and snippets.

@prasanthj
prasanthj / s3-get-speed-private.sh
Last active March 1, 2019 17:17
S3 GET Speed Private Bucket
#!/bin/bash
set -e
: ${AWS_ACCESS_KEY_ID:?"AWS_ACCESS_KEY_ID should be set in script or exported"}
: ${AWS_SECRET_ACCESS_KEY:?"AWS_SECRET_ACCESS_KEY should be set in script or exported"}
if [[ $# -eq 0 ]] ; then
echo 'S3 object URL expected as argument. Usage: ./s3-get-speed-private.sh <s3-private-object-uri>'
exit 0
fi
@prasanthj
prasanthj / orc-file-dump-total-row-count.txt
Created February 21, 2019 03:06
Total row count from orc file dumps
hive --orcfiledump <orc-table-path> | grep "Rows:" | cut -f2 -d":" | awk '{s+=$1}END{print s}'
@epiphani
epiphani / CDHTez.md
Last active February 14, 2024 08:03
Getting Tez enabled on CDH5.4+

So Hive in CDH is horribly, painfully slow. Cloudera ships Hive 1.1, which is actually moderately modern. It is, however, very badly configured out of the box and patched with custom code from Cloudera. With a bit of effort, we managed to improve hive performance considerably. We really shouldn't have to do this, but Cloudera is actively working against supporting a performant Hive.

First, building Tez was fairly straightforward. Using the instructions at https://github.com/apache/tez/blob/master/docs/src/site/markdown/install.md, the only change was to use the version string "2.6.0" for the build. I believe that was the default. Don't use the CDH string, it won't work.

At the bottom of the installation instructions, there's mention of the fact that to use the local hadoop jars (rather than those packaged with tez) you must unpack the jars in HDFS rather than using the tarball. In this case, unpack the tez-minimal tarball and upload the contents to /apps/tez-0.7.0 (or whatever you prefer). Don't fo

@bunkat
bunkat / index.html
Last active January 10, 2024 18:18
Swimlane Chart using d3.js
<!--
The MIT License (MIT)
Copyright (c) 2013 bill@bunkat.com
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
Index: apc_bin.c
===================================================================
--- apc_bin.c 2010-11-30 11:18:31.000000000 +0100
+++ apc_bin.c 2010-12-30 17:11:03.000000000 +0100
@@ -412,12 +412,6 @@
if((*bp_prev)->pListLast) {
apc_swizzle_ptr(bd, ll, &(*bp_prev)->pListLast);
}
- if((*bp_prev)->pNext) {
- apc_swizzle_ptr(bd, ll, &(*bp_prev)->pNext);
Index: Zend/zend_hash.c
===================================================================
--- Zend/zend_hash.c 2010-12-30 17:09:14.000000000 +0100
+++ Zend/zend_hash.c 2010-12-31 01:45:14.000000000 +0100
@@ -5,7 +5,7 @@
| Copyright (c) 1998-2010 Zend Technologies Ltd. (http://www.zend.com) |
+----------------------------------------------------------------------+
| This source file is subject to version 2.00 of the Zend license, |
- | that is bundled with this package in the file LICENSE, and is |
+ | that is bundled with this package in the file LICENSE, and is |