Skip to content

Instantly share code, notes, and snippets.

View y2k-shubham's full-sized avatar
🏁
Chasing Checkpoints

Shubham Gupta y2k-shubham

🏁
Chasing Checkpoints
View GitHub Profile
@ian-whitestone
ian-whitestone / notes.md
Last active March 1, 2023 01:45
Best practices for presto sql

Presto Specific

  • Don’t SELECT *, Specify explicit column names (columnar store)
  • Avoid large JOINs (filter each table first)
    • In PRESTO tables are joined in the order they are listed!!
    • Join small tables earlier in the plan and leave larger fact tables to the end
    • Avoid cross joins or 1 to many joins as these can degrade performance
  • Order by and group by take time
    • only use order by in subqueries if it is really necessary
  • When using GROUP BY, order the columns by the highest cardinality (that is, most number of unique values) to the lowest.
@giwa
giwa / file0.txt
Last active March 27, 2020 11:31
Install hive on Mac with Homebrew ref: http://qiita.com/giwa/items/dabf0bb21ae242532423
$ brew update
$ brew install hive
@hrwgc
hrwgc / aws-cli-s3cmd-du.sh
Last active June 19, 2023 15:32
aws-cli get total size of all objects within s3 prefix. (mimic behavior of `s3cmd du` with aws-cli)
#!/bin/bash
function s3du(){
bucket=`cut -d/ -f3 <<< $1`
prefix=`awk -F/ '{for (i=4; i<NF; i++) printf $i"/"; print $NF}' <<< $1`
aws s3api list-objects --bucket $bucket --prefix=$prefix --output json --query '[sum(Contents[].Size), length(Contents[])]' | jq '. |{ size:.[0],num_objects: .[1]}'
}
s3du $1;
@sebsto
sebsto / gist:19b99f1fa1f32cae5d00
Created August 8, 2014 15:53
Install Maven with Yum on Amazon Linux
sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo
sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo
sudo yum install -y apache-maven
mvn --version
@ysaotome
ysaotome / install_pyenv.sh
Last active August 7, 2021 13:27
pyenv install for CentOS 6.5 x86_64
#!/bin/zsh
# pyenv install for CentOS 6.5 x86_64
yum install -y gcc gcc-c++ make git patch openssl-devel zlib-devel readline-devel sqlite-devel bzip2-devel
git clone git://github.com/yyuu/pyenv.git ~/.pyenv
export PATH="$HOME/.pyenv/bin:$PATH"
eval "$(pyenv init -)"
@jiewmeng
jiewmeng / 1-batch.c
Created September 26, 2012 10:24
Parallel Computing - Assignment 1 - Levels & Size of Caches
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define KB 1024
#define MB 1024 * 1024
int main() {
unsigned int steps = 256 * 1024 * 1024;
static int arr[4 * 1024 * 1024];