Skip to content

Instantly share code, notes, and snippets.

View rueycheng's full-sized avatar

Ruey-Cheng Chen rueycheng

View GitHub Profile
@ian-whitestone
ian-whitestone / notes.md
Last active March 1, 2023 01:45
Best practices for presto sql

Presto Specific

  • Don’t SELECT *, Specify explicit column names (columnar store)
  • Avoid large JOINs (filter each table first)
    • In PRESTO tables are joined in the order they are listed!!
    • Join small tables earlier in the plan and leave larger fact tables to the end
    • Avoid cross joins or 1 to many joins as these can degrade performance
  • Order by and group by take time
    • only use order by in subqueries if it is really necessary
  • When using GROUP BY, order the columns by the highest cardinality (that is, most number of unique values) to the lowest.
@ryderdamen
ryderdamen / gce-to-gcs-uploads.md
Created December 4, 2018 15:43
Uploading Files from Google Compute Engine (GCE) VMs to Google Cloud Storage (GCS)

Uploading Files from Google Compute Engine (GCE) VMs to Google Cloud Storage (GCS)

I had a bit of trouble trying to configure permissions to upload files from my Google Compute Engine instance to my Google Cloud Storage bucket. The process isn't as intuitive as you think. There are a few permissions issues that need to be configured before this can happen. Here are the steps I took to get things working.

Let's say you want to upload yourfile.txt to a GCS bucket from your virtual machine. You can use the gsutil command line tool that comes installed on all GCE instances.

If you've never used the gcloud or gsutil command line tools on this machine before, you will need to initialize them with a service account.

@1duo
1duo / centos.install.boost.md
Last active April 4, 2024 18:39
Install Boost library from source on CentOS 7.

Download Boost Library: http://www.boost.org (Choose the expected version)

wget https://cfhcable.dl.sourceforge.net/project/boost/boost/1.54.0/boost_1_54_0.tar.gz
wget https://phoenixnap.dl.sourceforge.net/project/boost/boost/1.58.0/boost_1_58_0.tar.gz
wget https://dl.bintray.com/boostorg/release/1.64.0/source/boost_1_64_0.tar.gz
wget https://dl.bintray.com/boostorg/release/1.65.1/source/boost_1_65_1.tar.gz
wget https://dl.bintray.com/boostorg/release/1.67.0/source/boost_1_67_0.tar.gz
wget https://dl.bintray.com/boostorg/release/1.68.0/source/boost_1_68_0.tar.gz
wget https://dl.bintray.com/boostorg/release/1.69.0/source/boost_1_69_0.tar.gz
@takeit
takeit / INSTALL.md
Last active March 23, 2024 19:03
Write to NTFS on macOS Sierra (osxfuse + ntfs-3g)
  1. Install osxfuse:
brew cask install osxfuse
  1. Reboot your Mac.

  2. Install ntfs-3g:

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@subfuzion
subfuzion / curl.md
Last active July 18, 2024 17:12
curl POST examples

Common Options

-#, --progress-bar Make curl display a simple progress bar instead of the more informational standard meter.

-b, --cookie <name=data> Supply cookie with request. If no =, then specifies the cookie file to use (see -c).

-c, --cookie-jar <file name> File to save response cookies to.

@sebsto
sebsto / gist:19b99f1fa1f32cae5d00
Created August 8, 2014 15:53
Install Maven with Yum on Amazon Linux
sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo
sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo
sudo yum install -y apache-maven
mvn --version
@mblondel
mblondel / kernel_kmeans.py
Last active January 4, 2024 11:45
Kernel K-means.
"""Kernel K-means"""
# Author: Mathieu Blondel <mathieu@mblondel.org>
# License: BSD 3 clause
import numpy as np
from sklearn.base import BaseEstimator, ClusterMixin
from sklearn.metrics.pairwise import pairwise_kernels
from sklearn.utils import check_random_state
@stephenhardy
stephenhardy / git-clearHistory
Created April 26, 2013 22:14
Steps to clear out the history of a git/github repository
-- Remove the history from
rm -rf .git
-- recreate the repos from the current content only
git init
git add .
git commit -m "Initial commit"
-- push to the github remote repos ensuring you overwrite history
git remote add origin git@github.com:<YOUR ACCOUNT>/<YOUR REPOS>.git
@andreyvit
andreyvit / tmux.md
Created June 13, 2012 03:41
tmux cheatsheet

tmux cheat sheet

(C-x means ctrl+x, M-x means alt+x)

Prefix key

The default prefix is C-b. If you (or your muscle memory) prefer C-a, you need to add this to ~/.tmux.conf:

remap prefix to Control + a