Skip to content

Instantly share code, notes, and snippets.

View etheleon's full-sized avatar
🎯
Focusing

etheleon etheleon

🎯
Focusing
View GitHub Profile
@etheleon
etheleon / kfp_components_readme.md
Last active February 25, 2022 07:46
Kubeflow Pipeline components

There's multiple ways to generate a kfp component:

  1. from python function
  2. from file/text

From file/text

You'll use component.load_component_from_<file|text> when you need to interface with a command line tool.

@etheleon
etheleon / presto2spark.md
Last active August 11, 2021 01:48
translating PRESTO to SPARKSQL

Common tasks

Presto Spark
array_join(array[year, month, day], '-', 'NA') CONCAT_WS('-', col1, col2, col3)
DATE_ADD('day', -7, date '2021-07-01') date_sub('2021-07-01', 7)
array_join('one', 'two', 'three')[1] # one indexed array_join('one', 'two', 'three')[0] # zero indexed

Partition Naming convention

Class

class Upper {
  def upper(strings: String*): Seq[String] = {
    strings.map((s:String) => s.toupperCase())
  }
}
@etheleon
etheleon / add_partition_to_hive_table.sql
Last active May 18, 2021 08:03
accompanying gist for datalake article
-- ALTER TABLE schema.table DROP IF EXISTS PARTITION (year='2021', month='01', day='11', hour='01')
ALTER TABLE pricing.demand_tbl ADD
PARTITION (year='2021', month='01', day='11', hour='01')
LOCATION 's3://datascience-bucket/wesley.goi/data/pricing/demand_tbl/year=2021/month=01/day=11/hour=01'
@etheleon
etheleon / how-to-install-kubeflow1.2.md
Last active December 25, 2021 18:50
installing kubeflow 1.2

Introduction

Installing kubeflow on localmachine is not a simple task. Documentation on the official website might be outdated. At the time of writing, the solutions suggested include miniKF and microk8s. The later sets up GPU passthrough effortlessly.

@etheleon
etheleon / grpc_newbie.md
Created December 18, 2020 11:16
querying gRPC using gPRC

There's no need to create a heartbeat endpoint, with gRPC you could just use the grpc-health-probe

You can use either of two tools grpcurl or the web equivalent grpcUI (similar to postman but you cannot store the collection, but you can keep the request data as a .json file

Installation

grpcURL

Use the nbconvert / jupyter-nbconvert CLI to converting IPYNB to HTML

$ NOTEBOOK=file_name.ipynb
$ jupyter-nbconvert --to html $NOTEBOOK

You can add tags to hide cells, eg. code or outputs which you do not want in shown in the HTML

@etheleon
etheleon / first_mac.md
Last active September 4, 2020 07:46
New mac setup

You got your machine, now you want to start working immediately on your 1st day at work. What do you do?

  1. Install iterm2
curl -O https://iterm2.com/downloads/stable/iTerm2-3_3_12.zip
  1. Install brew (and xcode)
@etheleon
etheleon / how-to-mount-GCS.md
Created July 10, 2019 16:56
how to mount gcs

Step 1: Install gcsFUSE (linux)

export GCSFUSE_REPO=gcsfuse-`lsb_release -c -s`
echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" | sudo tee /etc/apt/sources.list.d/gcsfuse.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo apt-get install gcsfuse