Skip to content

Instantly share code, notes, and snippets.

View absognety's full-sized avatar
🎯
Focusing

Vikas Chitturi - Open Source Contributor absognety

🎯
Focusing
View GitHub Profile
import copy
# write to a path using the Hudi format
def hudi_write(df, schema, table, path, mode, hudi_options):
hudi_options = {
"hoodie.datasource.write.recordkey.field": "recordkey",
"hoodie.datasource.write.precombine.field": "precombine_field",
"hoodie.datasource.write.partitionpath.field": "partitionpath_field",
"hoodie.datasource.write.operation": "write_operaion",
"hoodie.datasource.write.table.type": "table_type",
@RobertAKARobin
RobertAKARobin / python.md
Last active June 13, 2024 04:24
Python Is Not A Great Programming Language
@shravan-kuchkula
shravan-kuchkula / apache-airflow.md
Last active September 12, 2022 07:27
Install apache-airflow locally on mac

Using Docker and docker-compose to manage Apache Airflow on mac

Using our beloved docker and docker-compose, we can very quickly bring up an Apache Airflow instance on our mac.

Contents of docker-compose.yml

About the only thing you need to customize in this docker-compose.yml file is the volumes section. This will tell docker to map the given directory containing your Airflow DAGs/plugins to container file system.

version: '3'
services:
@tamiroze
tamiroze / sql2sf.py
Last active May 13, 2024 20:06
Converts Oracle, SQL-Server, and other DDL to Snowflake DDL
#!/usr/bin/python
# $Id: $
# Converts Oracle, SQL-Server, and other DDL to Snowflake DDL
def usage():
print """\
# Usage: sql2sf.py input-file [output-file]
"""
sudo add-apt-repository -y ppa:apt-fast/stable
sudo add-apt-repository -y ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get -y install apt-fast
# prompts
sudo apt-fast -y upgrade
sudo apt-fast install -y python3-pip ubuntu-drivers-common libvorbis-dev libflac-dev libsndfile-dev cmake build-essential libgflags-dev libgoogle-glog-dev libgtest-dev google-mock zlib1g-dev libeigen3-dev libboost-all-dev libasound2-dev libogg-dev libtool libfftw3-dev libbz2-dev liblzma-dev libgoogle-glog0v5 gcc-6 gfortran-6 g++-6 doxygen graphviz libsox-fmt-all parallel exuberant-ctags vim-nox python-powerline python3-pip ack lsyncd
sudo apt-fast install -y tigervnc-standalone-server firefox mesa-common-dev
@absognety
absognety / scala_env_setup.md
Last active April 11, 2023 06:22
Setup Scala environment in Linux/RHEL/CentOS/Debian/Ubuntu/Fedora releases
Check for Java Development Kit (JDK) version
$ java -version
java version "1.7.0_171"
OpenJDK Runtime Environment (rhel-2.6.13.0.el7_4-x86_64 u171-b01)
OpenJDK 64-Bit Server VM (build 24.171-b01, mixed mode)

For scala to be set up JDK 8 or greater version is required
if JDK/OpenJDK version is less than 1.8 then follow the below steps

@bgauduch
bgauduch / multiple-repository-and-identities-git-configuration.md
Last active June 18, 2024 15:14
Git config with multiple identities and multiple repositories

Setup multiple git identities & git user informations

/!\ Be very carrefull in your setup : any misconfiguration make all the git config to fail silently ! Go trought this guide step by step and it should be fine 😉

Setup multiple git ssh identities for git

  • Generate your SSH keys as per your git provider documentation.
  • Add each public SSH keys to your git providers acounts.
  • In your ~/.ssh/config, set each ssh key for each repository as in this exemple:
@asmaier
asmaier / load_parquet_s3.py
Last active March 5, 2021 03:43
Pyspark script for downloading a single parquet file from Amazon S3 via the s3a protocol. It also reads the credentials from the "~/.aws/credentials", so we don't need to hardcode them. See also https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html .
#
# Some constants
#
aws_profile = "your_profile"
aws_region = "your_region"
s3_bucket = "your_bucket"
#
# Reading environment variables from aws credential file
#
# Install R + RStudio on Ubuntu 14.04
sudo apt-key adv –keyserver keyserver.ubuntu.com –recv-keys E084DAB9
# Ubuntu 12.04: precise
# Ubuntu 14.04: trusty
# Ubuntu 16.04: xenial
# Basic format of next line deb https://<my.favorite.cran.mirror>/bin/linux/ubuntu <enter your ubuntu version>/
sudo add-apt-repository 'deb https://ftp.ussg.iu.edu/CRAN/bin/linux/ubuntu trusty/'
sudo apt-get update