Skip to content

Instantly share code, notes, and snippets.

View chhantyal's full-sized avatar
💭
🤷‍♂

Nar Kumar Chhantyal chhantyal

💭
🤷‍♂
View GitHub Profile
@chhantyal
chhantyal / baby.py
Last active May 23, 2021 15:32
Baby abbouncement
#!/usr/bin/env python
from dataclasses import dataclass
@dataclass
class Baby:
name: str
birthday: str
mother: str = "Mother's name"
@chhantyal
chhantyal / visa-free.md
Last active April 14, 2019 21:16
EU permanent residents - list of countries outside Schengen/EU area where VISA is NOT required.

VISA free travel for EU permanent residents

This contains list of countries outside Schengen/EU area where VISA is NOT required for EU permanent residents.

  1. Turkey - VISA on arrival
  2. Cuba - Tourist Card
  3. Mexico
@chhantyal
chhantyal / install_python_package_git.md
Last active May 7, 2024 01:45
Pipenv or pip Install Python package from Git (Github, Gitlab, Bitbucket etc.) and using Git tag for versioning. Works for branches too.

Install from Git tag

pipenv install git+ssh://git@github.com/chhantytal/parquet-cli.git@v1.1#egg=parq

Install from branch name

pipenv install git+ssh://git@github.com/chhantytal/parquet-cli.git@master#egg=parq

Works for pip as well.

@chhantyal
chhantyal / luigi_run_python3.md
Last active November 8, 2018 08:58
Command for running Luigi with Python 3

Python 3

python -m luigi --module my_module MyTask --x 100 --local-scheduler

Python 2

luigi --module my_module MyTask --x 123 --y 456 --local-scheduler

@chhantyal
chhantyal / hammer.sh
Last active October 7, 2018 13:02
Build single egg distribution from multiple Python packages (source files only for now). It's a CRAZY idea, don't try at home or work 😂
#!/bin/bash
# Idea to copy all source files (packages in different repos) in once place
# and use setup.py located in project to build single egg file.
temp_dir=/tmp/ira/
curr_dir=$(pwd)
dist_dir=${curr_dir}/dist/
# Copy project to a dir
@chhantyal
chhantyal / spark-sqlserver-jdbc.md
Last active September 3, 2018 14:54
SQL Server (Azure SQL Database) JDBC driver installation for Apache Spark on OSX or Linux
  1. Download & unpack driver from https://www.microsoft.com/en-us/download/details.aspx?id=57175
  2. Find jar file inside: sqljdbc_{version}/enu/jre{version}/sqljdbc{version}.jar

There are few ways to use it.

  • Update Spark config to include this path (always included):
    • mv {SPARK_HOME}/conf/spark-defaults.conf.template {SPARK_HOME}/conf/spark-defaults.conf
    • Add line spark.driver.extraClassPath /path/to/sqljdbc.jar to spark-defaults.conf
@chhantyal
chhantyal / vi-worldcup-2018-prediction.md
Last active June 4, 2018 13:04
Vi World Cup 2018 prediction game

Vi World Cup 2018 prediction game

We collect 10€ from each participants.

Register

https://www.kicktipp.com/vi-worldcup/

Prize

Collected money will be distributed to highest point earners based on kicktip ranking.

@chhantyal
chhantyal / Vagrantfile
Last active April 2, 2018 18:27
Run Cockpit Linux server manager (cockpit-project.org) on Mac OSX or Windows using Vagrant and VirtualBox
# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/artful64"
config.vm.network "forwarded_port", guest: 9090, host: 9090
config.vm.provision "shell", inline: <<-SHELL
apt update
@chhantyal
chhantyal / send.js
Last active June 14, 2022 14:37
One thousands (1000) HTTP requests: async using NodeJs vs sync using Python vs using Python (with threads)
// npm install request
let request = require('request')
let range = n => Array.from(Array(n).keys())
data = range(1000)
data.forEach(function (item) {
request.get("https://httpbin.org/ip", function (error, response, body){
console.log("Request " + item + " complete.")
@chhantyal
chhantyal / spark_rdd_to_pandas_distributed.py
Last active April 27, 2023 23:53
Convert Spark RDD to Pandas DataFrame inside Spark executors and make Spark DataFrame from resulting RDD. This is distributed i.e. no need for collecting RDD to driver.
"""
Spark DataFrame is distributed but it lacks many features compared to Pandas.
If you want to use Pandas, you can't just convert Spark DF to Pandas because that means collecting it to driver.
It can be slow & not work at all when data size is big.
So only way to use Pandas is to create mini dataframes inside executors.
This gist shows how to create DataFrame from RDD inside Spark executors & build Spark DataFrame from final output.
"""
# Convert function to use in mapPartitions