Skip to content

Instantly share code, notes, and snippets.

@nguyenvulebinh
nguyenvulebinh / flatten_all_spark_schema.py
Last active August 8, 2023 15:08
Flatten a Spark DataFrame schema (include struct and array type)
import typing as T
import cytoolz.curried as tz
import pyspark
from pyspark.sql.functions import explode
def schema_to_columns(schema: pyspark.sql.types.StructType) -> T.List[T.List[str]]:
columns = list()
@michelp
michelp / postgrest-quick.sh
Last active April 13, 2022 21:42
From nothing to REST API with PostgREST
# Minimal example of getting a PostgREST API running from scratch for
# testing purposes. It uses docker to launch a postgres database and
# a postgrest api server.
# This should not be used to deploy a production system but to
# understand how postgrest works. In particular there is no security
# implemented, see the docs for more.
# https://postgrest.org/en/v4.4/
@aafwu00
aafwu00 / intellij_tips.md
Last active March 4, 2024 09:21
IntelliJ Tips, 익숙지 않은 분들을 위한

IntelliJ Tip 모음 2019.2.1 기준

IntelliJ 익숙지 않은 분께 도움이 될 tip, 모음

  • 가급적 최신 버전 유지
  • 버전이 2019.x 형태인데 저같은 경우 x 가 바뀔때 app cleaner 로 깔끔히 지우고 시작, 개취
    • intellij cache 를 많이 쓰는데 update 시 꼬이는 경우가 가끔 있고, 밀면 초기에 index 과정 지나면 좀 빠름
    • 신규 macOS 부터 Shift + Command + A 가 시스템 등록 되어있어서 System Preferences -> Keyboard -> Shortcuts -> Services -> Search man Page Index in Terminal 체크 해제

단축키(Mac 기준, keymap 은 Default Mac OS X 사용)

  • IntelliJ Learn Plugin 으로 따라하기 모드가 생김: 이것만 알아도 됨
@colbyford
colbyford / SparkML_DataPrep_BinaryClassification.py
Last active September 23, 2022 16:41
SparkML Data Preparation Steps for Binary Classification Models
########################################
## Title: Spark MLlib Classification Data Prep Script
## Language: PySpark
## Author: Colby T. Ford, Ph.D.
########################################
from pyspark.ml import Pipeline
from pyspark.ml.feature import OneHotEncoder, OneHotEncoderEstimator, StringIndexer, VectorAssembler
label = "dependentvar"
@yancya
yancya / EXCEPT.sql
Created December 15, 2017 11:04
INTERSECT and EXCEPT for BigQuery
#standardsql
WITH a AS (
SELECT * FROM UNNEST([1,2,3,4]) AS n
), b AS (
SELECT * FROM UNNEST([4,5,6,7]) AS n)
SELECT * FROM a
@rsperl
rsperl / lockfile.sh
Last active July 25, 2022 15:27
using lockfiles in bash #snippet
# src: http://www.davidpashley.com/articles/writing-robust-shell-scripts/
# noclobber will not redirect to an existing file
if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null;
then
# we have the lockfile, so be sure to remove it if the script exits early
trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT
# do our critical stuff
@ipmb
ipmb / settings.py
Last active November 24, 2023 20:25
Django logging example
import logging.config
import os
from django.utils.log import DEFAULT_LOGGING
# Disable Django's logging setup
LOGGING_CONFIG = None
LOGLEVEL = os.environ.get('LOGLEVEL', 'info').upper()
logging.config.dictConfig({
@LucasMagnum
LucasMagnum / queries.py
Created August 2, 2017 14:52
#DjangoTip - Playing with querysets - Print Queryset
print(adult_products.query)
"""
SELECT `product_product`.`title`, `product_product`.`is_adult`, `product_product`.`is_active`
FROM `product_product` WHERE `product_product`.`is_adult` = True
"""
print(active_products.query)
"""
SELECT `product_product`.`title`, `product_product`.`is_adult`, `product_product`.`is_active`
FROM `product_product` WHERE `product_product`.`is_active` = True
@krishpop
krishpop / export-toby.js
Last active March 21, 2024 22:12
Export Toby
// code courtesy of Toby team
chrome.storage.local.get("state", o => (
((f, t) => {
let e = document.createElement("a");
e.setAttribute("href", `data:text/plain;charset=utf-8,${encodeURIComponent(t)}`);
e.setAttribute("download", f);
e.click();
})(`TobyBackup${Date.now()}.json`, o.state)
));
@mdamien
mdamien / 0readme.md
Last active February 22, 2024 12:11
404 link detector with scrapy

List all the broken links on your website

Requirements:

python3 and scrapy (pip install scrapy)

Usage

  • scrapy runspider -o items.csv -a site="https://yoursite.org" 1spider.py
  • python3 2format_results.py