Skip to content

Instantly share code, notes, and snippets.

@monbang
monbang / The Best Medium-Hard Data Analyst SQL Interview Questions
Created May 3, 2020 16:15
The Best Medium-Hard Data Analyst SQL Interview Questions
# The Best Medium-Hard Data Analyst SQL Interview Questions
By Zachary Thomas ([zthomas.nc@gmail.com](mailto:zthomas.nc@gmail.com), [Twitter](https://twitter.com/zach_i_thomas), [LinkedIn](https://www.linkedin.com/in/thomaszi/))
**Tip: **See the Table of Contents (document outline) by hovering over the vertical line on the right side of the page
## Background & Motivation
> The first 70% of SQL is pretty straightforward but the remaining 30% can be pretty tricky.
@eduardorost
eduardorost / merge-schemas.scala
Last active December 19, 2023 09:36
Merge Schema with structs
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.types._
import org.slf4j.{Logger, LoggerFactory}
object Main {
val logger: Logger = LoggerFactory.getLogger(this.getClass)
private lazy val sparkConf: SparkConf = new SparkConf()
.setMaster("local[*]")
@rmoff
rmoff / kafkacat.adoc
Last active January 5, 2024 19:59
Show last three messages from a Kafka topic with kafkacat
kafkacat -b localhost:9092 \
         -t _kafka-connect-group-01-status \
         -C \
         -o-3 \
         -c3 \
         -f 'Topic %t / Partition %p / Offset: %o / Timestamp: %T\nHeaders: %h\nKey (%K bytes): %k\nPayload (%S bytes): %s\n--\n'
@yashk
yashk / SparkScaling.md
Last active July 9, 2019 13:56
Sparkg Scaling
@afranzi
afranzi / SparkWithHadoop.md
Last active February 18, 2024 18:45
Spark 2.4.0 with Hadoop 2.8.5

Setup Environmnet variables for Hadoop.

export HADOOP_VERSION=2.8.5
export HADOOP_HOME=${HOME}/hadoop-$HADOOP_VERSION
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=${HADOOP_HOME}/bin:$PATH

Download Hadoop files.

@ericnormand
ericnormand / 00_script.clj
Last active May 18, 2024 08:30
Boilerplate for running Clojure as a shebang script
#!/bin/sh
#_(
#_DEPS is same format as deps.edn. Multiline is okay.
DEPS='
{:deps {clj-time {:mvn/version "0.14.2"}}}
'
#_You can put other options here
OPTS='
@laughedelic
laughedelic / sbt-dependency-management-guide.md
Last active May 14, 2024 16:55
Explicit dependency management in sbt

Some of these practices might be based on wrong assumptions and I'm not aware of it, so I would appreciate any feedback.

  1. avoiding some dependency conflicts:

    • install sbt-explicit-dependencies globally in your ~/.sbt/{0.13,1.0}/plugins/plugins.sbt
    • run undeclaredCompileDependencies and make the obvious missing dependencies explicit by adding them to libraryDependencies of each sub-project
    • (optionally) run unusedCompileDependencies and remove some obvious unused libraries. This has false positives, so ; reload; Test/compile after each change and ultimately run all tests to see that it didn't break anything
    • (optionally) add undeclaredCompileDependenciesTest to the CI pipeline, so that it will fail if you have some undeclared dependencies
  2. keeping dependencies up to date and resolving conflicts:

    • install sbt-updates globally in your `~/.sbt/{0.13,1.0}/plugins/plugins.
@mayankcpdixit
mayankcpdixit / install-kafka-mac.md
Last active April 19, 2022 02:25
Install Kafka in local (mac)

Install kafka in your local mac machine

run following commands:

brew install kafka
sudo mkdir -p /usr/local/var/run/zookeeper/data
sudo chmod 777 /usr/local/var/run/zookeeper/data
zkServer start

mkdir -p /usr/local/var/lib/kafka-logs
@hansohn
hansohn / brew_install_hadoop_2.7.3.sh
Last active April 3, 2020 21:52
Install Hadoop 2.7.3 via brew
#!/usr/bin/env bash
BREW_PREFIX=$(brew --prefix);
# abort if already installed, else continue
if ! grep -q hadoop <(brew list); then
# unlink broken installation attempts
brew unlink hadoop > /dev/null 2>&1;
# install hadoop 2.7.3
@squarism
squarism / multiline.exs
Last active March 25, 2024 15:27
Multiline Anonymous Functions in Elixir
# Examples of Different Styles and Syntax
# legal / positive case / you can do these stuffs
# ---------------------------------------------------------------------------
# single line, nothing special
Enum.map(0...2, fn i -> IO.puts i end)
"""
0
1