Skip to content

Instantly share code, notes, and snippets.

View gaborgsomogyi's full-sized avatar

Gabor Somogyi gaborgsomogyi

View GitHub Profile
ContainerId string format is changed if RM restarts with work-preserving recovery enabled.
It used to be such format:
container_{clusterTimestamp}_{appId}_{attemptId}_{containerId}
e.g.: container_1410901177871_0001_01_000005.
It is now changed to:
container_e{epoch}_{clusterTimestamp}_{appId}_{attemptId}_{containerId}
e.g.: container_e17_1410901177871_0001_01_000005.

Common:

  • val groupIdPrefix = spark-kafka-sources or configured with kafka.groupIdPrefix

Driver:

  • var nextId = 0
  • s"${groupIdPrefix}-${UUID.randomUUID}-${metadataPath.hashCode}-driver-${nextId}"
  • nextId += 1

Executor:

  • s"${groupIdPrefix}-${UUID.randomUUID}-${metadataPath.hashCode}-executor"
  • Using Gawk:

git log --author="Your_Name_Here" --pretty=tformat: --numstat | gawk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s removed lines: %s total lines: %s\n", add, subs, loc }' -

  • Using Awk on Mac OSX:

git log --author="Your_Name_Here" --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }' -

git config --global merge.tool meld
git config --global diff.tool meld
git config --global mergetool.meld.path "C:\Program Files (x86)\Meld\meld\meld.exe"
class KafkaSinkStreamingSuite
...
  test("single node streaming") {
    val input = MemoryStream[String]
    val topic = newTopic()
    testUtils.createTopic(topic)

    val writer = createKafkaWriter(
 input.toDF(),
class KafkaSinkBatchSuiteV2
...
  test("single node batch") {
    val topic = newTopic()
    testUtils.createTopic(topic)
    val rand = new Random()
    val data = Seq.fill(100000)(Row(topic, rand.nextInt().toString))

 val df = spark.createDataFrame(
cd external/docker-integration-tests
mvn install -DskipTests -Dscalastyle.skip=true -Dcheckstyle.skip
  • Go to IntelliJ and recompile docker-integration-tests project.
  • Start test from IntelliJ
# Setup minikdc
cat << 'EOF' > minikdc_deps.gradle
apply plugin: 'java'

repositories {
   mavenCentral()
}

dependencies {