Skip to content

Instantly share code, notes, and snippets.

# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.
# Example:
# spark.master spark://master:7077
spark.eventLog.enabled true
spark.eventLog.dir file:///home/ozawa/sparkeventlogs
# spark.serializer org.apache.spark.serializer.KryoSerializer
# spark.driver.memory 5g
# spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
@oza
oza / SparkOnYARN.md
Last active October 9, 2022 08:53
How to run Spark on YARN with dynamic resource allocation

YARN

  1. General resource management layer on HDFS
  2. A part of Hadoop

Spark

  1. In memory processing framework

Spark on YARN

@oza
oza / tez-reading.md
Last active May 23, 2021 05:32
Apache Tez Source code reading

Presto source code reading #1

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
@oza
oza / yarn-site.xml
Last active January 20, 2018 09:29
yarn-site.xml for ResourceManager HA
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,

Keybase proof

I hereby claim:

  • I am oza on github.
  • I am ozw (https://keybase.io/ozw) on keybase.
  • I have a public key ASBEgZ5dBs8y7vz5MdexvhUgBH9vijrAse0P2YodzXZxxQo

To claim this, I am signing this object:

@oza
oza / FT.md
Created December 20, 2013 10:12
@oza
oza / LT1.md
Last active October 18, 2017 12:51
Running Kudu with MapReduce framework (Lightning talk in Cloudera World Tokyo)

Kudu

What's Kudu?

  • From http://getkudu.io/
    • Kudu completes Hadoop's storage layer to enable fast analytics on fast data.
    • Distributed Insertable/Updatable columnar store.
  • Schema on write.