Skip to content

Instantly share code, notes, and snippets.

OZAWA Tsuyoshi oza

Block or report user

Report or block oza

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View keybase.md

Keybase proof

I hereby claim:

  • I am oza on github.
  • I am ozw (https://keybase.io/ozw) on keybase.
  • I have a public key ASBEgZ5dBs8y7vz5MdexvhUgBH9vijrAse0P2YodzXZxxQo

To claim this, I am signing this object:

@oza
oza / hadoop-shaded-thirdparty
Created May 5, 2017
content of hadoop-shaded-thirdparty.jar
View hadoop-shaded-thirdparty
$ unzip hadoop-shaded-thirdparty-3.0.0-alpha3-SNAPSHOT.jar
Archive: hadoop-shaded-thirdparty-3.0.0-alpha3-SNAPSHOT.jar
creating: META-INF/
inflating: META-INF/MANIFEST.MF
inflating: META-INF/LICENSE.txt
creating: META-INF/maven/
inflating: META-INF/maven/remote-resources.xml
creating: META-INF/maven/org.apache.hadoop/
creating: META-INF/maven/org.apache.hadoop/hadoop-shaded-thirdparty/
inflating: META-INF/maven/org.apache.hadoop/hadoop-shaded-thirdparty/pom.xml
@oza
oza / HADOOP-14284.md
Last active Apr 19, 2017
how to replace imports of Guava
View HADOOP-14284.md
find . -name "*.java" | xargs sed -i -e "s/import\ com\.google\.common\./import org.apache.hadoop.shaded.com.google.common./"
find . -name "*.java" | xargs sed -i -e "s/import\ static\ com\.google\.common\./import static org.apache.hadoop.shaded.com.google.common./"

git diff --ignore-space-change > 1.patch
View gist:39e37a20f5af22b655b6
<configuration>
<property>
<name>tez.am.am-rm.heartbeat.interval-ms.max</name>
<value>250</value>
</property>
<property>
<name>tez.am.container.idle.release-timeout-max.millis</name>
<value>20000</value>
View JDK.md

JDK and Hadoop

  • 2 level of support
    • Runtime-level support
    • Source-level support

Current status

Runtime Support

@oza
oza / LT1.md
Last active Oct 18, 2017
Running Kudu with MapReduce framework (Lightning talk in Cloudera World Tokyo)
View LT1.md

Kudu

What's Kudu?

  • From http://getkudu.io/
    • Kudu completes Hadoop's storage layer to enable fast analytics on fast data.
    • Distributed Insertable/Updatable columnar store.
    • Schema on write.
View java-map-perf-result.log
11.44% libz.so.1.2.8 [.] crc32
4.33% [kernel] [k] isolate_freepages_block
2.04% [kernel] [k] copy_user_enhanced_fast_string
1.55% [kernel] [k] _raw_spin_unlock_irqrestore
1.27% libjvm.so [.] SpinPause
1.18% libc-2.19.so [.] __memcpy_sse2_unaligned
0.85% [kernel] [k] __reset_isolation_suitable
0.78% [kernel] [k] get_page_from_freelist
0.71% [kernel] [k] clear_page_c_e
0.61% [kernel] [k] compaction_alloc
View mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
@oza
oza / SparkOnYARN.md
Last active Nov 19, 2019
How to run Spark on YARN with dynamic resource allocation
View SparkOnYARN.md

YARN

  1. General resource management layer on HDFS
  2. A part of Hadoop

Spark

  1. In memory processing framework

Spark on YARN

View tez-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
You can’t perform that action at this time.