Skip to content

Instantly share code, notes, and snippets.

View tyrnov's full-sized avatar

Artem Tyrnov tyrnov

  • Saint-Petersburg, Russia
View GitHub Profile
@sderosiaux
sderosiaux / 0-spark.sh
Last active June 27, 2020 08:09
How to configure a Spark Streaming job on YARN for a good resilience (http://mkuthan.github.io/blog/2016/09/30/spark-streaming-on-yarn/)
spark-submit --master yarn --deploy-mode cluster # yarn cluster mode (driver in yarn)
--conf spark.yarn.maxAppAttempts=4 # default 2
--conf spark.yarn.am.attemptFailuresValidityInterval=1h # reset the count every hour (a streaming app can last months)
--conf spark.yarn.max.executor.failures={8 * num_executors} # default is max(2*num_executors, 3). So for 4 executors: 32 against 8.
--conf spark.yarn.executor.failuresValidityInterval=1h # same as before, but for the executors
--conf spark.task.maxFailures=8 # default is 4
--queue realtime_queue # do not mess with the default queue
--conf spark.speculation=true # ensure the job is idempotent (it will start the same job twice if the first is slow)
--conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties # eh, logging to some logstash for instance
--conf spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties
@errordeveloper
errordeveloper / Swarm.md
Last active March 27, 2021 15:03
A quick guide to using Weave Net with Docker Swarm

Weave Net and Docker Swarm

This example show how-to setup a very simple Swarm cluster enabled with Weave Net.

## Infratructure

There are two hosts that have a recent version of Docker Engine running.

These hosts are named host1 and host2, and might have the following in /etc/hosts or you just substitute IP addresses in the commands shown below.

@fetep
fetep / 00-logstash.conf
Created December 31, 2011 06:54
Logstash JSON filter
input {
file {
type => syslog
path => "/var/log/messages"
}
}
filter {
grok {
type => syslog