- http://www.slideshare.net/miguno/introducing-apache-kafkas-streams-api-kafka-meetup-munich-jan-25-2017
- https://technology.amis.nl/2017/02/12/apache-kafka-streams-running-top-n-grouped-by-dimension-from-and-to-kafka-topic/
- https://engineering.linkedin.com/blog/2016/04/kafka-ecosystem-at-linkedin
- https://jeqo.github.io/post/2017-01-31-kafka-rewind-consumers-offset/
- http://www.slideshare.net/Naveen1914/confluent-kafka-meetupseattle-jan2017
- https://community.hortonworks.com/articles/49789/kafka-best-practices.html
- https://www.slideshare.net/HadoopSummit/apache-kafka-best-practices
- I MUST read it when I use Kafka in production.
- https://www.confluent.io/blog/stories-front-lessons-learned-supporting-apache-kafka/
- https://community.hortonworks.com/articles/79891/kafka-mirror-maker-best-practices.html
- https://www.udemy.com/apache-kafka-series-setup-administration-in-production/
- http://hortonworks.com/webinar/apache-kafka-apache-nifi-better-together/
- https://logallthethings.com/2016/07/11/min-insync-replicas-what-does-it-actually-do/
- https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
- https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
- https://medium.com/@jaykreps/exactly-once-support-in-apache-kafka-55e1fdd0a35f
- https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/
- http://blog.cloudera.com/blog/2015/07/deploying-apache-kafka-a-practical-faq/
- https://www.youtube.com/watch?v=sv8DwKeHiEo
- https://qiita.com/brfrn169/items/1fc596f0c5070f9be091
- Accordion. in Japanese.
- https://community.hortonworks.com/articles/74766/optimizing-hbase-for-large-scale-hbase-implementat.html
- https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_cluster-planning/content/hardware-selection-for-hbase.1.html
- https://www.slideshare.net/edvorkin/learning-stream-processing-with-apache-storm
- Storm
- https://github.com/pmerienne/trident-ml
- Storm, Trident, Machhine Learning
- https://www.slideshare.net/ptgoetz/scaling-apache-storm-strata-hadoopworld-2014
- https://community.hortonworks.com/articles/550/unofficial-storm-and-kafka-best-practices-guide.html
- https://community.hortonworks.com/questions/10868/best-practices-for-storm-deployment-on-a-hadoop-cl.html
- https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_storm-component-guide/content/ch_storm-configure.html
- http://qiita.com/kimutansk/items/bcf9a147eafc8c1adf5f
- Apex
- https://streamsets.com/blog/blogreplicating-relational-databases-with-streamsets-data-collector/
- Streamset
- http://www.slideshare.net/LevBrailovskiy/data-ingestion-and-distribution-with-apache-nifi
- NiFi
- https://www.youtube.com/watch?v=sVjbKI9062w
- Apex
- https://www.slideshare.net/AndrFucsdeMiranda/logging-at-scale-doing-more-with-less
- NiFi
- https://www.slideshare.net/JontheBeach/realizing-the-promise-of-portability-with-apache-beam
- Beam
- https://github.com/xerial/streamdb-readings
- http://www.slideshare.net/fhueske/stream-analytics-with-sql-on-apache-flink
- https://data-artisans.com/continuous-queries-on-dynamic-tables-analyzing-data-streams-with-sql/
- http://qiita.com/kimutansk/items/eccb022cda019fe48599
- https://www.slideshare.net/DawidWysakowicz/flink-complex-event-processing
- "Flink Complex Event Processing"
- https://data-artisans.com/blog/apache-flink-at-mediamath-rescaling-stateful-applications
- https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html
- https://www.slideshare.net/getindata/flinkcep-library-dawid-wysakowicz-getindata-whug
- https://data-artisans.com/blog/on-designing-a-stream-processing-benchmark
- http://qiita.com/ogibayashi/items/bb5c4ae61dc27025bde7
- https://github.com/ajermakovics/jvm-mon
- JVM Monitor running on console.
- http://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/index.html
- Java
- https://www.youtube.com/watch?v=M9o1LVfGp2A
- Java
- http://prestodb.rocks/code/simd/
- Java + SIMD
- https://docs.oracle.com/javase/9/migrate/toc.htm
- https://www.ibm.com/developerworks/jp/java/library/j-java8idioms4/index.html
- Java8 idioms
- http://blog.palominolabs.com/2014/02/10/java-8-performance-improvements-longadder-vs-atomiclong/
- "Java 8 Performance Improvements: LongAdder vs AtomicLong"
- see also https://docs.oracle.com/javase/jp/8/docs/api/java/util/concurrent/atomic/LongAdder.html
- https://hortonworks.com/blog/update-hive-tables-easy-way-2/
- https://github.com/jwills/hive-scd
- SCD = Slowly changing dimension
- https://community.hortonworks.com/articles/149486/llap-sizing-and-setup.html
- LLAP
- https://community.hortonworks.com/content/kbentry/149894/llap-a-one-page-architecture-overview.html
- LLAP
- https://community.hortonworks.com/content/kbentry/149892/llap-troubleshooting-and-debugging.html
- LLAP
- http://crazyadmins.com/automate-hdp-installation-using-ambari-blueprints-part-6/
- http://crazyadmins.com/automate-hdp-installation-using-ambari-blueprints-part-5/
- http://crazyadmins.com/automate-hdp-installation-using-ambari-blueprints-part-4/
- http://crazyadmins.com/automate-hdp-installation-using-ambari-blueprints-part-2-2/
- http://blog.cloudera.com/blog/2017/06/apache-solr-memory-tuning-for-production/
- http://blog.cloudera.com/blog/2017/06/solr-memory-tuning-for-production-part-2/
- https://apex.sh/ping/
- monitoring tool?
- http://postd.cc/learning-about-distributed-systems/
- distributed system
- http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/
- distributed system
- https://christophermeiklejohn.com/distributed/systems/2013/07/12/readings-in-distributed-systems.html
- distributed system
- http://twill.apache.org/
- "Apache Twill allows you to use YARN’s distributed capabilities with a programming model that is similar to running threads."
- https://www.lambdanote.com/collections/frontpage/products/tls
- "BULLETPROOF SSL AND TLS" in Japanese.
- http://blog.packagecloud.io/eng/2016/06/22/monitoring-tuning-linux-networking-stack-receiving-data/
- Linux, network
- http://jp.hortonworks.com/webinar/%E5%88%9D%E3%82%81%E3%81%A6%E3%81%AE-hortonworks-data-platform-%E3%80%9C-apache-hadoop-spark%E3%82%92%E3%83%99%E3%83%BC%E3%82%B9%E3%81%AB%E3%83%87%E3%83%BC%E3%82%BF%E3%81%AE%E8%93%84%E7%A9%8D/
- https://fabxc.org/blog/2017-04-10-writing-a-tsdb/
- "Writing a Time Series Database from Scratch"
- http://d.hatena.ne.jp/yohei-a/20170420/1492685305
- column-oriented database
- https://www.mapd.com/products/core/
- MapD - "The world's fastest in-memory, distributed GPU database powers the world's most immersive data exploration experience."
- https://aajisaka.github.io/hadoop-document/hadoop-project/hadoop-yarn/hadoop-yarn-site/DockerContainers.html
- Hadoop3 Docker Container
- https://blog.cloudera.co.jp/introduction-to-hdfs-erasure-coding-in-apache-hadoop-c13910ba15d4
- HDFS Erasure Coding
- in Japanese
- https://github.com/hortonworks/data-tutorials
- HDP
- https://www.slideshare.net/julianhyde/a-smarter-pig-building-a-sql-interface-to-apache-pig-using-apache-calcite
- Calcite, Pig
- https://spotbugs.github.io/
- "SpotBugs is the spiritual successor of FindBugs"
- http://g.oswego.edu/dl/html/malloc.html
- memory allocator
- http://dev.classmethod.jp/event/cmdevio-2017-report-how-to-create-yarn-app/
- "How to create Yarn Application"
- http://blog.cloudera.com/blog/2017/04/apache-kudu-read-write-paths/
- https://hortonworks.com/blog/part-4-sams-stream-builder-building-complex-stream-analytics-apps-without-code/
- http://blog.cloudera.com/blog/2017/06/offset-management-for-apache-kafka-with-apache-spark-streaming/
- https://www.ibm.com/developerworks/jp/java/library/j-spring-boot-basics-perry/?cmp=dw&cpb=dwjav&ct=dwrss&cr=dwrss&ccy=jp&csr=092917
- Sprint Boot
- http://matthewrocklin.com/blog/work/2017/10/16/streaming-dataframes-1
- https://www.classtools.net/blog/using-plutchiks-wheel-of-emotions-to-improve-the-evaluation-of-sources/
- English
- https://engineering.linkedin.com/performance/optimizing-linux-memory-management-low-latency-high-throughput-databases
- Linux, NUMA
- https://www.youtube.com/watch?v=fNuoVnCFV6U
- Spark, CBO
- https://qiita.com/kumagi/items/535c9b7a761d2ed52bc0
- Paxos
- https://qiita.com/kumagi/items/3867862c6be65328f89c
- Eventual Consistency.
- Read. But...
- https://www.slideshare.net/HadoopSummit/debugging-apache-hadoop-yarn-cluster-in-production-63887902
- https://jp.hortonworks.com/blog/ambari-kerberos-support-hbase-1/
- http://blog.cloudera.com/blog/2017/12/hadoop-delegation-tokens-explained/http://blog.cloudera.com/blog/2017/12/hadoop-delegation-tokens-explained/
- https://jp.hortonworks.com/blog/top-11-customer-stories-2017/
- https://jp.hortonworks.com/blog/yarn-capacity-scheduler/
- https://seekingalpha.com/article/4135112-big-data-hot-investors-carry-risk-others
- http://cs231n.stanford.edu/syllabus.html
- https://www.amazon.co.jp/dp/4061529021
- https://www.amazon.co.jp/dp/4873117585
- https://github.com/oxford-cs-deepnlp-2017/lectures
- http://cs224d.stanford.edu/syllabus.html
- https://www.coursera.org/learn/neural-networks/
- https://drive.google.com/file/d/1SuwiICLERd7SfYo3FiqNG0tCEBUjKcT7/view
- https://www.facebook.com/nipsfoundation/videos/1552060484885185/
- https://www.oreilly.co.jp/books/9784873117140/
- graph database, neo4j.
- http://gihyo.jp/book/2013/978-4-7741-5879-2
- however I prefer Python these days...