Skip to content

Instantly share code, notes, and snippets.

View frank-leap's full-sized avatar

Francisco Lopez frank-leap

  • GSR
  • Spain
View GitHub Profile
class HowardHinnantDate < Formula
desc "C++ library for date and time operations based on <chrono>"
homepage "https://github.com/HowardHinnant/date"
url "https://github.com/HowardHinnant/date/archive/v2.4.1.tar.gz"
sha256 "98907d243397483bd7ad889bf6c66746db0d7d2a39cc9aacc041834c40b65b98"
bottle do
cellar :any
sha256 "4a838948afe43157af491b4310d36ae88e5cb731181568a19f66819198f24aee" => :catalina
end
class MysqlConnectorCxx < Formula
desc "MySQL database connector for C++ applications"
homepage "https://dev.mysql.com/downloads/connector/cpp/"
url "https://dev.mysql.com/get/Downloads/Connector-C++/mysql-connector-c++-8.0.18-src.tar.gz"
sha256 "63b20e446c0aadeddbbc5cef36db8222d602793e6f1e6de511bdf7bcb2181f86"
revision 2
depends_on "boost" => :build
depends_on "cmake" => :build
depends_on "mysql-client"
In the neural network terminology:
- one epoch = one forward pass and one backward pass of all the training examples
- batch size = the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you'll need.
- number of iterations = number of passes, each pass using [batch size] number of examples. To be clear, one pass = one forward pass + one backward pass (we do not count the forward pass and backward pass as two different passes).
Example: if you have 1000 training examples, and your batch size is 500, then it will take 2 iterations to complete 1 epoch.
#!/bin/bash
# forked from http://codegists.com/code/spark-submit-emr/
# Minimum TODOs on a per job basis:
# 1. define name, application jar path, main class, queue and log4j-yarn.properties path
# 2. remove properties not applicable to your Spark version (Spark 1.x vs. Spark 2.x)
# 3. tweak num_executors, executor_memory (+ overhead), and backpressure settings
# the two most important settings:
num_executors=6
@frank-leap
frank-leap / s3count.md
Created February 9, 2018 16:11 — forked from cjdd3b/s3count.md
How to count files in an S3 bucket

Counting files in S3 buckets and folders is harder than it should be. But here's a way to get it done using s3cmd:

  1. Install S3cmd
  • On Mac, brew install s3cmd
  • On Windows, go here
  1. From the command line, run s3cmd --configure

  2. Add your credentials when prompted.

@frank-leap
frank-leap / coursera_deep_learning_3.md
Created December 21, 2017 20:08
coursera_deep_learning_3.md

orthogonalization: know what to tune to achieve what effect; for this would help to have orthogonal controls (steering wheel, acceleration, braking; well defined impact); however that's not usually the case in machine learning

assumptions we always made in ML:

  • fit training set well on cost function (human like): knobs would be: bigger network, better optimization algorithm (adam)
  • hope it does well in dev set: knobs would be: bigger (training) data set, regularization
  • hope it does well in test set: knob would be: bigger dev set
  • performs well in real world: k: change dev set or cost function
@frank-leap
frank-leap / spark_k8s.md
Last active October 12, 2017 13:28
Spark in Kubernetes
@frank-leap
frank-leap / polyglot.md
Created October 8, 2017 07:48
Polyglot

K8s + Istio + Spring Boot + Golang? + Spark + ReactJS?

@frank-leap
frank-leap / istio.md
Last active October 8, 2017 07:44
Research on Istio + Spring Boot + Django + Spark
@frank-leap
frank-leap / build.sbt
Created November 20, 2016 19:24 — forked from r-wheeler/build.sbt
build.sbt for stanford NLP
name := "Simple Project"
version := "1.0"
libraryDependencies += "edu.stanford.nlp" % "stanford-corenlp" % "3.3.0"
libraryDependencies += "edu.stanford.nlp" % "stanford-corenlp" % "3.3.0" classifier "models"