Skip to content

Instantly share code, notes, and snippets.

View gdtm86's full-sized avatar

Guru Dharmateja Medasani gdtm86

  • Domino Data Lab
  • Chicago,IL
View GitHub Profile
@fchollet
fchollet / classifier_from_little_data_script_2.py
Last active September 13, 2023 03:34
Updated to the Keras 2.0 API.
'''This script goes along the blog post
"Building powerful image classification models using very little data"
from blog.keras.io.
It uses data that can be downloaded at:
https://www.kaggle.com/c/dogs-vs-cats/data
In our setup, we:
- created a data/ folder
- created train/ and validation/ subfolders inside data/
- created cats/ and dogs/ subfolders inside train/ and validation/
- put the cat pictures index 0-999 in data/train/cats
@baraldilorenzo
baraldilorenzo / readme.md
Created January 16, 2016 12:57
VGG-19 pre-trained model for Keras

##VGG19 model for Keras

This is the Keras model of the 19-layer network used by the VGG team in the ILSVRC-2014 competition.

It has been obtained by directly converting the Caffe model provived by the authors.

Details about the network architecture can be found in the following arXiv paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan, A. Zisserman

@baraldilorenzo
baraldilorenzo / readme.md
Last active June 13, 2024 03:07
VGG-16 pre-trained model for Keras

##VGG16 model for Keras

This is the Keras model of the 16-layer network used by the VGG team in the ILSVRC-2014 competition.

It has been obtained by directly converting the Caffe model provived by the authors.

Details about the network architecture can be found in the following arXiv paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan, A. Zisserman

@gdtm86
gdtm86 / Mail.scala
Last active August 29, 2015 14:06 — forked from mariussoutier/Mail.scala
package object mail {
implicit def stringToSeq(single: String): Seq[String] = Seq(single)
implicit def liftToOption[T](t: T): Option[T] = Some(t)
sealed abstract class MailType
case object Plain extends MailType
case object Rich extends MailType
case object MultiPart extends MailType
@jkreps
jkreps / benchmark-commands.txt
Last active June 17, 2024 03:54
Kafka Benchmark Commands
Producer
Setup
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test-rep-one --partitions 6 --replication-factor 1
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test --partitions 6 --replication-factor 3
Single thread, no replication
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 50000000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196

Recent versions of Cloudera's Impala added NDV, a "number of distinct values" aggregate function that uses the HyperLogLog algorithm to estimate this number, in parallel, in a fixed amount of space.

This can make a really, really big difference: in a large table I tested this on, which had roughly 100M unique values of mycolumn, using NDV(mycolumn) got me an approximate answer in 27 seconds, whereas the exact answer using count(distinct mycolumn) took ... well, I don't know how long, because I got tired of waiting for it after 45 minutes.

It's fun to note, though, that because of another recent addition to Impala's dialect of SQL, the fnv_hash function, you don't actually need to use NDV; instead, you can build HyperLogLog yourself from mathematical primitives.

HyperLogLog hashes each value it sees, and then assigns them to a bucket based on the low order bits of the hash. It's common to use 1024 buckets, so we can get the bucket by using a bitwise & with 1023:

select
@mushkevych
mushkevych / Dockerfile
Last active December 30, 2015 07:39
Docker CDH 4.5
FROM ubuntu:precise
MAINTAINER Bohdan Mushkevych
# Installing Oracle JDK
RUN apt-get -y install python-software-properties ;\
add-apt-repository ppa:webupd8team/java ;\
apt-get update && apt-get -y upgrade ;\
echo oracle-java7-installer shared/accepted-oracle-license-v1-1 select true | /usr/bin/debconf-set-selections ;\
apt-get -y install oracle-java7-installer && apt-get clean ;\
update-alternatives --display java ;\
@airawat
airawat / 00-LogParser-Hive-Regex
Last active October 4, 2020 01:56
Log parser in Hive using regex serde
This gist includes hive ql scripts to create an external partitioned table for Syslog
generated log files using regex serde;
Usecase: Count the number of occurances of processes that got logged, by year, month,
day and process.
Includes:
---------
Sample data and structure: 01-SampleDataAndStructure
Data download: 02-DataDownload
Data load commands: 03-DataLoadCommands
@dholbrook
dholbrook / Tree.scala
Created June 21, 2012 17:59
Scala binary tree
/**
* D Holbrook
*
* Code Club: PO1
*
* (*) Define a binary tree data structure and related fundamental operations.
*
* Use whichever language features are the best fit (this will depend on the language you have selected). The following operations should be supported:
*
* Constructors
#! /usr/bin/env python
import fileinput
import argparse
from operator import itemgetter
parser = argparse.ArgumentParser()
parser.add_argument('--target-mb', action = 'store', dest = 'target_mb', default = 61000, type = int)
parser.add_argument('vmtouch_output_file', action = 'store', nargs = '+')
args = parser.parse_args()