Skip to content

Instantly share code, notes, and snippets.

View fforbeck's full-sized avatar

Felipe Forbeck fforbeck

View GitHub Profile
@fforbeck
fforbeck / gist:43f3e946357a045a6cf3
Created January 20, 2016 16:49 — forked from duydo/elasticsearch_best_practices.txt
ElasticSearch - Index best practices from Shay Banon
If you want, I can try and help with pointers as to how to improve the indexing speed you get. Its quite easy to really increase it by using some simple guidelines, for example:
- Use create in the index API (assuming you can).
- Relax the real time aspect from 1 second to something a bit higher (index.engine.robin.refresh_interval).
- Increase the indexing buffer size (indices.memory.index_buffer_size), it defaults to the value 10% which is 10% of the heap.
- Increase the number of dirty operations that trigger automatic flush (so the translog won't get really big, even though its FS based) by setting index.translog.flush_threshold (defaults to 5000).
- Increase the memory allocated to elasticsearch node. By default its 1g.
- Start with a lower replica count (even 0), and then once the bulk loading is done, increate it to the value you want it to be using the update_settings API. This will improve things as possibly less shards will be allocated to each machine.
- Increase the number of machines you have so
// installed Clojure packages:
//
// * BracketHighlighter
// * lispindent
// * SublimeREPL
// * sublime-paredit
{
"word_separators": "/\\()\"',;!@$%^&|+=[]{}`~?",
"paredit_enabled": true,
{:user {:dependencies [[org.clojure/tools.namespace "0.2.3"]
[spyscope "0.1.3"]
[criterium "0.4.1"]]
:injections [(require '(clojure.tools.namespace repl find))
; try/catch to workaround an issue where `lein repl` outside a project dir
; will not load reader literal definitions correctly:
(try (require 'spyscope.core)
(catch RuntimeException e))]
:plugins [[lein-pprint "1.1.1"]
[lein-beanstalk "0.2.6"]
import static org.apache.commons.lang.StringUtils.isBlank;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.apache.log4j.Logger;
import net.tanesha.recaptcha.ReCaptchaImpl;
import net.tanesha.recaptcha.ReCaptchaResponse;

Interesse dos usuários

Motivação

Para empresas que trabalham com publicidade online, mais precisamente DSPs e suas plataformas de Real-time Bidding é muito importante coletar a analisar informações a cerca do comportamento e interesses dos usuários enquanto navegam na internet. Sendo assim, descrevo nesse graph-gist uma abordagem básica que pode ser utilizada para analisar tais dados, considerando um determinado período de tempo e os produtos visualizados por cada usuário. Vamos considerar que alguns personagens de Breaking Bad navegaram na internet há alguns dias e encontraram alguns produtos, elementos químicos e, possuem interesse em comprá-los. Tal informação será de extrema importância no momento de dar um lance em um leilão de publicidade, sabendo o perfil e interesse de um determinado usuário. Então armazenamos tais usuários, produtos e datas das visualizações para que possamos extrair essas informações futuramente.

@fforbeck
fforbeck / Graph Gist Sample
Last active December 23, 2015 17:59
Neo4j and Cypher (tests) Finding the users that liked some product a few days ago.
#New graph
START root=node(0)
CREATE
(User1 { name:'User1' }),
(User2 { name: 'User2' }),
(User3 { name: 'User3' }),
(Mac { name: 'Mac' }),
(Samsung { name: 'Samsung' }),
(Brastemp { name: 'Brastemp' }),

Interest of users

Motivation

For companies that work with online advertising, more precisely DSPs and their Real-time Bidding platforms is very important to consider collecting information about the behavior and interests of users while they surfing on the internet. Therefore, in this graph gist, I describe a basic approach that can be used to analyze such data, considering a certain period of time and the products visualized by each user. Let’s consider some characters from Breaking Bad surfed on the internet a few days ago and found some products interesting, chemical elements, and they are thinking about buying them. Such information is extremely important when making a bid at auction advertising, knowing the profile and interests of a given user. Then we store these users, products and dates of views so we can extract this information in the future.

@fforbeck
fforbeck / add_ebs_to_ubuntu_ec2
Last active December 19, 2015 02:48
Add EBS to Ubuntu EC2 Instance
You need to format the EBS volume (block device) with a file system between step 1 and step 2. So the entire process with your sample mount point is:
Create EBS volume.
Attach EBS volume to /dev/sdf (EC2's external name for this particular device number).
Format file system /dev/xvdf (Ubuntu's internal name for this particular device number):
sudo mkfs.ext4 /dev/xvdf
Mount file system (with update to /etc/fstab so it stays mounted on reboot):
@fforbeck
fforbeck / fix_java_plugin_macos
Created June 25, 2013 01:21
Fixing Java plugin for osx
sudo mkdir -p /Library/Internet\ Plug-Ins/disabled
sudo mv /Library/Internet\ Plug-Ins/JavaAppletPlugin.plugin /Library/Internet\ Plug-Ins/disabled
sudo ln -sf /System/Library/Java/Support/Deploy.bundle/Contents/Resources/JavaPlugin2_NPAPI.plugin /Library/Internet\ Plug-Ins/JavaAppletPlugin.plugin
sudo ln -sf /System/Library/Frameworks/JavaVM.framework/Commands/javaws /usr/bin/javaws
@fforbeck
fforbeck / git_install.sh
Created June 21, 2013 20:28
To install the latest stable from command line...
sudo apt-get install python-software-properties
sudo add-apt-repository ppa:git-core/ppa
sudo apt-get update
sudo apt-get install git