Para empresas que trabalham com publicidade online, mais precisamente DSPs e suas plataformas de Real-time Bidding é muito importante coletar a analisar informações a cerca do comportamento e interesses dos usuários enquanto navegam na internet. Sendo assim, descrevo nesse graph-gist uma abordagem básica que pode ser utilizada para analisar tais dados, considerando um determinado período de tempo e os produtos visualizados por cada usuário. Vamos considerar que alguns personagens de Breaking Bad navegaram na internet há alguns dias e encontraram alguns produtos, elementos químicos e, possuem interesse em comprá-los. Tal informação será de extrema importância no momento de dar um lance em um leilão de publicidade, sabendo o perfil e interesse de um determinado usuário. Então armazenamos tais usuários, produtos e datas das visualizações para que possamos extrair essas informações futuramente.
If you want, I can try and help with pointers as to how to improve the indexing speed you get. Its quite easy to really increase it by using some simple guidelines, for example: | |
- Use create in the index API (assuming you can). | |
- Relax the real time aspect from 1 second to something a bit higher (index.engine.robin.refresh_interval). | |
- Increase the indexing buffer size (indices.memory.index_buffer_size), it defaults to the value 10% which is 10% of the heap. | |
- Increase the number of dirty operations that trigger automatic flush (so the translog won't get really big, even though its FS based) by setting index.translog.flush_threshold (defaults to 5000). | |
- Increase the memory allocated to elasticsearch node. By default its 1g. | |
- Start with a lower replica count (even 0), and then once the bulk loading is done, increate it to the value you want it to be using the update_settings API. This will improve things as possibly less shards will be allocated to each machine. | |
- Increase the number of machines you have so |
// installed Clojure packages: | |
// | |
// * BracketHighlighter | |
// * lispindent | |
// * SublimeREPL | |
// * sublime-paredit | |
{ | |
"word_separators": "/\\()\"',;!@$%^&|+=[]{}`~?", | |
"paredit_enabled": true, |
{:user {:dependencies [[org.clojure/tools.namespace "0.2.3"] | |
[spyscope "0.1.3"] | |
[criterium "0.4.1"]] | |
:injections [(require '(clojure.tools.namespace repl find)) | |
; try/catch to workaround an issue where `lein repl` outside a project dir | |
; will not load reader literal definitions correctly: | |
(try (require 'spyscope.core) | |
(catch RuntimeException e))] | |
:plugins [[lein-pprint "1.1.1"] | |
[lein-beanstalk "0.2.6"] |
import static org.apache.commons.lang.StringUtils.isBlank; | |
import javax.servlet.http.HttpServletRequest; | |
import javax.servlet.http.HttpServletResponse; | |
import org.apache.log4j.Logger; | |
import net.tanesha.recaptcha.ReCaptchaImpl; | |
import net.tanesha.recaptcha.ReCaptchaResponse; |
#New graph | |
START root=node(0) | |
CREATE | |
(User1 { name:'User1' }), | |
(User2 { name: 'User2' }), | |
(User3 { name: 'User3' }), | |
(Mac { name: 'Mac' }), | |
(Samsung { name: 'Samsung' }), | |
(Brastemp { name: 'Brastemp' }), |
For companies that work with online advertising, more precisely DSPs and their Real-time Bidding platforms is very important to consider collecting information about the behavior and interests of users while they surfing on the internet. Therefore, in this graph gist, I describe a basic approach that can be used to analyze such data, considering a certain period of time and the products visualized by each user. Let’s consider some characters from Breaking Bad surfed on the internet a few days ago and found some products interesting, chemical elements, and they are thinking about buying them. Such information is extremely important when making a bid at auction advertising, knowing the profile and interests of a given user. Then we store these users, products and dates of views so we can extract this information in the future.
You need to format the EBS volume (block device) with a file system between step 1 and step 2. So the entire process with your sample mount point is: | |
Create EBS volume. | |
Attach EBS volume to /dev/sdf (EC2's external name for this particular device number). | |
Format file system /dev/xvdf (Ubuntu's internal name for this particular device number): | |
sudo mkfs.ext4 /dev/xvdf | |
Mount file system (with update to /etc/fstab so it stays mounted on reboot): |
sudo mkdir -p /Library/Internet\ Plug-Ins/disabled | |
sudo mv /Library/Internet\ Plug-Ins/JavaAppletPlugin.plugin /Library/Internet\ Plug-Ins/disabled | |
sudo ln -sf /System/Library/Java/Support/Deploy.bundle/Contents/Resources/JavaPlugin2_NPAPI.plugin /Library/Internet\ Plug-Ins/JavaAppletPlugin.plugin | |
sudo ln -sf /System/Library/Frameworks/JavaVM.framework/Commands/javaws /usr/bin/javaws |
sudo apt-get install python-software-properties | |
sudo add-apt-repository ppa:git-core/ppa | |
sudo apt-get update | |
sudo apt-get install git |