We have moved: https://github.com/magnetikonline/linuxmicrosoftievirtualmachines
Due to the popularity of this Gist, and the work in keeping it updated via a Gist, all future updates will take place at the above location. Thanks!
stratified = function(df, group, size) { | |
# USE: * Specify your data frame and grouping variable (as column | |
# number) as the first two arguments. | |
# * Decide on your sample size. For a sample proportional to the | |
# population, enter "size" as a decimal. For an equal number | |
# of samples from each group, enter "size" as a whole number. | |
# | |
# Example 1: Sample 10% of each group from a data frame named "z", | |
# where the grouping variable is the fourth variable, use: | |
# |
We have moved: https://github.com/magnetikonline/linuxmicrosoftievirtualmachines
Due to the popularity of this Gist, and the work in keeping it updated via a Gist, all future updates will take place at the above location. Thanks!
/*************************************************************************** | |
* | |
* This is a simple Monte Carlo simulation to see whether our sale | |
* person should execute a strategy of 'many' big deals and 'few' small ones | |
* or vis versa. | |
* | |
* Author: Ido Green | plus.google.com/+greenido | |
* Date: 16 July 2013 | |
* | |
* *************************************************************************/ |
This guide sets up a non-clustered Nutch crawler, which stores its data via HBase. We will not learn how to setup Hadoop et al., but just the bare minimum to crawl and index websites on a single machine.