Skip to content

Instantly share code, notes, and snippets.

View jiminoc's full-sized avatar

Jim Plush jiminoc

View GitHub Profile
public void testThatDifferentRPCandHttpPortsAreOK()
throws IOException {
Configuration conf = new HdfsConfiguration();
FileSystem.setDefaultUri(conf, "hdfs://localhost:8000");
conf.set(DFSConfigKeys.DFS_NAMENODE_HTTP_ADDRESS_KEY, "127.0.0.1:9000");
try {
NameNode nameNode = new NameNode(conf); // should be OK!
} catch (Exception e) {
@jiminoc
jiminoc / schema.pig
Created August 12, 2011 16:25
Example of a Pig import Macro
-- Data would be in the following format
-- 2011-01-01^pageview^http://somesite.com/url1^randomuserid123
DEFINE grvschema(inputdata)
returns BEACONS {
$BEACONS = LOAD '$inputdata' USING PigStorage('^') as (date:chararray, action:chararray, url:chararray, userId:chararray);
};
@jiminoc
jiminoc / uniques.pig
Created August 12, 2011 16:29
import macro and calculate uniques
-- Set the default number of reducers for this script
set default_parallel 15;
set job.name unique_users
-- INCLUDE THE OFFICIAL SCHEMA, beacons will be a relation returned
IMPORT 'schema.pig';
beacons = grvschema('$input');
fb = FOREACH beacons GENERATE userGuid;
@jiminoc
jiminoc / loanpattern.scala
Created September 16, 2011 17:21
Example of the loan pattern in Scala
/**
* here we're going to test the loan pattern using something we do all the time, iterate over files and work with
* the lines in those files. Java makes it a pain the arse to just read simple lines in a file, lots of set up
* code and resource management. The nice part of this pattern is once you create that set up code once you can
* use a nice clean API like the one seen below.
* You can see here we're passing in a function to "withFileIterator that accepts a "line" string, we then can
* work on that line string freely and the resource will be closed in a finally block behind the scenes.
*/
@Test
def loanPatternWithFileIterator() {
@jiminoc
jiminoc / loanpattern-writer.scala
Created September 16, 2011 17:32
Loan Pattern using BufferedWriter
/**
* we're going to receive an injected output buffer back to our curried function so we can just write lines
* to the output file, the resource will be flushed and closed for us at the end of the run so we don't have to
* worry about resource management
*/
@Test
def workWithBufferedWriterTest() {
val tmpDir = System.getProperty("java.io.tmpdir")
val tmpFilename = tmpDir + "/scalatowntest.log"
println("writing to file: " + tmpFilename)
<p class="textBodyBlack">
<span id="byLine"></span>
<table cellpadding="0" cellspacing="0" border="0" width="1%" align="left" style="padding:5 15 0 0;">
<tr>
<td>
<img src="http://media.cnbc.com/i/..Shortened for brevity>
</td>
</tr>
</table>
<b><strong>
<p class="textBodyBlack"><span id="byLine"></span></p>
<table cellpadding="0" cellspacing="0" border="0" width="1%" align="left" style="padding:5 15 0 0;">
<tr>
<td><img src="http://media.cnbc.com/i/CNBC/Sections/News_And_Analysis/__Story_Inserts/graphics/__FEDERAL_RESERVE/FED_RESERVE3.jpg" border="0" align="Left" height="150" width="200" vspace="0" hspace="0" /></td>
</tr>
</table>
<b><strong>
<a href="/id/44612678/"><strong>The Federal Reserve’s Operation Twist </strong></a></strong></b>was everything it was cracked up to be, and even a bit more.
TRACE com.gravity.goose.Crawler - crawler: STARTING TO RELEASE ALL RESOURCES
=======================::. ARTICLE REPORT .::======================
URL: http://www.nytimes.com/2011/09/20/arts/design/preserving-the-american-folk-art-museums-place-in-new-york.html?_r=1&ref=arts
TITLE: Preserving the American Folk Art Museum’s Place in New York
IMAGE: http://graphics8.nytimes.com/images/2011/09/20/arts/20folkart-web/20folkart-web-articleLarge.jpg
IMGKIND: bigimage
CONTENT: Please. Someone, everyone, do something to save the American Folk Art Museum from dissolution and dispersal. Or at least slow down the process, so that all options can be thoroughly considered. New York’s contemporary artists, and New York as a whole, need the creative energy of this stubborn, single-minded little institution, its outstanding exhibition program and its wondrous collection, an unparalleled mixture of classic American folk art and 20th-century outsider geniuses. At the moment it almost seems that the m
// create a pool of workers who will do the work
val worker = system.actorOf(Props[ZWorker].withRouter(
RoundRobinRouter(nrOfInstances = 10)),
name = "testworker")
val subSocket = system.newSocket(SocketType.Pull, Listener(listener), Connect("tcp://127.0.0.1:1235"), SubscribeAll)
import zmq
context = zmq.Context()
# Socket to talk to server
print "Connecting to server"
socket = context.socket(zmq.REQ)
socket.connect ("tcp://127.0.0.1:5559")
print 'Client started'
poll = zmq.Poller()