Skip to content

Instantly share code, notes, and snippets.

View datablend's full-sized avatar

Davy Suvee datablend

View GitHub Profile
// Create the graph
GraphDatabaseService graph = new EmbeddedGraphDatabase("yelp-graph");
Parser parser = new Parser("yelp_academic_dataset_business.json","yelp_academic_dataset_checkin.json");
Set<Business> businesses1 = parser.getBusinesses();
Set<Business> businesses2 = parser.getBusinesses();
// Double iteration
for (Business business1 : businesses1) {
Transaction tx = graph.beginTx();
1. Datablend datastore: 9 sec (Macbook Pro Retina)
2. Oracle: 14 sec (+55 %) (4-node cluster)
3. Vertica: 97 sec (+1077 %) (4-node cluster)
@datablend
datablend / datasize
Last active December 12, 2015 07:49
1. Oracle HCC Query High: 262 Mb
2. Datablend datastore: 296 Mb (+13 %)
3. Oracle HCC Query Low: 361 Mb (+38 %)
4. Vertica: 560 Mb (+113 %)
sourceValues = sourceRow.getValues();
for (sourceValue : sourceValues) {
targetValues : rows[sourceValue].getValues();
for (targetValue : targetValues) {
if (sourceValues.contains(targetValue) {
triangles++;
}
}
}
0 - 1 2 7 8 9
1 - 2 5
2 - 5 7479
@datablend
datablend / rows
Last active December 12, 2015 07:38
0 - 1 7 8 9
1 - 2 5
2 - 0 5 7479
0 1
0 7
0 8
0 9
1 2
1 5
2 0
2 5
2 7489
-- Retrieve the input paramters
local inputCompound = ARGV[1];
local similarity = ARGV[2];
-- Get the number of fingerprints of the input compound
local countToFind = redis.call('scard', inputCompound .. ':f');
-- Calculate the max, min and number of fingerprints to consider
local maxFingerprints = math.floor(countToFind / similarity);
local minFingerprints = math.floor(countToFind * similarity);
RandomAccessMDLReader reader = new RandomAccessMDLReader(new File(...));
EncodingFingerprint fingerprinter = new Encoding2DMolprint();
// We will use a pipeline in order to speedup the persisting process
Pipeline p = jedis.pipelined();
// Iterate the compounds one by one
for (int i = 0; i < reader.getSize(); i++) {
// Retrieve the molecule and the fingerprints for this molecule
Current relationships:
DavyUpdated -> knows -> Marko
Marko -> knows -> Peter
Relationships at checkpoint Tue Jul 23 20:21:17 CEST 2012:
Davy -> knows -> Marko