Skip to content

Instantly share code, notes, and snippets.

@mrocklin
Created May 19, 2014 23:53
Show Gist options
  • Save mrocklin/a33f9dea345765aad976 to your computer and use it in GitHub Desktop.
Save mrocklin/a33f9dea345765aad976 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "",
"signature": "sha256:1fa3c2a70db6eabcfb847614c5d10ff979afcc6bd9f37c6326b2b33a11dd9345"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Blaze Data"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from blaze.data import *\n",
"schema = '{transaction: int, sender: int, recipient: int, date: string, value: real}'\n",
"dd = CSV('user_edges.txt', schema=schema)\n",
"tuple(dd.py[:3])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 1,
"text": [
"((1, 2, 2, u'20130410142250', 24.375),\n",
" (1, 2, 782477, u'20130410142250', 0.7709),\n",
" (2, 620423, 4571210, u'20111227114312', 614.17495129))"
]
}
],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Blaze Expresssions"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from blaze.expr.table import *\n",
"t = TableSymbol(schema)\n",
"outdegree = By(t, t['sender'], t['recipient'].count()).sort('recipient', ascending=False)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Blaze Computation"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from blaze.compute.python import compute\n",
"compute(outdegree, dd.py[::1000])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 4,
"text": [
"[(25, 8116),\n",
" (1374, 1329),\n",
" (11, 554),\n",
" (645014, 317),\n",
" (29, 206),\n",
" (35087, 140),\n",
" (1877737, 138),\n",
" (3958, 134),\n",
" (12564, 133),\n",
" (193558, 133),\n",
" (74, 121),\n",
" (586816, 113),\n",
" (870051, 99),\n",
" (164206, 88),\n",
" (631, 85),\n",
" (373150, 79),\n",
" (27, 75),\n",
" (507544, 74),\n",
" (32668, 74),\n",
" (540461, 71),\n",
" (17321, 65),\n",
" (97650, 64),\n",
" (1450491, 60),\n",
" (453676, 53),\n",
" (466119, 50),\n",
" (1, 48),\n",
" (532, 45),\n",
" (264077, 45),\n",
" (9672, 44),\n",
" (116, 43),\n",
" (782688, 42),\n",
" (115, 41),\n",
" (1912, 41),\n",
" (240624, 39),\n",
" (28712, 38),\n",
" (540566, 37),\n",
" (151, 35),\n",
" (6500, 35),\n",
" (12676, 35),\n",
" (20, 34),\n",
" (142970, 34),\n",
" (5167, 33),\n",
" (513230, 33),\n",
" (362229, 32),\n",
" (26468, 32),\n",
" (91638, 32),\n",
" (32720, 31),\n",
" (180005, 30),\n",
" (858050, 29),\n",
" (423191, 29),\n",
" (520254, 28),\n",
" (32611, 28),\n",
" (824, 27),\n",
" (20868, 27),\n",
" (65132, 27),\n",
" (7, 26),\n",
" (39, 26),\n",
" (493524, 26),\n",
" (537178, 26),\n",
" (793, 25),\n",
" (89322, 25),\n",
" (335603, 24),\n",
" (456360, 24),\n",
" (534911, 24),\n",
" (11902, 23),\n",
" (240635, 23),\n",
" (14496, 23),\n",
" (48648, 22),\n",
" (842023, 21),\n",
" (7466, 21),\n",
" (224319, 21),\n",
" (39337, 20),\n",
" (374476, 20),\n",
" (520844, 20),\n",
" (131411, 19),\n",
" (2169, 19),\n",
" (27910, 19),\n",
" (845875, 19),\n",
" (505134, 19),\n",
" (640, 18),\n",
" (333668, 18),\n",
" (177807, 18),\n",
" (2119, 17),\n",
" (35081, 17),\n",
" (32574, 17),\n",
" (304202, 17),\n",
" (240490, 17),\n",
" (17047, 17),\n",
" (116261, 17),\n",
" (215152, 17),\n",
" (20302, 17),\n",
" (27322, 17),\n",
" (522065, 17),\n",
" (230521, 16),\n",
" (377601, 16),\n",
" (513869, 16),\n",
" (459615, 15),\n",
" (33711, 15),\n",
" (16146, 15),\n",
" (382692, 15),\n",
" (187131, 15),\n",
" (355608, 15),\n",
" (523645, 15),\n",
" (309, 14),\n",
" (126039, 14),\n",
" (797458, 14),\n",
" (518891, 14),\n",
" (3112, 14),\n",
" (283711, 14),\n",
" (34299, 13),\n",
" (1380463, 13),\n",
" (6912, 13),\n",
" (76047, 13),\n",
" (110625, 13),\n",
" (1196016, 13),\n",
" (25146, 13),\n",
" (255601, 13),\n",
" (162286, 13),\n",
" (30528, 13),\n",
" (656615, 12),\n",
" (2184, 12),\n",
" (5093, 12),\n",
" (7718, 12),\n",
" (1859351, 12),\n",
" (277895, 12),\n",
" (119412, 12),\n",
" (316326, 12),\n",
" (383378, 12),\n",
" (539804, 12),\n",
" (126106, 12),\n",
" (64358, 12),\n",
" (71, 11),\n",
" (1813, 11),\n",
" (3536, 11),\n",
" (1013806, 11),\n",
" (270942, 11),\n",
" (74925, 11),\n",
" (75435, 11),\n",
" (520906, 11),\n",
" (428598, 11),\n",
" (671550, 11),\n",
" (14108, 11),\n",
" (123571, 11),\n",
" (23595, 11),\n",
" (813726, 11),\n",
" (519188, 11),\n",
" (5104, 11),\n",
" (37978, 11),\n",
" (31569, 11),\n",
" (40, 10),\n",
" (653, 10),\n",
" (460129, 10),\n",
" (1876, 10),\n",
" (34947, 10),\n",
" (299429, 10),\n",
" (169171, 10),\n",
" (203769, 10),\n",
" (597697, 10),\n",
" (9004, 10),\n",
" (13285, 10),\n",
" (113524, 10),\n",
" (408638, 10),\n",
" (212657, 10),\n",
" (114384, 10),\n",
" (412631, 10),\n",
" (292775, 10),\n",
" (708815, 10),\n",
" (283886, 10),\n",
" (54558, 10),\n",
" (121669, 10),\n",
" (547953, 10),\n",
" (710262, 10),\n",
" (56986, 10),\n",
" (484890, 10),\n",
" (126296, 10),\n",
" (388823, 10),\n",
" (30180, 10),\n",
" (360491, 9),\n",
" (46, 9),\n",
" (82, 9),\n",
" (114, 9),\n",
" (195, 9),\n",
" (365, 9),\n",
" (366, 9),\n",
" (66263, 9),\n",
" (33511, 9),\n",
" (494469, 9),\n",
" (36495, 9),\n",
" (3314718, 9),\n",
" (5711, 9),\n",
" (38656, 9),\n",
" (497739, 9),\n",
" (31868, 9),\n",
" (7122, 9),\n",
" (8247, 9),\n",
" (533434, 9),\n",
" (9551, 9),\n",
" (76003, 9),\n",
" (11187, 9),\n",
" (799871, 9),\n",
" (669767, 9),\n",
" (14474, 9),\n",
" (62707, 9),\n",
" (671552, 9),\n",
" (3621385, 9),\n",
" (8287, 9),\n",
" (38963, 9),\n",
" (509554, 9),\n",
" (444506, 9),\n",
" (2312492, 9),\n",
" (20211, 9),\n",
" (185343, 9),\n",
" (342966, 9),\n",
" (223634, 9),\n",
" (193650, 9),\n",
" (522605, 9),\n",
" (32573, 9),\n",
" (75, 8),\n",
" (164016, 8),\n",
" (114628, 8),\n",
" (1867, 8),\n",
" (34851, 8),\n",
" (398726, 8),\n",
" (726626, 8),\n",
" (71777, 8),\n",
" (6568, 8),\n",
" (532014, 8),\n",
" (237788, 8),\n",
" (76057, 8),\n",
" (3898994, 8),\n",
" (243470, 8),\n",
" (1720242, 8),\n",
" (116775, 8),\n",
" (117114, 8),\n",
" (477919, 8),\n",
" (232722, 8),\n",
" (86213, 8),\n",
" (86927, 8),\n",
" (346005, 8),\n",
" (25245, 8),\n",
" (408501, 8),\n",
" (670736, 8),\n",
" (75474, 8),\n",
" (17871, 8),\n",
" (29666, 8),\n",
" (3569809, 8),\n",
" (884533, 8),\n",
" (180, 7),\n",
" (211, 7),\n",
" (402, 7),\n",
" (664, 7),\n",
" (856, 7),\n",
" (334, 7),\n",
" (3081, 7),\n",
" (201884, 7),\n",
" (465394, 7),\n",
" (39646, 7),\n",
" (2562899, 7),\n",
" (9123, 7),\n",
" (468491, 7),\n",
" (84555, 7),\n",
" (242451, 7),\n",
" (13516, 7),\n",
" (3683710, 7),\n",
" (1127913, 7),\n",
" (145042, 7),\n",
" (1646322, 7),\n",
" (278186, 7),\n",
" (802550, 7),\n",
" (442450, 7),\n",
" (442843, 7),\n",
" (181756, 7),\n",
" (249100, 7),\n",
" (20569, 7),\n",
" (3627, 7),\n",
" (481211, 7),\n",
" (1849, 7),\n",
" (514536, 7),\n",
" (57830, 7),\n",
" (20491, 7),\n",
" (492967, 7),\n",
" (419517, 7),\n",
" (59571, 7),\n",
" (116610, 7),\n",
" (290707, 7),\n",
" (463616, 7),\n",
" (586574, 7),\n",
" (5881357, 7),\n",
" (31709, 7),\n",
" (4, 6),\n",
" (140, 6),\n",
" (330, 6),\n",
" (356, 6),\n",
" (707, 6),\n",
" (726, 6),\n",
" (427232, 6),\n",
" (526055, 6),\n",
" (1946, 6),\n",
" (644795, 6),\n",
" (1870265, 6),\n",
" (428590, 6),\n",
" (264911, 6),\n",
" (593283, 6),\n",
" (593284, 6),\n",
" (429473, 6),\n",
" (331693, 6),\n",
" (922569, 6),\n",
" (530339, 6),\n",
" (6069, 6),\n",
" (891072, 6),\n",
" (4332231, 6),\n",
" (39665, 6),\n",
" (465703, 6),\n",
" (170861, 6),\n",
" (41308, 6),\n",
" (402613, 6),\n",
" (4040485, 6),\n",
" (534523, 6),\n",
" (10759, 6),\n",
" (1735991, 6),\n",
" (11178, 6),\n",
" (241186, 6),\n",
" (372948, 6),\n",
" (307757, 6),\n",
" (215217, 6),\n",
" (1531589, 6),\n",
" (931606, 6),\n",
" (800951, 6),\n",
" (636018, 6),\n",
" (15325, 6),\n",
" (80949, 6),\n",
" (998666, 6),\n",
" (3545382, 6),\n",
" (507859, 6),\n",
" (2180035, 6),\n",
" (476301, 6),\n",
" (116109, 6),\n",
" (280043, 6),\n",
" (444782, 6),\n",
" (288184, 6),\n",
" (19407, 6),\n",
" (3767, 6),\n",
" (260457, 6),\n",
" (186979, 6),\n",
" (533708, 6),\n",
" (57196, 6),\n",
" (1007540, 6),\n",
" (3399842, 6),\n",
" (188589, 6),\n",
" (518110, 6),\n",
" (417715, 6),\n",
" (27884, 6),\n",
" (1863800, 6),\n",
" (1996212, 6),\n",
" (31385, 6),\n",
" (4817301, 6),\n",
" (294656, 6),\n",
" (1146666, 6),\n",
" (360325, 6),\n",
" (150, 5),\n",
" (33056, 5),\n",
" (426295, 5),\n",
" (4194663, 5),\n",
" (471, 5),\n",
" (612, 5),\n",
" (656361, 5),\n",
" (3441842, 5),\n",
" (66816, 5),\n",
" (263441, 5),\n",
" (1932, 5),\n",
" (133519, 5),\n",
" (4426297, 5),\n",
" (4391898, 5),\n",
" (3099, 5),\n",
" (3050557, 5),\n",
" (36127, 5),\n",
" (530343, 5),\n",
" (3817, 5),\n",
" (27955, 5),\n",
" (201633, 5),\n",
" (2331765, 5),\n",
" (28200, 5),\n",
" (2102771, 5),\n",
" (235051, 5),\n",
" (399163, 5),\n",
" (1000, 5),\n",
" (6257, 5),\n",
" (399577, 5),\n",
" (813107, 5),\n",
" (301484, 5),\n",
" (323352, 5),\n",
" (1350817, 5),\n",
" (7815, 5),\n",
" (4890627, 5),\n",
" (464009, 5),\n",
" (958655, 5),\n",
" (467258, 5),\n",
" (241807, 5),\n",
" (471208, 5),\n",
" (238931, 5),\n",
" (448827, 5),\n",
" (76945, 5),\n",
" (2764093, 5),\n",
" (12249, 5),\n",
" (40425, 5),\n",
" (1586456, 5),\n",
" (406918, 5),\n",
" (518784, 5),\n",
" (412104, 5),\n",
" (10539, 5),\n",
" (48853, 5),\n",
" (474889, 5),\n",
" (1425487, 5),\n",
" (1295286, 5),\n",
" (254237, 5),\n",
" (411880, 5),\n",
" (543176, 5),\n",
" (19255, 5),\n",
" (445701, 5),\n",
" (19777, 5),\n",
" (904528, 5),\n",
" (200190, 5),\n",
" (1103878, 5),\n",
" (40358, 5),\n",
" (219547, 5),\n",
" (55963, 5),\n",
" (166625, 5),\n",
" (777193, 5),\n",
" (766130, 5),\n",
" (1335531, 5),\n",
" (483654, 5),\n",
" (533921, 5),\n",
" (28950, 5),\n",
" (976132, 5),\n",
" (491900, 5),\n",
" (1174063, 5),\n",
" (475551, 5),\n",
" (86782, 5),\n",
" (42978, 5),\n",
" (356848, 5),\n",
" (815855, 5),\n",
" (616603, 5),\n",
" (1332047, 5),\n",
" (246296, 5),\n",
" (62733, 5),\n",
" (63115, 5),\n",
" (70, 4),\n",
" (80, 4),\n",
" (273119, 4),\n",
" (361, 4),\n",
" (622, 4),\n",
" (628, 4),\n",
" (1396669, 4),\n",
" (394070, 4),\n",
" (1007, 4),\n",
" (33870, 4),\n",
" (405941, 4),\n",
" (1380, 4),\n",
" (296475, 4),\n",
" (3737285, 4),\n",
" (329428, 4),\n",
" (526082, 4),\n",
" (1846, 4),\n",
" (2328, 4),\n",
" (491931, 4),\n",
" (2695, 4),\n",
" (2754, 4),\n",
" (232273, 4),\n",
" (2997, 4),\n",
" (199609, 4),\n",
" (3189, 4),\n",
" (68730, 4),\n",
" (3246, 4),\n",
" (3260, 4),\n",
" (3334, 4),\n",
" (4360, 4),\n",
" (266656, 4),\n",
" (203107, 4),\n",
" (4817767, 4),\n",
" (3511406, 4),\n",
" (22857, 4),\n",
" (2070770, 4),\n",
" (104790, 4),\n",
" (17470, 4),\n",
" (39305, 4),\n",
" (3545594, 4),\n",
" (203400, 4),\n",
" (6926, 4),\n",
" (6979, 4),\n",
" (72714, 4),\n",
" (7889, 4),\n",
" (663498, 4),\n",
" (1458, 4),\n",
" (238335, 4),\n",
" (41778, 4),\n",
" (225466, 4),\n",
" (1683655, 4),\n",
" (1634, 4),\n",
" (337544, 4),\n",
" (326613, 4),\n",
" (10257, 4),\n",
" (444083, 4),\n",
" (10309, 4),\n",
" (797100, 4),\n",
" (1092186, 4),\n",
" (10846, 4),\n",
" (273767, 4),\n",
" (995995, 4),\n",
" (307126, 4),\n",
" (503943, 4),\n",
" (503944, 4),\n",
" (133212, 4),\n",
" (2055620, 4),\n",
" (1487533, 4),\n",
" (210133, 4),\n",
" (13575, 4),\n",
" (111969, 4),\n",
" (2864494, 4),\n",
" (931522, 4),\n",
" (1554121, 4),\n",
" (21918, 4),\n",
" (51616, 4),\n",
" (1198499, 4),\n",
" (375355, 4),\n",
" (14933, 4),\n",
" (670423, 4),\n",
" (408386, 4),\n",
" (15250, 4),\n",
" (4930732, 4),\n",
" (15591, 4),\n",
" (15868, 4),\n",
" (16325, 4),\n",
" (1916974, 4),\n",
" (5063049, 4),\n",
" (377602, 4),\n",
" (532644, 4),\n",
" (377840, 4),\n",
" (377918, 4),\n",
" (116383, 4),\n",
" (536988, 4),\n",
" (5327042, 4),\n",
" (252673, 4),\n",
" (3162, 4),\n",
" (8633, 4),\n",
" (1428173, 4),\n",
" (4738594, 4),\n",
" (315129, 4),\n",
" (86231, 4),\n",
" (20943, 4),\n",
" (349264, 4),\n",
" (882957, 4),\n",
" (612534, 4),\n",
" (842335, 4),\n",
" (481980, 4),\n",
" (154554, 4),\n",
" (187507, 4),\n",
" (515644, 4),\n",
" (937915, 4),\n",
" (89936, 4),\n",
" (57294, 4),\n",
" (24827, 4),\n",
" (1925815, 4),\n",
" (843202, 4),\n",
" (64333, 4),\n",
" (484386, 4),\n",
" (200957, 4),\n",
" (222713, 4),\n",
" (142722, 4),\n",
" (1009327, 4),\n",
" (1631985, 4),\n",
" (255770, 4),\n",
" (32746, 4),\n",
" (59873, 4),\n",
" (59979, 4),\n",
" (3631740, 4),\n",
" (60213, 4),\n",
" (1010838, 4),\n",
" (3108061, 4),\n",
" (4718, 4),\n",
" (162179, 4),\n",
" (1484875, 4),\n",
" (29151, 4),\n",
" (356905, 4),\n",
" (3589704, 4),\n",
" (193240, 4),\n",
" (23764, 4),\n",
" (455723, 4),\n",
" (12201, 4),\n",
" (784439, 4),\n",
" (97173, 4),\n",
" (1604551, 4),\n",
" (2325992, 4),\n",
" (32754, 4),\n",
" (327631, 4),\n",
" (18, 3),\n",
" (21, 3),\n",
" (27312, 3),\n",
" (54, 3),\n",
" (393277, 3),\n",
" (846542, 3),\n",
" (266, 3),\n",
" (33073, 3),\n",
" (388, 3),\n",
" (491929, 3),\n",
" (447, 3),\n",
" (983555, 3),\n",
" (290680, 3),\n",
" (920, 3),\n",
" (880580, 3),\n",
" (33983, 3),\n",
" (361839, 3),\n",
" (1454, 3),\n",
" (398923, 3),\n",
" (1246739, 3),\n",
" (1572, 3),\n",
" (1589, 3),\n",
" (1786, 3),\n",
" (526081, 3),\n",
" (218757, 3),\n",
" (1181474, 3),\n",
" (1960, 3),\n",
" (5146617, 3),\n",
" (493606, 3),\n",
" (1345674, 3),\n",
" (35041, 3),\n",
" (291411, 3),\n",
" (21905, 3),\n",
" (2669, 3),\n",
" (2732, 3),\n",
" (821934, 3),\n",
" (2785, 3),\n",
" (1149673, 3),\n",
" (2866, 3),\n",
" (68625, 3),\n",
" (3234, 3),\n",
" (3249, 3),\n",
" (3255, 3),\n",
" (199972, 3),\n",
" (3377, 3),\n",
" (3422, 3),\n",
" (530338, 3),\n",
" (3540, 3),\n",
" (200243, 3),\n",
" (3688, 3),\n",
" (986926, 3),\n",
" (4110, 3),\n",
" (1380423, 3),\n",
" (168074, 3),\n",
" (561301, 3),\n",
" (1284214, 3),\n",
" (4924, 3),\n",
" (4949, 3),\n",
" (4976, 3),\n",
" (4989, 3),\n",
" (1512467, 3),\n",
" (5159, 3),\n",
" (148325, 3),\n",
" (234627, 3),\n",
" (324798, 3),\n",
" (234746, 3),\n",
" (824762, 3),\n",
" (136762, 3),\n",
" (5768, 3),\n",
" (1009, 3),\n",
" (6073, 3),\n",
" (6126, 3),\n",
" (1316923, 3),\n",
" (137310, 3),\n",
" (137311, 3),\n",
" (1480969, 3),\n",
" (30765, 3),\n",
" (3545383, 3),\n",
" (366952, 3),\n",
" (792980, 3),\n",
" (2922938, 3),\n",
" (530950, 3),\n",
" (1677748, 3),\n",
" (1874674, 3),\n",
" (138251, 3),\n",
" (367106, 3),\n",
" (48931, 3),\n",
" (531763, 3),\n",
" (251274, 3),\n",
" (2989805, 3),\n",
" (4038400, 3),\n",
" (4399003, 3),\n",
" (261052, 3),\n",
" (776958, 3),\n",
" (74895, 3),\n",
" (75018, 3),\n",
" (181836, 3),\n",
" (42450, 3),\n",
" (9759, 3),\n",
" (30880, 3),\n",
" (992921, 3),\n",
" (42678, 3),\n",
" (19972, 3),\n",
" (4105977, 3),\n",
" (468803, 3),\n",
" (26525, 3),\n",
" (2019325, 3),\n",
" (534521, 3),\n",
" (108624, 3),\n",
" (525180, 3),\n",
" (141498, 3),\n",
" (829909, 3),\n",
" (502256, 3),\n",
" (395022, 3),\n",
" (371419, 3),\n",
" (1190672, 3),\n",
" (174926, 3),\n",
" (2595982, 3),\n",
" (11095, 3),\n",
" (366992, 3),\n",
" (2437661, 3),\n",
" (56549, 3),\n",
" (275032, 3),\n",
" (459005, 3),\n",
" (2502570, 3),\n",
" (110585, 3),\n",
" (274426, 3),\n",
" (209110, 3),\n",
" (241924, 3),\n",
" (526392, 3),\n",
" (297024, 3),\n",
" (2077339, 3),\n",
" (13422, 3),\n",
" (46324, 3),\n",
" (5119529, 3),\n",
" (210408, 3),\n",
" (13901, 3),\n",
" (210553, 3),\n",
" (2177269, 3),\n",
" (46736, 3),\n",
" (243792, 3),\n",
" (407720, 3),\n",
" (362870, 3),\n",
" (3137272, 3),\n",
" (80413, 3),\n",
" (703016, 3),\n",
" (47436, 3),\n",
" (61822, 3),\n",
" (507306, 3),\n",
" (1261175, 3),\n",
" (221127, 3),\n",
" (376578, 3),\n",
" (114436, 3),\n",
" (2127231, 3),\n",
" (527026, 3),\n",
" (475210, 3),\n",
" (13726, 3),\n",
" (405920, 3),\n",
" (115266, 3),\n",
" (49825, 3),\n",
" (377794, 3),\n",
" (120607, 3),\n",
" (17904, 3),\n",
" (17915, 3),\n",
" (50747, 3),\n",
" (116708, 3),\n",
" (149481, 3),\n",
" (64991, 3),\n",
" (346146, 3),\n",
" (313942, 3),\n",
" (19093, 3),\n",
" (19099, 3),\n",
" (19143, 3),\n",
" (445316, 3),\n",
" (19838, 3),\n",
" (839144, 3),\n",
" (249695, 3),\n",
" (2609021, 3),\n",
" (287383, 3),\n",
" (282536, 3),\n",
" (479229, 3),\n",
" (20488, 3),\n",
" (872512, 3),\n",
" (20775, 3),\n",
" (20779, 3),\n",
" (184705, 3),\n",
" (184924, 3),\n",
" (1856153, 3),\n",
" (946715, 3),\n",
" (120290, 3),\n",
" (382507, 3),\n",
" (186034, 3),\n",
" (4445940, 3),\n",
" (317184, 3),\n",
" (448268, 3),\n",
" (481098, 3),\n",
" (1688737, 3),\n",
" (2959985, 3),\n",
" (219605, 3),\n",
" (4282841, 3),\n",
" (383742, 3),\n",
" (1533089, 3),\n",
" (384451, 3),\n",
" (417021, 3),\n",
" (4066, 3),\n",
" (528298, 3),\n",
" (28052, 3),\n",
" (391786, 3),\n",
" (122495, 3),\n",
" (974476, 3),\n",
" (1058075, 3),\n",
" (1859350, 3),\n",
" (975075, 3),\n",
" (352517, 3),\n",
" (3203530, 3),\n",
" (57832, 3),\n",
" (484003, 3),\n",
" (254749, 3),\n",
" (58388, 3),\n",
" (58917, 3),\n",
" (26011, 3),\n",
" (255422, 3),\n",
" (5531076, 3),\n",
" (222731, 3),\n",
" (503882, 3),\n",
" (7376, 3),\n",
" (2057934, 3),\n",
" (222949, 3),\n",
" (59034, 3),\n",
" (80963, 3),\n",
" (911861, 3),\n",
" (289302, 3),\n",
" (92704, 3),\n",
" (1239678, 3),\n",
" (187505, 3),\n",
" (26483, 3),\n",
" (436092, 3),\n",
" (408872, 3),\n",
" (487174, 3),\n",
" (389215, 3),\n",
" (3144, 3),\n",
" (135927, 3),\n",
" (6091380, 3),\n",
" (15815, 3),\n",
" (783072, 3),\n",
" (1733414, 3),\n",
" (4322130, 3),\n",
" (3371871, 3),\n",
" (29578, 3),\n",
" (2126784, 3),\n",
" (86928, 3),\n",
" (620053, 3),\n",
" (390898, 3),\n",
" (16002, 3),\n",
" (423824, 3),\n",
" (4257786, 3),\n",
" (227344, 3),\n",
" (587851, 3),\n",
" (391257, 3),\n",
" (3569808, 3),\n",
" (489726, 3),\n",
" (260669, 3),\n",
" (916099, 3),\n",
" (32539, 3),\n",
" (25306, 3),\n",
" (64471, 3),\n",
" (97707, 3),\n",
" (2158255, 3),\n",
" (26, 2),\n",
" (37, 2),\n",
" (4161578, 2),\n",
" (76, 2),\n",
" (85, 2),\n",
" (153, 2),\n",
" (198, 2),\n",
" (2818315, 2),\n",
" (295196, 2),\n",
" (321, 2),\n",
" (2780210, 2),\n",
" (33162, 2),\n",
" (33203, 2),\n",
" (3441103, 2),\n",
" (458843, 2),\n",
" (524854, 2),\n",
" (33359, 2),\n",
" (601, 2),\n",
" (33383, 2),\n",
" (393873, 2),\n",
" (708, 2),\n",
" (393937, 2),\n",
" (4260602, 2),\n",
" (33590, 2),\n",
" (836, 2),\n",
" (525181, 2),\n",
" (917, 2),\n",
" (262304, 2),\n",
" (965, 2),\n",
" (1011, 2),\n",
" (17901, 2),\n",
" (4195357, 2),\n",
" (1178, 2),\n",
" (689312, 2),\n",
" (1222, 2),\n",
" (333346, 2),\n",
" (1237, 2),\n",
" (3802410, 2),\n",
" (1323, 2),\n",
" (361871, 2),\n",
" (787856, 2),\n",
" (1430, 2),\n",
" (1882247, 2),\n",
" (1513, 2),\n",
" (34296, 2),\n",
" (75339, 2),\n",
" (1082912, 2),\n",
" (1476163, 2),\n",
" (1660523, 2),\n",
" (1674, 2),\n",
" (132749, 2),\n",
" (67286, 2),\n",
" (4163291, 2),\n",
" (1830, 2),\n",
" (67393, 2),\n",
" (128817, 2),\n",
" (1980, 2),\n",
" (26255, 2),\n",
" (100365, 2),\n",
" (1076330, 2),\n",
" (395442, 2),\n",
" (2294, 2),\n",
" (1149178, 2),\n",
" (2300, 2),\n",
" (125998, 2),\n",
" (5409062, 2),\n",
" (4163885, 2),\n",
" (4163899, 2),\n",
" (4163924, 2),\n",
" (4164076, 2),\n",
" (1313471, 2),\n",
" (2296524, 2),\n",
" (382757, 2),\n",
" (385263, 2),\n",
" (50719, 2),\n",
" (199490, 2),\n",
" (2722629, 2),\n",
" (807534, 2),\n",
" (2952, 2),\n",
" (3020, 2),\n",
" (3021, 2),\n",
" (830635, 2),\n",
" (3110, 2),\n",
" (3134, 2),\n",
" (1150044, 2),\n",
" (3171, 2),\n",
" (3177, 2),\n",
" (3187, 2),\n",
" (232592, 2),\n",
" (1838235, 2),\n",
" (524828, 2),\n",
" (3257, 2),\n",
" (363739, 2),\n",
" (3314, 2),\n",
" (363771, 2),\n",
" (2526482, 2),\n",
" (1314074, 2),\n",
" (1862373, 2),\n",
" (1215976, 2),\n",
" (3509774, 2),\n",
" (69160, 2),\n",
" (3654, 2),\n",
" (3677, 2),\n",
" (200300, 2),\n",
" (331413, 2),\n",
" (4413394, 2),\n",
" (1675030, 2),\n",
" (69402, 2),\n",
" (855886, 2),\n",
" (36688, 2),\n",
" (841716, 2),\n",
" (1871850, 2),\n",
" (1282215, 2),\n",
" (4143, 2),\n",
" (1617248, 2),\n",
" (33964, 2),\n",
" (4187, 2),\n",
" (459451, 2),\n",
" (856244, 2),\n",
" (4283, 2),\n",
" (4286, 2),\n",
" (4349, 2),\n",
" (1032919, 2),\n",
" (4100410, 2),\n",
" (3772758, 2),\n",
" (2068883, 2),\n",
" (4526588, 2),\n",
" (463514, 2),\n",
" (4791, 2),\n",
" (463593, 2),\n",
" (1839866, 2),\n",
" (2954027, 2),\n",
" (2003786, 2),\n",
" (4965, 2),\n",
" (4133758, 2),\n",
" (1381261, 2),\n",
" (988101, 2),\n",
" (922568, 2),\n",
" ...]"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Who are the big spenders?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"big_spenders = (By(t, t['sender'], t['value'].sum())\n",
" .sort('value', ascending=False)\n",
" .head(10))\n",
"list(compute(big_spenders, dd.py[::1000]))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 5,
"text": [
"[(3119971, 135140.0),\n",
" (4665331, 43039.33),\n",
" (4979659, 42926.094281),\n",
" (4896090, 42923.6281703),\n",
" (4798189, 37461.9390543),\n",
" (4825891, 36855.3859677),\n",
" (4796882, 31164.1292553),\n",
" (5095778, 28785.376222),\n",
" (2348831, 27015.1529336),\n",
" (4960807, 25632.8584088)]"
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Lets do that again, now in SQL\n",
"\n",
"This will require\n",
"\n",
"1. Migrating our data from a CSV file into a SQLite database\n",
"2. Executing our query inside SQLite\n",
"\n",
"Blaze will help us with both the migration and the computation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Migration\n",
"\n",
"First lets migrate our data to SQL"
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"sql = SQL('sqlite:///btc.db', 'user_edges', schema=schema)\n",
"sql.extend(dd)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Execute\n",
"\n",
"That was easy, now lets form a query and then execute it against the SQL backend"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Translate Blaze expression into a SQLAlchemy query\n",
"from blaze.compute.sql import compute\n",
"query = compute(big_spenders, sql.table)\n",
"print(query)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"SELECT user_edges.sender, sum(user_edges.value) AS value \n",
"FROM user_edges GROUP BY user_edges.sender ORDER BY value DESC\n",
" LIMIT :param_1\n"
]
}
],
"prompt_number": 13
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"with sql.engine.connect() as conn:\n",
" print list(conn.execute(query))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[(11, 52461821.94165766), (1374, 23394277.034151807), (25, 13178095.975724494), (29, 5330179.983046564), (12564, 3669712.399824968), (782688, 2929023.064647781), (74, 2122710.961163437), (91638, 2094827.8251607446), (27, 2058124.131470339), (20, 1182868.148780274)]\n"
]
}
],
"prompt_number": 14
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Affecting the compute system\n",
"\n",
"Blaze acts as a thin layer on top of other mature compute systems. Maximizing performance often requires diving deep into your backend. We endeavor to keep your backend just barely under the surface, always within reach as your performance demands increase.\n",
"\n",
"Here the SQLAlchemy engine is directly accessible. We add an index onto the data to accelerate queries."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"with sql.engine.connect() as conn:\n",
" conn.execute('CREATE INDEX from_index ON user_edges (\"from\");')"
],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment