Skip to content

Instantly share code, notes, and snippets.

@vvuksan
Created July 8, 2011 15:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vvuksan/1072046 to your computer and use it in GitHub Desktop.
Save vvuksan/1072046 to your computer and use it in GitHub Desktop.
Ganglia MonetDB integration
https://github.com/jmlowe/Ganglia-MonetDB-plugin
(16:07:47) jmlowe: monetdb's performance is still very flat 3400 rows, 4.3 msec
(16:11:03) bernardl: jmlowe: this is with gmetad-python constantly updating stuff?
(16:11:13) jmlowe: yes, it's banging away
(16:11:19) bernardl: jmlowe: cool -- host many hosts?
(16:11:24) jmlowe: just one
(16:11:27) bernardl: ok
(16:11:41) bernardl: how easy is it to setup monetdb? is it readily packaged up in Linux?
(16:12:19) jmlowe: maintenance day is next tuesday, I can probably have 1024 nodes for awhile
(16:12:31) bernardl: jmlowe: cool
(16:12:33) jmlowe: I think there are rpm's for rhel
(16:12:47) jmlowe: debian repositories also
(16:12:48) bernardl: yeah, once you got everything down, i'd like to test it out
(16:13:14) jmlowe: http://dev.monetdb.org/downloads/
(16:13:44) jmlowe: osx, solaris, windows, debian, rpms
(16:14:18) jmlowe: ymmv with getting the python libs
(16:14:46) bernardl: right now i only care about EL :)
(16:16:06) jmlowe: rhel5 is what I'm currently testing on, so you should be set
(16:16:12) bernardl: awesome
(16:16:35) bernardl: look forward to it!
(16:20:00) jmlowe: this is where I got it https://code.google.com/p/monetdb-rhel-5/
(16:24:10) bernardl: jmlowe: we should get that picked up in EPEL
(16:24:59) jmlowe: yeah, would be nice
(16:26:54) bernardl: jmlowe: but i suppose it shouldn't be too hard to rebuild the RPMs from Fedora :)
(16:31:08) jmlowe: bernardl: sent you what I have so far, a schema and a butchered version of rrd_plugin.py
(16:31:17) bernardl: heh ok
(16:31:35) bernardl: i don't suppose we need any of those RRD stuff any more :)
(16:33:28) jmlowe: I never like to take away a choice from somebody else, when I teach somebody to wire I let them try the hard way first before I show them how to work smarter not harder
(16:36:15) bernardl: jmlowe: well, it's a plugin, so the user can choose whether to use rrdtool or monetdb :)
sql>select count(*) from floats; select metric,avg(val) from floats group by metric;
+-------+
| L7 |
+=======+
| 504 |
+-------+
1 tuple
+---------------+------------------------+
| metric | L10 |
+===============+========================+
| bytes_in | 3463.6079166666673 |
| pkts_in | 51.849166666666655 |
| cpu_wio | 0 |
| load_one | 0.048750000000000009 |
| bytes_out | 5678.3087499999992 |
| mem_total | 12292744 |
| mem_free | 6024477 |
| pkts_out | 44.432916666666671 |
| mem_buffers | 332489 |
| mem_shared | 0 |
| cpu_user | 0.21250000000000002 |
| cpu_system | 0.029166666666666664 |
| load_fifteen | 0.0029166666666666668 |
| load_five | 0.051666666666666694 |
| part_max_used | 32 |
| swap_total | 14352376 |
| mem_cached | 3307717 |
| swap_free | 13680448 |
| cpu_idle | 99.758333333333326 |
| cpu_aidle | 0 |
| cpu_nice | 0 |
+---------------+------------------------+
21 tuples
Timer 3.435 msec 21 rows
(10:52:38) jmlowe: left my monetdb experimental collector running overnight, 71ms to select hourly averages of 21 metrics across 100738 rows
(10:55:26) vvuksan: jmlowe: can you give me a rundown on monetdb
(10:56:30) jmlowe: imagine a file with a pile of arrays, [col1,col2,col3],[col1,col2,col3]...
(10:57:01) jmlowe: if you want to take the average you need to sum all of col1 and divide by n
(10:57:07) vvuksan: correct
(10:57:55) jmlowe: now imagine how that looks to the disk subsystem, you would need to grab every block that the db is stored in
(10:58:36) jmlowe: now imagine if you swapped the orientation [col1,col1,col1…][col2,col2,col2..]...
(10:59:57) vvuksan: k
(11:00:36) jmlowe: you would load much fewer blocks, your caches would have far more hits, there is far more auto correlation (read compressible data) between any contiguous chunks of data, you can trivially use simd (single instruction multiple data, mmx, sse, altivec, etc)
(11:01:34) vvuksan: k
(11:01:55) jmlowe: the one caveat is that modifying data is very expensive, but if you have time series measurements you would never need to modify existing data unlike account balances for example
(11:02:12) vvuksan: right
(11:03:42) vvuksan: monetdb does this automatically ?
(11:04:25) jmlowe: monetdb is a column store db implementation that takes advantage of the hardware
(11:05:26) jmlowe: unlike the classical db and I think it will outperform rrd and other lossy types of compression
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment