-
-
Save donjohnson/2513850 to your computer and use it in GitHub Desktop.
pt-query-digest mysql.slow.log --no-report --filter 'print "put mysql.slowqueries $event->{timestamp} $event->{Query_time} query_md5=" . make_checksum($event->{fingerprint}) . " host=$event->{host} db=$event->{db} dbuser=$event->{user}\n"'|nc opentsdb 4242 | |
format: | |
put mysql.slow.query 1335559893 1.889435 query_md5=AECBE3F75D62FCA4 host=api1 db=prod dbuser=app |
If you have very few data points because you have very few slow queries, then the performance will not degrade much. The cost of the query is O(N) where N is the number of data points for the metric mysql.slow.query
in the time range your query covers.
Addendum: the reason of my comment above is because it's generally not recommended to have a script that can potentially create an unbounded number of tag values like you do with the MD5 sum. If there's a hiccup in your database and all of a sudden you log 20k slow queries, then you'll "waste" 20k tag value UIDs, and it's annoying/hard to "recycle" them.
tsuna, you are incorrect. The md5sum is of the /normalized/ query, and it is extremely unlikely that there will be 20k /different kinds of queries/. Most database servers have less than a couple hundred types of queries executed against them in my experience.
Ah OK, my bad then, I wasn't aware of that. Then yes it's probably fine.
Since the MD5s are query fingerprints of supposedly rare queries, I don't (currently) expect enough variation in normalized queries (coming from hibernate) to reach that many values....probably never more than 5k in this case. The ability to filter on a specific query though is huge for my use case--will performance degrade approaching that limit, or linearly?