-
-
Save donjohnson/2513850 to your computer and use it in GitHub Desktop.
pt-query-digest mysql.slow.log --no-report --filter 'print "put mysql.slowqueries $event->{timestamp} $event->{Query_time} query_md5=" . make_checksum($event->{fingerprint}) . " host=$event->{host} db=$event->{db} dbuser=$event->{user}\n"'|nc opentsdb 4242 | |
format: | |
put mysql.slow.query 1335559893 1.889435 query_md5=AECBE3F75D62FCA4 host=api1 db=prod dbuser=app |
Since the MD5s are query fingerprints of supposedly rare queries, I don't (currently) expect enough variation in normalized queries (coming from hibernate) to reach that many values....probably never more than 5k in this case. The ability to filter on a specific query though is huge for my use case--will performance degrade approaching that limit, or linearly?
If you have very few data points because you have very few slow queries, then the performance will not degrade much. The cost of the query is O(N) where N is the number of data points for the metric mysql.slow.query
in the time range your query covers.
Addendum: the reason of my comment above is because it's generally not recommended to have a script that can potentially create an unbounded number of tag values like you do with the MD5 sum. If there's a hiccup in your database and all of a sudden you log 20k slow queries, then you'll "waste" 20k tag value UIDs, and it's annoying/hard to "recycle" them.
tsuna, you are incorrect. The md5sum is of the /normalized/ query, and it is extremely unlikely that there will be 20k /different kinds of queries/. Most database servers have less than a couple hundred types of queries executed against them in my experience.
Ah OK, my bad then, I wasn't aware of that. Then yes it's probably fine.
Putting the MD5 in a tag is a bad idea. Remember the default installation only allows up to 16777216 tag values, so use them wisely. There is no way to change the maximum number of tag values on an existing tsdb table.