Last active
August 29, 2015 14:10
-
-
Save tamsky/ce737c4b733a2646a52b to your computer and use it in GitHub Desktop.
Talk notes: Jon Haddad: Diagnosing Cassandra Problems in Production
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Jon Haddad: Diagnosing Cassandra Problems in Production | |
http://www.meetup.com/ladevops/events/213408212/ | |
where do systems fail? | |
spof | |
typical : replication - master failure | |
master insert bottleneck | |
san&nas performance | |
hardware/software upgrades | |
any # of cloud issues | |
lack of viz - prepare for everything | |
dc failure == ?? | |
---- | |
"it worked in dev" | |
- devs rarely use legit datasets | |
- unpredicatable query perf (joins are insane) | |
- index updates | |
- test on ssd, deploy to spinning rust? | |
- usually no contention in dev | |
- clusters are very differeny beasts from single instances | |
the problem with most dbs | |
- clustering is an afterthought | |
- bolt on | |
- developers start with acid, but give it up when they start using replication | |
- bad practices are encouraged | |
- failover is an afterthought | |
- edge cases in failover process | |
- multidc requires RT to master xcountry to INSERT/UPDATE | |
- you can't cheat latency | |
-- | |
how is cassandra diff? | |
ring based replication | |
only 1 type of server (cassandra) | |
all nodes hold data and can answer queries | |
data is stored on RF=N servers | |
no spof | |
eventually consistent | |
data is found by key (CQL) | |
build for HA & scale | |
multi-dc | |
runs on JVM | |
---- | |
preventattive meausres | |
- opscenter | |
- metrics integration | |
- munin | |
- monit | |
- icinga | |
- graphite / statsd (app level) | |
- logstash | |
-------- | |
- weird consistency issues - NTP working? | |
- last write wins -- time skew | |
- problems with streamsing / repair - version conflicts | |
- run cleanup after you add nodes (reclaim disk space) | |
- slow queries | |
- compaction | |
- histograms | |
- tracing | |
- nodes flapping/failing | |
- check opscenter | |
- dig into system metrics | |
- jvm gc issues | |
--- | |
compaction | |
- compaction merges sstables | |
- too much compaction? | |
- opscenter provides insight into compaction cluster-wide | |
- nodetool | |
- compactionhistory | |
- getcompactionthroughput | |
- leveld vs SizeTiered vs DateTiered | |
- leveled on SSD + Read Heavy | |
- size tiered on spinning rust | |
- size tiered is great for write heavy time series workloads | |
- DateSeries tiered -- (NEW) good for time series ? | |
------ sysutils | |
iostat | |
htop | |
iftop & netstat | |
dstat | |
strace | |
------ | |
- proxyhistograms | |
- high level read and write times | |
- includes network latency | |
- cfhistograms <keyspace> <table> | |
- reports stats for single table on a single node | |
- used to identify tables with perf prob | |
--- query tracing | |
TRACING on; | |
select * from blah where pk=1 limit 100; | |
---- jvm gc | |
- generational gc (parnew & cms) | |
- new gen (eden + survivor0 + survivor1) | |
- old generation | |
- new obj are created in eden | |
- minor gc | |
- occurs when new gen fills up | |
- stops the world | |
- dead objects are removed | |
- live obj are prmoted into survivor, then old gen | |
- removing objects is fast, promoting objects is slow | |
---- old gen | |
- obj are promoted to new gen from old gen | |
- major gc | |
- old gens fill up some % | |
- mostly concurrent | |
- 2 short stop the world pauses | |
- full gc | |
- occurs when old gen fills up or obj can't be promoted | |
- stop the world | |
- collects all generations | |
- these are bad! | |
------------- GC profiling | |
- opscenter gc stats | |
- look for correlations between gc spikes and r/w latency | |
jstat -gcutil 89760(pid) 250(interval) 10000 | |
- casssandra gc logging | |
- can be activated in cassandra-env.sh | |
- jstat | |
- prints gc activity | |
--- look out for | |
- long multi-second pauses | |
- caused by full gcs. old gen is filling up faster than the concurrent gc can keep up with it. | |
typ. means garbage is being promoted out of the new gen too soon | |
- long minor gc | |
- many of the objects in the new gen are being promoted to the old gen | |
- most commonly caused by new gen being too big | |
- sometimes caused by obj being promoted prematurely | |
smaller new gen size = smaller, more predictable performance when gc occurs | |
--------- | |
@rustyrazorblade | |
in depth disk analysis @AlTobey | |
planetcassandra.com | |
blake eggleston blog on JVM tuning | |
http://tech.shift.com/post/74311817513/cassandra-tuning-the-jvm-for-read-heavy-workloads | |
--- | |
tablesnap - can do backups automatically to S3 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment