Skip to content

Instantly share code, notes, and snippets.

@tamsky
Last active August 29, 2015 14:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tamsky/ce737c4b733a2646a52b to your computer and use it in GitHub Desktop.
Save tamsky/ce737c4b733a2646a52b to your computer and use it in GitHub Desktop.
Talk notes: Jon Haddad: Diagnosing Cassandra Problems in Production
Jon Haddad: Diagnosing Cassandra Problems in Production
http://www.meetup.com/ladevops/events/213408212/
where do systems fail?
spof
typical : replication - master failure
master insert bottleneck
san&nas performance
hardware/software upgrades
any # of cloud issues
lack of viz - prepare for everything
dc failure == ??
----
"it worked in dev"
- devs rarely use legit datasets
- unpredicatable query perf (joins are insane)
- index updates
- test on ssd, deploy to spinning rust?
- usually no contention in dev
- clusters are very differeny beasts from single instances
the problem with most dbs
- clustering is an afterthought
- bolt on
- developers start with acid, but give it up when they start using replication
- bad practices are encouraged
- failover is an afterthought
- edge cases in failover process
- multidc requires RT to master xcountry to INSERT/UPDATE
- you can't cheat latency
--
how is cassandra diff?
ring based replication
only 1 type of server (cassandra)
all nodes hold data and can answer queries
data is stored on RF=N servers
no spof
eventually consistent
data is found by key (CQL)
build for HA & scale
multi-dc
runs on JVM
----
preventattive meausres
- opscenter
- metrics integration
- munin
- monit
- icinga
- graphite / statsd (app level)
- logstash
--------
- weird consistency issues - NTP working?
- last write wins -- time skew
- problems with streamsing / repair - version conflicts
- run cleanup after you add nodes (reclaim disk space)
- slow queries
- compaction
- histograms
- tracing
- nodes flapping/failing
- check opscenter
- dig into system metrics
- jvm gc issues
---
compaction
- compaction merges sstables
- too much compaction?
- opscenter provides insight into compaction cluster-wide
- nodetool
- compactionhistory
- getcompactionthroughput
- leveld vs SizeTiered vs DateTiered
- leveled on SSD + Read Heavy
- size tiered on spinning rust
- size tiered is great for write heavy time series workloads
- DateSeries tiered -- (NEW) good for time series ?
------ sysutils
iostat
htop
iftop & netstat
dstat
strace
------
- proxyhistograms
- high level read and write times
- includes network latency
- cfhistograms <keyspace> <table>
- reports stats for single table on a single node
- used to identify tables with perf prob
--- query tracing
TRACING on;
select * from blah where pk=1 limit 100;
---- jvm gc
- generational gc (parnew & cms)
- new gen (eden + survivor0 + survivor1)
- old generation
- new obj are created in eden
- minor gc
- occurs when new gen fills up
- stops the world
- dead objects are removed
- live obj are prmoted into survivor, then old gen
- removing objects is fast, promoting objects is slow
---- old gen
- obj are promoted to new gen from old gen
- major gc
- old gens fill up some %
- mostly concurrent
- 2 short stop the world pauses
- full gc
- occurs when old gen fills up or obj can't be promoted
- stop the world
- collects all generations
- these are bad!
------------- GC profiling
- opscenter gc stats
- look for correlations between gc spikes and r/w latency
jstat -gcutil 89760(pid) 250(interval) 10000
- casssandra gc logging
- can be activated in cassandra-env.sh
- jstat
- prints gc activity
--- look out for
- long multi-second pauses
- caused by full gcs. old gen is filling up faster than the concurrent gc can keep up with it.
typ. means garbage is being promoted out of the new gen too soon
- long minor gc
- many of the objects in the new gen are being promoted to the old gen
- most commonly caused by new gen being too big
- sometimes caused by obj being promoted prematurely
smaller new gen size = smaller, more predictable performance when gc occurs
---------
@rustyrazorblade
in depth disk analysis @AlTobey
planetcassandra.com
blake eggleston blog on JVM tuning
http://tech.shift.com/post/74311817513/cassandra-tuning-the-jvm-for-read-heavy-workloads
---
tablesnap - can do backups automatically to S3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment