Skip to content

Instantly share code, notes, and snippets.

@rendaw
Created June 28, 2018 12:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rendaw/4b73e33b5802ef240946452d29c60216 to your computer and use it in GitHub Desktop.
Save rendaw/4b73e33b5802ef240946452d29c60216 to your computer and use it in GitHub Desktop.
Gitter chat record

On decommissioning a node (https://www.cockroachlabs.com/docs/stable/remove-nodes.html) the instructions say to check the node before commissioning: "Open the Admin UI, click Metrics on the left, select the Replication dashboard, and hover over the Replicas per Store and Leaseholders per Store graphs" but don't say what to do with that information The example scenarios linked right before that say if there aren't enough nodes then decommissioning will hang, but don't say what metric/indicators let you know if there are enough nodes to decommission My own naive guess would be replicated ranges - the docs say by default you want 3 replications of a range, but the screenshots show replicated ranges at ~70. Does this mean there's 20+ times the required capacity in that example? As a separate question, what metrics should be used to identify if a cluster should be scaled out? Data usage near max capacity? Or will replicas drop towards some critical value?

kena @knz 20:29 these are all very good questions!

rendaw @rendaw_gitlab 20:29

kena @knz 20:31 the first step is there so that you can write down the values pre-decommission, to serve as reference in subsequent steps I agree this should be clarified.

rendaw @rendaw_gitlab 20:31 Ah, I see

kena @knz 20:32 as to "enough nodes" - the rule is simple really and explained in the intro using the example: if your replication factor is N, you should have at least N nodes remaining after the decommission so if you use the default replication factor of 3 and try to decommission a node away from a 3-node cluster, it will hang but going from 4 to 3 should be fine I suppose this could be clarified as well

rendaw @rendaw_gitlab 20:33 Can't nodes have a subset of the whole data as well though?

kena @knz 20:33 yes absolutely! that's why it's fine to have 10 nodes with a replication factor of 3 this way each node contains about 30% of the overall data but if you want 3 separate copies, you need at least 3 nodes :) as to when to scale out, I suppose your app will tell you: once the capacity of your existing cluster is under pressure (max CPU, max RAM, max disk I/O) the number of transactions per second will saturate, or even worse perhaps start to go down, that's when you need to scale out now all of these points would make great doc updates I do recommend you file the github issues on the docs repo github.com/cockroachdb/docs you can perhaps include this conversation too

rendaw @rendaw_gitlab 20:37 Great, will do! If you have one more second though

kena @knz 20:37 yes

rendaw @rendaw_gitlab 20:37 So I guess my confusion is: suppose I have 100gb of data and each node has a 10gb disk, and I have a 30 node cluster (replication factor 3) - I definitely can't scale down to 3 nodes here right? So the metric doesn't seem to just be minimum nodes = replication factor, but something else

kena @knz 20:38 well you're asking something different here which is about hardware capacity. If you were not using CockroachDB and instead, say, PostgreSQL, and you were asking me: "can PostgreSQL work with only 1 CPU core?" I would say it's not clever, because there are background tasks, and the rule of thumb is 2-4 cores at least. But now it seems like you'd tell me "yes but my app needs 30000 transactions per second, so perhaps 4 cores is not sufficient? so the number of background tasks is not the only factor?" of course whatever your app needs dictates your scale to come back to cockroachdb, if you try to downreplicate 30x10GB into just 3x10GB (assuming repl factor 3) the replication process will start and move data to those 3 nodes at some point, these 3 nodes will become full and the downreplication will stall (not hang) if, say, you dynamically add more disk space to these 3 nodes, the downreplication will continue

rendaw @rendaw_gitlab 20:42 I'm assuming disk size per node is fixed... with 100gb of data I'd imagine it would stall immediately (since there's exactly 3 replicas of all data)

kena @knz 20:43 well

rendaw @rendaw_gitlab 20:43 So is there a metric that says you're at max capacity for the given replicas?

kena @knz 20:43 if the nodes wer ecompletely full to start with, cockroachdb would not work properly :) each nodes has 1 or more "stores", each has a capacity metric expressed in bytes (or gigabytes) you can view this in the web UI too the thing is, and the reason why I did not mention disk capacity as a starting point, is that a healthy cluster typically provisions more disk capacity than needed at any point in time (on account of storage being typically cheaper)

rendaw @rendaw_gitlab 20:45 Yeah, that makes perfect sense :)

kena @knz 20:45 now

rendaw @rendaw_gitlab 20:45 I'm just wondering if there's a metric for this health

kena @knz 20:46 there's no pre-computed metric, no and that metric would need to take other parameters into account, those I mentioned earlier: CPU usage, RAM usage, disk I/O ops per second

rendaw @rendaw_gitlab 20:46 I'm monitoring the performance metrics as well separately

kena @knz 20:47 these also act as barriers to downscaling

rendaw @rendaw_gitlab 20:50 I have this cluster, and processes are adding and deleting data all the time. The performance metrics all look okay, but the database might fill up or it might empty, and I'm trying to monitor so I can scale out if it comes close to max storage capacity and scale in when large chunks of data have been removed If I understand correctly, what I want to check to see if it's okay to scale in is (node count X node disk size X replicas) - if cluster used storage is much less than this value I can initiate a scale in without stalling (assuming performance metrics are green), and I should scale out before it approaches that number and the Cockroach metric to watch is then the "cluster used storage" metric, disk size and replicas are constants, and node count is from my hosting solution

rendaw @rendaw_gitlab 20:58 The concern I have with this method is if there's a sudden influx of data, I assume it might take some time for the data to be distributed evenly (and/or data to be reallocated to nodes?)... if there's an "average replicas" number that I could aim to always have > desired replicas that seems like it would be a better metric, since during redistribution the number of replicas would be low and I could hold off/cancel any scale in actions

rendaw @rendaw_gitlab 21:06 So while I can't rely just on data usage/capacity, it's surely something I need to watch? I couldn't find anything in the docs that address this (specifically on the decommissioning a node page, where this is probably something that should be confirmed before downscaling)... although perhaps I missed it somewhere or it's a sort of common knowledge for those with more experience than me (or I'm just especially confused).

kena @knz 21:07 I think here you are discovering an insight we haven't had so far this % of replicas I think it's very good and valuable insight

rendaw @rendaw_gitlab 21:10 That's a relief - I'm generally clueless so I'm a bit hesitant to ask I don't need an answer on this quickly either

kena @knz 21:11 your reasoning above does not qualify as "clueless" to me

rendaw @rendaw_gitlab 21:17 Thanks @knz , I much appreciate the discussion and answers! I'll put this in a github ticket for reference and further discussion if anyone's interested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment