rendaw/chat.md

## chat.md

      
    Raw
  

              chat.md
            
          
    On decommissioning a node (https://www.cockroachlabs.com/docs/stable/remove-nodes.html) the instructions say to check the node before commissioning: "Open the Admin UI, click Metrics on the left, select the Replication dashboard, and hover over the Replicas per Store and Leaseholders per Store graphs" but don't say what to do with that information
The example scenarios linked right before that say if there aren't enough nodes then decommissioning will hang, but don't say what metric/indicators let you know if there are enough nodes to decommission
My own naive guess would be replicated ranges - the docs say by default you want 3 replications of a range, but the screenshots show replicated ranges at ~70. Does this mean there's 20+ times the required capacity in that example?
As a separate question, what metrics should be used to identify if a cluster should be scaled out? Data usage near max capacity? Or will replicas drop towards some critical value?
kena @knz 20:29
these are all very good questions!
rendaw @rendaw_gitlab 20:29
kena @knz 20:31
the first step is there so that you can write down the values pre-decommission, to serve as reference in subsequent steps
I agree this should be clarified.
rendaw @rendaw_gitlab 20:31
Ah, I see
kena @knz 20:32
as to "enough nodes" - the rule is simple really and explained in the intro using the example: if your replication factor is N, you should have at least N nodes remaining after the decommission
so if you use the default replication factor of 3 and try to decommission a node away from a 3-node cluster, it will hang
but going from 4 to 3 should be fine
I suppose this could be clarified as well
rendaw @rendaw_gitlab 20:33
Can't nodes have a subset of the whole data as well though?
kena @knz 20:33
yes absolutely!
that's why it's fine to have 10 nodes with a replication factor of 3
this way each node contains about 30% of the overall data
but if you want 3 separate copies, you need at least 3 nodes :)
as to when to scale out, I suppose your app will tell you: once the capacity of your existing cluster is under pressure (max CPU, max RAM, max disk I/O) the number of transactions per second will saturate, or even worse perhaps start to go down, that's when you need to scale out
now
all of these points would make great doc updates
I do recommend you file the github issues on the docs repo github.com/cockroachdb/docs
you can perhaps include this conversation too
rendaw @rendaw_gitlab 20:37
Great, will do! If you have one more second though
kena @knz 20:37
yes
rendaw @rendaw_gitlab 20:37
So I guess my confusion is: suppose I have 100gb of data and each node has a 10gb disk, and I have a 30 node cluster (replication factor 3) - I definitely can't scale down to 3 nodes here right?
So the metric doesn't seem to just be minimum nodes = replication factor, but something else
kena @knz 20:38
well
you're asking something different here
which is about hardware capacity. If you were not using CockroachDB and instead, say, PostgreSQL, and you were asking me: "can PostgreSQL work with only 1 CPU core?" I would say it's not clever, because there are background tasks, and the rule of thumb is 2-4 cores at least. But now it seems like you'd tell me "yes but my app needs 30000 transactions per second, so perhaps 4 cores is not sufficient? so the number of background tasks is not the only factor?"
of course whatever your app needs dictates your scale
to come back to cockroachdb, if you try to downreplicate 30x10GB into just 3x10GB (assuming repl factor 3) the replication process will start and move data to those 3 nodes
at some point, these 3 nodes will become full and the downreplication will stall (not hang)
if, say, you dynamically add more disk space to these 3 nodes, the downreplication will continue
rendaw @rendaw_gitlab 20:42
I'm assuming disk size per node is fixed... with 100gb of data I'd imagine it would stall immediately (since there's exactly 3 replicas of all data)
kena @knz 20:43
well
rendaw @rendaw_gitlab 20:43
So is there a metric that says you're at max capacity for the given replicas?
kena @knz 20:43
if the nodes wer ecompletely full to start with, cockroachdb would not work properly :)
each nodes has 1 or more "stores", each has a capacity metric expressed in bytes (or gigabytes)
you can view this in the web UI too
the thing is, and the reason why I did not mention disk capacity as a starting point, is that a healthy cluster typically provisions more disk capacity than needed at any point in time
(on account of storage being typically cheaper)
rendaw @rendaw_gitlab 20:45
Yeah, that makes perfect sense :)
kena @knz 20:45
now
rendaw @rendaw_gitlab 20:45
I'm just wondering if there's a metric for this health
kena @knz 20:46
there's no pre-computed metric, no
and that metric would need to take other parameters into account, those I mentioned earlier: CPU usage, RAM usage, disk I/O ops per second
rendaw @rendaw_gitlab 20:46
I'm monitoring the performance metrics as well separately
kena @knz 20:47
these also act as barriers to downscaling
rendaw @rendaw_gitlab 20:50
I have this cluster, and processes are adding and deleting data all the time. The performance metrics all look okay, but the database might fill up or it might empty, and I'm trying to monitor so I can scale out if it comes close to max storage capacity and scale in when large chunks of data have been removed
If I understand correctly, what I want to check to see if it's okay to scale in is (node count X node disk size X replicas) - if cluster used storage is much less than this value I can initiate a scale in without stalling (assuming performance metrics are green), and I should scale out before it approaches that number
and the Cockroach metric to watch is then the "cluster used storage" metric, disk size and replicas are constants, and node count is from my hosting solution
rendaw @rendaw_gitlab 20:58
The concern I have with this method is if there's a sudden influx of data, I assume it might take some time for the data to be distributed evenly (and/or data to be reallocated to nodes?)... if there's an "average replicas" number that I could aim to always have > desired replicas that seems like it would be a better metric, since during redistribution the number of replicas would be low and I could hold off/cancel any scale in actions
rendaw @rendaw_gitlab 21:06
So while I can't rely just on data usage/capacity, it's surely something I need to watch? I couldn't find anything in the docs that address this (specifically on the decommissioning a node page, where this is probably something that should be confirmed before downscaling)... although perhaps I missed it somewhere or it's a sort of common knowledge for those with more experience than me (or I'm just especially confused).
kena @knz 21:07
I think here you are discovering an insight we haven't had so far
this % of replicas
I think it's very good and valuable insight
rendaw @rendaw_gitlab 21:10
That's a relief  - I'm generally clueless so I'm a bit hesitant to ask
I don't need an answer on this quickly either
kena @knz 21:11
your reasoning above does not qualify as "clueless" to me
rendaw @rendaw_gitlab 21:17
Thanks @knz , I much appreciate the discussion and answers! I'll put this in a github ticket for reference and further discussion if anyone's interested