rduplain/linode-node-balancer-500-limitation.md

## linode-node-balancer-500-limitation.md

      
    Raw
  

              linode-node-balancer-500-limitation.md
            
          
    Linode's NodeBalancer assumes (as of Apr 2015) that a 500 response means that the node should be removed from rotation. Naturally, exceptions happen, so this is a very serious design limitation for any application which allows its code to have uncaught exceptions. I have opened a support ticket, with discussion copied here.
Ultimately, we had to rewrite our 500 responses to a non-50x response, which is strange to our application, but at least the change was limited to an nginx config and a single line of JavaScript to handle our status code as a server error. Linode specifically advised to use a non-50x response. All we need is a configuration in the NodeBalancer to not use passive checks on 500 Internal Server Error responses. There is no such configuration.
Due to the head-scratching nature of this configuration, we used 418 I'm a teapot in place of 500 responses. In nginx:
proxy_intercept_errors on;
error_page 500 =418 /_error/internal-server-error.html;

Unfortunately, nginx replies with 418 OK because the error handler is providing a static resource which results in 200 OK (because the error handler is a static file, and the static file is found and served up) and the error_page statement only rewrites the status code, not the text. Ideally, it would say something to the effect of 418 Internal Server Error with Assumptive Load Balancer. You can add this with a @name proxy handler or e.g. a PHP script, if needed.
In JavaScript:
return response.status === 418 || response.status >= 500 && response.status <= 599;


More details follow.

  
## message01.md

      
    Raw
  

              message01.md
            
          
    2015-04-28 11:53pm US/Eastern

Hello Linode,
We are testing our new configuration with intent to put NodeBalancer into production this week, but in our testing, we found that the passive health check feature is leading to many false positives. After a series of 500 errors from our application, which is otherwise performing fine, the NodeBalancer decides that the node is unhealthy and removes it from rotation.
(1) Is there any way to disable passive health checks? We only want active heath checks on TCP handshaking.
(2) How many 50x errors happen before the NodeBalancer pulls the node out of rotation?
(3) Is (2) configurable?
(4) Is there any way that we can find out when/why the NodeBalancer removed a node from rotation, in general? i.e. log file or API call.
(5) Can we configure the NodeBalancer to never take the last remaining node out of rotation?
Our code generally has a let-it-crash approach. This is essential to our architecture, where the code running in the client can read 500 codes and determine what to do. Unless one of the questions above leads to some options, we will not be able to use NodeBalancer and we will have to question our long-term plans in using Linode. I'd rather stay with Linode than move to another vendor.
We're looking for thoughtful answers here, not a quick reply, so please let us know if we are able to use NodeBalancer or if it simply is not meant for let-it-crash code.
Thanks!
Ron

  
## message02.md

      
    Raw
  

              message02.md
            
          
    2015-04-29 12:30pm US/Eastern

To frame the problem a bit differently:
I think that the NodeBalancer design is limiting our use of it, specifically in that I cannot turn off passive health checks (1). We cannot get Real IP headers if we balance via TCP mode, so we use HTTPS. But HTTPS balancing assumes that 500 errors are the responsibility of the NodeBalancer.
Whereas, errors happen, and the 500 is likely an issue with a sub-system within our code, which would mean that all nodes will respond 500.
In an ideal world, database state has fully integrity all the time. But bugs happen. If the database is in a weird state for exactly one user using exactly one feature, I don't want NodeBalancer to pull my node out of rotation.
If after considering that, you conclude that we should be using a status code other than 500, then I'd appreciate your recommendation. It's just that this is indeed a 500 Internal Server Error by our interpretation. The error is so deeply "internal" that it doesn't mean that the node itself is unhealthy.
Thanks for your time and thoughts on this,
Ron

  
## message03.md

      
    Raw
  

              message03.md
            
          
    2015-04-29 12:41pm US/Eastern

For (2), I suspect that it's not a matter of how many 500 errors happen, but that there's a lag after the first one happens. We are testing with 15 responses of 500 in rapid succession (loading various img sources on a page), and consistently the first 10 or 11 are 500 via our application, followed by the remaining as 503s via NodeBalancer.

  
## message04.md

      
    Raw
  

              message04.md
            
          
    2015-04-29 1:06pm US/Eastern

Thank you for contacting Linode. Our NodeBalancer will detect error code 500 and remove the Linode from the rotation when encountered. In order to prevent this, please use a different error code when this internal issue occurs will be the a suitable approach.
If you have any other questions or concerns, please let us know.
Sincerely,
Max

  
## message05.md

      
    Raw
  

              message05.md
            
          
    2015-04-29 1:08pm US/Eastern

Thanks for the response. I'm not sure how Linode would like to receive feature requests, but I am in strong support of making passive checks configurable, at least on/off (we want off).
-Ron

  
## message06.md

      
    Raw
  

              message06.md
            
          
    2015-04-29 1:22pm US/Eastern

Hello Ron,
I will certainly make sure to pass this request along to our developers. If you have any other questions or concerns please let us know.
Kind Regards,
Sal

  
## message07.md

      
    Raw
  

              message07.md
            
          
    2015-04-30 1:13am US/Eastern

This is strange. Exceptions happen. Why remove the node on 500? Due to the head-scratching nature of this configuration, we used 418 I'm a teapot in place of 500 responses.
But, we do now have NodeBalancer in production.
Please do take this feature request seriously. We've provided a lot of thought and engineering input on this thread. I'm concerned that this thread will be lost in support space instead of actually being escalated to the engineers who designed the NodeBalancer. All we need is a flag to say that the default passive check should allow 500 responses.
Yours,
Ron

  
## message08.md

      
    Raw
  

              message08.md
            
          
    2015-04-30 2:01am US/Eastern

Ron,
Thanks for the update! We appreciate your feedback and we welcome your suggestions to make our platform work better for you.  We will let our Developers know of your request but due to the number of requests we have we can not offer a time frame when they will have a chance to assess the benefits of these features.
To answer your question, we remove the node with the 500 error for the fact that it often indicates there is an issue with the node.  Most of our customers do not utilize the functionality of the NodeBalancer to the level that you are trying to.  Our platform is designed to cater more towards the developers side of the VPS business, so it would be great to incorporate these features into our system.
Let us know if you have any other questions for us!
Thanks,
Jeffrey Rosenberg
Linode Support Team

  
## message09.md

      
    Raw
  

              message09.md
            
          
    2015-04-30 10:48am US/Eastern

Thanks! We can close this ticket.
-Ron

  
## xyz-irc-01.txt
#### #linode on oftc
#### 2015-04-29 US/Eastern
11:39:20 <rduplain> anyone know if NodeBalancer passive checks are configurable?
11:39:25 <rduplain> The let-it-crash programming model falls to pieces, because after some arbitrary number of 50x, your node gets removed from rotation.
11:42:29 <akerl> Um... that's the let-it-crash model working
11:43:01 <akerl> and no, they aren't configurable, except in the way that if you switch to TCP mode there are no passive checks
11:44:35 <rduplain> akerl: that's not my experience. passive checks are clearly happening with TCP active mode.
11:44:48 <akerl> that is false
11:45:02 <rduplain> I disagree.
11:45:16 <akerl> in TCP mode (the mode, not the style of active checks), there are no passive checks on 5xx errors because it's not reading the HTTP traffic, because it's just proxying the TCP connections
11:45:31 <rduplain> I see. Thanks for clarifying.
11:45:34 <rduplain> I'm using HTTPS mode.
11:46:37 <MajObviousman> so then how are you passivly scanning responses for 50x?
11:46:46 <akerl> Yes, so you get 5xx passive checks. If your backends are throwing 5xx errors, it should mean they are unhealthy, ergo they get pulled
11:47:14 <rduplain> We send 500 responses to the client, we don't want NodeBalancer to make that assumption for us.
11:47:19 <akerl> MajObviousman: HTTPS mode terminates SSL at the NodeBal, and the NodeBal does passive health checks where if your backends throw 5xx codes back for requests from users, they get pulled
11:47:34 <akerl> rduplain: What's a scenario where you throw a 5xx that doesn't mean "server error"
11:47:51 <MajObviousman> ahhh duh. Yeah sorry, I forgot that was a feature
11:48:02 <MajObviousman> load up that NB!
11:48:57 -*- MajObviousman spent way too long working with load balancers where SSL termination was an expensive add-on feature, and so nobody opted for it
11:49:46 <akerl> I'm only really a fan of SSL termination when it's followed by SSL renegotiation, which doesn't happen here and has pretty bleh performance characteristics at scale
11:50:42 <MajObviousman> so then what's the purpose of the termination if you're just re-encrypting it again to the back-end node?
11:50:59 <MajObviousman> just to look at the contents?
11:51:17 <MajObviousman> you don't have to re-encrypt to do that
11:51:36 <akerl> Mostly load balancing. Also your backend nodes and balancer nodes can trust based on their own happy internal certs rather than the expensive dangerous public cert
11:52:34 <MajObviousman> to each his own, I suppose
11:53:25 -*- MajObviousman personally doesn't mind having the expensive, dangerous public cert in both the LB and back-end nodes, if SSL to the node is mandatory
11:55:43 <jrhunt> akerl, is your objection that keeping the private key to the public cert on *all* the webservers in the pool + the LBs is more dangerous than just having it on the LBs?
11:57:31 <akerl> I wouldn't call it an objection, but yes, keeping a secret in more places is absolutely less secure
11:57:38 <MajObviousman> sure
11:57:59 <MajObviousman> I suspect we are assigning vastly different weight to our risk assessments of that particular item
11:58:01 <akerl> In my case, the backend nodes have 0 access to the internet, and I also don't want to trust the network that is not entirely in my control
11:58:39 <akerl> so everything inside the circle does trust on internal CAs already, and everything outside the circle does trust on the external cert already
11:59:07 <akerl> Thus, having the LB -> backend connection use certs that already exist everywhere they need to be Just Makes Sense
12:02:45 <MajObviousman> it makes sense from a security standpoint, but there's a thought running around in the back of my head shouting, "This won't scale cheaply!"
12:03:30 <akerl> You mean the SSL renegotiation cost? or having to deal with certs for all the backend nodes?
12:04:03 <MajObviousman> no the certs are free
12:04:07 <MajObviousman> but SSL computation is not
12:04:14 <MajObviousman> you're trebling it
12:04:17 <akerl> Yea, that was my initial sadness :)
12:04:30 <akerl> "and has pretty bleh performance characteristics at scale"
12:04:40 <MajObviousman> oh yes, yes you did state that up front
12:04:55 <MajObviousman> again, different criteria in our individual risk assessments :)
12:05:12 <MajObviousman> unrelated topic, coming down to NC this year?
12:05:15 <MajObviousman> or did I already ask you that?
12:05:33 <akerl> I might be. Depends on how crazy the real world is
12:05:55 <akerl> is it at the same place this year?
12:06:00 <MajObviousman> yep
12:08:27 <rduplain> akerl: "server error" != node is unhealthy. in our case, the server error would happen on all nodes.
12:08:39 <rduplain> i.e. some weird state happened
12:09:08 <rduplain> akerl: the reason I moved from TCP mode to HTTPS is that I want real IP. Is there a way to get that with TCP mode?
12:09:14 <akerl> rduplain: I feel like this is a fundamental difference regarding the spec
12:09:33 <akerl> No, there is not a way to get the originating IP in TCP mode
12:09:46 <akerl> Yes, 5xx errors mean the node is unhealhty, because otherwise it wouldn't be throwing errors
12:10:04 <rduplain> errors happen
12:10:11 <rduplain> it's not the node that's unhealthy
12:10:16 <rduplain> so I don't want it removed
12:10:40 <akerl> If the error being thrown means something else ("client gave me a bad method", or "there were no results" or "try again later"), give the right code for that, they exist, all in happy 4xx land
12:10:43 <rduplain> I agree that NodeBalancer wasn't designed for that, but it's surprising to me, since we're not doing anything weird (though clearly you think we are).
12:10:49 <akerl> You are
12:10:55 <rduplain> Haha, okay.
12:11:02 <akerl> 5xx is designed to represent "the server is misbehaving"
12:11:29 <akerl> and, as you cited up front, the idea of let-it-crash is that a misbehaving server should die and be replaced with a non-misbehaving server
12:12:52 <rduplain> the issue here is that NodeBalancer doesn't let me configure it to have my code decide how to replace the misbehaving server
12:13:10 <rduplain> because it's not the server, it's some subsystem of mine
12:13:26 <akerl> That's because nodebalancers implement the balancing part
12:13:31 <rduplain> I got that.
12:13:36 <rduplain> I still want configuration here.
12:13:38 <akerl> You'd handle detecting and acting on failures via the API
12:13:51 <rduplain> I just want to turn off passive checks.
12:14:31 <rduplain> Thanks a lot for the discussion, akerl. This has been useful.
	#### #linode on oftc
	#### 2015-04-29 US/Eastern
	11:39:20 <rduplain> anyone know if NodeBalancer passive checks are configurable?
	11:39:25 <rduplain> The let-it-crash programming model falls to pieces, because after some arbitrary number of 50x, your node gets removed from rotation.
	11:42:29 <akerl> Um... that's the let-it-crash model working
	11:43:01 <akerl> and no, they aren't configurable, except in the way that if you switch to TCP mode there are no passive checks
	11:44:35 <rduplain> akerl: that's not my experience. passive checks are clearly happening with TCP active mode.
	11:44:48 <akerl> that is false
	11:45:02 <rduplain> I disagree.
	11:45:16 <akerl> in TCP mode (the mode, not the style of active checks), there are no passive checks on 5xx errors because it's not reading the HTTP traffic, because it's just proxying the TCP connections
	11:45:31 <rduplain> I see. Thanks for clarifying.
	11:45:34 <rduplain> I'm using HTTPS mode.
	11:46:37 <MajObviousman> so then how are you passivly scanning responses for 50x?
	11:46:46 <akerl> Yes, so you get 5xx passive checks. If your backends are throwing 5xx errors, it should mean they are unhealthy, ergo they get pulled
	11:47:14 <rduplain> We send 500 responses to the client, we don't want NodeBalancer to make that assumption for us.
	11:47:19 <akerl> MajObviousman: HTTPS mode terminates SSL at the NodeBal, and the NodeBal does passive health checks where if your backends throw 5xx codes back for requests from users, they get pulled
	11:47:34 <akerl> rduplain: What's a scenario where you throw a 5xx that doesn't mean "server error"
	11:47:51 <MajObviousman> ahhh duh. Yeah sorry, I forgot that was a feature
	11:48:02 <MajObviousman> load up that NB!
	11:48:57 -*- MajObviousman spent way too long working with load balancers where SSL termination was an expensive add-on feature, and so nobody opted for it
	11:49:46 <akerl> I'm only really a fan of SSL termination when it's followed by SSL renegotiation, which doesn't happen here and has pretty bleh performance characteristics at scale
	11:50:42 <MajObviousman> so then what's the purpose of the termination if you're just re-encrypting it again to the back-end node?
	11:50:59 <MajObviousman> just to look at the contents?
	11:51:17 <MajObviousman> you don't have to re-encrypt to do that
	11:51:36 <akerl> Mostly load balancing. Also your backend nodes and balancer nodes can trust based on their own happy internal certs rather than the expensive dangerous public cert
	11:52:34 <MajObviousman> to each his own, I suppose
	11:53:25 -*- MajObviousman personally doesn't mind having the expensive, dangerous public cert in both the LB and back-end nodes, if SSL to the node is mandatory
	11:55:43 <jrhunt> akerl, is your objection that keeping the private key to the public cert on all the webservers in the pool + the LBs is more dangerous than just having it on the LBs?
	11:57:31 <akerl> I wouldn't call it an objection, but yes, keeping a secret in more places is absolutely less secure
	11:57:38 <MajObviousman> sure
	11:57:59 <MajObviousman> I suspect we are assigning vastly different weight to our risk assessments of that particular item
	11:58:01 <akerl> In my case, the backend nodes have 0 access to the internet, and I also don't want to trust the network that is not entirely in my control
	11:58:39 <akerl> so everything inside the circle does trust on internal CAs already, and everything outside the circle does trust on the external cert already
	11:59:07 <akerl> Thus, having the LB -> backend connection use certs that already exist everywhere they need to be Just Makes Sense
	12:02:45 <MajObviousman> it makes sense from a security standpoint, but there's a thought running around in the back of my head shouting, "This won't scale cheaply!"
	12:03:30 <akerl> You mean the SSL renegotiation cost? or having to deal with certs for all the backend nodes?
	12:04:03 <MajObviousman> no the certs are free
	12:04:07 <MajObviousman> but SSL computation is not
	12:04:14 <MajObviousman> you're trebling it
	12:04:17 <akerl> Yea, that was my initial sadness :)
	12:04:30 <akerl> "and has pretty bleh performance characteristics at scale"
	12:04:40 <MajObviousman> oh yes, yes you did state that up front
	12:04:55 <MajObviousman> again, different criteria in our individual risk assessments :)
	12:05:12 <MajObviousman> unrelated topic, coming down to NC this year?
	12:05:15 <MajObviousman> or did I already ask you that?
	12:05:33 <akerl> I might be. Depends on how crazy the real world is
	12:05:55 <akerl> is it at the same place this year?
	12:06:00 <MajObviousman> yep
	12:08:27 <rduplain> akerl: "server error" != node is unhealthy. in our case, the server error would happen on all nodes.
	12:08:39 <rduplain> i.e. some weird state happened
	12:09:08 <rduplain> akerl: the reason I moved from TCP mode to HTTPS is that I want real IP. Is there a way to get that with TCP mode?
	12:09:14 <akerl> rduplain: I feel like this is a fundamental difference regarding the spec
	12:09:33 <akerl> No, there is not a way to get the originating IP in TCP mode
	12:09:46 <akerl> Yes, 5xx errors mean the node is unhealhty, because otherwise it wouldn't be throwing errors
	12:10:04 <rduplain> errors happen
	12:10:11 <rduplain> it's not the node that's unhealthy
	12:10:16 <rduplain> so I don't want it removed
	12:10:40 <akerl> If the error being thrown means something else ("client gave me a bad method", or "there were no results" or "try again later"), give the right code for that, they exist, all in happy 4xx land
	12:10:43 <rduplain> I agree that NodeBalancer wasn't designed for that, but it's surprising to me, since we're not doing anything weird (though clearly you think we are).
	12:10:49 <akerl> You are
	12:10:55 <rduplain> Haha, okay.
	12:11:02 <akerl> 5xx is designed to represent "the server is misbehaving"
	12:11:29 <akerl> and, as you cited up front, the idea of let-it-crash is that a misbehaving server should die and be replaced with a non-misbehaving server
	12:12:52 <rduplain> the issue here is that NodeBalancer doesn't let me configure it to have my code decide how to replace the misbehaving server
	12:13:10 <rduplain> because it's not the server, it's some subsystem of mine
	12:13:26 <akerl> That's because nodebalancers implement the balancing part
	12:13:31 <rduplain> I got that.
	12:13:36 <rduplain> I still want configuration here.
	12:13:38 <akerl> You'd handle detecting and acting on failures via the API
	12:13:51 <rduplain> I just want to turn off passive checks.
	12:14:31 <rduplain> Thanks a lot for the discussion, akerl. This has been useful.