Of course we cannot always share details about our work with customers, but nevertheless it is nice to show our achievements and share some solutions.
In certain projects we use external Elasticsearch instances running in the cloud (using elastic.co cloud). However having a service running in the cloud does not mean that a higher availability can be assumed - it probably should but real life experiences show otherwise. We've had our experiences with problems/outages in the cloud and the Elasticsearch cloud makes no difference.
To suffer "less" from a failed Elasticsearch instance, we have set up two instances in two different regions (Elastic.co uses AWS in the background). The idea: Both Elasticsearch instances contain (more or less) the same data and we can balance or fail over from one to another instance. Using a load balancer (HAProxy) between the application and the two instances is self-explanatory.
However when the applications wanted to connect to Elasticsearch via the load balancer, the following error message showed up:
$ curl https://esloadbalancer.internal:9243
This error message comes from the target Elasticsearch instance. This means the Elasticsearch connection from the application via HAProxy to the Elasticsearch instance did work. However Elastic.co's cloud instances stopped serving requests to "unknown" host names. Only requests containing the HTTP Host header with the "real" instance name (e.g. 12345678912345678912345678912345.eu-central-1.aws.cloud.es.io) are allowed. Elastic removed support for additional DNS names / CNAME's a while ago. From their FAQ:
We don’t support custom SSL certificates, which means that a custom CNAME for an Elasticsearch Service endpoint such as mycluster.mycompanyname.com also is not supported.
What happens in this case is the following: The application sends a HTTP request to HAProxy using the Host header "esloadbalancer.internal". HAProxy by default simply forwards the whole HTTP request, including the headers. The HTTP request from HAProxy to one of the Elasticsearch instances now looks like this (simplified with curl):
$ curl -H "Host: esloadbalancer.internal" https://12345678912345678912345678912345.eu-central-1.aws.cloud.es.io:9243
Of course the Host header could be rewritten to something static using http-request set-header , for example:
http-request set-header Host 12345678912345678912345678912345.eu-central-1.aws.cloud.es.io
server ES1 12345678912345678912345678912345.eu-central-1.aws.cloud.es.io:9243 id 1 maxconn 2000 check ssl verify none
server ES2 98765432198765432198765432198765.eu-central-1.aws.cloud.es.io:9243 id 2 maxconn 2000 check ssl verify none
But of course this would only work for the first backend server (ES1) as the Host header would not match ES2 and throw the same error again. And because this HAProxy deployment should be as dynamic as possible, maybe even run with more than two backend servers, this is not a solution.
Luckily there's another possibility described in the HAProxy documentation: http-send-name-header. Without reading the documentation this probably wouldn't mean much, but it actually does exactly what we need in our case:
Add the server name to a request. Use the header string given by <header>