We sometimes write.

Of course we cannot always share details about our work with customers, but nevertheless it is nice to show our achievements and share some solutions.

HAProxy: How to use different HTTP Host header based on each backend server address

Published on December 11th 2019 - see original post


In certain projects we use external Elasticsearch instances running in the cloud (using elastic.co cloud). However having a service running in the cloud does not mean that a higher availability can be assumed - it probably should but real life experiences show otherwise. We've had our experiences with problems/outages in the cloud and the Elasticsearch cloud makes no difference. 

Fail over between multiple Elasticsearch instances

To suffer "less" from a failed Elasticsearch instance, we have set up two instances in two different regions (Elastic.co uses AWS in the background). The idea: Both Elasticsearch instances contain (more or less) the same data and we can balance or fail over from one to another instance. Using a load balancer (HAProxy) between the application and the two instances is self-explanatory.

Elasticsearch balancing

However when the applications wanted to connect to Elasticsearch via the load balancer, the following error message showed up:

$ curl https://esloadbalancer.internal:9243
{"ok":false,"message":"Unknown deployment."}

Beware the HTTP Host header

This error message comes from the target Elasticsearch instance. This means the Elasticsearch connection from the application via HAProxy to the Elasticsearch instance did work. However Elastic.co's cloud instances stopped serving requests to "unknown" host names. Only requests containing the HTTP Host header with the "real" instance name (e.g. 12345678912345678912345678912345.eu-central-1.aws.cloud.es.io) are allowed. Elastic removed support for additional DNS names / CNAME's a while ago. From their FAQ:

We don’t support custom SSL certificates, which means that a custom CNAME for an Elasticsearch Service endpoint such as mycluster.mycompanyname.com also is not supported.

What happens in this case is the following: The application sends a HTTP request to HAProxy using the Host header "esloadbalancer.internal". HAProxy by default simply forwards the whole HTTP request, including the headers. The HTTP request from HAProxy to one of the Elasticsearch instances now looks like this (simplified with curl):

$ curl -H "Host: esloadbalancer.internal" https://12345678912345678912345678912345.eu-central-1.aws.cloud.es.io:9243
{"ok":false,"message":"Unknown deployment."}

Of course the Host header could be rewritten to something static using http-request set-header , for example:

backend es-https-out
  http-request set-header Host 12345678912345678912345678912345.eu-central-1.aws.cloud.es.io
  server ES1 12345678912345678912345678912345.eu-central-1.aws.cloud.es.io:9243 id 1 maxconn 2000 check ssl verify none
  server ES2 98765432198765432198765432198765.eu-central-1.aws.cloud.es.io:9243 id 2 maxconn 2000 check  ssl verify none

But of course this would only work for the first backend server (ES1) as the Host header would not match ES2 and throw the same error again. And because this HAProxy deployment should be as dynamic as possible, maybe even run with more than two backend servers, this is not a solution.

http-send-name-header to the rescue!

Luckily there's another possibility described in the HAProxy documentation: http-send-name-header. Without reading the documentation this probably wouldn't mean much, but it actually does exactly what we need in our case:

http-send-name-header [<header>]
Add the server name to a request. Use the header string given by <header>

The importance here is to understand what "server name" means. It might not sound very clear at first, but it basically means that any new header can be set (or an existing replaced) which will use the name of the used backend server as header value. By using the "Host" header with this option, this would mean that the existing Host header would be overwritten with the new value from the server name. But how is the server name actually defined? Again from the documentation:

 server <name> <address>[:port] [settings ...]

In the config example above this would mean either "ES1" or "ES2", depending on which backend server was contacted, would be set as server name. Wouldn't work of course. But if the server name is set to the actual address of the backend server, the HTTP Host header would be correct. This results in the following configuration:

backend es-https-out
  http-send-name-header Host
  server 12345678912345678912345678912345.eu-central-1.aws.cloud.es.io 12345678912345678912345678912345.eu-central-1.aws.cloud.es.io:9243 id 1 maxconn 2000 check ssl verify none
  server 98765432198765432198765432198765.eu-central-1.aws.cloud.es.io 98765432198765432198765432198765.eu-central-1.aws.cloud.es.io:9243 id 2 maxconn 2000 check ssl verify none

Sure, not nice to the eyes but technically correct and working!

HAProxy using different Host header based on backend server

The applications can now correctly access the target Elasticsearch instances via the internal load balancer:

$ curl https://esbalancer.internal:9243 -u user:pass
{
  "name" : "instance-0000000007",
  "cluster_name" : "12345678912345678912345678912345",
  "cluster_uuid" : "b_XXXXXXXXXXXXXXXXXXXX",
  "version" : {
    "number" : "6.8.2",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "b506955",
    "build_date" : "2019-07-24T15:24:41.545295Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

Of course this does not only apply to Elasticsearch backends but to any web service/application using a strict HTTP Host check (or other header for that matter).


More recent articles: