HAProxy Technologies https://www.haproxy.com Fri, 08 Mar 2019 14:31:51 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.9 https://www.haproxy.com/wp-content/uploads/2017/06/cropped-Favicon-3-32x32.png HAProxy Technologies https://www.haproxy.com 32 32 Using HAProxy as an API Gateway, Part 3 [Health Checks] https://www.haproxy.com/blog/using-haproxy-as-an-api-gateway-part-3-health-checks/ https://www.haproxy.com/blog/using-haproxy-as-an-api-gateway-part-3-health-checks/#respond Fri, 01 Mar 2019 11:00:22 +0000 https://www.haproxy.com/?p=222411 Achieving high availability rests on having good health checks. HAProxy as an API gateway gives you several ways to do this. Run your service on multiple servers. Place your servers behind an HAProxy load balancer. Enable health checking to quickly remove unresponsive servers. These are all steps from a familiar and trusted playbook. Accomplishing them […]

The post Using HAProxy as an API Gateway, Part 3 [Health Checks] appeared first on HAProxy Technologies.

]>

Achieving high availability rests on having good health checks. HAProxy as an API gateway gives you several ways to do this.

Run your service on multiple servers.
Place your servers behind an HAProxy load balancer.
Enable health checking to quickly remove unresponsive servers.

These are all steps from a familiar and trusted playbook. Accomplishing them ensures that your applications stay highly availability. What you may not have considered, though, is the flexibility that HAProxy gives you in how you can monitor your servers for downtime. HAProxy offers more ways to track the health of your servers and react quicker than any other load balancer, API gateway, or appliance on the market.

In the previous two blog posts of this series, you were introduced to using HAProxy as an API gateway and implementing OAuth authorization. You’ve seen how placing HAProxy in front of your API services gives you a simplified interface for clients. It provides easy load balancing, rate limiting and security—combined into a centralized service.

In this blog post, you will learn several ways to create highly available services using HAProxy as an API gateway. You’ll get to know active and passive health checks, how to use an external agent, how to monitor queue length, and strategies for avoiding failure. The benefits of each health-checking method will become apparent, allowing you to make the best choice for your scenario.

Active Health Checks

The easiest way to check whether a server is up is with the server directive’s check parameter. This is known as an active health check. It means that HAProxy polls the server on a fixed interval by trying to make a TCP connection. If it can’t contact a server, the check fails and that server is removed from the load balancing rotation.

Consider the following example:

When you’ve enabled active checking, you can then add other, optional parameters, such as to change the polling interval and/or the number of allowed failed checks:

Parameter What it does
inter The interval between checks (defaults to milliseconds, but you can set seconds with the s suffix).
downinter The interval between checks when the server is already in the down state.
fall The number of failed checks before marking the server as down.
rise The number of successful checks before marking a server as up again.
fastinter The interval between checks when up, but a check has failed; or the interval when down, but a check has passed. You are transitioning towards up or down. This allows you to speed up the interval.

The next example demonstrates these parameters:

If you’re load balancing web applications, then instead of monitoring a server based on whether you can make a TCP connection, you can send an HTTP request. Add option httpchk to a backend and HAProxy will continually send requests and expect to receive valid HTTP responses that have a status code in the 2xx to 3xx range.

The option httpchk directive also lets you choose the HTTP method (e.g. GET, HEAD, OPTIONS), as well as the URL to monitor. Having this flexibility means that you can dedicate a specific webpage to be the health-check endpoint. For example, if you’re using a tool like Prometheus, which exposes its metrics on a dedicated page within the application, you could target its URL. Or, you could point HAProxy at the homepage of your website, which is arguably the most important.

You can also set a different IP address and/or port to check by adding the addr and port parameters, respectively, to the server line. In the following snippet, we target port 80 for our health checks, even though normal web traffic is sent to port 443:

Some web servers will reject requests that don’t include certain headers, such as a Host header. You can pass HTTP headers with the health check, like so:

You can accept only specific responses from the server, such as a specific status code or string within the HTTP body. Use http-check expect with either the status or string keyword. In the following example, only health checks that return a 200 OK response are classified as successful:

Or, require that the response body contain a certain case-sensitive string of text:

You can also have one server delegate its own status to another server by adding a track parameter to a server line. It should be set to the name of a backend and server. If the tracked server is down, the server tracking it will also be down.

Passive Health Checks

Active checks are easy to configure and provide a simple monitoring strategy. They work well in a lot of cases, but you may also wish to monitor real traffic for errors, which is known as passive health checking. In HAProxy, a passive health check is configured by adding an observe parameter to a server line. You can monitor at the TCP or HTTP layer. In the following example, the observe parameter is set to layer4 to watch real-time TCP connections.

Here, we’re observing all TCP connections for problems. We’ve set error-limit to 10, which means that ten connections must fail before the on-error parameter is invoked and marks the server as down. When that happens, you’ll see a message like this in the HAProxy logs:

The only way to revive the server is with the regular health checks. In this example, we’ve used inter to set the polling health check interval to two minutes. When they do run and mark the server as healthy again, you’ll see a message telling you that the server is back up:

It’s a good idea to include option redispatch so that if a client runs into an error connecting, they’ll instantly be redirected to another healthy server within that backend. That way, they’ll never know that there was an issue. You can also add a retries parameter to the server so that HAProxy retries it the given number of times. The delay between retries is set with timeout connect.

As an alternative to watching TCP traffic, you can monitor live HTTP requests by setting observe layer7. This will automatically remove servers if users experience HTTP errors.

If any webpage returns a status code other than 100-499, 501 or 505, it will count towards the error limit. Be aware, however, that if enough users are getting errors on even a single, misconfigured webpage, it could cause the entire server to be removed. If that same page affects all of your servers, you may see widespread loss of service, even if the rest of the site is functioning normally. Note that the active health checks will still run and eventually bring the server back online, as long as the server is reachable.

Gauging Health with an External Agent

Connecting to a service’s IP and port or sending an HTTP request will give you a good idea about whether that application is functioning. One downside, though, is that it doesn’t give you a rich sense of the server’s state, such as its CPU load, free disk space, and network throughput.

With HAProxy, you can query an external agent, which is a piece of software running on the server that’s separate from the application you’re load balancing. Since the agent has full access to the remote server, it has the ability to check its vitals more closely.

External agents have an edge over other types of health checks: they can send signals back to HAProxy to force some kind of change in state. For example, they can mark the server as up or down, put it into maintenance mode, change the percentage of traffic flowing to it, or increase and decrease the maximum number of concurrent connections allowed. The agent will trigger your chosen action when some condition occurs, such as when CPU usage spikes or disk space runs low.

Consider this example:

The server directive’s agent-check parameter tells HAProxy to connect to an external agent. The agent-addr and agent-port parameters set the agent’s IP and port. The interval between checks is set with agent-inter. Note that this communication is not HTTP, but rather a raw TCP connection over which the agent communicates back to HAProxy by sending ASCII text over the wire.

Here are a few things that it might send back. Note that an end-of-line character (e.g. \n) is required after the message:

Text it sends back Result
down\n server is put into the down state
up\n server is put into the up state
maint\n server is put into maintenance mode
ready\n server is taken out of maintenance mode
50%\n server’s weight is halved
maxconn:10\n server’s maxconn is set to 10

The agent can be any custom process that can return a string whenever HAProxy connects to it. The following Go code creates a TCP server that listens on port 9999 and measures the current CPU idle time. If that metric falls below 10, the code sends back the string, 50%\n, setting the server’s weight in HAProxy to half of what it is currently.

You can then artificially spike the CPU with a tool like stress. Use the HAProxy Stats page to see the effect on the load balancer. Here, the server’s weight began at 100, but is set to 50 when there is high CPU usage.

You can enable the Stats page by adding a listen section with a stats enable directive to your HAProxy configuration file. This will start the HAProxy Stats page on port 8404:

Queue Length as a Health Indicator

When you’re thinking about how to keep tabs on your servers’ health, you might wonder, what constitutes healthy anyway? Certainly, loss of connectivity and returning errors fall into the unhealthy category. Those are failed states. However, before it gets to that point, maybe there are warning signs that the server is on a downward spiral, moving towards failure.

One early warning sign is queue length. Queue length is the number of sessions in HAProxy that are waiting for the next available connection slot to a server. Whereas other load balancers may simply fail when there are no servers ready to receive the connection, HAProxy queues clients until a server becomes available.

Following best practices, you should have a maxconn specified on each server line. Without one, there’s no limit to the number of connections HAProxy can open. While that might seem like a good thing, servers have finite capacity. The maxconn setting caps it so that the server isn’t overloaded with work. The following configuration limits the number of concurrent connections to 30:

Under heavy load, or if the server is sluggish, the connections may exceed maxconn and will then be queued in HAProxy. If the queue length grows past a certain number or if sessions in the queue have a long wait time, it’s a sign that the server isn’t able to keep up with demand. Maybe it’s processing a slow database query or perhaps the server isn’t powerful enough to handle the volume of requests.

When queue length grows, clients begin to experience increasingly long wait times. With a bottleneck, downstream client applications may exhaust their worker threads, run into their own timeouts, or fail in unexpected ways. You can see why monitoring queue length is a good way to gauge the health of the system. So, what can you do when you see this happening?

Applying Backpressure with a Timeout

By default, each backend queue in HAProxy is unbounded and there’s no timeout on it whatsoever. However, it’s easy to add a limit with the timeout queue directive. This ensures that clients don’t stay queued for too long before receiving a 503 Service Unavailable error.

You might think that sending back an error is bad news. If we didn’t queue at all, then I’d say that was true. However, under unusual circumstances, failure can be better than leaving clients waiting for a long time. Client applications can build logic around this. For example, they could retry after a few seconds, display a message to the user, or even scale out server resources if given access to the right APIs. Giving feedback to downstream components about the health of a server is known as backpressure. It’s a mechanism for letting the client react to early warning signs, such as by backing off.

If you do not set timeout queue, it defaults to the value of timeout connect. It can also be set in a defaults section.

Failing Over to a Remote Load Balancer

Another way to deal with a growing queue is to direct clients to another load balancer. Consider a scenario where you have geographically redundant data centers: one in North America and another in Europe. Ordinarily, you’d send clients to the data center nearest to them. However, if a backend has an avg_queue, which is the queue length of all servers within that backend divided by the number of servers, that grows past a certain point, you could send clients to the other data center’s load balancer.

Your North American load balancer would be configured like this:

In the frontend section, there are two ACLs. The first, northamerica_toobusy, checks if the average queue length of the northamerica backend is greater than 20. The second, europe_up, uses the srv_is_up fetch method to check whether the server in the europe backend is up.

The interesting thing about this is that the europe_up ACL is checking a remote address that’s in the European data center by pointing to a backend that contains the address of the European data center’s load balancer. This means that the North American load balancer is not checking the health of the European servers directly. Instead, it’s monitoring a URL that the European load balancer has provided. That’s because the North American load balancer doesn’t have any information about connections that it didn’t proxy and wouldn’t have the full picture of the state of those servers. Essentially, you’re avoiding the problem where each load balancer sends clients directly to origin servers, but without adequate knowledge of all of the connections in play, potentially swamping those servers with connections from multiple load balancers.

When the northamerica backend has too many sessions queued and the europe backend is up, clients get redirected to the Europe URL. It’s a way to offload work until things calm down. The European load balancer would be configured like this:

Here, you’re using the monitor-uri directive to intercept requests for /check on port 8080 and having HAProxy automatically send back a 200 OK response. However, if the europe backend becomes too busy, the monitor fail directive becomes true and begins sending back 503 Service Unavailable. In this way, the European data center can be monitored from the North American data center and requests will only be redirected there if it’s healthy.

Real-Time Dashboard

HAProxy Enterprise combines the stable codebase of HAProxy with an advanced suite of add-ons, expert support and professional services. One of these add-ons is the Real-Time Dashboard, which allows you to monitor the health of a cluster of load balancers.

Out-of-the-box, HAProxy provides the Stats page, which gives you observability over a single load balancer instance. The Real-Time Dashboard aggregate information across multiple load balancers for easier and faster administration. You can enable, disable, and drain connection from any of your servers and the list can be filtered by server name or by server status.

Conclusion

In the blog post, you learned various ways to health check your servers so that your APIs maintain a high level of reliability. This includes enabling active or passive health checks, communicating with an external agent, and using queue length as an early warning signal for failure.

HAProxy Enterprise offers expert support services and a variety of advanced security modules that are necessary in today’s threat landscape. Want to learn more? Sign up for a free trial or contact us. You can stay in the loop about topics like these by subscribing to our blog or following us on Twitter. You can also join the conversation on Slack.

The post Using HAProxy as an API Gateway, Part 3 [Health Checks] appeared first on HAProxy Technologies.

]>
https://www.haproxy.com/blog/using-haproxy-as-an-api-gateway-part-3-health-checks/feed/ 0
Test Driving “Power of Two Random Choices” Load Balancing https://www.haproxy.com/blog/power-of-two-load-balancing/ https://www.haproxy.com/blog/power-of-two-load-balancing/#respond Fri, 15 Feb 2019 11:00:03 +0000 https://www.haproxy.com/?p=219751 The Power of Two Random Choices load-balancing algorithm has piqued some curiosity. In this blog post, we see how it stacks up against other modern-day algorithms available in HAProxy. Recently, I was asked twice about my opinion on supporting an algorithm known as the Power of Two Random Choices. Some believe that it’s the next […]

The post Test Driving “Power of Two Random Choices” Load Balancing appeared first on HAProxy Technologies.

]>

The Power of Two Random Choices load-balancing algorithm has piqued some curiosity. In this blog post, we see how it stacks up against other modern-day algorithms available in HAProxy.

Recently, I was asked twice about my opinion on supporting an algorithm known as the Power of Two Random Choices. Some believe that it’s the next big thing in load balancing since it was recently implemented in some other load balancers.

I took some time to read the 2001 report of the research by Mitzenmacher, Richa & Sitaraman and also the original 1996 study by Mitzenmacher. Attentive readers will note that the idea first emerged 23 years ago—before HAProxy even existed—and was further explored five years later while HAProxy was still in its infancy. I really liked the overall explanation, even though, I must confess, I quickly glanced over some of the mathematical demonstration. It sounded naturally good and efficient.

The principle is this: The algorithm decides which server will respond to each request by picking two random servers from the fleet and choosing the one with the fewest active connections. This check for active connections presents the nice property of fairness by making it harder for heavily loaded servers to be chosen while less loaded servers are available. Randomly choosing two servers makes it possible to add this fairness without needing to check each individual server’s load.

The whole purpose is to save the load balancer from the cost of having to check all servers, while still making a better choice than a purely random decision. The papers discuss how the algorithm gets even better when choosing between more than two servers, although they report that the extra gains are less impressive and only become linear past two.

The principle is extremely smart, while also easy to understand: by randomly picking a small number of entries among a list and then selecting the least loaded one, the probability of choosing an overloaded server decreases. This is especially true as the number of servers in the fleet grows and the distribution of selected servers widens. The system balances itself: The wider the distribution, the fairer the outcome.

It sounds good on paper. But given that it’s been around for more than two decades and nobody has asked for it during this time, you have to wonder whether it still provides any benefit over other algorithms employed by modern load balancers.

A Closer Look

Let’s think about what makes the Power of Two algorithm stand apart and, at the same time, compare it with alternatives available in HAProxy.

Fairness, or the likelihood that traffic will be spread evenly across servers, is our first consideration. When load balancing equally loaded servers, you’d typically choose the Round Robin algorithm. While Round Robin and Power of Two differ in appearance and you’d think that Round Robin would be fairer, statistically they will provide nearly the same results.

For dealing with varying response times, HAProxy supports the Least Connections algorithm, which picks the least loaded server among all of them and not only among two like Power of Two does by default. On the face of it, the Power of Two algorithm seems like the better choice since it doesn’t have to compare the load of every server before choosing one. However, when you take a closer look, you see that it is typically only that a load balancer isn’t able to implement Least Connections correctly. When using Least Connections in HAProxy, the load balancer always knows which server has the least amount of load because it sorts them by outstanding request count into a binary tree. Finding the least loaded one is as trivial as picking the first one in the tree.

There is something that randomness provides when dealing with highly dynamic environments though. Newly added servers aren’t swamped by connections, which can happen when a new server that has no load is inflicted by a burst of traffic until it reaches the same level of load as its peers. In contrast, a naive Least Connections algorithm would send all of the traffic to the newly added servers. Instead, HAProxy implements a slow-start mechanism that consists of progressively raising a new server’s weight over a period of time. A slow-start is usually enabled on services that need a pre-heating period, such as those relying on the Least Connections or Consistent Hashing algorithms. The fact that some other load balancers do not fully support weights may be the reason why they choose Power of Two.

With all of this in mind, the HAProxy implementation of Least Connections seems to be almost equivalent to the Power of Two algorithm. However, the studies on Power of Two address a particular, interesting point that is not sufficiently emphasized. That is, when operating multiple load balancers, which all must make an independent decision in parallel, there’s a risk that they will choose the same server and overload it. This is especially true if they base their decision on a local measurement, or, in other words, pick the least loaded server without consulting one another. Some randomness might alleviate this problem.

The Random algorithm was already added to HAProxy to address this specific concern. Multiple load balancers is common in service mesh architectures where HAProxy is deployed as a sidecar proxy. In this scenario, there are as many load balancers as there are service nodes.

Given that we had already implemented Random a few months ago, adding support for an argument to configure the number of draws, or servers to choose from, and pick the least loaded one was absolutely trivial to do. It only required 17 lines of code. So, the motivation to perform a benchmark and get true data, rather than anecdotal evidence, started to build up.

Benchmark Setup

For our benchmark tests, we needed an application and landed on a simple one made of four REST/JSON services. You can download it from our git repository. This new application, which is called MyTime, says Hello to a user and tells him/her what time it is on his/her side of the world based upon the time zone stored in the local database. For example:

The application relies on a Time service that returns the current time of day, a User service that mimics a database of known users and their attributes (e.g. full name and time zone), and a Log service that logs the event represented by the user’s request. The Log service also queries Time and User to retrieve some extra information to be logged. All of these services are accessed through a REST/JSON API.

This application was deployed on six low-power, quad-core ARM servers from our lab with HAProxy as a sidecar next to each service. Since HTTP traffic has to reach the MyTime application, a seventh node was prepared with HAProxy to serve as the edge load balancer in front of the cluster.

This results in six load balancers being present in front of the Log service (the MyTime sidecars) and 12 load balancers in front of the Time and User services (MyTime and Log sidecars).

Several algorithm settings were compared under sustained load to see which would fare better:

  • Round Robin
  • Least Connections
  • Random (with a draw of 1)
  • Power of Two

In addition, given that we have an edge load balancer available, it sounded really appealing to compare this setup to the optimal case where a single load balancer is present in front of each service. So, we added an extra test with an external load balancer, which is one instance per service, running on the edge machine. In this case, it doesn’t change anything for the services, it’s just that their sidecars only know a single endpoint per service, which is the target service’s load balancer.

The tests were run under three different load levels:

  • No contention: each service runs on a dedicated CPU core, to see how the algorithms compare for services running on dedicated servers with no external stress.
  • Moderate contention: all four services share two CPU cores, to match VMs and containers running on moderately loaded servers, like in most clouds.
  • High contention: all four services share the same CPU core, to match the case of overloaded VMs or containers with noisy neighbors.

For each test, we measure:

  • Application performance: the number of requests per second during the whole test.
  • User experience: the application’s response time during the whole test.
  • Load-balancing fairness: the maximum load inflicted upon each service during the test. Or, in other words, how evenly servers were chosen by a load-balancing algorithm.

Results

No Contention

Under no contention, all algorithms are roughly equal. Random does the least well. Power of Two is not as good as Least Connections or Round Robin, but it is very close. This makes sense because, if there is no contention, the highest fairness should provide the smoothest distribution of load across servers and, hence, the lowest queuing and the lowest response times. But based on this, all algorithms, except for Random, could be considered equal.

Moderate Contention

Under moderate contention, Random becomes very bad in terms of load distribution, which matches what was predicted in the studies. The request rate is around 10% lower than the best alternatives. Round-robin is not good either. Very likely, it causes some excess queuing on already slow servers and leaves them with few options to recover from transient overloads.

Power of Two shows a much better distribution than these last two, with peak connection counts about 30% lower. However, its performance in requests-per-second and response time is exactly identical to Round Robin. In that regard, it’s already better than Round Robin in this case.

Least Connections performs very well here, with peak loads about 4% lower than Power of Two. It also shows request rates about 4% higher and response times that are about 4% lower. This probably matches the paper’s prediction regarding the possibility that drawing more than two nodes will improve the overall distribution and performance. The Least Connections algorithm compares the load of all servers, not just two.

Finally, the external load balancer is, as expected, the best solution given that it always picks the best server based on its centralized knowledge of each server’s load. Its performance is about 3% better than the distributed Least Connections (i.e. sidecars connecting directly to other service nodes) on all metrics. So, it is about 7% better than Power of Two. It’s worth noting that this extra 3% difference is not huge, but it indicates that one server could be turned off in a farm of 30 servers by simply pointing all sidecars to each service’s edge load balancer. That is, assuming that the edge load balancer is sufficiently sized.

It was also interesting to see that the complete test finished 26 seconds faster on Least Connections than on Power of Two or Round Robin. The external load balancer test finished 38 seconds faster than Power of Two or Round Robin. While most often, this doesn’t matter for web-facing applications, it definitely affects microservice architectures that are comprised of long service chains. In that scenario, the total processing time directly depends on the processing time of individual services.

High Contention

Under high contention, the load distribution provided by the Random algorithm is disastrous. Round Robin is also quite poor.

Power of Two manages to reduce the peak load on the servers beyond what others manage to do. Interestingly, it is actually an imbalance that causes this reduction due to the competition between the Time and the User services. Time was granted less CPU time and User was granted more, but they still achieved the same results. It’s possible that we’re observing some artifacts of the scheduler’s tick (4 ms at 250 Hz), which fixes the period of time during which a task can run uninterrupted.

Regarding the measured performance, Power of Two remains at the exact same request rate and response time as Round Robin. Least Connections consistently remains about 4% better in both reports. The centralized load balancer is even 3% better.

Here, the complete test finished 24 seconds earlier on Least Connections than on Power of Two or Round Robin. The external load balancer test finished 36 seconds earlier than Power of Two or Round Robin.

Analysis

As expected, in all cases, relying on an external, central load balancer is better in environments where there is moderate or high contention for system resources. This is quickly followed by the Least Connections algorithm, then by Power of Two. Finally, Round Robin, which is not very good in this case, and the Random algorithm, which creates significant peaks and has a distribution that likely follows a long tail, round out the set.

Power of Two consistently gives the same performance in regards to requests per second and average response time as Round Robin. However, it has a much better distribution under load. It makes sense to always use it as an alternative to Round Robin when a good Least Connections implementation is not available. Least Connections consistently performs better.

How is it possible that Least Connections works better than Power of Two in a distributed environment? Very likely the explanation stems from the Round Robin effect that happens between two similarly loaded servers in Least Connections. Indeed, if one load balancer sees servers A and B as equal and another load balancer sees servers A and C as equal, a naive Least Connections will pick A for both of them. That would result in server A taking twice the expected load. In the case of HAProxy’s Least Connections, it depends on the order in which they were released. For two pairs of servers seen by two load balancers, you can have four possible outcomes:

LB 1 selects LB 2 selects Outcome
A A Bad
A C Good
B A Good
B C Good

As you can see, the case that would overload the same server from two load balancers picking it only happens a quarter of the time. When using three LBs, it would be even lower. With more servers, it’s even lower. So this form of randomness is already present in the Least Connections algorithm’s construction. HAProxy always tries to maximize the interval between two picks of the same server.

Conclusion

At the very least, Power of Two is always better than Random. So, we decided to change the default number of draws for the Random algorithm from one to two, matching Power of Two. This will improve the quality of the Random algorithm’s distribution. It, of course, also supports weights. This will be available in the 2.0 release, wherein you’ll be able to set the number of draws as a parameter when using the Random algorithm.

The HAProxy implementation of Least Connections is already significantly better than Power of Two and matches the highest degree of accuracy that the theoretical, best scenario version of the algorithm can produce: comparing all servers within the fleet. It avoids inflicting bursts of traffic onto a single server and, so, is perfectly suitable for both single and distributed load balancer deployments.

As explained in the papers cited, Power of Two was designed as a poor man’s Least Connections to be used in situations where a true Least Connections would be impractical, difficult to implement, or come with a significant performance hit. While it’s an excellent workaround, especially as an alternative to Round Robin, whenever you have the choice you should still prefer Least Connections since it demonstrates better server response times and reduces the cost of server load and processing power.

Furthermore, whenever possible (i.e. if it doesn’t induce any extra cost), you should prefer to configure the sidecars to always pass through a central load balancer as it improves performance by about 3% over the distributed Least Connections. It is also 7% better than Power of Two running on sidecars.

The post Test Driving “Power of Two Random Choices” Load Balancing appeared first on HAProxy Technologies.

]>
https://www.haproxy.com/blog/power-of-two-load-balancing/feed/ 0
Introduction to HAProxy Logging https://www.haproxy.com/blog/introduction-to-haproxy-logging/ https://www.haproxy.com/blog/introduction-to-haproxy-logging/#comments Fri, 08 Feb 2019 11:00:55 +0000 https://www.haproxy.com/?p=218691 When it comes to operationalizing your log data, HAProxy provides a wealth of information. In this blog post, we demonstrate how to set up HAProxy logging, target a Syslog server, understand the log fields, and suggest some helpful tools for parsing log files. [On Demand Webinar] Deep Dive Into HAProxy Logging HAProxy sits in the […]

The post Introduction to HAProxy Logging appeared first on HAProxy Technologies.

]>

When it comes to operationalizing your log data, HAProxy provides a wealth of information. In this blog post, we demonstrate how to set up HAProxy logging, target a Syslog server, understand the log fields, and suggest some helpful tools for parsing log files.

[On Demand Webinar] Deep Dive Into HAProxy Logging

HAProxy sits in the critical path of your infrastructure. Whether used as an edge load balancer, a sidecar, or as a Kubernetes ingress controller, getting meaningful logs out of HAProxy is a must-have.

Logging gives you insights about each connection and request. It enables observability needed for troubleshooting and can even be used to detect problems early. It’s one of the many ways to get information from HAProxy. Other ways include getting metrics using the Stats page or Runtime API, setting up email alerts, and making use of the various open-source integrations for storing log or statistical data over time. HAProxy provides very detailed logs with millisecond accuracy and generates a wealth of information about traffic flowing into your infrastructure. This includes:

  • Metrics about the traffic: timing data, connections counters, traffic size, etc.
  • Information about HAProxy decisions: content switching, filtering, persistence, etc.
  • Information about requests and responses: headers, status codes, payloads, etc.
  • Termination status of a session and the ability to track where failures are occurring (client side, server side?)

In this post, you’ll learn how to configure HAProxy logging and how to read the log messages that it generates. We’ll then list some tools that you’ll find helpful when operationalizing your log data.

Syslog Server

HAProxy can emit log message for processing by a syslog server. This is compatible with familiar syslog tools like Rsyslog, as well as the newer systemd service journald. You can also utilize various log forwarders like Logstash and Fluentd to receive Syslog messages from HAProxy and ship them to a central log aggregator.

If you are working in a container environment, HAProxy supports Cloud Native Logging which allows you to send the log messages to stdout and stderr. In that case, skip to the next section where you’ll see how.

Before looking into how to enable logging via the HAProxy configuration file, you should first make sure that you have a Syslog server, such as rsyslog, configured to receive the logs. On Ubuntu, you’d install rsyslog using the apt package manager, like so:

Once rsyslog is installed, edit its configuration to handle ingesting HAProxy log messages. Add the following either to /etc/rsyslog.conf or to a new file within the rsyslog.d directory, like /etc/rsyslog.d/haproxy.conf:

Then, restart the rsyslog service. In the example above, rsyslog listens on the IP loopback address, 127.0.0.1, on the default UDP port 514. This particular config writes to two log files. The file chosen is based on the severity level with which the message was logged. In order to understand this, take a closer look at the last two lines in the file. They begin like this:

The Syslog standard prescribes that each logged message should be assigned a facility code and a severity level. Given the example rsyslog configuration above, you can assume that we’ll be configuring HAProxy to send all of its log messages with a facility code of local0.

The severity level is specified after the facility code, separated by a dot. Here, the first line captures messages at all severity levels and writes them to a file called haproxy-traffic.log. The second line captures only notice-level messages and above, logging them to a file called haproxy-admin.log.

HAProxy is hardcoded to use certain severity levels when sending certain messages. For example, it categorizes log messages related to connections and HTTP requests with the info severity level. Other events are categorized using one of the other, less verbose levels. From most to least important, the severity levels are:

Severity Level HAProxy Logs
emerg Errors such as running out of operating system file descriptors.
alert Some rare cases where something unexpected has happened, such as being unable to cache a response.
crit Not used.
err Errors such as being unable to parse a map file, being unable to parse the HAProxy configuration file, and when an operation on a stick table fails.
warning Certain important, but non-critical, errors such as failing to set a request header or failing to connect to a DNS nameserver.
notice Changes to a server’s state, such as being UP or DOWN or when a server is disabled. Other events at startup, such as starting proxies and loading modules are also included. Health check logging, if enabled, also uses this level.
info TCP connection and HTTP request details and errors.
debug You may write custom Lua code that logs debug messages

Modern Linux distributions are shipped with the service manager systemd, which introduces journald for collecting and storing logs. The journald service is not a Syslog implementation, but it is Syslog compatible since it will listen on the same /dev/log socket. It will collect the received logs and allow the user to filter them by facility code and/or severity level using the equivalent journald fields (SYSLOG_FACILITY, PRIORITY).

HAProxy Logging Configuration

The HAProxy configuration manual explains that logging can be enabled with two steps: The first is to specify a Syslog server in the global section by using a log directive:

The log directive instructs HAProxy to send logs to the Syslog server listening at 127.0.0.1:514. Messages are sent with facility local0, which is one of the standard, user-defined Syslog facilities. It’s also the facility that our rsyslog configuration is expecting. You can add more than one log statement to send output to multiple Syslog servers.

You can control how much information is logged by adding a Syslog level to the end of the line:

The second step to configuring logging is to update the different proxies (frontend, backend, and listen sections) to send messages to the Syslog server(s) configured in the global section. This is done by adding a log global directive. You can add it to the defaults section, as shown:

The log global directive basically says, use the log line that was set in the global section. Putting a log global directive into the defaults section is equivalent to putting it into all of the subsequent proxy sections. So, this will enable logging on all proxies. You can read more about the sections of an HAProxy configuration file in our blog post The Four Essential Sections of an HAProxy Configuration.

By default, output from HAProxy is minimal. Adding the line option httplog to your defaults section will enable more verbose HTTP logging, which we will explain in more detail later.

A typical HAProxy configuration looks like this:

Using global logging rules is the most common HAProxy setup, but you can put them directly into a frontend section instead. It can be useful to have a different logging configuration as a one-off. For example, you might want to point to a different target Syslog server, use a different logging facility, or capture different severity levels depending on the use case of the backend application. Consider the following example in which the frontend sections, fe_site1 and fe_site2, set different IP addresses and severity levels:

When logging to a local Syslog service, writing to a UNIX socket can be faster than targeting the TCP loopback address. Generally, on Linux systems, a UNIX socket listening for Syslog messages is available at /dev/log because this is where the syslog() function of the GNU C library is sending messages by default. Target the UNIX socket like this:

However, you should keep in mind that if you’re going to use a UNIX socket for logging and at the same time you are running HAProxy within a chrooted environment—or you let HAProxy create a chroot directory for you by using the chroot configuration directive—then the UNIX socket must be made available within that chroot directory. This can be done in one of two ways.

First, when rsyslog starts up, it can create a new listening socket within the chroot filesystem. Add the following to your HAProxy rsyslog configuration file:

The second way is to manually add the socket to the chroot filesystem by using the mount command with the --bind option.

Be sure to add an entry to your /etc/fstab file or to a systemd unit file so that the mount persists after a reboot. Once you have logging configured, you’ll want to understand how the messages are structured. In the next section, you’ll see the fields that make up the TCP and HTTP-level logs.

If you need to limit the amount of data stored, one way is to sample only a portion of log messages. Set the log level to silent for a random number of requests, like so:

Note that, if possible, it’s better to capture as much data as you can. That way, you don’t have missing information when you need it most. You can also modify the ACL expression so that certain conditions override the rule.

Another way to limit the number of messages logged is to set option dontlog-normal in your defaults or frontend. That way, only timeouts, retries and errors are captured. You probably would not want to enable this all of the time, but only during certain times, such as when performing benchmarking tests.

If you are running HAProxy inside of a Docker container and you’re using HAProxy version 1.9, then instead of sending log output to a Syslog server you can send it to stdout and/or stderr. Set the address to stdout or stderr, respectively. In that case, it’s also preferable to set the format of the message to raw, like so:

HAProxy Log Format

The type of logging you’ll see is determined by the proxy mode that you set within HAProxy. HAProxy can operate either as a Layer 4 (TCP) proxy or as Layer 7 (HTTP) proxy. TCP mode is the default. In this mode, a full-duplex connection is established between clients and servers, and no layer 7 examination will be performed. If you’ve set your rsyslog configuration based on our discussion in the first section, you’ll find the log file at /var/log/haproxy-traffic.log.

When in TCP mode, which is set by adding mode tcp, you should also add option tcplog. With this option, the log format defaults to a structure that provides useful information like Layer 4 connection details, timers, byte count, etc. If you were to re-create this format using log-format, which is used to set a custom format, it would look like this:

Descriptions of these fields can be found in the TCP log format documentation, although we’ll describe several in the upcoming section.

When HAProxy is run as a Layer 7 proxy via mode http, you should add the option httplog directive. It ensures that HTTP requests and responses are analyzed in depth and that no RFC-compliant content will go uncaptured. This is the mode that really highlights the diagnostic value of HAProxy. The HTTP log format provides the same level of information as the TCP format, but with additional data specific to the HTTP protocol. If you were to re-create this format using log-format, it would look like this:

Detailed descriptions of the different fields can be found in the HTTP log format documentation.

You can also define a custom log format, capturing only what you need. Use the log-format (or log-format-sd for structured-data syslog) directive in your defaults or frontend. Read our blog post HAProxy Log Customization to learn more and see some examples.

In the next few sections, you’ll become familiar with the fields that are included when you use option tcplog or option httplog.

Proxies

Within the log file that’s produced, each line begins with the frontend, backend, and server to which the request was sent. For example, if you had the following HAProxy configuration, you would see lines that describe requests as being routed through the http-in frontend to the static backend and then to the srv1 server.

This becomes vital information when you need to know where a request was sent, such as when seeing errors that only affect some of your servers.

Timers

Timers are provided in milliseconds and cover the events happening during a session. The timers captured by the default TCP log format are Tw / Tc / Tt. Those provided by the default HTTP log format are TR / Tw/ Tc / Tr / Ta. These translate as:

Timer Meaning
TR The total time to get the client request (HTTP mode only).
Tw The total time spent in the queues waiting for a connection slot.
Tc The total time to establish the TCP connection to the server.
Tr The server response time (HTTP mode only).
Ta The total active time for the HTTP request (HTTP mode only).
Tt The total TCP session duration time, between the moment the proxy accepted it and the moment both ends were closed.

You’ll find a detailed description of all of the available timers in the HAProxy documentation. The following diagram also demonstrates where time is recorded during a single end-to-end transaction. Note that the purple lines on the edges denote timers.

Session State at Disconnection

Both TCP and HTTP logs include a termination state code that tells you the way in which the TCP or HTTP session ended. It’s a two-character code. The first character reports the first event that caused the session to terminate, while the second reports the TCP or HTTP session state when it was closed.

Here are some termination code examples:

Two-character code Meaning
Normal termination on both sides.
cD The client did not send nor acknowledge any data and eventually timeout client expired.
SC The server explicitly refused the TCP connection.
PC The proxy refused to establish a connection to the server because the process’ socket limit was reached while attempting to connect.

There’s a wide variety of reasons a connection may have been closed. Detailed information about all possible termination codes can be found in the HAProxy documentation.

Counters

Counters indicate the health of the system when a request went through. HAProxy records five counters for each connection or request. They can be invaluable in determining how much load is being placed on the system, where the system is lagging, and whether limits have been hit. When looking at a line within the log, you’ll see the counters listed as five numbers separated by slashes: 0/0/0/0/0.

In either TCP or HTTP mode, these break down as:

  • The total number of concurrent connections on the HAProxy process when the session was logged.
  • The total number of concurrent connections routed through this frontend when the session was logged.
  • The total number of concurrent connections routed to this backend when the session was logged.
  • The total number of concurrent connections still active on this server when the session was logged.
  • The number of retries attempted when trying to connect to the backend server.

Other Fields

HAProxy doesn’t record everything out-of-the-box, but you can tweak it to capture what you need. An HTTP request header can be logged by adding the http-request capture directive:

The log will show headers between curly braces and separated by pipe symbols. Here you can see the Host and User-Agent headers for a request:

A response header can be logged by adding an http-response capture directive:

In this case, you must also add a declare capture response directive, which allocates a capture slot where the response header, once it arrives, can be stored. Each slot that you add is automatically assigned an ID starting from zero. Reference this ID when calling http-response capture. Response headers are logged after the request headers, within a separate set of curly braces.

Cookie values can be logged in a similar way with the http-request capture directive.

Anything captured with http-request capture, including HTTP headers and cookies, will appear within the same set of curly braces. The same goes for anything captured with http-response capture.

You can also use http-request capture to log sampled data from stick tables. If you were tracking user request rates with a stick-table, you could log them like this:

So, making a request to a webpage that contains the HTML document and two images would show the user’s concurrent request rate incrementing to three:

You can also log the values of fetch methods, such as to record the version of SSL/TLS that was used (note that there is a built-in log variable for getting this called %sslv):

Variables set with http-request set-var can also be logged.

ACL expressions evaluate to either true or false. You can’t log them directly, but you can set a variable based on whether the expression is true. For example, if the user visits /api, you could set a variable called req.is_api to a value of Is API and then capture that in the logs.

Enabling HAProxy Profiling

With the release of HAProxy 1.9, you can record CPU time spent on processing a request within HAProxy. Add the profiling.tasks directive to your global section:

There are new fetch methods that expose the profiling metrics:

Fetch method Description
date_us The microseconds part of the date.
cpu_calls The number of calls to the task processing the stream or current request since it was allocated. It is reset for each new request on the same connection.
cpu_ns_avg The average number of nanoseconds spent in each call to the task processing the stream or current request.
cpu_ns_tot The total number of nanoseconds spent in each call to the task processing the stream or current request.
lat_ns_avg The average number of nanoseconds spent between the moment the task handling the stream is woken up and the moment it is effectively called.
lat_ns_tot The total number of nanoseconds between the moment the task handling the stream is woken up and the moment it is effectively called.

Add these to your log messages like this:

This is a great way to gauge which requests cost the most to process.

Parsing HAProxy Logs

As you’ve learned, HAProxy has a lot of fields that provide a tremendous amount of insight about connections and requests. However, reading them directly can lead to information overload. Oftentimes, it’s easier to parse and aggregate them with external tools. In this section, you’ll see some of these tools and how they can leverage the logging information provided by HAProxy.

HALog

HALog is a small but powerful log analysis tool that’s shipped with HAProxy. It was designed to be deployed onto production servers where it can help with manual troubleshooting, such as when facing live issues. It is extremely fast and able to parse TCP and HTTP logs at 1 to 2 GB per second. By passing it a combination of flags, you can extract statistical information from the logs, including requests per URL and requests per source IP. Then, you can sort by response time, error rate, and termination code.

For example, if you wanted to extract per-server statistics from the logs, you could use the following command:

This is useful when you need to parse log lines per status code and quickly discover if a given server is unhealthy (e.g. returning too many 5xx responses). Or, a server may be denying too many requests (4xx responses), which is a sign of a brute-force attack. You can also get the average response time per server with the avg_rt column, which is helpful for troubleshooting.

With HALog, you can get per-URL statistics by using the following command:

The output shows the number of requests, the number of errors, the total computation time, the average computation time, the total computation time of successful requests, the average computation time of successful requests, the average number of bytes sent, and the total number of bytes sent. In addition to parsing server and URL statistics, you can apply multiple filters to match logs with a given response time, HTTP status code, session termination code, etc.

HAProxy Stats Page

Parsing the logs with HALog isn’t the only way to get metrics out of HAProxy. The HAProxy Stats Page can be enabled by adding the stats enable directive to a frontend or listen section. It displays live statistics of your servers. The follow listen section starts the Stats page listening on port 8404:

The Stats Page is very useful for getting instant information about the traffic flowing through HAProxy. It does not store this data, though, and displays data only for a single load balancer.

HAProxy Enterprise Real-Time Dashboard

If you’re using HAProxy Enterprise, then you have access to the Real-Time Dashboard. Whereas the Stats page shows statistics for a single instance of HAProxy, the Real-Time Dashboard aggregates and displays information across a cluster of load balancers. This makes it easy to observe the health of all of your servers from a single screen. Data can be viewed for up to 30 minutes.

The dashboard stores and displays information about service health, request rates and load. It also makes it easier to perform administrative tasks, such as enabling, disabling and draining backends. At a glance, you can see which servers are up and for how long. You can also view stick table data, which, depending on what the stick table is tracking, may show you error rates, request rates, and other real-time information about your users. Stick table data can be aggregated as well.

The Real-Time Dashboard is one of many add-ons available with HAProxy Enterprise.

Conclusion

In this blog post, you learned how to configure HAProxy logging to get observability over your load balancer, which is a critical component within your infrastructure. HAProxy emits detailed Syslog messages when operating in either TCP and HTTP mode. These can be sent to a number of logging tools, such as rsyslog.

HAProxy ships with the HALog command-line utility, which simplifies parsing log data when you need information about the types of responses users are getting and the load on your servers. You can also get a visual of the health of your servers by using the HAProxy Stats Page or the HAProxy Enterprise Real-Time Dashboard.

Want to know when content like this is published? Subscribe to our blog or follow us on Twitter. You can also join the conversation on Slack! HAProxy Enterprise combines HAProxy with enterprise-class features, such as the Real-Time Dashboard, and premium support. Contact us to learn more or sign up for a free trial today!

The post Introduction to HAProxy Logging appeared first on HAProxy Technologies.

]>
https://www.haproxy.com/blog/introduction-to-haproxy-logging/feed/ 1
[On Demand Webinar] Deep Dive Into HAProxy Logging https://www.haproxy.com/blog/webinar-deep-dive-into-haproxy-logging/ Thu, 07 Feb 2019 21:31:41 +0000 https://www.haproxy.com/?p=220241 There’s more to logs than grep! In this deep dive, you’ll learn how to unleash the power of your HAProxy logs. See how understanding the data captured empowers you to operationalize that data, debug issues, and stay ahead of lurking problems. Join our live webinar where we will demonstrate how to: Interpret the HAProxy log […]

The post [On Demand Webinar] Deep Dive Into HAProxy Logging appeared first on HAProxy Technologies.

]>
There’s more to logs than grep! In this deep dive, you’ll learn how to unleash the power of your HAProxy logs. See how understanding the data captured empowers you to operationalize that data, debug issues, and stay ahead of lurking problems.

Join our live webinar where we will demonstrate how to:

  • Interpret the HAProxy log fields
  • Know your options when it comes to collecting log data
  • Apply logs to debugging
  • Enrich your logs with geolocation data using HAProxy Enterprise
  • Utilize recent changes such as Profiling, introduced in version 1.9

Afterwards, participate in the Q&A session where you’ll be able to ask follow-up questions.

Register to watch now:



The post [On Demand Webinar] Deep Dive Into HAProxy Logging appeared first on HAProxy Technologies.

]>
Using HAProxy as an API Gateway, Part 2 [Authentication] https://www.haproxy.com/blog/using-haproxy-as-an-api-gateway-part-2-authentication/ https://www.haproxy.com/blog/using-haproxy-as-an-api-gateway-part-2-authentication/#respond Tue, 22 Jan 2019 10:15:29 +0000 https://www.haproxy.com/?p=216781 HAProxy is a powerful API gateway due to its ability to provide load balancing, rate limiting, observability and other features to your service endpoints. It also integrates with OAuth 2, giving you control over who can access your APIs. In this blog post, you’ll see how.   Using HAProxy as an API Gateway, Part 1 […]

The post Using HAProxy as an API Gateway, Part 2 [Authentication] appeared first on HAProxy Technologies.

]>

HAProxy is a powerful API gateway due to its ability to provide load balancing, rate limiting, observability and other features to your service endpoints. It also integrates with OAuth 2, giving you control over who can access your APIs. In this blog post, you’ll see how.

 

 

In the previous blog post, Using HAProxy as an API Gateway, Part 1 [Introduction], we touched upon how simple it is for you to evade that proverbial avalanche of complexity by setting up an immensely powerful point of entry to your services—an API gateway. HAProxy creates a unified front that clients can connect to, distributing requests to your backend without breaking a sweat, allowing you to operate at any scale and in any environment. HAProxy, at the same time, provides best-in-class load balancing, advanced DDoS and bot protection, rate limiting and observability.

The second part of our API gateway series focuses on how to authenticate and authorize users that want to connect. After all, APIs provide direct access to backend systems and may return sensitive information such as healthcare, financial and PII data. Recent data breaches due to API vulnerabilities have hit organizations as large as Amazon and the USPS. APIs often expose create, update and delete operations on your data too, which shouldn’t be open to just anyone.

In this post, we’ll demonstrate how HAProxy defends your APIs from unauthorized access via JWT access tokens and shrinks the attack surface that you might otherwise expose. You’ll learn how HAProxy can be extended with Lua, which provides a flexible way to integrate with other tools, protocols, and frameworks.

Authentication and Authorization

Let’s begin with a scenario where you have an API to protect. For example, let’s say that this API provides methods related to listing hamsters up for adoption. It has the following API endpoints:

API endpoint What it does
GET /api/hamsters Returns a list of hamsters ready to be adopted
POST /api/hamsters/{name} Adds a newly arrived hamster to the list
DELETE /api/hamsters/{name} Removes a hamster from the list after it’s found a home

This fictitious API lets you view available hamsters, add new hamsters to the list, and remove the furry critters after they’ve been adopted to loving homes. For example, you could call GET /api/hamsters like this:

GET https://api.mywebsite.com/api/hamsters

This would be consumed by your frontend application, perhaps through Ajax or when loading the page. For requests like this that retrieve non-sensitive information, you may not ask users to log in and there may not be any authentication necessary. For other requests, such as those that call the POST and DELETE endpoints for adding or deleting records, you may want users to log in first. If an anonymous user tries to call the POST and DELETE API methods, they should receive a 403 Forbidden response.

There are two terms that we need to explain: authentication and authorization. Authentication is the process of getting a user’s identity. Its primary question is: Who is using your API? Authorization is the process of granting access. Its primary question is: Is this client approved to call your API?

OAuth 2 is a protocol that authenticates a client and then gives back an access token that tells you whether or not that client is authorized to call your API. By and large, the concept of identity doesn’t play a big part in OAuth 2, which is mostly concerned with authorization. Think of it like going to the airport, and at the first gate you are meticulously inspected by a number of set criteria. Upon inspection, you are free to continue on to your terminal where you can buy overpriced coffee, duty-free souvenir keychains and maybe a breakfast bagel. Since you’ve been inspected and have raised no red flags, you are free to roam around.

In a similar way, OAuth 2 issues tokens that typically don’t tell you the identity of the person accessing the API. They simply show that the user, or the client application that the user has delegated their permissions to, should be allowed to use the API. That’s not to say that people never layer on identity properties into an OAuth token. However, OAuth 2 isn’t officially meant for that. Instead, other protocols like OpenID Connect should be used when you need identity information.

As we described in Part 1 of this series, an API gateway is a proxy between the client and your backend API services that routes requests intelligently. It also acts as a security layer. When you use HAProxy as your API gateway, you can validate OAuth 2 access tokens that are attached to requests.

For simplifying your API gateway and keeping the complicated authentication pieces out of it, you’ll offload the task of authenticating clients to a third-party service like Auth0 or Okta. These services handle logging users in and can distribute tokens to clients that successfully authenticate. A client application would then include the token with any requests it sends to your API.

After you’ve updated HAProxy with some custom Lua code, it will inspect each request and look at the token that the client is presenting. It will then decide whether or not to allow the request through.

OAuth2 Access Tokens

An access token uses the JSON Web Token (JWT) format and contains three base64-encoded sections:

  • A header that contains the type of token (“JWT” in this case) and the algorithm used to sign the token
  • A payload that contains:
    • the URL of the token issuer
    • the audience that the token is intended for (your API URL)
    • an expiration date
    • any scopes (e.g. read and write) that the client application should have access to
  • A signature to ensure that the token is truly from the issuer and that it has not been tampered with since being issued

In this article, we won’t focus on how a client application gets a token. In short, you’d redirect users to a login page hosted by a third-party service like Auth0 or Okta. Instead, we’ll highlight how to validate a token. You will see how HAProxy can inspect a token that’s presented to it and then decide whether to let the request proceed.

If you’re curious about what the JWT data looks like, you can use the debugger at https://jwt.io to decode it.

Some interesting fields to note are:

  • alg, the algorithm, which is RS256 in this example, that was used to sign the token
  • iss, the issuer, or the service that authenticated the client and created the token
  • aud, the audience, which is the URL of your API gateway
  • exp, the expiration date, which is a UNIX timestamp
  • scope, which lists the granular permissions that the client has been granted (Note that Okta calls this field “scp”, so the Lua code would have to be modified to suit.)

API Gateway Sample Application

To follow this tutorial, you have two options:

  1. You can clone the sample application from Github and use Vagrant to set it up.
  2. You can clone the JWT Lua code repository by itself. It provides an install script to assist with installing the Lua library and its dependencies into your own environment.

The workflow for authorizing users looks like this:

  1. A client application uses one of the grant workflows to request a token from the authentication service. For example, a frontend JavaScript application may use the implicit grant flow to get a token.
  2. Once the client has received a token, it stores it so that it can continue to use it until it expires.
  3. When calling an API method, the application attaches the token to the request in an HTTP header called Authorization. The header’s value is prefixed with Bearer, like so:
  4. HAProxy receives the request and performs the following checks:
    • Was the token signed using an algorithm that the Lua code understands?
    • Is the signature valid?
    • Is the token expired?
    • Is the issuer of the token (the authenticating service) who you expect it to be?
    • Is the audience (the URL of your API gateway) what you expect?
    • Are there any scopes that would limit which resources the client can access?
  5. The application continues to send the token with its requests until the token expires, at which time it repeats Step 1 to get a new one.

To test it out, sign up for an account with Auth0. Then, you can use curl to craft an HTTP request to get a new token using the client credential grant flow. POST a request to https://{your_account}.auth0.com/oauth/token and get an access token back. The Auth0 website gives you some helpful guidance on how to do this.

Here’s an example that asks for a new token via the /oauth/token endpoint. It sends a JSON object containing the client’s credentials, client_id and client_secret:

You’ll get back a response that contains the JWT access token:

In a production environment, you’d use the client credentials grant workflow only with trusted client applications where you can protect the client ID and secret. It works really well for testing though.

Now that you have a token, you can call methods on your API. One of the benefits of OAuth 2 over other authorization schemes like session cookies is that you control the process of attaching the token to the request. Whereas cookies are always passed to the server with every request, even those submitted from an attacker’s website as in CSRF attacks, your client-side code controls sending the access token. An attacker will not be able to send a request to your API URL with the token attached.

The request will look like this:

In the next section, you’ll see how HAProxy can, with the addition of some Lua code, decode and validate access tokens.

Configuring HAProxy for OAuth 2

Before an issuer like Auth0 gives a client an access token, it signs it. Since you’ll want to verify that signature, you’ll need to download the public key certificate from the token issuer’s website. On the Auth0 site, you’ll find the download link under Applications > [Your application] > Settings > Show Advanced Settings > Certificates. Note, however, that it will give you a certificate in the following format:

This contains the public key that you can use to validate the signature, but also extra metadata that can’t be used. Invoke the following OpenSSL command to convert it to a file containing just the public key:

This will give you a new file called pubkey.pem that is much shorter:

In the sample project, I store this file in the pem folder and then Vagrant syncs that folder to the VM. I then use an environment variable to tell the Lua code where to find it. In fact, I use environment variables for passing in several other parameters as well. Use setenv in your HAProxy configuration file to set an environment variable.

A lua-load directive loads a Lua file called jwtverify.lua that contains code for validating access tokens. It gets this from the JWT Lua code repository.

Next, the frontend receives requests on port 443 and performs various checks by invoking the jwtverify.lua file. Here we’re using ACL statements to define conditional logic that allows or denies a request. ACLs are a powerful and flexible system within HAProxy and one of the building blocks that make it so versatile.

The first http-request deny line rejects the request if the client did not send an Authorization header at all. The next line, http-request lua.jwtverify, invokes our Lua script, which will perform the following actions:

  • Decodes the JWT
  • Checks that the algorithm used to sign the token is supported (RS256)
  • Verifies the signature
  • Ensures that the token is not expired
  • Compares the issuer in the token to the OAUTH_ISSUER environment variable
  • Compares the audience in the token to the OAUTH_AUDIENCE environment variable
  • If any scopes are defined in the token, adds them to an HAProxy variable called req.oauth_scopes so that subsequent ACLs can check them
  • If everything passes, sets a variable called txn.authorized to true

The next http-request deny line rejects the request if the Lua script did not set a variable called txn.authorized to a value of true. Notice how booleans are evaluated by adding the -m bool flag.

The next two lines reject the request if the token does not contain a scope that matches what we expect for the HTTP path and method. Scopes in OAuth 2 allow you to define specific access restrictions. In this case, POST and DELETE requests require the write:hamsters permission. Scopes are optional and some APIs don’t use them. You can set them up on the Auth0 website and associate them with your API. If the client should have these scopes, they’ll be included in the token.

To summarize, any request for /api/hamsters must meet the following rules:

  • It must send an Authorization header containing a JWT
  • The token must be valid, per the jwtverify.lua script
  • The token must contain a scope that matches what you expect

With this configuration in place, you can use curl to send requests to your API, attaching a valid token, and expect to get a successful response. Using this same setup, you’d lock down your APIs so that only authenticated and approved clients can use them.

Conclusion

In the blog post, you learned more about using HAProxy as an API gateway, leveraging it to secure your API endpoints using OAuth 2. Clients request tokens from an authentication server, which sends back a JWT. That token is then used to gain access to your APIs. With the help of some Lua code, HAProxy can validate the token and protect your APIs from unauthorized use.

Did you find this article helpful? Want to stay up to date on similar topics? Subscribe to our blog! Also, follow us on Twitter for other HAProxy news. You can also join the conversation on Slack.

HAProxy Enterprise comes with a number of preinstalled Lua modules and makes it easy to add your own, as it comes bundled with the Lua runtime. Request a free trial or contact us to learn more! Our expert support team has experience setting up Lua modules and can help provide a tailored approach to your needs.

The post Using HAProxy as an API Gateway, Part 2 [Authentication] appeared first on HAProxy Technologies.

]>
https://www.haproxy.com/blog/using-haproxy-as-an-api-gateway-part-2-authentication/feed/ 0
HAProxy 1.9.2 Adds gRPC Support https://www.haproxy.com/blog/haproxy-1-9-2-adds-grpc-support/ https://www.haproxy.com/blog/haproxy-1-9-2-adds-grpc-support/#respond Wed, 16 Jan 2019 18:12:29 +0000 https://www.haproxy.com/?p=217211 HAProxy provides end-to-end proxying of HTTP/2 traffic. Use HAProxy to route, secure, and observe gRPC traffic over HTTP/2. Read on to learn more. HAProxy 1.9 introduced the Native HTTP Representation (HTX). Not only does this allow you to use HTTP/2 end-to-end, it also paves the way for HAProxy to support newer versions of HTTP-based technologies […]

The post HAProxy 1.9.2 Adds gRPC Support appeared first on HAProxy Technologies.

]>

HAProxy provides end-to-end proxying of HTTP/2 traffic. Use HAProxy to route, secure, and observe gRPC traffic over HTTP/2. Read on to learn more.

HAProxy 1.9 introduced the Native HTTP Representation (HTX). Not only does this allow you to use HTTP/2 end-to-end, it also paves the way for HAProxy to support newer versions of HTTP-based technologies and protocols at a faster pace.

Today, with the release of version 1.9.2, we’re excited to announce that HAProxy fully supports gRPC. This moment solidifies the vision we had when creating HTX. The gRPC protocol allows your services to communicate with low latency. HAProxy supports it in ways such as enabling bidirectional streaming of data, parsing and inspecting HTTP headers, and logging gRPC traffic.

HAProxy is known for its high performance, low latency, and flexibility. It provides the building blocks needed to solve a vast array of problems you may encounter quickly and easily. It brings increased observability that can help with troubleshooting, built-in support for ACLs, which can be combined with stick tables, to define rules that will allow you to enable rate limiting for protection against bot threats and application-layer DDoS attacks.

In this blog post, you’ll learn how to set up an example project that uses gRPC and Protocol Buffers to stream messages between a client and a server with HAProxy in between. You’ll learn a bit of the history of how HAProxy came to support HTTP/2 and why it’s such a great choice as a load balancer for gRPC traffic.

The Return of RPC

If you’ve been writing services over the past ten years, you’ve seen the movement away from heavy, remote-procedure-call protocols like SOAP that passed around XML towards lighter, HTTP-friendly paradigms like REST. So complete was the industry’s move away from RPC that entire maturity models (see Richardson Maturity model) were developed that took us further into the land of using HTTP than anyone, I suspect, ever thought possible.

However, somewhere from here to there, we all settled on the notion that JSON was the best (only?) way to transfer data between our services. It made sense. JSON is flexible, easily parsed, and readily deserializes into objects in any given language.

This one-size-fits-all approach led many to implement backend services that communicate by passing JSON messages, even services that only speak among themselves within your own network. Even services that must send and receive a lot of data, or that communicate with half a dozen other services—they all relied on JSON.

In order to support services defined only by a collection of HTTP paths and methods, each with the potential to define how arguments should be sent differently (part of the URL? included in the JSON request?), implementers had to roll their own client libraries—a process that had to be repeated for every programming language used within the organization.

Then, gRPC, an RPC-style framework that uses a unique, binary serialization called Protocol Buffers appeared on the scene. It allowed messages to be passed faster and more efficiently. Data between a client and server can even be streamed continuously. Using Protocol Buffers, gRPC allows client SDKs and service interfaces to be auto-generated. Clearly, the RPC paradigm is back in a big way.

The Case for gRPC

What is gRPC and what problems does it try to solve? Back in 2015, Google open-sourced gRPC, a new framework for connecting distributed programs via remote procedure calls that they’d developed in collaboration with Square and other organizations. Internally, Google had been migrating most of its public-facing services to gRPC already. The framework offered features that were necessary for the scale Google’s services had achieved.

However, gRPC solves problems that the rest of the industry is seeing too. Think about how service-oriented architectures have changed. Initially, a common pattern was a client makes a request to a single backend service, gets a JSON response, then disconnects. Today, applications often decompose business transactions into many more steps. A single transaction may involve communicating with half a dozen services.

The gRPC protocol is an alternative to sending text-based JSON messages over the wire. Instead, it serializes messages using Protocol Buffers, which is transmitted as binary data, making the messages smaller and faster. As you increase the number of your services, reducing latency between them becomes more noticeable and important.

Another change in the industry is the rapid growth of data that services must send and receive. This data might come from always-on IoT devices, rich mobile applications, or even your own logging and metrics collection. The gRPC protocol handles this by using HTTP/2 under the hood in order to enable bidirectional streaming between a client and a service. This allows data to be piped back and forth over a long-lived connection, breaking free of the limitations of the request/response-per-message paradigm.

Protocol Buffers also provides code generation. Using protoc, the Protocol Buffers compiler, you can generate client SDKs and interfaces for your services into a number of programming languages. This makes it easier to keep clients and services in sync and reduces the time writing this boilerplate code yourself.

Similar to how earlier frameworks like SOAP used XML to connect heterogeneous programming languages, gRPC uses Protocol Buffers as a shared, but independent, service description language. With gRPC, interfaces and method stubs are generated from a shared .proto file that contains language-agnostic function signatures. However, the implementation of those functions isn’t directly attached. Clients can, in fact, swap mock services in place of the real implementations to do unit testing or point to a completely different implementation if the need arises.

HAProxy HTTP/2 Support

In order to support gRPC, support for HTTP/2 is required. With the release of HAProxy 1.9, you can load balance HTTP/2 traffic between both the client and HAProxy and also between HAProxy and your backend service. This opens the door to utilizing gRPC as a message passing protocol. At this time, most browsers do not support gRPC. However, tools like the gRPC Gateway can be placed behind HAProxy to translate JSON to gRPC and you can, of course, load balance service-to-service, gRPC communication within your own network.

For the rest of this section, you’ll get to know the history of how HAProxy came to offer these features. Then, we’ll demonstrate an application that uses bidirectional streaming over gRPC.

HTTP/2 Between Client and Proxy

HAProxy added support for HTTP/2 between itself and the client (such as a browser) with the 1.8 release back at the end of 2017. This was a huge win for those using HAProxy because the latency you see is typically happening on the network segments that traverse the Internet between the server and browser. HTTP/2 allows for more efficient transfer of data due to its binary format (as opposed to the human-readable, text-based format of HTTP/1.1), header compression, and multiplexing of message frames over a single TCP connection.

Enabling this in HAProxy is incredibly simple. You simply ensure that you are binding over TLS and add an alpn parameter to the bind directive in a frontend.

If you aren’t familiar with ALPN, here’s a short recap: When using TLS with HTTP/1.1, the convention is to listen on port 443. When HTTP/2 came along, the question became, why reinvent the wheel by listening on a different port than the one with which people are already familiar? However, there had to be a way to tell which version of HTTP the server and client would use. Of course, there could have been an entirely separate handshake that negotiated the protocol, but in the end it was decided to go ahead and encode this information into the TLS handshake, saving a round-trip.

The Application-Layer Protocol Negotiation (ALPN) extension, as described in RFC 7301, updated TLS to support a client and server agreeing on an application protocol. It was created to support HTTP/2 specifically, but will be handy for any other protocols that might need to be negotiated in the future.

ALPN allows a client to send a list of protocols, in preferred order, that it supports as a part of its TLS ClientHello message. The server can then return the protocol that it chooses as a part of its TLS ServerHello message. So, as you can see, being able to communicate which version of HTTP each side supports really does rely on an underlying TLS connection. In a way, it nudges us all towards a more secure web—at least if we want to support both HTTP/1.1 and HTTP/2 on the same port.

Adding HTTP/2 to the Backend

After the release of version 1.8, users of HAProxy could already see performance gains simply by switching on HTTP/2 in a frontend. However, protocols like gRPC require that HTTP/2 be used for the backend services as well. The open-source community and engineers at HAProxy Technologies got to work on the problem.

During the process, it became apparent that the time was right to refactor core parts of how HAProxy parses and modifies HTTP messages. An entirely new engine for handling HTTP messages was developed, which was named the Native HTTP Representation, or HTX mode, and released with version 1.9. In HTX mode, HAProxy is able to more easily manipulate any representation of the HTTP protocol. Before you can use HTTP/2 to a backend, you must add option http-use-htx.

Then, in your backend section, adding the alpn parameter to a server directive enables HAProxy to connect to the origin server using HTTP/2.

In the case of gRPC, which requires HTTP/2 and can’t fall back to HTTP/1.1, you can omit http/1.1 altogether. You can also use the proto parameter instead of alpn when specifying a single protocol. Here’s an example that uses proto on the bind and server lines:

When using proto, enabling TLS via the ssl parameter becomes optional. When not used, HTTP traffic is transferred in the clear. Note that you can use alpn in the frontend and proto in the backend, and vice versa.

You Could Always Do Layer 4 Proxying

It should be noted that you could always proxy HTTP/2 traffic using transport layer (Layer 4) proxying (e.g. setting mode tcp). That’s because, in this mode, the data that’s sent over the connection is opaque to HAProxy. The exciting news is the ability, via HTX, to proxy traffic end-to-end at the application layer (Layer 7) when using mode http.

This means that you can inspect the contents of HTTP/2 messages including headers, the URL, and the request method. You can also set ACL rules to filter traffic or to route it to a specific backend. For example, you might inspect the content-type header to detect gRPC messages and route them specifically.

In the next section, you’ll see an example of proxying gRPC traffic with HAProxy.

HAProxy gRPC Support

Follow along by downloading the sample HAProxy gRPC project from Github. It spins up an environment using Docker Compose. It demonstrates getting a new, random codename from the server (e.g. Bold Badger or Cheerful Coyote). It includes a simple gRPC request/response example and a more complex, bidirectional streaming example, with HAProxy in the middle.

The Proto File

First, take a look at the sample/codenamecreator/codenamecreator.proto file. This is a Protocol Buffers file and lists the methods that our gRPC service will support.

At the top, we’ve defined a NameRequest message type and a NameResult message type. The former takes a string called category as a parameter and the latter takes a string called name. A service called CodenameCreator is defined that has a function called GetCodename and another called KeepGettingCodenames. In this example project, GetCodename requests a single codename from the server and then exits. KeepGettingCodenames continuously receives codenames from the server in an endless stream.

When defining functions in a .proto file, adding stream before a parameter or return type makes it streamable, in which case gRPC leaves the connection open and allows requests and/or responses to continue to be sent on the same channel. It’s possible to define gRPC services with no streaming, streaming only from the client, streaming only from the server, and bidirectional streaming.

In order to generate client and server code from this .proto file, you’d use the protoc compiler. Code for different languages, including Golang, Java, C++, and C#, can be generated by downloading the appropriate plugin and passing it to protoc via an argument. In our example, we generate Golang .go files by installing the protoc-gen-go plugin and specifying it using the –go_out parameter. You’ll also need to install Protocol Buffers and the gRPC library for your language. Using the golang:alpine Docker container, the beginning of our client Dockerfile configures the environment like this:

A separate Dockerfile for our gRPC server is the same up to this point, since it also needs to generate code based off of the same .proto file. A file called codenamecreator.pb.go will be created for you. The rest of each Dockerfile (client and server) build and run the respective Go code that implements and calls the gRPC service.

In the next section, you’ll see how the server and client code is structured.

Server Code

Our gRPC service’s server.go file implements the GetCodename function that was defined in the .proto file like this:

Here, some custom code is used to generate a new, random codename (not shown, but available in the Github repository) and this is returned as a NameResult. There’s a lot more going on in the streaming example, KeepGettingCodenames, so suffice it say it implements the interface that was generated in codenamecreator.pb.go:

To give you an idea, the server calls stream.Send to send data down the channel. In a separate goroutine, it calls stream.Recv() to receive messages from the client using the same stream object. The server begins listening for connections on port 3000. You’re able to use transport-layer security by providing a TLS public certificate and private key when creating the gRPC server, as shown:

HAProxy is able to verify the server’s certificate by adding ca-file /path/to/server.crt to the backend server line. You can also disable TLS by calling grpc.NewServer without any arguments.

Client Code

The protoc compiler generates a Golang interface that your service implements, as well as a client SDK that you’d use to invoke the service functions from the client. In the case of Golang, all of this is included within the single, generated .go file. You then write code that consumes this SDK.

The client configures a secure connection to the server by passing its address into the grpc.Dial function. In order for it to use TLS to the server, it must be able to verify the server’s public key certificate using the grpc.WithTransportCredentials function:

Since HAProxy sits between the client and server, the address should be the load balancer’s and the public key should be the certificate portion of the .pem file specified on the bind line in the HAProxy frontend. You can also choose to not use TLS at all and pass grpc.WithInsecure() as the second argument to grpc.Dial. In that case, you would change your HAProxy configuration to listen without TLS and use the proto argument to specify HTTP/2:

The client.go file is able to call GetCodename and KeepGettingCodenames as though they were implemented in the same code. That’s the power of RPC services.

When calling a gRPC function that isn’t using streams, as with GetCodename, the function simply returns the result from the server and exits. This is probably how most of your services will operate.

For the streaming example, the client calls KeepGettingCodenames to get a stream object. From there, stream.Recv() is called in an endless loop to receive data from the server. At the same time, it calls stream.Send to send data back—in this case, a new category such as Science—to the server every ten seconds. In this way, both the client and server are sending and receiving data in parallel over the same connection.

On the client-side, you’ll see new, random codenames displayed:

Every ten seconds, the server will show that the client has requested a different category:

In the next section, you’ll see how to configure HAProxy to proxy gRPC traffic at Layer 7.

HAProxy Configuration

The HAProxy configuration for gRPC is really just an HTTP/2-compatible configuration.

Within the frontend, the bind line uses the alpn parameter (or proto) to specify that HTTP/2 (h2) is supported. Likewise, an alpn parameter is added to the server line in the backend, giving you end-to-end HTTP/2. Note that option http-use-htx is necessary to make this work.

There are a few other caveats to note. The first is that when streaming data bidirectionally between the client and server, because HAProxy defaults to only logging the traffic when the full request/response transaction has completed, you should use option logasap to tell HAProxy to log the connection right away. It will log a message at the start of the request:

You can also add debug to the global section to enable debug logging. Then you’ll see all of the HTTP/2 headers from the request and response.

When streaming data from the client to the server, be sure not to set option http-buffer-request. This would pause HAProxy until it receives the full request body, which, when streaming, will be a long time in coming.

Inspecting Headers and URL Paths

To demonstrate some of the Layer 7 features of proxying gRPC traffic, consider the need to route traffic based on the application protocol. You might, for example, want to use the same frontend to serve both gRPC and non-gRPC traffic, sending each to the appropriate backend. You’d use an acl statement to determine the type of traffic and then choose the backend with use_backend, like so:

Another use for inspecting headers is the ability to operate on metadata. Metadata is extra information that you can include with a request. You might utilize it to send a JWT access token or a secret passphrase, denying all requests that don’t contain it or performing more complex checks. When sending metadata from your client, your gRPC code will look like this (where the metadata package is google.golang.org/grpc/metadata):

Here’s an example that uses http-request deny to refuse any requests that don’t send the secret passphrase:

You can also record metadata in the HAProxy logs by adding a capture request header line to the frontend, like so:

The mysecretpassphrase header will be added to the log, surrounded by curly braces:

HAProxy can also route to a different backend based upon the URL path. In gRPC, the path is a combination of the service name and function. Knowing that, you can declare an ACL rule that matches the expected path, /CodenameCreator/KeepGettingCodenames, and route traffic accordingly, as in this example:

Conclusion

In this blog post, you learned how HAProxy provides full support for HTTP/2, which enables you to use gRPC for communicating between services. You can use HAProxy to route gRPC requests to the appropriate backend, load balance equally among servers, enforce security checks based on HTTP headers and gRPC metadata, and get observability into the traffic.

Want to stay up to date on the latest HAProxy news? Subscribe to our blog. You can also follow us on Twitter and join the conversation on Slack.

HAProxy Enterprise offers a suite of extra security-related modules and expert support. Contact us to learn more and request a free trial.

The post HAProxy 1.9.2 Adds gRPC Support appeared first on HAProxy Technologies.

]>
https://www.haproxy.com/blog/haproxy-1-9-2-adds-grpc-support/feed/ 0
[On Demand Webinar] Introduction to HAProxy Maps https://www.haproxy.com/blog/webinar-introduction-to-haproxy-maps/ https://www.haproxy.com/blog/webinar-introduction-to-haproxy-maps/#respond Tue, 15 Jan 2019 12:22:14 +0000 https://www.haproxy.com/?p=217661 Your routing logic in HAProxy is simple in the beginning. Then it grows in complexity, perhaps requiring you to choose a different backend based upon the requested URL path, hostname or region. Not long after, you have dozens of similar rules—maybe even hundreds! When that day comes, it’s time to regain control by leveraging HAProxy […]

The post [On Demand Webinar] Introduction to HAProxy Maps appeared first on HAProxy Technologies.

]>
Your routing logic in HAProxy is simple in the beginning. Then it grows in complexity, perhaps requiring you to choose a different backend based upon the requested URL path, hostname or region. Not long after, you have dozens of similar rules—maybe even hundreds! When that day comes, it’s time to regain control by leveraging HAProxy maps.

Maps are a key-value storage built into the load balancer. They simplify HAProxy configurations by giving you the ability to do fast lookups based on an input. Although maps are a core building block of HAProxy, many users have yet to unlock their full potential.

Join our live webinar where we will demonstrate how to:

  • Define maps and load them into HAProxy
  • Use converters to transform inputs
  • Edit existing map files by hand or dynamically with tools like the HAProxy Enterprise lb-update module
  • Create pseudo APIs for updating maps by using the http-request set-map directive
  • Learn real-world examples of when to use a map

Afterwards, participate in the Q&A section were we’ll cover more specific use cases.

Register to watch now:



The post [On Demand Webinar] Introduction to HAProxy Maps appeared first on HAProxy Technologies.

]>
https://www.haproxy.com/blog/webinar-introduction-to-haproxy-maps/feed/ 0
[Conference] RSA Conference 2019 https://www.haproxy.com/blog/conference-rsa-conference-2019/ https://www.haproxy.com/blog/conference-rsa-conference-2019/#respond Tue, 08 Jan 2019 05:27:13 +0000 https://www.haproxy.com/?p=213441 March 4 – 8, 2019 San Francisco, SF

The post [Conference] RSA Conference 2019 appeared first on HAProxy Technologies.

]>
March 4 – 8, 2019
San Francisco, SF

The post [Conference] RSA Conference 2019 appeared first on HAProxy Technologies.

]>
https://www.haproxy.com/blog/conference-rsa-conference-2019/feed/ 0
HAProxy 1.9 Has Arrived https://www.haproxy.com/blog/haproxy-1-9-has-arrived/ https://www.haproxy.com/blog/haproxy-1-9-has-arrived/#comments Wed, 19 Dec 2018 12:35:16 +0000 https://www.haproxy.com/?p=212181 HAProxy Technologies is proud to announce the release of HAProxy 1.9. This release brings a native HTTP representation (HTX) powering end-to-end HTTP/2 support and paving the way for future innovations such as HTTP/3 (QUIC). It also contains improvements to buffers and connection management including connection pooling to backends, threading optimizations, updates to the Runtime API, […]

The post HAProxy 1.9 Has Arrived appeared first on HAProxy Technologies.

]>

HAProxy Technologies is proud to announce the release of HAProxy 1.9. This release brings a native HTTP representation (HTX) powering end-to-end HTTP/2 support and paving the way for future innovations such as HTTP/3 (QUIC). It also contains improvements to buffers and connection management including connection pooling to backends, threading optimizations, updates to the Runtime API, and much more.

 

UPDATE: HAProxy 1.9.2 Adds gRPC Support

 

 

HAProxy, the world’s fastest and most widely used software load balancer, was first released in December 2001. The load balancer landscape has changed significantly since then. Yet HAProxy, with 17 years of active development under its belt, has continued to evolve and innovate. Today, we’re announcing the release of HAProxy 1.9.

This release focuses on laying the foundation that will allow us to continue to provide best-in-class performance while accelerating cutting edge feature delivery for modern environments. Exciting near-term features on the roadmap, thanks to the core improvements in 1.9, include Layer 7 retries, circuit breaking, gRPC, the new Data Plane API, and much more.

The advancements in this version are thanks to a strong community of contributors. They provide code submissions covering new functionality and bug fixes, quality assurance test each feature, and correct documentation typos. Everyone has done their part to make this release possible!

We also saw a need to release updates more often. Going forward, HAProxy will be moving from an annual release cycle to a biannual release cycle. While previous major releases would happen each year around November/December, starting today we will begin releasing twice per year. Note that this version is backwards compatible with older configurations.

HAProxy 1.9 can be broken down into the following categories:

  • Buffer Improvements
  • Connection Management
  • Native HTTP Representation (HTX)
  • Improved Threading
  • Cache Improvements
  • Early Hints (HTTP 103)
  • Runtime API Improvements
  • Server Queue Priority Control
  • Random Load Balancing Algorithm
  • Cloud-Native Logging
  • New Fetches
  • New Converters
  • Miscellaneous Improvements
  • Regression Test Suite

In the following sections, we’ll dig into these categories and share the improvements you can expect to see.

Buffer Improvements

HAProxy already supports HTTP/2 to the client. A major goal of this release was to support end-to-end HTTP/2, including to the backend server. We also wanted to support any future version of HTTP, such as HTTP/3 (QUIC).

Our R&D team put a tremendous amount of effort into establishing what changes would be required to make this happen. A very important one that was discovered involves the way in which HAProxy handles buffers.

The buffer is an area of storage cut into two parts: input data that has not been analyzed and output data. The buffer can start and end anywhere. In previous versions of HAProxy there were 22 possible buffer cases (times two versions of the code: input and output), as the graphic below illustrates:

This shows a breakdown of the various types of buffers and how they are allocated, pre-version 1.9. A decision was made to rewrite the buffer handling and simplify buffer allocation. Below is a diagram showing the latest buffer changes:

The latest changes reduce the amount of buffer cases to seven with only one version of the code to maintain.

In addition to this refactoring, the buffer’s header, which describes the buffer state, has been split from the storage area. This means that it is not mandatory anymore to have a single representation for the same data and that multiple actors may use the same storage in a different state. This is typically used in the lower layer muxes during data transfers to avoid memory copies (“zero-copy”) by aliasing the same data block by the reader and the writer. This has resulted in a performance increase for HTTP/2.

This was not an easy task but will bring a lot of benefits. Namely, as mentioned, it paves the way for easier implementation of end-to-end HTTP/2. It also simplifies several other things including the internal API, handling of error messages, and rendering the Stats page.

Connection Management

The connection management in HAProxy 1.9 received some big improvements. The new implementation has moved from a callback-oriented model to an async events model with completion callbacks. This new design will be extremely beneficial and reduce the amount of bugs that can appear within the connection layer.

Some of the benefits of the new design include: lower send() latency (it almost never polls), fewer round-trips between layers (better I-cache efficiency), straight-forward usage within the upper layers, and eliminating code duplication and providing granular error reporting within the lower layers. It will also provide the ability to retry failed connections using a different protocol (e.g. switch between HTTP/2 and HTTP/1 if ALPN indicates support for both and a failure happens on one of them).

The http-reuse directive now defaults to safe if not set. This means that the first request of a session to a backend server is always sent over its own connection. Subsequent requests may reuse other existing, idle connections. This has been the recommended setting for several years and it was decided that it was time to make it the default.

In addition, HAProxy now provides connection pooling. Idle connections between HAProxy and the server are no longer closed immediately if the frontend connection vanishes. They remain open on the server to be used by other requests.

Native HTTP Representation (HTX)

While researching the route needed to support future versions of HTTP, it was decided that the internal handling of HTTP messages required a redesign. Previously, HTTP messages were kept as a byte stream as it appears on the wire, and the disparate tasks of analyzing that data and processing it were mixed into a single phase.

The HTTP information was collected as a stream of bytes and manipulated using offsets. The main structure had two channels: the request/response and the HTTP transaction. The first channel buffered the request and response messages as strings. The HTTP transaction channel had two states: response/request and another with all of the offsets for the headers.

With everything stored as offsets, when it came to adding, removing and rewriting HTTP data, things became quite painful, constantly requiring the movement of the end of the headers and even, possibly, the HTTP body in the case of responses. Over time, the need for header manipulation has increased with cookies, keep-alive, compression and cache, making this task expensive.

The new design, which we call HTX, creates an internal, native representation of the HTTP protocol(s). It creates a list of strongly typed, well-delineated header fields that support gaps and out-of-order fields. Modifying headers now simply consists in marking the old one deleted and appending the new one at the end.

This provides easy manipulation of any representation of the HTTP protocol, allows us to maintain HTTP transport and semantics from end-to-end, and provides higher performance when translating HTTP/2 to HTTP/1.1 or HTTP/1.1 to HTTP/2. It splits analyzing from processing so that, now, the analysis and formatting happen in the connection layer and the processing happens in the application layer.

Since we’re performing additional testing, HTX is not yet enabled by default. Enable it by using the following option in a defaults, frontend, backend or listen section:

Once turned on, you can use HTTP/2 to your backend servers. Add alpn h2 to a server line (or alpn h2,http/1.1 if you prefer to let HAProxy negotiate the protocol with the server).

Here is a full frontend + backend displaying end-to-end HTTP/2:

HAProxy 1.9 also supports the proto h2 directive which allows HAProxy to communicate using HTTP/2 without TLS, such as to HTTP/2-enabled backends like Varnish and H2O. You can enable this with the following server configuration:

Improved Threading

Significant improvements were made to the threading in 1.9. These changes allow HAProxy to offer its superior performance. To achieve this there was a rework of the task scheduler. It now divides its work into three levels:

  • a priority-aware level, shared between all threads
  • a lockless, priority-aware level; one per thread
  • a per-thread list of already started tasks that can be used for I/O

This results in most of the scheduling work being performed without any locks, which scales much better. Also, an optimization was made in the scheduler regarding its wait queues. They are now mostly lock free. The memory allocator became lockless and uses a per-thread cache of recently used objects that are still hot in the CPU cache, resulting in much faster structure initialization. The file descriptor event cache became mostly lockless as well, allowing much faster concurrent I/O operations. Last, the file descriptor (FD) lock has been updated so that it’s used less frequently. Overall, you should expect to see about a 60% performance gain when using HAProxy 1.9 with threading enabled.

Cache Improvements

We introduced the Small Object Cache in HAProxy 1.8. At the time, we knew it was only the beginning of a feature many have asked for: caching within the proxy layer. Internally, we referred to it as the favicon cache because it was limited to caching objects smaller than tune.bufsize, which defaults to 16KB. Also, during that first version, it could only cache objects that returned a response code of HTTP 200 OK.

We’re happy to announce that, in HAProxy 1.9, you can now cache objects up to 2GB in size, set with max-object-size. The total-max-size setting determines the total size of the cache and can be increased up to 4095MB. We’re very excited about these changes and look forward to improving the cache even further in the future!

Early Hints (HTTP 103)

HAProxy now supports HTTP Status code 103, also known as Early Hints (RFC8297), which allows you to send a list of links to objects to preload to the client before the server even starts to respond. Still in early adoption, Early Hints is looking like it may replace HTTP/2 Server Push.

A few elements make Early Hints an improvement over Server Push. They are as follows:

  • Server Push can accelerate the delivery of resources, but only resources for which the server is authoritative. In other words, it must follow the same-origin policy, which in some cases hinders the usage of a CDN. Early Hints can point directly to a CDN-hosted object.
  • Early Hints can give the browser the opportunity to use a locally-cached version of the object. Server Push requires that the request be transmitted to the origin regardless of whether the client has the response cached.

To enable the use of Early Hints you would add something similar to the following to your HAProxy configuration file:

While many browsers are still working to support this new feature, you can be sure that HAProxy will be at the forefront when it comes to providing enhancements that improve the browsing experience of your site.

Runtime API Improvements

We’ve updated the Runtime API. The first change modifies the master/worker model to support easier interaction with the workers and better observability into the processes. First, the master now has its own socket that can be used to communicate with it directly. This socket can then manage communication with each individual worker, even those that are exiting.

To begin using this new feature, HAProxy should be launched with the -W and -S options.

Then connect to the Runtime API via the master socket, like so:

The new show proc command displays the uptime of each process.

The new reload command reloads HAProxy and loads a new configuration file. It is exactly the same as sending a SIGUSR2 signal to the master process, except that it can be triggered by an external program after a new configuration file has been uploaded.

From the master socket, commands can be sent to each individual worker process by prefixing the command with an @ sign and the worker’s number. Here’s an example of how you would issue show info to the first worker process:

We’ve also added payload support, which allows you to insert multi-line values using the Runtime API. This is useful for updating map files, for example. At the moment, TLS certificate updating through the Runtime API is not supported, but stay tuned for HAProxy 2.0!

To update a map file using a payload, you would get the ID of the map that you want to update and then use add map to append new lines, separating lines with \n:

You can also append the contents of a file, like so:

HAProxy can already do OCSP stapling, in which the revocation status and expiration date of a certificate is attached to the TLS certificate. This saves the browser from having to contact the certificate vendor itself to verify. The new payload support allows you to more easily update OCSP files without reloading HAProxy.

First, you’d generate an .ocsp file for the certificate using the openssl ocsp command. Once you have the .ocsp file you can issue the following command, which will use the Runtime API with payload support to update within the running process:

The script below shows a complete example for automating this process:

A new show activity command has also been added to the Runtime API. It shows for each thread the total CPU time that was detected as stolen by the system, possibly in other processes running on the same processor, or by another VM shared by the same hypervisor. It also indicates the average processing latency experienced by all tasks, which may indicate that some heavy operations are in progress, such as very high usage of asymmetric cryptography, or extremely large ACLs involving thousands of regular expressions.

Similarly, CPU time and latency values can be reported in logs when profiling is enabled in the global section or enabled using the Runtime API. This helps indicate which TCP/HTTP requests cost a lot to process and which ones suffer from the other ones. To enable profiling within the global section, you would add:

Optionally, to set it using the Runtime API:

To verify that it’s been enabled:

Profiling exposes the following fetches which can be captured within the HAProxy log:

Fetch method Description
date_us The microseconds part of the date.
cpu_calls The number of calls to the task processing the stream or current request since it was allocated. It is reset for each new request on the same connection.
cpu_ns_avg The average number of nanoseconds spent in each call to the task processing the stream or current request.
cpu_ns_tot The total number of nanoseconds spent in each call to the task processing the stream or current request.
lat_ns_avg The average number of nanoseconds spent between the moment the task handling the stream is woken up and the moment it is effectively called.
lat_ns_tot The total number of nanoseconds between the moment the task handling the stream is woken up and the moment it is effectively called.

To use these in the logs, you would either extend the default HTTP log-format, like so:

Or, extend the default TCP log-format:

Server Queue Priority Control

HAProxy 1.9 allows you to prioritize some queued connections over others. This can be helpful to, for example, deliver JavaScript or CSS files before images. Or, you might use it to improve loading times for premium-level customers. Another way to use it is to give a lower priority to bots.

Set a higher server queue priority for JS or CSS files over images by adding an http-request set-priority-class directive that specifies the level of importance to assign to a request. In order to avoid starvation caused by a contiguous stream of high-priority requests, there is also the set-priority-offset directive which sets an upper bound to the extra wait time that certain requests should experience compared to others. When you combine this with ACL rules, you gain the flexibility to decide when and how to prioritize connections.

Lower numbers are given a higher priority. So, in this case, JavaScript and CSS files are given the utmost priority, followed by images, and then by everything else.

Random Load Balancing Algorithm

We’ve added a new random load-balancing algorithm. When used, a random number will be chosen as the key for the consistent hashing function. In this mode, server weights are respected. Dynamic weight changes take effect immediately, as do new server additions. Random load balancing is extremely powerful with large server fleets or when servers are frequently added and removed. When many load balancers are used, it lowers the risk that all of them will point to the same server, such as can happen with leastconn.

The hash-balance-factor directive can be used to further improve fairness of the load balancing by keeping the load assigned to a server close to the average, which is especially useful in situations where servers show highly variable response times.

To enable the random load-balancing algorithm, set balance to random in a backend.

We’re constantly looking to improve our load-balancing algorithms and hope to unveil even more options soon!

Cloud-Native Logging

HAProxy has had the ability to log to a syslog server. However, in microservice architectures that utilize Docker, installing syslog into your containers goes against the paradigm. Users have often asked for alternative methods for sending logs. We’ve received this request quite a bit and have spent some time planning the best way to implement it—without blocking—and we’re pleased to announce that we’ve found a solution!

When using HAProxy 1.9, you will now be able to take advantage of three new ways to send logs: send them to a file descriptor, to stdout, or to stderr. These new methods can be added using the standard log statement.

To enable logging to stdout, use the stdout parameter:

The same can be done for stderr. An alternative way to do that is to log to a file descriptor as shown:

The fd@1 parameter is an alias for stdout and fd@2 is an alias for stderr. This change also comes with two new log formats: raw (better for Docker) and short (better for systemd).

New Fetches

Fetches in HAProxy provide a source of information from either an internal state or from layers 4, 5, 6, and 7. New fetches that you can expect to see in this release include:

Fetch method Description
date_us The microseconds part of the date.
cpu_calls The number of calls to the task processing the stream or current request since it was allocated. It is reset for each new request on the same connection.
cpu_ns_avg The average number of nanoseconds spent in each call to the task processing the stream or current request.
cpu_ns_tot The total number of nanoseconds spent in each call to the task processing the stream or current request.
lat_ns_avg The average number of nanoseconds spent between the moment the task handling the stream is woken up and the moment it is effectively called.
lat_ns_tot The total number of nanoseconds between the moment the task handling the stream is woken up and the moment it is effectively called.
srv_conn_free / be_conn_free Determine the number of available connections on server/backend.
ssl_bc_is_resumed Returns true when the back connection was made over an SSL/TLS transport layer and the newly created SSL session was resumed using a cached session or a TLS ticket.
fe_defbe Fetch frontend default backend name.
ssl_fc_session_key / ssl_bc_session_key Return the SSL master key of the front/back connection.
ssl_bc_alpn / ssl_bc_npn Provides the ALPN and the NPN for an outgoing connection.
prio_class Returns the priority class of the current session for http mode or the connection for tcp mode.
prio_offset Returns the priority offset of the current session for http mode or the connection for tcp mode.

New Converters

Converters allow you to transform data within HAProxy and are usually followed after a fetch. The following converters have been added to HAProxy 1.9:

Converter Description
strcmp Compares the contents of <var> with the input value of type string.
concat Concatenates up to three fields after the current sample which is then turned into a string.
length Return the length of a string.
crc32c Hashes a binary input sample into an unsigned, 32-bit quantity using the CRC32C hash function.
ipv6 added to “ipmask” converter Apply a mask to an IPv4/IPv6 address and use the result for lookups and storage.
field/word converter extended Extended so it’s possible to extract field(s)/word(s) counting from the beginning/end and/or extract multiple fields/words (including separators).

Miscellaneous Improvements

Other, miscellaneous improvements were added to this version of HAProxy. They include:

  • New stick table counters, gpc1 and gpc_rate, are available.
  • The resolvers section now supports resolv.conf.
  • busy-polling – allows reduction of request processing latency by 30 – 100 microseconds on machines using frequency scaling or supporting deep idle states.
  • The following updates were made to the Lua engine within HAProxy:
    • The Server class gained the ability to change a server’s maxconn value.
    • The TXN class gained the ability to adjust a connection’s priority within the server queue.
    • There is a new StickTable class that allows access to the content of a stick-table by key and allows dumping of the content.

Regression Test Suite

Regression testing is an extremely important part of releasing quality code. Being able to create tests that cover a wide range of code is powerful in not only preventing past bugs from being reintroduced but also helps in detecting any new ones.

Varnish ships with a tool named varnishtest that’s used to help do regression testing across the Varnish codebase. After reviewing this tool we found it to be the perfect candidate for HAProxy-specific tests. We worked with the Varnish team and contributed patches to varnishtest that allow it to be extended and used with HAProxy.

We’ve also begun creating and shipping tests with the code that can be run within your environment today. The tests are quite easy to create once you have an understanding of them. So, if you are interested in contributing to HAProxy but don’t know where to start, you might want to check them out and try creating your own tests!

To begin using the regression testing suite, you will want to install varnishtest, which is provided with the Varnish package. Once that has been installed, you will want to create a test vtc file. Here is a sample:

To run this test, you would set the HAPROXY_PROGRAM environment variable to the path to the binary you’d like to test. Then call varnishtest.

HAProxy 2.0 Preview

HAProxy 1.9 will allow us to support the latest protocols and features that are becoming a necessity in the rapidly evolving technology landscape. You can expect to see the following features in HAProxy 2.0, which is scheduled to be released in May 2019:

  • HAProxy Data Plane API
  • gRPC
  • Layer 7 Retries
  • FastCGI integration
  • Circuit Breaking
  • Separation of TLS certificates from private keys
  • Ability to update TLS certificates and private keys using the Runtime API

Stay tuned, as we will continue to provide updates the closer we get to our next release!

Conclusion

HAProxy remains at the forefront of performance and innovation because of the commitment of the open-source community and the staff at HAProxy Technologies. We’re excited to bring you this news of the 1.9 release!

It paves the way for many exciting features and begins a new chapter in which you’ll see more frequent releases. It immediately brings support for end-to-end HTTP/2, improved buffering and connection management, updates to the runtime API and Small Object Cache, a new random load balancing algorithm, and even better observability via the runtime API and new fetch methods.

You will quickly see many of these advancements in HAProxy Enterprise as we backport them to the pending HAProxy Enterprise 1.8r2 release. Our philosophy is to always provide value to the open-source community first and then rapidly integrate features into the Enterprise suite, which has a focus on stability. You can compare versions on the Community vs Enterprise page.

Want to stay in the loop about content like this? Subscribe to our blog or follow us on Twitter. You can also join us on Slack. HAProxy Enterprise combines HAProxy with enterprise-class features and premium support. Contact us to learn more or sign up for a free trial today!

The post HAProxy 1.9 Has Arrived appeared first on HAProxy Technologies.

]>
https://www.haproxy.com/blog/haproxy-1-9-has-arrived/feed/ 4
Building a Service Mesh with HAProxy and Consul https://www.haproxy.com/blog/building-a-service-mesh-with-haproxy-and-consul/ https://www.haproxy.com/blog/building-a-service-mesh-with-haproxy-and-consul/#respond Tue, 11 Dec 2018 11:17:36 +0000 https://www.haproxy.com/?p=211351 HashiCorp added a service mesh feature to Consul, its service-discovery and distributed storage tool. In this post, you’ll see how HAProxy is the perfect fit as a data plane for this architecture. HAProxy is no stranger to the service mesh scene. Its high performance, low resource usage, and flexible design allows it to be embedded […]

The post Building a Service Mesh with HAProxy and Consul appeared first on HAProxy Technologies.

]>

HashiCorp added a service mesh feature to Consul, its service-discovery and distributed storage tool. In this post, you’ll see how HAProxy is the perfect fit as a data plane for this architecture.

HAProxy is no stranger to the service mesh scene. Its high performance, low resource usage, and flexible design allows it to be embedded within various types of service-oriented architectures. For example, when Airbnb needed to scale its infrastructure to support a growing number of distributed services, it developed SmartStack. SmartStack is a service mesh solution that relies on instances of HAProxy relaying communication between services. Using HAProxy allowed SmartStack to take advantage of advanced load-balancing algorithms, traffic queuing, connection retries, and built-in health checking.

HAProxy Technologies is working with HashiCorp to bring you a Consul service mesh that utilizes HAProxy as a data plane. This will allow you to deploy the world’s fastest and most widely used software load balancer as a sidecar proxy, enabling secure and reliable communication between all of your services.

In Consul 1.2, HashiCorp released Connect, which is a feature that allows you to turn an existing Consul cluster into a service mesh. If you’re familiar with Consul, you’ll know it for its distributed key/value storage and dynamic service discovery. With the addition of Connect, you can register sidecar proxies that are colocated with each of your services and relay traffic between them, creating a service mesh architecture. Best of all, it’s a pluggable framework that allows you to choose the underlying proxy layer to pair with it.

With Connect, HAProxy can be combined with the Consul service mesh too. This provides you with many advantages when designing your architecture. First, let’s take a step back and cover what a service mesh is and why it benefits you to use one.

Why Use a Service Mesh?

Why would you use a service mesh anyway? It comes down to the challenges of operating distributed systems. When your backend services are distributed across servers, departments, and data centers, executing the steps of a business processes often involves plenty of network communication. However, if you’ve ever had the responsibility of managing such a network, L. Peter Deutsch’s Fallacies of Distributed Computing will ring true. The fallacies are:

  • The network is reliable.
  • Latency is zero.
  • Bandwidth is infinite.
  • The network is secure.
  • Topology doesn’t change.
  • There is one administrator.
  • Transport cost is zero.
  • The network is homogeneous.

Obviously, the network is not always reliable. Topology does change. The network is not secure by default. How can you manage these risks? Ultimately, you want to apply mechanisms that addresses these issues without adding tons of complexity into every service that you own. In essence, each application shouldn’t need to be aware that a network exists at all.

That is why service meshes are becoming so popular. They abstract away the network from the application code. A service mesh is a design pattern where, instead of having services communicate directly to one another or directly to a message bus, requests are first passed to intermediary proxies that relay messages from one service to another.

The benefit is that the proxies can support functions for dealing with an unruly network that all services need, but which are more easily handled outside of the services themselves. For example, a proxy could retry a connection if it fails the first time and secure the communication with TLS encryption. You end up separating network functionality from application functionality.

The network-related problems that we mentioned before are mitigated by capabilities within HAProxy. For example, the solutions include:

  • The network is not reliable = Retry logic for connecting to services
  • Bandwidth is not infinite = Rate limiting to prioritize certain traffic or prevent overuse
  • Topology changes = Consistent endpoints, such as always communicating with localhost, service discovery, and DNS resolution at run time
  • The network is not secure = Truly end-to-end encryption with mutual TLS authentication
  • There is never just one administrator = Authorizing which services are allowed to connect to which other services
  • Transport cost is not zero = Observability of service traffic

Let’s see how the pieces fit together.

Control Plane and Data Plane

How does Consul relate to HAProxy? What are the responsibilities of each? Think of the whole thing like a real-time strategy video game. One of my favorites is Starcraft. In this game, you, the player, control an army of workers whose sole mission in life is to await their marching orders from you and then execute them. For example, you could tell them to go mine for minerals or explore a new area of the map and they’ll happily go off and do what’s been asked of them.

Compared to a service mesh, you, the player, represent Consul. Consul gives the marching orders to all of the proxies under its influence. It provides each one with the configuration details it needs: information about where other services can be found, which port to listen on for incoming requests, which TLS certificate to use for encryption, and which upstream services the local service depends on. In service mesh terminology, Consul is the control plane.

HAProxy, the Advanced Proxy

The proxies take on a more drone-like approach to life. You don’t need to configure each instance of HAProxy individually. They are responsible for querying the central Consul registry to get their configuration information on their own. This proxy layer is called the data plane. Of course, they are just as important as your video game workers. The proxy technology that you choose determines the capabilities your service mesh will have. Will it be able to encrypt communication? Will it have logic for reconnecting to a service if the first attempt fails?

HAProxy gives you features that a distributed architecture requires. It is the product of the hard-working open-source community and has become known as the fastest and most widely used software load balancer in the world. You get TLS encryption, connection retry logic, server persistence, rate limiting, authorization between services, and observability all out-of-the-box and in a lightweight process.

The Components

The key pieces of your service mesh will include the following:

  • Your service
  • A Consul agent, running locally, that your service registers with
  • The proxy, which is registered as a sidecar
  • A quorum of Consul agents that are in server mode, responsible for managing the service registry and configuration information

Let’s cover these components in more detail.

Your Service (aka your business logic)

Your service is at the heart of the design. It exposes functionality over a TCP port and, optionally, relies on other distributed services. The goal is to minimize the ways in which you need to change it in order to fit into the service mesh. Ideally, it should continue on as it always has, oblivious to the topology of the outside network. Consul gives you this separation of concerns.

Like in our video game analogy, let’s say that we’re talking about a service that mines for minerals. However, maybe it needs to talk to the map service to find out where to start working. Ordinarily, you would need to configure it with a known address or DNS name of the map service. If these settings ever changed, your mining service’s configuration also had to change to point to the new endpoint.

With Consul, service discovery allows you to query a central registry to find out where services live. However, with Connect, it gets even better. A local instance of HAProxy is created next to the service so that it’s listening on localhost. Then, your service always queries it as if it were the remote map service. This is known as a sidecar proxy: Each service gets its own local proxy and, through the relaying of messages, can reach other remote services as if they too were local. Now, you can point your local service’s configuration at localhost and never need to change that endpoint again.

Consul Agent

Also local to your service is a Consul agent. An agent is the Consul executable running in regular agent mode, as opposed to server mode. You register the local service with this agent by adding a JSON file to the /etc/consul.d folder. This gets synced to the server-mode agents.

Think of it like a walkie-talkie to the agents that are running in server mode, which hold the source of truth. The registration gets sent up and saved to the global registry. Then, when HAProxy wants to discover where it can find services on the network, it asks its local Consul agent and that agent pulls the information back down. Afterwards, it gives the answer to your proxy.

The same goes for the upstream services you’re calling. They each get their own local Consul agent. They register with it so that others can find them and they also tell Consul about any services that they, in turn, depend on. All of this information is used to configure the proxies so that in the end, all services talk to localhost and all communication becomes proxy-to-proxy.

The HAProxy Sidecar

Next to each service and Consul agent is a sidecar proxy, which is an instance of HAProxy. You don’t configure it directly. Instead, you install it with a specialized handler that queries the Consul agent to know which upstream services your local service depends on. Then, HAProxy sets up listeners on localhost so that your application can talk to the remote endpoints that it needs to, but without needing to know exactly where those endpoints live. In essence, you’re abstracting away the network.

A benefit to routing traffic through a local proxy is that the proxy can enforce fine-grained authorization rules. Consul lets you define intentions, which are rules that govern whether one service can talk to another. At runtime, HAProxy queries the local Consul agent to check if an incoming connection is allowed.

To review, your service registers with its local Consul agent information about itself and the upstream services on which it depends. That agent sends that information up to the Consul servers, which maintain the central registry. The local instance of HAProxy then asks the local Consul agent for configuration information and the agent pulls back the data that the Consul servers have compiled.

HAProxy then configures itself, listening for incoming requests to the local service and also for outgoing requests that the service makes to other services. The HAProxy handler continues to check for changes to service registrations and updates itself when needed.

In the end, all service-to-service communication ends up going through proxies. You can see why it’s important to choose a proxy implementation that meets your needs. When it comes to speed and the right mix of features, HAProxy has a lot of benefits.

Server-Mode Agents

You’ve probably got a good idea about how the Consul agents that are running in server mode fit into this architecture. They maintain the service registry, which is the source of truth about where each service endpoint can be found. They also store the settings that each proxy needs to self-configure itself, such as TLS certificates. The Consul servers host the Consul HTTP API that local agents use to send and receive information about the cluster.

Agents in server mode must elect a leader using a consensus protocol that is based on the Raft algorithm. For that reason, you should dedicate an odd number of nodes, such as three, to participate in the quorum. These agents should reside on separate machines, if possible, so that your cluster has resiliency.

The Implementation

Baptiste Assmann presented a solution for integrating HAProxy with Consul at this year’s HashiConf conference in San Francisco.

During his presentation, he demonstrated using HAProxy as a sidecar that’s configured using information from the local Consul agent. It uses a Golang binary as the handler for configuring HAProxy. That will be available in Q1 of 2019.

In the meantime, he has reproduced its behavior using Bash. The HAProxy/Consul Github repository uses a Bash script to configure local instances of HAProxy. You can spin up this demo environment using Docker Compose. To get started, download the repository and then run the following commands:

You will then be able to open a browser and go to http://localhost:8500/ui to see the Consul dashboard. You’ll need to go to the ACL screen first and enter the master token secret: “mastertoken”, which we’ve set in consul-server/consul.d/basic_config.

Notice how a *-sidecar-proxy service has been generated for the two services we’re creating, redis and www. The www app is a Node.js application that connects to redis via the service mesh. It’s able to connect to Redis on localhost and the connection is routed to the right place.

For this example, each service is hosted inside of a Docker container. Within each container is the service, a local Consul agent, a running instance of HAProxy, and a script called controller.sh that configures HAProxy on the fly. There’s also a Lua file, authorize.lua, that validates the connections between services by checking the client certificate passed with each request. The shell script and Lua file are the same for both services.

They have different start.sh files though, which Docker executes on container startup. The start.sh script installs the service, which is the Node.js application for www and Redis for the redis container, and then registers it with the local Consul agent. Here is the JSON registration for www:

The Node.js app listens on port 8080 locally, but is exposed through the service mesh on an automatically assigned port chosen by Consul. There’s a dependency on the upstream Redis service. HAProxy binds it locally to port 6379, but proxies it to the Redis container on a port chosen by Consul. In this way, the Node.js app can access Redis at 127.0.0.1:6379.

Also note that a configuration parameter called unsecured_bind_port allows you to access the app from outside of the service mesh. So, you can go to http://localhost:21002 on the machine where you’re running Docker. Here’s a screenshot of what it looks like:

In the example code, we’ve enabled Consul ACLs with a default deny rule. Before the www app can connect to Redis, you must add an Intention. Intentions are rules that allow or deny connections between services. You could allow the Node.js app to connect to Redis by adding a new Intention:

Now, the app will succeed when it tries to read from or write to Redis:

After you remove the Intention, it will fail again due to the default deny set up by the ACL.

You can also see the HAProxy Stats page for the app by going to http://localhost:1936. This is great for troubleshooting connection problems.

To extend this example, add more containers, patterning them off of the given Dockerfiles. Then, update your start.sh file to install your application and register it with Consul. Last, add the new service to the docker-compose.yml file.

Conclusion

Consul’s Connect feature enables you to transform a Consul cluster into a service mesh. Connect is a pluggable framework that allows you to choose the proxy technology that fits your needs best.

In this blog post, you learned how HAProxy can be selected as the data plane, giving you access to features like TLS encryption, connection retry logic, server persistence, rate limiting, authorization between services, and observability. I hope you’re as excited as we are about the possibilities that this creates! In the coming months, we will be releasing more information about our integration with Consul and the new Golang implementation.

Want to stay in the loop about blog posts like this? Subscribe to our blog or follow us on Twitter. Want to see what HAProxy Enterprise has to offer? Contact us to learn more or sign up for a free trial today! Our expert support engineers have helped some of the largest companies in the world implement service-oriented architectures.

The post Building a Service Mesh with HAProxy and Consul appeared first on HAProxy Technologies.

]>
https://www.haproxy.com/blog/building-a-service-mesh-with-haproxy-and-consul/feed/ 0