Sanitizing HTTP/1: a technical deep dive into HAProxy’s HTX abstraction layer

HTTP/1.1 is a text-based protocol where the message framing is mixed with its semantics, making it easy to parse incorrectly. The boundaries between messages are very weak because there is no clear delimiter between them. Thus, HTTP/1.1 parsers are especially vulnerable to request smuggling attacks.

In older HAProxy versions, HTTP/1.1 parsing was performed "in-place" on top of the raw TCP data. It was not an issue while connection keep-alive and massive header manipulations were not implemented/supported. However, when reusing connections to process several requests and adding more and more header manipulations (it is not uncommon to see configurations with hundreds of http-request rules), performance and security became concerns. In addition, supporting HTTP/2 raised compa…

To be performant, secure, and future-proof, a new approach had to be envisioned. This is what we achieved by using our own internal HTTP representation, called the HTX.

HTX is the internal name for the HTTP abstraction layer in HAProxy. It serves as the interface between the low-level layers of HAProxy, particularly the one responsible for parsing and converting different HTTP versions, the HTTP multiplexers, and the application part. It standardizes the representation of messages between the different versions of HTTP.

Thanks to its design, almost all attacks that have appeared on HTTP/1 since the arrival of HTX have had no effect on HAProxy.

How HAProxy handled HTTP/1 before HTX

Originally, HAProxy was a TCP proxy and L4 load balancer. The HTTP/1 processing was added on top of it to be light, handle one request per connection, and only perform a few modifications. Over time, the trend has changed, and using HAProxy as an HTTP/1 proxy/load balancer has become the main usage, with more complex configurations and increasingly expensive HTTP processing.

To protect HAProxy and servers behind it from attacks against the HTTP/1.1 protocol, costly and complex manipulations were mandatory, making processing even more expensive and the maintenance harder. The limits of the pre-HTX model were reached, mainly because of its design:

We directly received HTTP/1 from the socket into a buffer, and on output, we directly emitted this buffer to another socket.

The buffer therefore contained requests with all their flaws and variations (extra spaces, etc). The start and end of headers were indexed on the fly, and the various analyzers had to account for all possible variations (upper/lower case header names, spaces after the `:`, spaces at the end of the line, lone line feeds (`LF`) instead of carriage return line feed (`CRLF`), forbidden characters).

Rewriting a header required anticipating the change in size (up or down), and potentially deleting an eventual `CR` and an `LF` if a header was deleted, or inserting one if a header was added. These modifications also required updating the header index so that subsequent analyzers remained well synchronized. Some rewrites could insert CRLFs haphazardly ("hacks"), resulting in headers not being detected by subsequent stages because they were not indexed. The same applied to checks.

Data brought up as chunks appeared in raw format with the chunk, including optional extensions, so it was not possible to perform simple string search processing without having to account for chunking. This is notably why the `http-buffer-request` directive only processed the first chunk.

For very light and historical uses (with minimal header consideration, a model closer to early HTTP/1.0), this operation was relatively optimal since everything received was sent back with very little analysis.

With the arrival of keep-alive, which required parsing many headers, performing many more checks (Content-Length vs. Transfer-Encoding, host vs. authority, connection, upgrade, etc.), and making even more changes (adaptation between close vs. keep-alive sides), the simplicity of the original model became a liability. All advanced parsing work had to be redone at each processing stage, often multiple times per request and response.

How HAProxy handled HTTP/2 before HTX

With the arrival of HTTP/2, HAProxy was faced with a completely different paradigm. A text-based protocol with no real framing for HTTP/1 against a binary-based protocol with a well-defined framing for HTTP/2. The main challenge was to find a way to add HTTP/2 support while making it compatible with the HTTP processing stack of HAProxy.

The HTTP/2 support was originally implemented as a protocol conversion layer between internal HTTP/1 and external HTTP/2. However, this raised several security issues due to the ambiguities of this conversion. But this also came with an unfortunate extra cost. The model was as follows:

On input, a block of HTTP/2 data was received and decoded. HEADERS frames were decompressed via the HPACK algorithm and produced a list of headers as (`name`, `value`) pairs. This list was then used to fabricate an HTTP/1.1 request by combining the method, URI, and adding the headers with `:` after the names and CRLFs after the values. This already posed several problems, because H2 is binary-transparent, meaning it is technically possible to encode `:` and `CRLF` in field values, making header injection possible if not enough care was taken.

The reconstructed URL was generally an absolute URL because the elements provided by the client (method, scheme, authority, path) were concatenated, and URL matchings in the configurations no longer worked (e.g., `url_beg`/`static`).

Data received in DATA frames resulted in HTTP chunks if the Content-Length was not announced. Similarly, it was difficult to analyze request chunks when needed (e.g., impossible to perform state modifications in HTTP/2 on the stats page, which used POST).

On output, the HTTP/2 converter had to re-parse HTTP/1.1 headers to fabricate a HEADERS frame. The parser used was simpler because it was assumed that HAProxy could be trusted to send valid protocol, but this caused many problems with error messages (`errorfiles`), responses forged in Lua, and the cache, which could sometimes store content that had undergone few protocol checks. Furthermore, to read the data, the parser also had to account for chunking and emit a DATA frame for each chunk. As a result, an HTTP/1 response sent back over HTTP/2 had to be parsed twice: once by the HTTP/1 parser for analysis, and a second time by the HTTP/1 parser integrated into the HTTP/2 converter.

The HTTP/2 converter also had to infer the correct operating mode for the response based on the `Connection` header and the announced HTTP version (HTTP/1.0 vs HTTP/1.1). It also struggled with edge cases involving `Transfer-Encoding` combined with `Content-Length`, as well as cases with no announced length where closing the server-side connection signaled the end of the response. Trailers were not supported because they were too risky to implement in this conversion model.

HTTP/2 was not implemented on the server side because of the increase in special cases to handle and the difficulty in converting these responses into valid HTTP/1. Furthermore, it quickly became clear that with this model, it was impossible to maintain a correct level of performance by doing end-to-end H2 because it required four conversions for each exchange.

The HTX: common internal representation for different HTTP versions

The "legacy" model having shown its limits, we logically decided to undertake a transition towards a more rational model. Continuing to base all internal operations on HTTP/1 was clearly a hindrance to the adoption of HTTP/2 and any other future versions of HTTP. The HTX was born from this thinking: to achieve a common internal representation for all current or future versions of HTTP.

If we take the previous example, here is how it happens today to transmit a request from an H2 client to an HTTP/1.1 server:

If we look more closely at what happens on the H2-to-HTX conversion side, we now obtain a structured message that has nothing to do with the previous HTTP/1 version:

It is at the moment of sending the request to the server that the conversion to HTTP/1.1 occurs, and this is done using standardized information:

How HTX works: a technical deep-dive

The HTX is a structured representation of an HTTP message, intended to serve as a common foundation for all versions of HTTP within HAProxy. Internally, an HTX message is stored in a buffer. It consists of a part containing information about the message, the metadata, followed by a set of blocks containing a portion of the message resulting from parsing. This eliminates formatting differences tied to specific HTTP versions and standardizes the data. For example, header names are stored in lowercase, and spaces at the beginning and end of header values are removed, as are CRLFs at the end of the header.

Because an HTX structure is limited to the size of a buffer, only part of a large HTTP message may be present in HTX at any one time. An HTX message can be thought of as a pipe: as parsing progresses, new blocks are appended; HAProxy processes them (header rewriting, body compression, etc.); and they are then removed on the output side to be formatted and sent to the remote peer.

Organization of HTX blocks

HTX blocks are stored in a contiguous memory area, an array of blocks. Each block is divided into two parts. The first part, the block index, contains information related to the block: its type, its size, and the address of its content in the array. These indexes are stored starting from the end of the block array. The second part, meanwhile, contains the block data and is stored starting from the beginning of the array.

The block indexes remain ordered and stored linearly. We use a positive position to identify a block index. This position can then be converted into an address relative to the beginning of the block array.

"head" is the position of the oldest block index

"tail" is that of the newest

The part corresponding to the block data is a memory space that can "wrap" and is located at the beginning of the block array. The block data is not directly accessible; one must go through the index of the corresponding block to know the address of its data, relative to the beginning of the block array.

When the free space between the index area and the data area is too small to store the data of a new block, we restart from the beginning to find free space. The advantage of managing block data as a circular memory area is to optimize the use of available space when blocks are manipulated — for example, when a header is deleted — or when the blocks are simply consumed to be formatted and sent to the remote peer.

However, this sometimes requires a defragmentation step when the distribution of blocks becomes too fragmented, and it is necessary to recover free space to continue processing. Concretely, the data can be arranged in two different ways:

Contiguous and ordered: with two possible free spaces — before and after the data. To preserve the order of the blocks as much as possible, additions are made primarily at the end of the array. The gaps between the blocks are not directly reusable:

Scattered and unordered: where the only usable space for inserting new blocks is located after the most recent data:

Defragmentation is necessary when the usable free space becomes too small. In this case, the block data are realigned at the beginning of the array to obtain the largest possible contiguous free space. The gaps between the block data are thus recovered. During defragmentation, unused block indexes are erased, and the index array is also defragmented.

Structure of HTX block indexes

A block index contains the following information about the block:

A 32-bit field: 4 bits for the block type and 28 bits for the size of the block data.

The address of the block data, on 32 bits, relative to the beginning of the block array.

Block types overview

Among the different types of HTX blocks, we find the elements that constitute an HTTP message:

A start-line: (method + uri) for a request, or (status + reason) for a response.

A header or a trailer: a (name + value) pair.

Data.

There are also internal block types used to mark the end of headers and trailers, or to mark a block as unused (for example, when it has been deleted).

Encoding details

Depending on the type, the information of a block will be encoded differently:

0b 0000 0000 0000 0000 0000 0000 0000 0000
---- ------------------------ ---------
type  value length (1MB max)  name length (256B max)

Start-line or data block:

0b 0000 0000 0000 0000 0000 0000 0000 0000
---- ----------------------------------
type      data length (256 MB max)

0b 0000 0000 0000 0000 0000 0000 0000 0001
---- ----------------------------------
type         always set to 1

Unused:

0b 0000 0000 0000 0000 0000 0000 0000 0000
---- ----------------------------------
type         always set to 0

Block ordering

An HTX message is typically composed of the following blocks, in this order:

A start-line (request or response) 1.

Zero or more header blocks 1.

An end-of-headers marker 1.

Zero or more data blocks (HTTP) 1.

Zero or more trailer blocks 1.

An end-of-trailers marker (optional, but always present if there is at least one trailer block) 1.

Zero or more data blocks (TUNNEL)

Responses with interim status codes

In the case of responses, when there are interim responses (1xx), the first three blocks can be repeated before having the final response (2xx, 3xx, 4xx, or 5xx). In all cases, whether for requests or responses, the start-line, headers, and end-of-headers marker always remain grouped. This is true at the time of parsing, but also at the time of message formatting. The same applies to trailers and the end-of-trailers marker.

Structure of HTX block data

HTX block data comes in several forms, each representing a specific part of an HTTP message. The following sections describe how each block type is structured and used.

The start-line

The start line is the first block of an HTX message. The data in this block is structured. In HTTP/1.1, it is directly extracted from the first line of the request or response. In H2 and H3, it comes from the pseudo-headers (:method, :scheme, :authority, :path). Furthermore, because this block is emitted after all message headers have been parsed, it also contains information about the message itself, in the form of flags. For example, it can indicate whether the message is a request or a response, whether a size was announced via the Content-Length header, and so on.

Headers and trailers

Header data and trailers are stored the same way in HTX. There are two different block types to simplify internal header processing, but from the HTX perspective, there is no real difference, apart from their position in the message. Header blocks always come after a start-line and before an end-of-headers marker, while trailers always come after any data and are terminated by an end-of-trailers marker. In both cases, it is a {name, value} pair.

Data

The message payload is stored as data blocks, outside of any transfer encoding. Thus, in HTTP/1.1, the chunked formatting disappears. The same blocks are also used to store data exchanged in an HTTP tunnel.

These are blocks without data. Depending on their type, they mark the end of the headers of a message or the end of the trailers. The end-of-headers marker is mandatory and marks the separation between the headers and the rest of the message. The end-of-trailers marker is optional and marks the end of the HTTP message and can potentially be followed by data exchanged in an HTTP tunnel.

The benefits of using HTX in HAProxy

Simplified and more secure manipulation

As header names are normalized and indexed, searching is straightforward; it is enough to iterate over the blocks and compare them without having to re-parse them. There are relatively few headers per request or response (very often less than ten per request, between one and two dozen per response), so any additional indexing would be superfluous. This simplifies operations by eliminating the need to preserve relative positions between all headers.

Rewrites are also well-controlled. The name of a header is normally not modified, and the value that needs modification (e.g., cookie modification) is already perfectly delimited and cleared of leading/trailing spaces and CRLFs. The API prevents insertion of forbidden characters such as NUL, CR, or LF. As a result, it is effectively impossible to leave CRLFs in a value accidentally, and each HTX header corresponds to exactly one outgoing header.

In HTTP/1, the CRLF at the end of each line is automatically added by the protocol conversion, so the user or analyzer working on HTX has no direct control. Except in cases of conversion bugs, this design makes it impossible to pass one header for another or inject portions of a request or response to cause smuggling. Risks of bugs in the analysis or application layer are minimized because this layer no longer needs to manage available space for modifications; HTX handles it automatically, making the API trivial to use.

Regarding the data, its delimitation is performed by the output converter. A block emitted in HTTP/1 will lead to the creation of a chunk, while the same block emitted in HTTP/2 will lead to one or more DATA frames. The total size of the converted data must match the announced size, otherwise an error is reported. Here too, the user has no control, so there is no risk of confusion about the output protocol. We again benefit from the intrinsic knowledge of the remaining data to be transmitted, avoiding the misinterpretation of boundary formatting that could lead to exploitation.

This design is why nearly all HTTP/1 attacks since HTX’s introduction have had no effect on HAProxy. A few early issues affected the first versions due to missing checks in the conversion stage (e.g., spaces in the name of an HTTP/2 or HTTP/3 method), but these had very limited impact.

Closer to RFCs

Modern HTTP specs (RFC911x) have been carefully separated by distinguishing the semantic part from the protocol part. Thus, all HTTP versions are subject to the same rules described in RFC9110, and each version also has its own constraints. For example, HTTP/1 describes how the body length of a message is determined based on the method used, the status code, the presence or absence of a Transfer-Encoding header, etc., while HTTP/2 and HTTP/3 do not have this last header. Conversely, HTTP/2 and HTTP/3 are not allowed to let a Connection header pass and have specific rules regarding mandatory headers depending on the methods, and which may also be subject to negotiations within the framework of optional extensions (e.g., RFC8441 to pass WebSocket over HTTP/2).

These types of checks were tedious to implement in the analysis layer because they all had to be subject to version checks to know what to verify. And errors are even easier on the response path because an HTTP/2 request transferred over HTTP/1 will lead to an HTTP/1 response that must be re-encoded in HTTP/2, but the error of relying on the request version rather than the response version for a given treatment is very quick to happen. All this code was therefore considered very sensitive and was subject to very few improvements for fear of making it fragile.

The HTX made it possible to report protocol checks to the ends, in the converters, and to leave only semantic checks in the analyzers, operating on the HTX representation. Thus, each protocol converter is more free with its checks and can stick to its own rules without the risk of others benefiting from it as a side effect.

More easily extensible

Adding support for new protocols only requires writing the new converters, potentially by drawing inspiration from another similar protocol. There is no longer any need to modify the analyzers or the core semantic layer. This is how HTTP/3 support on top of the QUIC layer was added so quickly—with fewer than 3000 lines of code, some of which came from the HTTP/2 implementation (and were later shared between them). Indeed, in practice, implementing a new protocol mostly comes down to writing the HTX transcoding code to/from this protocol and nothing else.

Support for the FastCGI protocol was introduced in much the same way, since this protocol is primarily just a different representation of HTTP. As a result, the codebase is now well positioned to accommodate new experimental versions of protocols while remaining maintainable. For example, when HTTP/4 eventually takes shape, the HTTP/3 code will probably be reused and adapted to form the beginning of HTTP/4, thus preserving a proven basis that can evolve alongside the protocol and be ready with a functional version once the protocol is ratified.

This approach keeps the focus on the essential, namely the protocol itself and its interoperability with the outside world rather than on the impacts it could have on the entire codebase. By comparison, the first implementation of HTTP/2, starting from an already ratified protocol, took more than a year to complete.

Conclusion

The HTX enables us, among other things, to free ourselves from the details related to the different versions of HTTP in the core of HAProxy, allowing the conversion layers to handle them on input and output. HAProxy is thus capable of enabling clients and servers to communicate regardless of the HTTP versions used on each side.

Ultimately, this abstraction immunizes HAProxy against numerous classes of bugs affecting HTTP/1. In HAProxy, HTTP/1 is not processed for analysis (rewriting, routing, or searching). Once the translation into HTX is done, HAProxy is no longer subject to HTTP/1 attacks. The focus is therefore concentrated solely on the conversion layer.

For example, a well-known request smuggling attack, which involves sending an HTTP/1 request with both a Content-Length header and a Transfer-Encoding header in order to hide a second request in the payload of the first is not possible in HTX, by design. Information related to data size is extracted from the message and stored as metadata.

While HAProxy cannot block every possible request smuggling attack (consisting of hiding one request inside another), it will not be vulnerable to them and will prevent a certain number of them from being sent to servers, precisely because with the switch to HTX, the HTTP message undergoes normalization, and HAProxy can only transmit what it understands.

Need more security?

HAProxy Enterprise features next-generation multi-layered security, including DDoS protection, bot management, web application firewall (WAF), and AI gateway. These security layers are exceptionally accurate with ultra-low latency, have low resource usage, feature dynamic and adaptive detection, and can be orchestrated at scale using HAProxy Fusion Control Plane.

HTTP/1.1 is indeed an ambiguous and complicated protocol to parse correctly. Before HTX, HAProxy was probably affected by some of these known attacks, and a lot of time was spent resolving parsing and processing bugs. Today, this no longer happens. This is a significant benefit brought by the switch to HTX.

Subscribe to our blog. Get the latest release updates, tutorials, and deep-dives from HAProxy experts.