Up until recently, I thought that the difference between long and short polling was just the interval of the polls.
However, that is not actually the case, there is a more objective difference: where the “waiting” is held.
What is short polling?
In short polling, the client is sending requests at an interval, asking the server for new updates. The server then immediately replies with the new updates, even if there are none.
Let’s imagine a real-case scenario, you are creating a chatbot for a messaging platform like Discord, and you want to know when you receive a new message, one way you could do it is by sending a request like
GET /messages/?unread=true every 500 ms or so.
The server then would reply with all of the messages that you (the chatbot) haven’t read yet.
This is the simplest polling, but also expensive, especially considering the handshake overhead of TCP. You have to establish a connection for each request sent.
Thus, short polling is always a no-go.
What is long polling?
In long polling, however, the client sends the request, and the server hangs until there are new updates. (normally with a configured timeout)
In the previous scenario, you would send the
GET /messages/?unread=true request, and the server would only reply when a new message is there.
This one is better since you are only sending one request, thus being lightweight on resources.
Where long polling is used
There is a commonly used technology that makes use of long polling: Kafka.
In Kafka, the consumer polls the broker for new messages, and it is configured to respond when there is a minimum byte size of messages available, or at a time limit.
Having the possibility of such a configuration, plus being light on resources, is the benefit of long polling.
Better alternatives for websites
For websites/browsers, however, polling isn’t used anymore, but rather WebSockets and Server-Sent Events (SSE), which, for those cases, are better alternatives.
WebSockets is a protocol, distinct from HTTP, where the connection is established, with support for a bidirectional flow of messages.
Thus, both the server and the client can send messages to each other.
The different colors are used to indicate that the messages are independent of one another.
The handshake is a GET request with the following key headers:
Upgrade: websocket Connection: Upgrade
Then the server response is:
HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade
Those are the key headers, another key one is the “Sec-WebSocket-Protocol”, which specifies the subprotocol to be used for the messages exchanged.
From now on, the protocol used is WebSockets.
Server-sent events (SSE)
Server-sent events are, just like the name suggests, messages that are sent straight by the server to the client, without any interaction needed from the client. This
The functional difference between this and WebSockets is that, after the client connects to the server, the flow is unidirectional, coming only from the server.
A key technical difference is that SSE uses the HTTP protocol, this is a great positive point, we’ll see later why.
Although the client can’t send messages via the SSE connection, it can send via normal HTTP requests.
Which one to choose: WebSockets or SSE?
One might find SSE useless since with WebSockets the client can make use of the connection to send messages.
However, since SSE relies on HTTP, it is never blocked by firewalls, and it has more support from Web Application Firewalls.
Browsers have built-in security measures for HTTP, whereas in WebSockets, being a separate protocol, some practices are needed on the server side.
One example of it is the lack of the Same-Origin Policy, which allows an attacker to:
- send you a link to its website with a malicious JS script
- that script sends the WS handshake to Discord
- if this was an SSE request, the browser wouldn’t allow it (due to the Same-Origin policy)
- Discord receives that request
- if the Discord server doesn’t validate the Origin header: the attacker can read the messages you receive and send them to its server
Check this article for more security practices.
Browsers have a limit of HTTP connections, depending on the browser, so SSE gets affected by this.
SSE also isn’t able to transmit binary data, since the data sent must be text UTF-8 encoded, a workaround for it is to use an encoding such as base64, but the client would need to decode it on its end.
To be honest, the choice here comes down to which one you have the easiest support:
- which one does your framework/language? what about your WAF?
- do you intend to send binary data to the client?
- if so, WS is probably the easiest choice
- can firewalls be an issue?
- normally only hardened corporate ones, if so, then SSE is the way to go
What about SPDY, Server Push…?
SPDY was a protocol very quickly made obsolete, and Server Push from HTTP 2.0 is being removed from Chrome, thus I don’t think it’s worth going over.