Websockets are a communication protocol that enables real-time, full-duplex communication between a client (such as a web browser) and a server. Understanding websockets is crucial because they facilitate efficient and persistent communication between clients and servers, particularly in scenarios requiring real-time updates or interactive features.
This article aims to explore a method for horizontally scaling websockets, which can be challenging due to their stateful nature. Our focus will be on explaining the concept of scaling rather than providing a detailed implementation guide.
Websockets maintain a stateful connection between the client and the server, which becomes particularly challenging in horizontally scaled environments. When multiple clients are connected, and the backend needs to send a message to a specific client, it must determine which of the horizontally scaled services is connected to that particular client. This makes communication more complex and requires a smart way to keep track of connections.
For example:
In this scenario, we have a backend service that needs to communicate with a specific client, Client 1. The challenge arises because the WebSocket service is horizontally scaled into multiple replicas, each managing different client connections.
When “Another backend service” tries to send a message to Client 1, the WebSocket service must correctly identify which replica, in this case, Replica 1, maintains the connection to Client 1. If the WebSocket service incorrectly attempts to connect through Replica 2, it would reach Client 2 instead of Client 1. Thus, it's crucial for the WebSocket service to track which client is connected through which replica to ensure the correct routing of messages.
To address the challenge of routing messages correctly in a horizontally scaled WebSocket environment, a solution is to enable each WebSocket service replica to operate independently while still being able to communicate with other replicas.
This means that each replica should have the capability to determine whether an incoming message is relevant to its current connections. If the message is intended for a client connected to a different replica, the replica receiving the message should have a method to forward the message to the correct replica. This requires a coordination mechanism among replicas, ensuring that messages are routed accurately to the client they are meant for, regardless of which replica initially receives the message.
Implementing a broadcast mechanism is a feasible solution to manage message routing among replicas in a horizontally scaled WebSocket service. Here’s how it would work:
This approach ensures that every message reaches the appropriate client, regardless of which replica initially received it. However, it's important to consider the potential for increased network traffic and the need for efficient filtering mechanisms to handle irrelevant messages, to avoid overloading the replicas with unnecessary processing tasks.
Following image shows an example implementation of this idea
This diagram illustrates the process of routing a message from "Another backend service" to Client 3 within a horizontally scaled WebSocket environment utilizing a publish/subscribe (pub/sub) queue system.
Here's a step-by-step breakdown:
This setup illustrates a robust mechanism where messages can be effectively routed across multiple service replicas, ensuring that even in a scaled environment, communication remains precise and reliable. The pub/sub model facilitates a scalable architecture by decoupling the message sending and receiving duties, allowing replicas to independently handle only relevant communications.
This method significantly enhances the system's efficiency by minimizing unnecessary message handling and processing across the service replicas.