SSE, WebSockets, Hasura, Apollo Federation

You might be thinking that there isn't much to say about subscriptions. They are defined in the GraphQL specification and it should be very clear how they work and what they do.

But the specification doesn't really say much about the transport layer. In fact, it doesn't specify the transport layer at all. On the one hand, this is an advantage because you can use GraphQL in a wide variety of environments. On the other hand, we now have at least five different implementations of GraphQL subscriptions.

This means that you can't just use any GraphQL client, connect to a GraphQL server, and expect everything to work. You need to know which protocol the server supports and which client you need to use. Is this an ideal situation? Probably not, but we're going to change that!

We are the creators WunderGraph (open source), the first cloud-based server-side GraphQL API Gateway. One of the challenges we faced was supporting all of the different GraphQL subscription protocols. Because the GraphQL specification is strictly protocol agnostic, several different protocols have been developed over the years.

If a client wants to use a GraphQL subscription, they need to know which protocol to use and implement the client side of that protocol.

With our Open Source API Gateway, we take things a step further and bring everything under one roof. If you're looking at using GraphQL subscriptions in your project, this post is a great way to quickly become familiar with the different protocols and their features.

Introduction – What are GraphQL Subscriptions?

GraphQL has three types of operations: Queries, Mutations, and Subscriptions. Queries and Mutations are used to retrieve and modify data. Subscriptions are used to subscribe to data changes.

Instead of polling the server for updates, subscriptions allow the client to subscribe to data changes, for example by subscribing to a chat room. When a new message is sent to the chat room, the server sends the message to the client.

With Queries and Mutations, control of the flow is in the hands of the client. The client sends a request to the server and waits for a response. With Subscriptions, flow control is in the hands of the server.

Here's an example of a GraphQL subscription:

subscription($roomId: ID!) {
  messages(roomId: $roomId) {
    id
    text
  }
}

The server will now send a continuous stream of messages to the client. Here is an example with 2 messages:

{
  "data": {
    "messages": {
      "id": 1,
      "text": "Hello Subscriptions!"
    }
  }
}
{
  "data": {
    "messages": {
      "id": 2,
      "text": "Hello WunderGraph!"
    }
  }
}

Now that we understand what GraphQL subscriptions are, let's look at the different protocols available.

GraphQL Subscriptions via WebSockets

The most widely used transport layer for GraphQL subscriptions is WebSockets. WebSockets is a bidirectional communication protocol. They allow the client and server to send messages to each other at any time.

There are two implementations of GraphQL subscriptions over WebSockets:

The first one is subscription-transport-ws from Apollo, second – graphql-ws from Denis Badurin.

Both protocols are quite similar, although there are some minor differences. It is important to note that the Apollo protocol has been deprecated in favor of graphql-wsbut it is still widely used.

GraphQL Subscriptions via WebSockets: subscription-transport-ws vs graphql-ws

Both transports use JSON as the message format. The field is used to uniquely identify the message type type. Individual subscriptions are identified by field id.
Clients initiate a connection by sending a message connection_initfollowed by a message connection_ack from the server.

{"type": "connection_init"}
{"type": "connection_ack"}

This seems strange to me. It looks like we are creating multiple layers of TCP. To create a WebSocket connection, we first need to create a TCP connection. A TCP connection is initiated by sending a SYN packet followed by an ACK packet from the server. Thus, a handshake already occurs between the client and the server.

We then initiate a WebSocket connection by sending an HTTP Upgrade request, which the server accepts by sending an HTTP Upgrade response. This is the second handshake between the client and server.

Why do we need a third handshake? Are we not trusting the WebSocket protocol enough?

Anyway, after these three handshakes, we are finally ready to send messages. But before we talk about starting and stopping subscriptions, we need to make sure our WebSocket connection is still alive. Regarding heartbeat, there are some differences between the two protocols.

The Apollo protocol uses a message {"type": "ka"} to send heartbeat from server to client. The protocol does not define how the client should respond. If the server sends a keepalive message to the client, but the client never responds, what is the point of the keepalive message? But there is one more problem. The protocol specifies that the server should start sending keepalive messages only after the connection has been confirmed. In practice, we have found that Hasura can send keepalive messages before the connection is confirmed. So if your implementation depends on strict message ordering, you should be aware of this.

Protocol graphql-ws improved it. Instead of a single keepalive message, it specifies that the server should send a message periodically {"type":"ping"}to which the client must respond with a message {"type":"pong"}. This ensures for both client and server that the other party is still alive.

Now let's talk about launching a subscription. With the Apollo protocol we had to send the following message:

{"type":"start","id":"1","payload":{"query":"subscription {online_users{id}}"}}

Type is startand we must indicate id to uniquely identify a subscription; subscription sent as a request field on the object payload. I think this is confusing, and it comes from the fact that many people in the GraphQL community refer to operations as queries. This gets even more confusing because although the “GraphQL Operation” is called a query (query), you must specify the field operationNameif you have several named operations in your document.

Unfortunately, the protocol graphql-ws didn't improve it. I'm guessing it's because they want to stay within spec GraphQL over HTTPa specification that attempts to unify the way GraphQL is used over HTTP.

Anyway, here's how we would start a subscription with the protocol graphql-ws:

{"type":"subscribe","id":"1","payload":{"query":"subscription {online_users{id}}"}}

Type start was replaced by subscribethe rest remained unchanged.

After we have initiated the connection and started the subscription, we now need to understand how receiving messages works.

For subscription messages, the Apollo protocol uses the type datatogether with id subscriptions, with actual data that is sent to the field payload.

{"type":"data","id":"1","payload":{"data":{"online_users":[{"id":1},{"id":2}]}}}

Protocol graphql-ws uses type next for subscription messages, the rest of the message remains unchanged.

{"type":"next","id":"1","payload":{"data":{"online_users":[{"id":1},{"id":2}]}}}

Now that we have started a subscription, we may want to stop it at some point.

The Apollo protocol uses the type stop for this. If the client wants to stop the subscription, he sends a message stop With id subscriptions.

{"type":"stop","id":"1"}

Protocol graphql-ws simplified it. Both the client and the server can send a message complete With id subscription to stop it or notify the other party that the subscription has been stopped.

{"type":"complete","id":"1"}

On the other hand, in the Apollo protocol the message complete used only by the server to notify the client that the subscription has been stopped and there is no more data to send.

This was a quick overview of the differences between the two protocols. But how does the client actually know which protocol to use, or how can the server know which protocol the client is using?

This is where content negotiation comes into play. When a client initiates a WebSocket connection, it can send a list of supported protocols in the header Sec-WebSocket-Protocol. The server can then choose one of the protocols and send it back in the header Sec-WebSocket-Protocol HTTP Upgrade response.

This is what such an update request might look like:

GET /graphql HTTP/1.1
Host: localhost:8080
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Sec-WebSocket-Protocol: graphql-ws, graphql-transport-ws

Here's how the server might react:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: graphql-ws

This is a theory. But does this work in practice? The simple answer is no, but I think it's worth looking into this issue in more detail.

GraphQL client and server implementations typically do not support content negotiation. The reason is that for a long time there was only one protocol, so there was no need for negotiations. Now that there are multiple protocols, it is too late to add support for content negotiation to existing implementations.

This means that even if the client sends a list of supported protocols, the server can simply ignore it and use the protocol it supports. Or, even worse, the server might choose the first protocol in the list even though it doesn't support it, and then act as if the second protocol was selected.

So you need to somehow “identify” the client and the server to understand what protocol it supports. Another option is to “just try” and see which protocol works. It's not ideal, but you have to work with it.

It would be nice if we had something like an OPTIONS query for GraphQL servers so that the client and server could learn about each other to choose the right protocol. But we'll come back to this later.

For now, let's summarize the full flow of the two protocols. Let's start with the Apollo protocol.

C: {"type": "connection_init"}
S: {"type": "connection_ack"}
S: {"type": "ping"}
C: {"type": "pong"}
C: {"type": "subscribe","id":"1","payload":{"query":"subscription {online_users{id}}"}}
S: {"type": "next","id":"1","payload":{"data":{"online_users":[{"id":1},{"id":2}]}}}
C: {"type": "complete","id":"1"}

For comparison, here is the flow subscriptions-transport-ws:

C: {"type": "connection_init"}
S: {"type": "connection_ack"}
S: {"type": "ka"}
C: {"type": "start","id":"1","payload":{"query":"subscription {online_users{id}}"}}
S: {"type": "data","id":"1","payload":{"data":{"online_users":[{"id":1},{"id":2}]}}}
C: {"type": "stop","id":"1"}

Multiplexing GraphQL Subscriptions via WebSocket

What's great about these two protocols is that they both support multiplexing multiple subscriptions over a single WebSocket connection. This means that we can send multiple subscriptions over the same connection and receive multiple subscription messages on the same connection. At the same time, this is also a huge disadvantage because multiplexing is implemented at the application level.

When you implement a GraphQL server or client that uses WebSockets, you have to implement multiplexing yourself. Wouldn't it be much better if the transport layer did this? It turns out there is a protocol that does just that.

GraphQL via Server-Sent Events (SSE)

Protocol Server-Sent Events is a transport layer protocol that allows a client to receive events from a server. This is a very simple protocol that is built on top of HTTP. Together with HTTP/2 and HTTP/3, it is one of the most efficient protocols for sending events from a server to a client. Most importantly, it solves the problem of multiplexing multiple subscriptions over a single connection at the transport layer. This means that the application layer no longer has to worry about multiplexing.

Let's see how the protocol works by looking at an implementation from GraphQL Yoga:

curl -N -H "accept:text/event-stream" "http://localhost:4000/graphql?query=subscription%20%7B%0A%20%20countdown%28from%3A%205%29%0A%7D"

data: {"data":{"countdown":5}}

data: {"data":{"countdown":4}}

data: {"data":{"countdown":3}}

data: {"data":{"countdown":2}}

data: {"data":{"countdown":1}}

data: {"data":{"countdown":0}}

It's no coincidence that we use curl here. The Server-Sent Events Protocol is a transport layer protocol that is built on top of HTTP. It is so simple that it can be used with any HTTP client that supports streaming, such as curl. The GraphQL subscription is sent as a URL-encoded query parameter.

The subscription begins when the client connects to the server and ends when the client or server closes the connection. With HTTP/2 and HTTP/3, the same TCP connection can be used for multiple subscriptions. This is multiplexing at the transport level.

If a client does not support HTTP/2, it can still use HTTP/1.1 chunked encoding as a fallback.

In fact, this protocol is so simple that it doesn't even need to be explained.

Proxying GraphQL Subscriptions via SSE Gateway

As we just showed, the Server-Sent Events approach is the simplest approach. That's why we chose it for WunderGraph as the primary way to deliver subscriptions and live queries.

But how can you combine multiple GraphQL servers with different subscription protocols under one API? This will be the last part of this post…

Multiplexing multiple GraphQL subscriptions over a single WebSocket connection

We previously discussed how WebSocket protocols support multiplexing multiple subscriptions over a single WebSocket connection. This makes sense for the client, but gets a little more complicated when used in a proxy/API Gateway.

We can't just use the same WebSocket connection for all subscriptions because we need to handle authentication and authorization for each subscription.

Therefore, instead of using a single WebSocket connection for all subscriptions, we must “gather” all subscriptions together, which must be executed in one “secure context”. We do this by hashing all security-related information such as headers, origin, etc. to create a unique identifier for each secure context.

If a WebSocket connection for this hash already exists, we use it. Otherwise, we create a new WebSocket connection for this secure context.

Authentication for GraphQL subscriptions via WebSockets

Some GraphQL APIs, such as those from Reddit, expect the client to send an Authorization header with the WebSocket connection. This is a bit problematic because browsers can't send custom headers with WebSocket update requests, the browser API simply doesn't support it.

So how does Reddit handle this? For example, go to reddit.com/r/graphql and open developer tools. If you filter connections by websocket (“ws”) you should see the WebSocket connection to wss://gql-realtime.reddit.com/query.

If you look at the first message you will see that it is connection_init with some special content:

{"type":"connection_init","payload":{"Authorization":"Bearer XXX-Redacted-XXX"}}

The client sends an “authorization header” as part of the message payload connection_init. We were wondering how we could implement this without knowing what message you would like to send to connection_init message. Reddit sends Bearer Token to the field Authorizationbut you may want to send some other information.

So we decided to allow our users to define a custom hook that can change the message payload connection_init the way they want.

Here's an example:

// wundergraph.server.ts
export default configureWunderGraphServer<HooksConfig, InternalClient>(() => ({
  hooks: {
    global: {
      wsTransport: {
        onConnectionInit: {
          // counter is the id of the introspected api (data source id), defined in the wundergraph.config.ts
          enableForDataSources: ['counter'],
          hook: async (hook) => {
            let token = hook.clientRequest.headers.get('Authorization') || ''
            // we can have a different logic for each data source
            if (hook.dataSourceId === 'counter') {
              token = 'secret'
            }
            return {
              // this payload will be passed to the ws `connection_init` message payload
              payload: {
                Authorization: token,
              },
            }
          },
        },
      },
    },
  },
  graphqlServers: [],
}))

This hook takes the header Authorization from the client request (SSE) and inserts it into the message payload connection_init.

This not only simplifies authentication for WebSocket subscriptions, but also makes the implementation much more secure.

The Reddit implementation provides the client with a Bearer Token. This means that the Javascript client in the browser has access to the Bearer Token. This token may be lost or may be accessible by malicious Javascript code that has been injected into the page.

The SSE implementation is different. We do not disclose any tokens to the client. Instead, the user's credentials are stored in an encrypted cookie that is only accessible over http.

Manipulating/filtering GraphQL subscription messages

Another problem you may encounter is the desire to manipulate/filter messages that are sent to the client. You may want to integrate a third party GraphQL API, and before sending messages to the client, you will want to filter out some fields that contain sensitive information.

We also implemented a hook for this:

// wundergraph.server.ts
export default configureWunderGraphServer<HooksConfig, InternalClient>(() => ({
  hooks: {
    global: {},
    queries: {},
    mutations: {},
    subscriptions: {
      Ws: {
        mutatingPreResolve: async (hook) => {
          // here we modify the input before request is sent to the data source
          hook.input.from = 7
          return hook.input
        },
        postResolve: async (hook) => {
          // here we log the response we got from the ws server (not the modified one)
          hook.log.info(`postResolve hook: ${hook.response.data!.ws_countdown}`)
        },
        mutatingPostResolve: async (hook) => {
          // here we modify the response before it gets sent to the client
          let count = hook.response.data!.ws_countdown!
          count++
          hook.response.data!.ws_countdown = count
          return hook.response
        },
        preResolve: async (hook) => {
          // here we log the request input
          hook.log.info(
            `preResolve hook input, counter starts from: ${hook.input.from}`
          )
        },
      },
    },
  },
}))

There are four hooks in your toolkit that allow you to manipulate the subscribe message before it is sent to the source and before each response is sent to the client.

The most interesting hook may be this mutatingPostResolveas it allows you to filter and manipulate the response, which we talked about earlier.

Proxying GraphQL subscriptions to federated GraphQL APIs (Apollo Federation/Supergraph/Subgraph)

Proxying GraphQL subscriptions to federated GraphQL APIs adds a whole new layer of complexity to the problem. You need to start a subscription to the root field on one of the subgraphs, and then “merge” the response from one or more subgraphs into one message.

If you're interested in seeing an example of how this works, check out the Apollo Federation example in our monorepository.

I'll do a more detailed description of this topic in the future, but for now let me give you a quick overview of how it works.

We break a GraphQL federated subscription into multiple operations, one subscription for the root field and one or more queries for the rest of the response tree.

We then execute the subscription like any other subscription and switch to “normal” execution mode as soon as a new subscription message arrives from the root field.

This also allows us to use the “magic” field _join to combine a subscription with a REST API or any other data source.

Once you've figured out the part of managing multiple WebSocket connections, the rest is just a matter of combining responses from different data sources, be they federated or non-federated GraphQL APIs, REST APIs, or even gRPC.

Examples

This was quite a lot to understand, so let's look at some examples to make this a little more concrete.

WunderGraph as an API gateway before Hasura

This example shows how to use WunderGraph before Hasura.

WunderGraph with graphql-ws-subscriptions

Next example unites graphql-ws-subscriptions from WunderGraph.

WunderGraph with Apollo GraphQL Subscriptions

If you are still using legacy subscriptions Apollo GraphQL, we've got you covered too.

WunderGraph and GraphQL SSE Subscriptions

This example uses the GraphQL SSE subscriptions implementation.

WunderGraph with GraphQL Yoga Subscriptions

One of the most popular GraphQL libraries, GraphQL Yogadefinitely should be on the list.

Example of hooks for WunderGraph subscriptions

Finally we would like to wrap this up example hooks on WunderGraph subscriptions, demonstrating the available hooks.

Conclusion

As you've learned, understanding and implementing all of the different GraphQL subscription protocols involves some complexity.

I think there's really a lack of standardization in the GraphQL community on the “GraphQL Server Capability” protocol. This protocol would allow the client to quickly determine what capabilities the GraphQL server has and what protocols it supports.

In its current state, it is not always guaranteed that a GraphQL client can automatically determine how to communicate with a GraphQL server. If we want the GraphQL ecosystem to grow, we must set standards so that clients can communicate with GraphQL servers without human intervention.

If you're trying to combine multiple GraphQL APIs under one umbrella, you're likely running into the same issues we ran into. We hope we were able to give you some hints on how to resolve these issues.

And of course, if you're just looking for a pre-built, programmable GraphQL API gateway that handles all the complexity for you, check out the examples above and give WunderGraph a try. It is Open Source (Apache 2.0) and free to use.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *