Building a WebSocket Chat service for Cloud Run tutorial

This tutorial shows how to create a multi-room, realtime chat service using WebSockets with a persistent connection for bidirectional communication. With WebSockets, both client and server can push messages to each other without polling the server for updates.

Although you can configure Cloud Run to use session affinity, this provides a best effort affinity, which means that any new request can still be potentially routed to a different instance. As a result, user messages in the chat service need to be synchronized across all instances, not just between the clients connected to one instance.

Design overview

This sample chat service uses a Memorystore for Redis instance to store and synchronize user messages across all instances. Redis uses a Pub/Sub mechanism, not to be confused with the product Cloud Pub/Sub, to push data to subscribed clients connected to any instance, to eliminate HTTP polling for updates.

However, even with push updates, any instance that is spun up will only receive new messages pushed to the container. To load prior messages, message history would need to be stored and retrieved from a persistent storage solution. This sample uses Redis's conventional functionality of an object store to cache and retrieve message history.

Architectural Diagram — The diagram shows multiple client connections to each Cloud Run instance. Each instance connects to a Memorystore for Redis instance via a Serverless VPC Access connector.

The Redis instance is protected from the internet using private IPs with access controlled and limited to services running on the same Virtual Private Network as the Redis instance; therefore a Serverless VPC Access connector is needed for the Cloud Run service to connect to Redis. Learn more about Serverless VPC Access.

Limitations

This tutorial does not show end user authentication or session caching. To learn more about end user authentication, refer to the Cloud Run tutorial for end user authentication.
This tutorial does not implement a database such as Firestore for indefinite storage and retrieval of chat message history.
Additional elements are needed for this sample service to be production ready. A Standard Tier Redis instance is recommended to provide High Availability using replication and automatic failover.

Setting up `gcloud` defaults

To configure gcloud with defaults for your Cloud Run service:

Set your default project:
```
gcloud config set project PROJECT_ID
```
Replace PROJECT_ID with the name of the project you created for this tutorial.
Configure gcloud for your chosen region:
```
gcloud config set run/region REGION
```
Replace REGION with the supported Cloud Run region of your choice.

Cloud Run locations

Cloud Run is regional, which means the infrastructure that runs your Cloud Run services is located in a specific region and is managed by Google to be redundantly available across all the zones within that region.

Meeting your latency, availability, or durability requirements are primary factors for selecting the region where your Cloud Run services are run. You can generally select the region nearest to your users but you should consider the location of the other Google Cloud products that are used by your Cloud Run service. Using Google Cloud products together across multiple locations can affect your service's latency as well as cost.

Cloud Run is available in the following regions:

Subject to Tier 1 pricing

asia-east1 (Taiwan)
asia-northeast1 (Tokyo)
asia-northeast2 (Osaka)
asia-south1 (Mumbai, India)
europe-north1 (Finland) Low CO₂
europe-north2 (Stockholm) Low CO₂
europe-southwest1 (Madrid) Low CO₂
europe-west1 (Belgium) Low CO₂
europe-west4 (Netherlands) Low CO₂
europe-west8 (Milan)
europe-west9 (Paris) Low CO₂
me-west1 (Tel Aviv)
northamerica-south1 (Mexico)
us-central1 (Iowa) Low CO₂
us-east1 (South Carolina)
us-east4 (Northern Virginia)
us-east5 (Columbus)
us-south1 (Dallas) Low CO₂
us-west1 (Oregon) Low CO₂

Subject to Tier 2 pricing

africa-south1 (Johannesburg)
asia-east2 (Hong Kong)
asia-northeast3 (Seoul, South Korea)
asia-southeast1 (Singapore)
asia-southeast2 (Jakarta)
asia-south2 (Delhi, India)
australia-southeast1 (Sydney)
australia-southeast2 (Melbourne)
europe-central2 (Warsaw, Poland)
europe-west10 (Berlin)
europe-west12 (Turin)
europe-west2 (London, UK) Low CO₂
europe-west3 (Frankfurt, Germany)
europe-west6 (Zurich, Switzerland) Low CO₂
me-central1 (Doha)
me-central2 (Dammam)
northamerica-northeast1 (Montreal) Low CO₂
northamerica-northeast2 (Toronto) Low CO₂
southamerica-east1 (Sao Paulo, Brazil) Low CO₂
southamerica-west1 (Santiago, Chile) Low CO₂
us-west2 (Los Angeles)
us-west3 (Salt Lake City)
us-west4 (Las Vegas)

If you already created a Cloud Run service, you can view the region in the Cloud Run dashboard in the Google Cloud console.

Retrieving the code sample

To retrieve the code sample for use:

Clone the sample repository to your local machine:
Node.js
```
git clone https://github.com/GoogleCloudPlatform/nodejs-docs-samples.git
```
Alternatively, you can download the sample as a zip file and extract it.
Change to the directory that contains the Cloud Run sample code:
Node.js
```
cd nodejs-docs-samples/run/websockets/
```

Understanding the code

Socket.io is a library that enables real time, bidirectional communication between the browser and server. Although Socket.io is not a WebSocket implementation, it does wrap the functionality to provide a simpler API for multiple communication protocols with additional features such as improved reliability, automatic reconnection, and broadcasting to all or a subset of clients.

Client-side integration

<script src="/socket.io/socket.io.js"></script>

The client instantiates a new Socket instance for every connection. Because this sample is server side rendered the server URL does not need to be defined. The socket instance can emit and listen to events.

// Initialize Socket.io
const socket = io('', {
  transports: ['websocket'],
});

// Emit "sendMessage" event with message
socket.emit('sendMessage', msg, error => {
  if (error) {
    console.error(error);
  } else {
    // Clear message
    $('#msg').val('');
  }
});

// Listen for new messages
socket.on('message', msg => {
  log(msg.user, msg.text);
});

// Listen for notifications
socket.on('notification', msg => {
  log(msg.title, msg.description);
});

// Listen connect event
socket.on('connect', () => {
  console.log('connected');
});

Server-side integration

On the server side, the Socket.io server is initialized and attached to the HTTP server. Similar to the client side, once the Socket.io server makes a connection to the client, a socket instance is created for every connection which can be used to emit and listen to messages. Socket.io also provides an easy interface for creating "rooms" or an arbitrary channel that sockets can join and leave.

// Initialize Socket.io
const server = require('http').Server(app);
const io = require('socket.io')(server);

const {createAdapter} = require('@socket.io/redis-adapter');
// Replace in-memory adapter with Redis
const subClient = redisClient.duplicate();
io.adapter(createAdapter(redisClient, subClient));
// Add error handlers
redisClient.on('error', err => {
  console.error(err.message);
});

subClient.on('error', err => {
  console.error(err.message);
});

// Listen for new connection
io.on('connection', socket => {
  // Add listener for "signin" event
  socket.on('signin', async ({user, room}, callback) => {
    try {
      // Record socket ID to user's name and chat room
      addUser(socket.id, user, room);
      // Call join to subscribe the socket to a given channel
      socket.join(room);
      // Emit notification event
      socket.in(room).emit('notification', {
        title: "Someone's here",
        description: `${user} just entered the room`,
      });
      // Retrieve room's message history or return null
      const messages = await getRoomFromCache(room);
      // Use the callback to respond with the room's message history
      // Callbacks are more commonly used for event listeners than promises
      callback(null, messages);
    } catch (err) {
      callback(err, null);
    }
  });

  // Add listener for "updateSocketId" event
  socket.on('updateSocketId', async ({user, room}) => {
    try {
      addUser(socket.id, user, room);
      socket.join(room);
    } catch (err) {
      console.error(err);
    }
  });

  // Add listener for "sendMessage" event
  socket.on('sendMessage', (message, callback) => {
    // Retrieve user's name and chat room  from socket ID
    const {user, room} = getUser(socket.id);
    if (room) {
      const msg = {user, text: message};
      // Push message to clients in chat room
      io.in(room).emit('message', msg);
      addMessageToCache(room, msg);
      callback();
    } else {
      callback('User session not found.');
    }
  });

  // Add listener for disconnection
  socket.on('disconnect', () => {
    // Remove socket ID from list
    const {user, room} = deleteUser(socket.id);
    if (user) {
      io.in(room).emit('notification', {
        title: 'Someone just left',
        description: `${user} just left the room`,
      });
    }
  });
});

Socket.io also provides a Redis adapter to broadcast events to all clients regardless of which server is serving the socket. Socket.io only uses Redis's Pub/Sub mechanism and does not store any data.

const {createAdapter} = require('@socket.io/redis-adapter');
// Replace in-memory adapter with Redis
const subClient = redisClient.duplicate();
io.adapter(createAdapter(redisClient, subClient));

Socket.io's Redis adapter can reuse the Redis client used to store the room's message history. Each container will create a connection to the Redis instance and Cloud Run can create a large number of instances. This is well under the 65,000 connections that Redis can support. If you need to support this amount of traffic, you also need to evaluate the throughput of the Serverless VPC Access connector.

Reconnection

Cloud Run has a maximum timeout of 60 minutes. So you need to add reconnection logic for possible timeouts. In some cases, Socket.io automatically attempts to reconnect after disconnection or connection error events. There is no guarantee that the client will reconnect to the same instance.

// Listen for reconnect event
socket.io.on('reconnect', () => {
  console.log('reconnected');
  // Emit "updateSocketId" event to update the recorded socket ID with user and room
  socket.emit('updateSocketId', {user, room}, error => {
    if (error) {
      console.error(error);
    }
  });
});

// Add listener for "updateSocketId" event
socket.on('updateSocketId', async ({user, room}) => {
  try {
    addUser(socket.id, user, room);
    socket.join(room);
  } catch (err) {
    console.error(err);
  }
});

Instances will persist if there is an active connection until all requests close or time out. Even if you use Cloud Run session affinity, it is possible for new requests to be load balanced to active containers, which allows containers to scale in. If you are concerned about large numbers of containers persisting after a spike in traffic, you can lower the maximum timeout value, so that unused sockets are cleaned up more frequently.

Shipping the service

Create a Memorystore for Redis instance:
```
gcloud redis instances create INSTANCE_ID --size=1 --region=REGION
```
Replace INSTANCE_ID with the name for the instance, such as my-redis-instance, and REGION_ID with the region for all your resources and services, for example, europe-west1.

Instance will be automatically allocated an IP range from the default service network range. This tutorial uses 1GB of memory for the local cache of messages in the Redis instance. Learn more about Determining the initial size of a Memorystore instance for your use case.
Set up a Serverless VPC Access connector:

To connect to your Redis instance, your Cloud Run service needs access to the Redis instance's authorized VPC network.

Every VPC connector requires its own /28 subnet to place connector instances on. This IP range must not overlap with any existing IP address reservations in your VPC network. For example, 10.8.0.0 (/28) will work in most new projects or you can specify another unused custom IP range such as 10.9.0.0 (/28). You can see which IP ranges are currently reserved in the Google Cloud console.
```
gcloud compute networks vpc-access connectors create CONNECTOR_NAME \
  --region REGION \
  --range "10.8.0.0/28"
```
Replace CONNECTOR_NAME with the name for your connector.

This command creates a connector in the default VPC network, same as the Redis instance, with e2-micro machine size. Increasing the machine size of the connector can improve the throughput of the connector but also will increase cost. The connector must also be in the same region as the Redis instance. Learn more about Configuring Serverless VPC Access.

Note: The Cloud Run service, Redis instance, and Serverless VPC Access connector are required to be in the same region and VPC network.
Define an environment variable with the IP address of your Redis instance's authorized network:
```
 export REDISHOST=$(gcloud redis instances describe INSTANCE_ID --region REGION --format "value(host)")
```
Note: This tutorial uses the default value for the Redis port, 6379.

Create a service account to serve as the service identity. By default this has no privileges other than project membership.

gcloud iam service-accounts create chat-identity
gcloud projects add-iam-policy-binding PROJECT_ID \
--member=serviceAccount:chat-identity@PROJECT_ID. \
--role=roles/serviceusage.serviceUsageConsumer

Build and deploy the container image to Cloud Run:
```
gcloud run deploy chat-app --source . \
    --vpc-connector CONNECTOR_NAME \
    --allow-unauthenticated \
    --timeout 3600 \
    --service-account chat-identity \
    --update-env-vars REDISHOST=$REDISHOST
```
Respond to any prompts to install required APIs by responding y when prompted. You only need to do this once for a project. Respond to other prompts by supplying the platform and region, if you haven't set defaults for these as described in the setup page. Learn more about Deploying from source code.

Trying it out

To try out the complete service:

Navigate your browser to the URL provided by the deployment step above.
Add your name and a chat room to sign in.
Send a message to the room!

If you choose to continue developing these services, remember that they have restricted Identity and Access Management (IAM) access to the rest of Google Cloud and will need to be given additional IAM roles to access many other services.