At the heart of the MODE Cloud platform is the routing of commands and events. We need to direct commands from apps to the correct devices that are standing by at a remote location. At the same time, we need to syndicate events coming from these devices to the relevant listeners (apps and webhooks).

Essentially, we’ve created a kind of message broker for this part of our system. In our implementation, we have two goals:

  • Be able to serve millions of publishers and subscribers.
  • Be able to automatically recover from failures in the underlying hardware.

We looked at a few software components (including Kafka) that we may leverage. In the end we decided to build out the system from scratch and modify it incrementally to reach our stated goals.

Start With a Single Server

In the initial phase, we created a simple message broker that runs on a single machine. Clients (publishers and subscribers) connect to a pool of frontend REST API (web) servers and MQTT servers, which then connects to this single backend server.

For example, a device may establish a connection to one of our MQTT servers. The MQTT server then connects to the message broker and starts listening for commands destined for the device. When an app sends a command using our REST API, the web server simply connects to the same message broker and “publishes” the command. The message broker hands the command to the listener (MQTT server) which then sends it over to the device.

Break It Into Shards

Obviously, a single message broker server in the backend can only handle up to a certain number of publishers/subscribers at one time. In the second phase, we scaled the system horizontally by sharding, i.e. deploying the message broker to multiple servers. Each subscriber is assigned to one of the shards by a hashing algorithm.

This is straightforward enough. But there is a wrinkle. We want an efficient way for the frontend web servers and MQTT servers to keep track of the list of shards. We also want this list to be dynamic so that we can easily swap in and out the machines. Fortunately, we quickly realized that Etcd would be the perfect solution for this.

We like Etcd for its resilient multi-node cluster architecture. We built such a cluster and stored the directory of shards in it. We updated the web/MQTT servers to query the Etcd cluster to locate which shard to use for a particular subscriber. We further optimized this by caching the directory in the memory of the web/MQTT servers, which keep the data up-to-date by using the “watch” feature of the Etcd protocol.

Add Automatic Failover

We were pretty happy with our implementation of sharding in conjunction with Etcd. We will be able to connect as many devices and apps as we need to, as along as we have sufficient number of shards at our disposal. So we moved to tackle our second goal: achieving high availability.

We want to detect failures in any of the shards and swap them out with minimal disruption to the pub/sub activities. We proceeded to create a monitor that checks periodically the health of each shard. We also keep around a set of spare machines whose health is also being monitored.

When any one of the shards goes down, the monitor will activate one of the spare machines and update the directory of shards. Thanks to our use of Etcd, this can be done easily. The monitor simply writes the new shard assignment to Etcd. All the frontend web/MQTT servers that have been watching for changes in Etcd will soon be updated and start using the new machine. Automatic failover is accomplished.

Conclusion

With this architecture, we were able to implement a message broker system that is scalable and highly available. A similar approach was also applied to our implemention of the system that queues and executes webhook calls (for our Webhooks Smart Module).