The Couchbase Sync Gateway 2.8 release announced support for enterprise-grade cloud-to-edge data sync. The new inter-Sync Gateway replication technology allows enterprise-grade scalable, secure sync between cloud and edge data centers in a distributed cloud environment to cater to the demands of Edge Computing applications.

In this post, we provide an overview of the feature with some examples on how to configure your deployment. For more details, refer to the documentation pages.

First, some use cases …

Use Cases

Distributed cloud deployments where data storage and processing is distributed and handled closer to the apps is growing in relevance as applications demand guaranteed high availability, real time responses, adherence to data privacy and regulatory restrictions and are dealing with massive volumes of data. This paradigm of computing is referred to as “Edge Computing”. You can learn more about it in this blog on “Architecting Edge Computing Solutions with Couchbase”.

Here are a few examples of apps that benefit from such a distributed cloud architecture –

  • Retail :
    Large retail outlets will continue to service their customers even in case of internet outages by runnning off of servers it their local on-prem servers. Business downtime is not only detrimental to customer experience, it can have long standing impact on the reputation as well. In this case, guaranteed high availability of apps and resiliency are key drivers
  • Travel :
    Passengers on cruise ships can take advantage of all on-board services even when ships are disconnected from the Internet for days or months. In this case, the on-cruise data center will continue to serve the passengers en voyage. This is another example where guaranteed high availability of apps and resiliency are key drivers.
  • Hospitality :
    Hotel properties can ensure that guests are checked in even when there is an internet outage. The on-property Property Management Systems (PMS) will ensure that guest experience is not compromised. This is another example where guaranteed high availability of apps and resiliency are key drivers.
  • Healthcare :
    Patient Monitoring Systems in hospitals can locally process patient data and take immediate remedial action. Real time data processing and data privacy are key drivers in this case.
  • IoT :
    IoT apps are a key driver of Edge Computing architectures. Apps in this space generate massive volumes of data that need to be analyzed in real time. Transferring all that data to the backend servers imposes a lot of overhead on the network as well as the servers. Moreover, a lot of the data is typically ephemeral in nature and it does not make much sense to have to transfer it to remote servers just to be processed and discarded. As a specific example in the IIoT space, factories can monitor, collect and analyze sensor data from equipment locally for preventative maintenance. Only aggregated data is sent up the cloud data center. Real time data processing and cost savings from reduced bandwidth usage are main drivers in this case.

Typical Cloud to Edge Deployment

A typical deployment of a distributed cloud architecture using Couchbase is shown below.

How does Couchbase fit in? You have Couchbase Server in the remote cloud data centers and is responsible for storing and processing data across all edge data centers. You then have a smaller footprint of Couchbase Server in each of the edge data centers. The size of the servers in the edge data centers will be significantly smaller than that of cloud data centers since it serves a smaller population of clients at the edge. Data local to the edge is processed by the on-prem Couchbase server cluster.

But what about data movement? In other words, how does data between the cloud and edge remain in sync. That’s where inter-Sync Gateway replication comes in. For this, you have Sync Gateway deployed at the cloud and edge data centers which is reponsible for replicating data. And then you have to consider that the sync is happening over the internet, which is untrustworthy. So you need to ensure that the data is encrypted and that there are strict security controls in place to ensure authorized access to data. Further more, you can have different access control policies deployed at the cloud and at the edge and can ensure that a compromised edge does not impact the cloud or the other edge data centers.

Deployment Configuration Tip

The Sync Gateway cluster on which the replication is initialized or scheduled is the “Active Cluster” and the remote Sync Gateway cluster that is the target of the replication is the “Passive Cluster”.

If you have multiple replications to be configured between two clusters, it is recommended that you pick one cluster as the active cluster for all your replications. This is true regardless of the direction of your replication – push, pull or push-pull. This configuration makes it simpler to deploy, administer and troubleshoot your replications

Particularly within the context of cloud-to-edge sync, we envision that the edge will be the active cluster initiating replications to the remote cloud cluster. It is likely that the edge clusters are not accessible over an external network.

Attributes of the Sync Technology

Handling sync at scale over an untrusted network under unreliable network conditions is not an easy challenge. There are several considerations and here is an overview of how inter-Sync Gateway replication. Refer to the documentation for details

Feature inter-Sync Gateway Replication
Scalability The number of edge data can range from 10s to 100s to 1000s. The protocol is capable of scaling to handle that number of edges
Security The sync is happening over the internet, which is inherently untrustworthy. All data is encrypted over TLS and there are strict access controls in place to prevent unauthorized access to data. Further more, you can have different access control policies deployed at the cloud and at the edge and can ensure that a compromised edge does not impact the cloud or the other edge data centers.
Network Resiliency Protocol implements an exponential backoff retry algorithm. The backoff period is configurable
Efficiency To optimize network bandwith usage and reduce transfer costs, the protocol supports delta sync – ability to sync parts of doc thave changed. The sync can operate in both continuous mode or one-shot, on demand mode. So apps are in control over when to sync data and for instance, they can choose to do so during off-peak hours
Data Conflicts Comprehensive conflict resolution strategy. Sync Gateway provides automatic conflict resolution support with out of box resolvers and you can define your own conflict resolver – similar to the way the sync function is defined, you can define a JS function as part of the Sync Gateway config file
Operational Ease High Availability of replications, Automatic load balancing/distribution of replications uniformly across Sync Gateway nodes and a REST interface for remote administration and management
Flexible Topologies Hierarchical. The number of tiers in the hierarchy can be greater than 1 – for instance, cloud data center can talk to downstream data centers which in turn can talk to more downstream data centers.

Sample Configurations

In this section, we provide a few examples of typical replicator configurations.

Replications are scoped to a database and can be configured in the Sync Gateway config file and scheduled during launch or they can be initialized via the _replication endpoint at any point after startup.

By default, all nodes participate in replications. This implies that replications configured for a cluster of Sync Gateway nodes are uniformly distributed across all the nodes. A Sync Gateway node can be configured to opt out from participating in the replication using the sgreplicate_enabled config option.

Pull-only one-shot replication with default conflict resolution

In this example, a replication with Id pull-from-target-oneshot is setup to do a one-time pull of documents belonging to the channel channel:storechannel from the stores database at the remote endpoint. The documents are replicated to the local my_local_store database. The replicating user’s credentials are specified via the username and password parameters.

The replication is initially in stopped state and can be started at a later point via replicationStatus endpoint. Conflicts are automatically handled by Sync Gateway using predefined policies.

Bi-directional continuous replication with out-of-box conflict resolver

In this example, a replication with Id pushandpull-with-target-continuous is setup to do a continuous push and pull documents belonging to the channel channel:storechannel from stores database at the remote endpoint. The documents are replicated to the local my_local_store database. The replicating user’s credentials are specified via the username and password parameters.

The replication is automatically started when scheduled – this is the default value of initial_state flag. In case of conflicts, the remote side is the winner.

Bi-directional continuous replication with custom conflict resolver

This example is identical to the previous case except that we associate a custom conflict resolver with the replicator. Now everytime, Sync Gateway detects a conflict during replication, the conflict resolver is invoked with the conflicting revisions. The resolver has access to full document body and metadata which can then be used for resolving the conflict. Of course, you can choose to return either of the conflicting revision to implement the equivalent of the “LocalWins” or “RemoteWins” strategy.

BTW, don’t get too attached with the details of what’s going on in the resolver or it’s accuracy/efficiency. I am sure there are better ways to do this in Javascipt – this is just to demonstrate the concept.

Ofcourse there are plenty of other configuration options to choose from that will allow you to customize it to suit your application needs. You can refer to our documentation for the details.

Monitoring Replications

Once your configurations are up and running, you can monitor it through via the replicationStatus endpoint. In 2.8, we also released a new metrics endpoint in Developer Preview mode. This endpoint also exports stats in Prometheus format which would make it very easy to monitor with Prometheus and visualize using Grafana. You can learn more about this in an upcoming blog.

What about “SG-Replicate”?

If you have been working with Sync gateway, you are probably familiar with the SG-Replicate which can be used for replication between Sync Gateway nodes in different clusters. The new version of the protocol, which is websockets based is redesigned from the ground-up to offer a number of enterprise-grade features such as automatic load-balancing of replications across participating Sync Gateway nodes, High Availability (HA), built-in automatic conflict resolution with custom conflict resolvers, delta sync support, significant improvements in scalability and performance and more.

While “SG-Replicate” continue to remain supported in 2.8, it is deprecated and existing apps should migrate to the new version of the inter-Sync Gateway replication technology.

What Next

Couchbase Sync Gateway’s cloud-to-edge sync solution is secure, scalable and easy to configure and manage. sync is the only peer-to-peer database sync solution that allows clients to directly communicate with each other in disconnected environments.

You can download Sync Gateway and evauate the functionality for free.

If you want to dive into the details, here’s where you can find more information
Connect Video with Demo: Using inter-Sync Gateway Replication
Documentation: inter-Sync Gateway Replication
Solutions Page: Edge Computing

The Couchbase Forums is a great place to reach out with questions. Please leave a comment below or feel free to reach out to me via Twitter or email me

 

Author

Posted by Priya Rajagopal, Senior Director, Product Management

Priya Rajagopal is a Senior Director of Product Management at Couchbase responsible for developer platforms for the cloud and the edge. She has been professionally developing software for over 20 years in several technical and product leadership positions, with 10+ years focused on mobile technologies. As a TISPAN IPTV standards delegate, she was a key contributor to the IPTV standards specifications. She has 22 patents in the areas of networking and platform security.

Leave a reply