MyRocks in the Enterprise with DropboxPercona Technical Account Managers get the privilege of working with some of our largest enterprise clients day in and day out.  As such, we get to really focus on how to best leverage our technology to generate measurable benefits for our users.  While it is fun to “nerd out” and always strive to use the latest and greatest, we need to stay focused on demonstrating business value and a genuine need.  Over the past few months, I’ve been working with one of my larger clients, Dropbox, along with our professional services team to validate the use of Percona Server for MySQL with the MyRocks storage engine over a large portion of their MySQL infrastructure.

Please note – this is not meant to be a deep dive into the technical details around MyRocks or the implementation.  Rather, it is meant to show how we determined the need, potential solution, and the planning that has started us down this path.  Look for a more detailed case study in the coming months as we hope to push this solution forward!

The Problem

In a single word, space.  When a client reaches a certain scale, space becomes a real concern.  With 10-20 servers, having an extra index or choosing the wrong data type can be noticeable.  However, with 10,000 servers, there is much less margin for error.  The challenge with this client is that the schema has already been sharded and optimized. Even with that, the aggregate dataset is still on the petabyte scale.  Think about what that means:

  • Large storage footprint per server (2+TB)
  • Replication multiplies that footprint for each “cluster”
  • Multiple backups per cluster result in a huge (PB+) backup footprint

There are some additional challenges created at this scale, too.  To mitigate the risk of filling a disk, there is a soft cap of 75 percent full on each server.  When a cluster hits this level, the shard is split to give each half some additional runway.  While this is great operationally, splitting shards results in doubling the cost for the cluster.  Again, on a scale of 1000s of servers, a “split” means an increase in cost in the hundreds of thousands of dollars.  This is not trivial and puts pressure on the team to delay splits as long as possible while maintaining high availability, durability, and performance.

MyRocks Use Case

After much discussion and understanding that space (and, in turn, cost) is a major business driver, we decided to investigate a change to MyRocks as the storage engine.  While standard InnoDB compression provided a small bandaid, the thinking was that MyRocks would give substantial savings.  At a very high level, MyRocks is an LSM tree-based storage engine built on top of RocksDB.  This type of storage has numerous advantages over traditional B+Tree storage engines (like InnoDB), including a smaller disk footprint and reduced write amplification.  This can translate to business value in several ways, including:

  • Decreased cloud/HDFS storage cost for backups
  • Reduced Write Amplification results in a longer disk lifetime, reducing fleet turnover
  • Decreased instance space requires fewer splits and increases runway

After several rounds of discussion around the benefits and any blockers, we decided to begin testing the feasibility of converting from InnoDB to MyRocks.

Initial Testing

While the use case seemed like a solid fit, extensive testing is always needed.  So far, the initial testing looks promising.  We’ve seen a large reduction in space combined with an increase in performance.  One example cluster saw nearly a 75 percent reduction in space, to the point that we were able to test running two replica instances on a single server in parallel.

What makes this interesting is that the test cluster in question has replicas that periodically show lag and generally are close to the replication limit.  In contrast, the two MyRocks replicas showed no lag during the same test period despite running two instances on one physical server.

While this isn’t something that would be done in production, it was impressive to see double the workload operating so efficiently on the same hardware.  I/O and CPU utilization were both noticeably lower than the single replica running InnoDB.  This shows the potential of increased server longevity and less frequent splits that we were hoping to see.  If these numbers and this performance were to hold into production, we could see savings on the order of millions of dollars across the entire fleet.

Note – this early testing has been limited to replicas only so we can’t yet validate this performance at the production level of concurrency of a primary server.  These results are only with four replica threads, so contention has been minor.

Looking Forward

So far, our limited testing has shown some real promise.  There are still some blockers that we need to overcome, but early results are encouraging. Currently, the biggest technical challenges in this project include:

  • MyRocks currently lacks pt-table-checksum support (for replica consistency)
  • Modifying and validating existing backup/restore/clone automation

Once we are able to fully validate this solution and solve the existing challenges (both fixes have been identified and are on the project roadmap) in the coming months, look for a more detailed case study.  While not every organization needs to make such a drastic change, this use case is one that is becoming more common at the enterprise level as data sets continue to explode.

Want to learn more? Check out the Percona MyRocks Introduction page on our website!

5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Bruno

Please Percona, enable MyRocks in XtraDB cluster.
I know that this is a very hard thing because of Galera. But this would be fantastic!!

Bruno Cabral

Yes I do. But I also run an XtraDB cluster with Kubernetes operator and I miss the convenience

sjmudd

Hi Mike.

Interesting read. You do not mention which major version of Percona Server you are talking about here. Is this on 5.7 or 8.0?