explicitClick to confirm you are 18+

Denormalization Angels and Demons

Mark HardingDec 14, 2018, 10:34:33 AM

Minds has always had scaling at its forefront, particularly horizontal scaling where machines can be spun up and down as required with minimal effort. Back in 2012, when the Minds HQ was briefly relocated to Rockaway Beach (more about my luxurious digs in future blog), we decided to take the leap into the world of NoSQL, departing from both Elgg and MySQL.

After a lot of research we decided to go with Cassandra, a highly scalable and decentralized database, over alternatives such as MongoDB or CouchDB. The most appealing aspect of Cassandra, besides its incredible performance, was the fact that it could easily scale out with a few simple commands. Over the past 6 years Cassandra has proven an invaluable resource and a cornerstone of our infrastructure. Everything is stored here, posts, users, indexes, etc.

The key difference between a column based NoSQL database such as Cassandra and an RDBMS database such as MySQL is the lack of relationships between the data. In other words, you can’t simply ask the database to return a set of posts that a user is subscribed to; instead we have to store the data as we are going to query it in the future. This approach has both benefits and tradeoffs, with the plus points being improved read performance and the downsides being less fluid, more restrictive queries.

In order to get around the lack of relational queries in Cassandra, we ‘denormalize’ a lot of the data and maintain user-centric indexes for feeds. Denormalization is the process of replicating/duplicating data. For example, in order to improve the performance of reading a post, we store a denormalized version of the owner on the post. You may notice that if you change your name that your old posts persist the old user information; here we arrive at stale data.

Denormalization of posts

About a year ago we noticed that we were over-denormalizing. Perhaps we got a little bit too carried away! Our databases quickly passed over 6TB in storage, with most of this being duplicated data. A particularly bad culprit of this was the notifications system.

Notifications structure.. too much data!

We were denormalizing the sender entity, receiver entity, post entity and container entity on every single notification, and particularly for group notifications we were sending out 100’s of thousands of the same notification to different users multiple times per day. Luckily notifications are largely ephemeral (temporary) so we could purge a lot of this data, but as more notifications are delivered, this was not a scalable solution.

An additional issue we encountered was the way that Cassandra stores its data. Cassandra stores data in immutable files called SSTables. When we add new data to the database, new SSTables are created and regularly compacted (merged/squashed together) to create new ones. When you delete rows or columns, a tombstone is created and the data remains in these SSTables until a compaction job is executed. You are correct in thinking that tombstones don’t sound like a good thing! When your tables get to a certain size you run into issues when you are not able to run the compaction jobs quick enough to efficiently store or purge your data.

Our original approach of throwing everything into our Key-Value ‘entities_by_time’ table reached its peak and we solved this issue by separating out, and partitioning a lot of the data into new tables. We also made use of a new compaction strategy called LeveledCompaction, rather than SizeTieredCompaction, which is particularly suited to updates and regular compaction jobs.

Moving forwards, we are still heavily denormalizing our data for performance, but we only denormalize where we would have previously performed joins in a relational system. We still store the owners on posts, but it is not necessary to denormalize the post on every users newsfeeds or notifications, instead we can cache this information on read and simply store a guid. Additionally in we are exploring new database such as CockroachDB (a NewSQL solution) along with DAT, a peer-to-peer protocol powering our fully decentralized prototype called Nomad, but more on those in later blogs...

If you are passionate about tackling these kind of issues and creating scalable solutions, we are hiring!