Write-Through Cacheing is an Essential Part of a Healthy Scaling Strategy
The debate around “Does Rails scale?” I’ll leave to the armchair architects. But one thing I know is that having one 100% Beef Database doesn’t scale.
With replication you can scale your database. With Master-Slave replication, you can have any number of read-slaves for one master; you perform reads (SELECT queries) on the slaves and issue writes (INSERT/UPDATE/DELETEs) on the master. With this strategy your write capacity wont scale but your read capacity can scale almost indefinitely. And anyway, since write and read actions have different performance characteristics, you’ll get the most out of your beefy machines if you’re into this role-play. By the way, the best Rails plugin for this is masochism.
Replication Lag
Still, Replication has its problems. The latency of information propagation from the master to the slave (and from slave to slave if you have a tree topology) can lead to data inconsistency.
Consider a user who creates a new account on your web site. A new record is INSERTed into the `users` table on the master database. On the next page load their record SELECTed from a random slave. But the information might not yet be propagated to the database slave! So the user gets an HTTP 500 error.
Typically, it takes only a fraction of a second for information to propagate. But it isn’t unusual to measure replication lag in seconds, minutes, or hours because of bugs in mysql, partial outages, and expensive queries. This is an every-day occurrence and you must plan accordingly.
Fortunately there is a solution, and it is write-through caching. At the same time as writing to the master database, write “through” a cache layer such as memcached. If for all reads you first try to read from cache, you will always find data: data that hasn’t yet propagated from database to database is newest and therefore most likely to be in the cache.
Easier said, harder done. You’ll have smear your code with cream-cheese to get this to work. Unless there were some magical plugin that solved this problem for you…
Testing for Replication Lag-induced defects
But you can only obtain code cream cheese if you can even figure out what part of the site needs cacheing. You’d be crazy to test, live in production, bugs caused by non-deterministic replication lag. So you need to simulate replication lag somehow.
One technique that is almost foolproof is to use a comprehensive Selenium test-suite. Create a new Rails environment with your master and slave as two local databases. Load your fixtures into both databases. Don’t establish any replication relationship between the two databases — voila! you’ve just simulated infinite replication lag. Now run your Selenium suite. I’m sure it won’t be pretty, 500s upon 500s, but that’s what will happen to your users unless you implement write-through cacheing.
Other Scaling Strategies
Now, there are other database scaling strategies than master-slave replication. Master-Master replication can skirt replication lag if you use sticky sessions. But MySQL support for master-master replication is arguably not production-ready, and in any case such replication, especially combined with sticky sessions, has worse failure characteristics than master-slave.
The unicorn strategy of “sharding”/Database partitioning does not obviate replication strategies; usually each partition has replication strategies if only for fault tolerance.
My fantasy scaling strategy involves skipping the database altogether. Rather than synchronously write to a database, write to a message queue. Have a daemon read from the queue and write to your (partitioned) (replicated) database at its leisure. In the event of a write-database failure, the daemon gets backed up but there is no downtime since your site is completely decoupled from the write database. If you’re not concerned with in-order delivery, message queues scale effortlessly. (Please in no way consider this an endorsement of either ActiveMQ or RabbitMQ). In any case, writing to the database asynchronously from a daemon is only possible with sufficient write-through cacheing.
This is why I say write-through cacheing is critical to scaling. And in the next few weeks I’ll bring forth some code that shows how to do it right.
Great article. I’d love to hear more on your thoughts about using something like ActiveResource and avoiding direct database calls altogether.
Nick Gerakines
November 24, 2008 at 7:53 am
Nick, I >50% understood this. I’m anticipating some of your posts will get detailed & technical, but know you have some readers who appreciate the occasional high-level entry like this oe.
Seth Kenvin
November 24, 2008 at 12:14 pm
Hi Nick,
Apologies for what is a most unnecessarily pedantic comment on your excellent post.
First off, you say “The latency of information propagation from the master to the slave [..] can lead to data inconsistency.” I assume you mean ‘coherency’ and not ‘consistency’? (ref – http://docs.cray.com/books/S-2315-50/html-S-2315-50/zfixedgtci5niw.html)
In either case, my main comment is that your argument at face value applies to both write-behind and write-through caching.
You say that: “there is a solution, and it is write-through caching”. You then say: “If for all reads you first try to read from cache, you will always find data: data that hasn’t yet propagated [..] is newest”. My extremely pedantic point is that this statement is also true of write-behind caching.
I would define write-through caching to be a synchronous update of two stores: the in memory cache; and the ‘backing’ data store on disk. The latter is usually a database but could also be a logfile (eg if you were caching to a persistent queue). I would define write-behind caching to be an update to the in memory cache, logically followed by an asynchronous update to the database or other backing store on disk.
So, in both cases, write-through and write-behind, if you always read the cache first, then you read the ‘latest’ update. (Caveat – this of course assumes no other writer can update the database without writing to the cache either first, or synchronously) This means your strategy works equally well for write-behind and write-through caching.
Also, you say “If for all reads you first try to read from cache, you will always find data: data that hasn’t yet propagated from database to database is newest and therefore most likely to be in the cache”. This is only true if the cache replicates to other cache copies faster than the database replicates to database copies. Yes, that is usually true.
But it is worth – again pedantically – spelling out that cache-cache replication can be write-through or write-behind. Strictly speaking your strategy works in either of two cases: (1) where all updates are write-behind, and the reader gets data from the most recently updated cache; and (2) where cache-cache replication is synchronous write-through and the reader gets data from any cache. In both cases cache-database replication can be either write-behind or write-through.
Of course none of this matters one jot if you don’t require the ‘latest’ write. The only case where logically you must have the latest write is if your next piece of work depends on it, and in many systems this means the same as “your process performed the latest update”. Your use case of course falls into this category: “Consider a user who creates a new account on your web site. A new record is INSERTed into the `users` table on the master database. On the next page load their record SELECTed from a random slave”.
The retort ‘yes, I know all this but did not want to bore my readers’ could be safely inserted here
In any case – I’d be very interested in your views on how far you can take lazy replication either at cache or at database level. You allude to this when you say that “Rather than synchronously write to a database, write to a message queue. Have a daemon read from the queue and write to your (partitioned) (replicated) database at its leisure. In the event of a write-database failure, the daemon gets backed up but there is no downtime since your site is completely decoupled from the write database.”
Given that either the queue, or the daemon, provides a ‘write-through’ checkpoint to disk, from which your system can safely reply to the end user, this is of course ‘safe’ write-behind writ large
… Or put another way – use the MQ redo log instead of the DB redo log …
Finally, you say that: “If you’re not concerned with in-order delivery, message queues scale effortlessly … In any case, writing to the database asynchronously from a daemon is only possible with sufficient write-through cacheing”. I am not sure if either of these statements are entirely correct, but again I am being pedantic.
Cheers,
alexis
alexis
November 28, 2008 at 3:32 pm
Hi Alexis,
I appreciate your comments and corrections. I agree with everything you said, and especially the point that write-behind cacheing “writ large” is what I’m proposing. I wonder if I’ll be able to implement this in production–I certainly hope so–but it’s not priority #1 right now. I think it provides an unparalleled level of fault tolerance, which is why I’m so interested in it. Pne of the interesting requirements is having an globally unique id generator (which must be incrementing for my site since API clients use that for ordering). Since my site uses artificial primary keys all over the place, and this cannot change any time soon if ever, I need to generate pks for objects even if I don’t write to the db. There’s probably a simple solution involving hostname, pid, and timestamp, but it’s worth thinking about. Another issue is commiting out-of-order writes to the database. I claim I don’t care if the message queue guarantees ordering. This is true but two updates to the same object must then be idempotent. This is actually not so hard if you are storing your data in normal form (i.e., no denormalized aggregates), and keep a version number in all the tables. Just throw away messages with a bad version number. Timestamps must be part of the enqueued message and not set on INSERT.
So, onto another thing you mentioned. One place where I must be precise is that I was describing an environment without cache *replication*. The environment I work in uses partitioning but not replication (it’s memcached). This is important for addressing this point: “you must have the latest write is if your next piece of work depends on it” — this issue is actually VERY common in web application since it is the essence of a user transaction (a user performs a write then a read, either as the response to the write request or after clicking around a few pages later). You mention that likely “your process performed the latest update” but this is precisely not the case in a typical share-nothing least-connection load-balanced web architecture. So I skirt this issue altogether by not having cache-replication and by partitioning the cache.
nkallen
December 9, 2008 at 7:06 am
[...] In Uncategorized on December 11, 2008 at 6:25 am Pre-requisite: please read my article on Write-through caching to understand why this is [...]
Introducing Cache Money « Magic Scaling Sprinkles
December 11, 2008 at 6:25 am
[...] Write-Through Cacheing is an Essential Part of a Healthy Scaling Strategy [...]
Ennuyer.net » Blog Archive » 2009-02-08- Today’s Ruby/Rails Reading
February 8, 2009 at 11:13 am
Note that this approach does not deal with failures and can lose updates that have been read by others.
Yang
February 10, 2010 at 9:21 pm