nkallen

The Meaning of Information Technology

In Uncategorized on November 2, 2009 at 2:00 am

The first commercial computer was the Lyons Electronic Office I and was used in 1951 to perform vast calculations pertaining to the fabrication and consumption of biscuits. You see, after the war, J. Lyons & Co., a popular chain of British tea shops, was confronted with an appetite for pastries so astronomical (which is understandable given years of tedious disputes with Germany), that the human mind was incapable of solving unaided the problem of distributing tea cakes to their customers.

Hidden in this story is the true meaning of all information technology.

A brief statement of the problem.

There is an old logical puzzle called the Sorites Paradox, first articulated by the Megarian logician Eubulides of Miletus. It predates the stored program computer by 2,000 years but it similarly concerns the production of pastries:

Would you describe a single grain of wheat as a heap? No. Would you describe two grains of wheat as a heap? No…. You must admit the presence of a heap sooner or later, so where do you draw the line?

This problem was of keen interest to the philosophical community for thousands of years, principally because the Greek recipe for tea cakes called for two heaping tablespoons of sugar. Some philosophers went so far as to vow to grow a beard and engage in pederasty until a solution to the problem was found. But all efforts were in vain; the problem remains unsolved to this day.

Unfortunately, the problem has only become ever more acute in the modern era. In fact, far from only destabilizing the fabrication of pastries, it has further undermined every area of society. Consider the process of voting. If no one voted, one vote would affect the outcome. But if millions of people vote, one vote makes little difference.

In fact, the defining characteristic of the modern era is that every aspect of society is heaping. To understand how this came to be, we must revisit ancient history.

Some scientifical facts.

During the pre-historic era, when mankind lived in trees and swam in lagoons, we lived in small clans and tribes of dozens up to a few hundred people. During this long honeymoon period in homo sapiens history, we evolved our cognitive abilities through careful grooming and diligent fornicating. The brain developed the ability to speak languages and sympathize with other people and feel jealousy and kindness and all that other stuff. In other words, we evolved a social technology that equipped us for living in society and dealing with the ordeals caused by other people.

But that social technology is ill-equipped to deal with the humongous heaps of the modern world. We meet hundreds of people every year and can’t remember any of their names. We evolved language and vocal chords to cover long distances but somebody put skyscrapers in the way and anyway we now live like really long distances away, like you have to fly an airplane to see them or at least ride a bicycle. And then there are these celebrity neanderthals like Ashton Kutcher who are adored for their plumage by a multitude who greedily read every banal detail of his private life in magazines like Star that ship millions of copies to every end of a giant sphere whose radius is 6,378.1 kilometers; or on a Twitter that delivers every inane thing he tweets to 4 million people, 10 times daily. It goes without saying that this would not be possible if not for a defect in our programming in the face of the massive scale of the world.

Everything has become a giant fucking heap.

The modern world is profoundly inhumane. Mankind is incapable of reasoning about the heaping constructs of mass culture using the technes of intimacy that are an hundred thousand years old. For example, we need to be constantly reassured that celebrities are just like us. They eat waffles and pick up dry cleaning. If we do not share this understanding of Ashton Kutcher, we become overwhelmed by existential anomie and commit suicide.

Human beings need to understand one another in terms of primordial intimacies because man has no other tools for understanding the solicitations of man. But if the size of the world no is on longer amenable to intimacy technologies, then mankind must invent information technologies that rehumanize the world.

Thus the proliferation of social software on the web. The reification of the social graph in Friendster; the Facebook Newsfeed and the Twitter; and the Foursquare all serve this one purpose: to rehumanize an inhumane world. Let’s consider each of these technologies one at a time.

  1. Friendster’s reification of the social graph makes it possible to understand the ties that bind us all together when we only have room in our brains for the intrigues of a few dozen relationships.
  2. The Facebook Newsfeed and the Twitter make it possible to share in the thoughts and intimate moments of those who inhabit different neighborhoods, and different schools, and different jobs, and make different choices than us from amongst the vast cornucopia of mass-produced art sold to us by the culture industry. Finally,
  3. the Foursquare coordinates the alienated existence of cosmopolitan voluptuaries into a shared bacchanal.

Technology and self-criticism.

I cannot help but be a technological optimist because technology is mankind’s only bulwark against the barbarism of heaps. But I’ll grant that technology is imperfect; it is sometimes fair to criticize the Tyranny of Technology. The usual argument goes that all these tweets and text messages and notifications that “a software update is available” leave no space quiet, provide no room for contemplation. It is true: we do live in a world of interruptions; interruptions created by information technology. But we should not be surprised by this fact and no more should we despair of it. One generation of technology solves the problems of the previous but causes problems of its own. The next generation of technology repeats this story; a story as old as mankind itself. This is the dialectics of history.

Do not doubt, then, that a technology will arise to solve this problem too. We, the makers, shall fabricate a machine to produce quiet and contemplation. In fact, at this very moment, I have a patent pending on a pair of contemplation goggles. And in just a few weeks I will release a beta of a contemplation-inducing goggular Twitter client. I’m sure it will be received to massive acclaim and to the profound benefit of humanity.

So! Old men: do not fear for the future. We young people, we hackers and makers, we have it all under control. We know the true meaning of information technology. We shall save us all from the giant fucking heaps.

Introducing Cache Money

In Uncategorized on December 11, 2008 at 6:25 am

Pre-requisite: please read my article on Write-through caching to understand why this is useful.

Most caching solutions in the Rails world involve something like Cache-Fu: an alternative API to ActiveRecord that explicitly annotates all call sites with cache rules.

  • User.find(1) becomes User.get_cache(1)
  • User.find(:all, ...) becomes User.get_cache("query_name", :ttl => 5.minutes) { User.find(:all, ... )}

I hate this kind of interface, which places the burden on the caller and meekly surrenders any attempt at encapsulation. Your codebase will be littered with haphazard cache rules in your controllers, views, and models.

But even worse are the explicit cache expiry rules. As you cache non-trivial queries, you’ll have to find all the of the writes in your system that could possibly invalidate the results of the query. It’s a tedious and onerous effort: after hundreds of hours of debugging you’ll finally get expire_cache in just the right places.

A solution to this brittle, messy coding style is now available, and ready for production use. `Cache Money` is a plugin for ActiveRecord that transparently provides write-through and read-through caching functionality using Memcached. With `Cache Money`, queries are automatically cached for you; and similarly, cache expiry happens automatically as after_save and after_destroy events.

This doesn’t just apply to trivial queries. Very complex, sophisticated queries are handled effortlessly; the vast majority of ActiveRecord usage is transparently materialized, indexed, and kept fresh in Memcached. Here are some examples:

  • User.find(1)
  • User.find(:all, :conditions => { :screen_name => 'bob' })
  • Friendship.find(:all, :conditions => ['friendships.creator_id = ? AND friendships.receiver_id = ?', ...])
  • users.direct_messages
  • users.direct_messages.find(1)
  • users.direct_messages.count
  • User.find(:all, :limit => 10, :o rder => 'id DESC')

All of these, and much more will automatically be cached and kept fresh as you write to the database. This greatly lessens the load on your database and makes your site impervious to catastrophic replication lag.

The way `Cache Money` works is by materializing the equivalent of database indices in Memcached. It’s as if you store your indices in a distributed hash table instead of an in-process BTree. Just as with a database, you declare your indices:


class User < ActiveRecord::Base
index :screen_name
end

class Friendship
index [:creator_id, :receiver_id]
end

class DirectMessage
index :user_id
index [:user_id, :id]
end

There are lots of configurable options like TTLs:

index :user_id, :ttl => 1.day

You can also specify limits to ensure that your indices do not grow too large:

index :user_id, :limit => 500, :buffer => 20

(This keeps a rolling window of 500 items. The buffer option indicates how many “extra” you want to keep around in case of deletes in order to maintain at least 500 items. If more than 20 are deleted, the index will be repopulated to ensure there are at least 500 items in it).

This is just the tip of the iceberg. Many advanced utilities are included for even more sophisticated use, including shared locks to deal with distributed computations on shared memory, simulated transactions in Memcached (which obviates the need for locks in most cases), high-performance mocks for your tests, and in process-caches to minimize network operations during a single request-response cycle.

A version of this code is in production use at Twitter and is one part of the reason Twitter’s uptime has improved so much over the last several months. This is real, pragmatic, unmagical, production-ready code that can be a big part of your Rails scaling strategy. It is designed with massive datasets and real-world operational challenges in mind. And it’s almost effortless to use, since it requires no changes to how you use ActiveRecord.

Check out `Cache Money` on github.

Write-Through Cacheing is an Essential Part of a Healthy Scaling Strategy

In Uncategorized on November 24, 2008 at 7:05 am

The debate around “Does Rails scale?” I’ll leave to the armchair architects. But one thing I know is that having one 100% Beef Database doesn’t scale.

With replication you can scale your database. With Master-Slave replication, you can have any number of read-slaves for one master; you perform reads (SELECT queries) on the slaves and issue writes (INSERT/UPDATE/DELETEs) on the master. With this strategy your write capacity wont scale but your read capacity can scale almost indefinitely. And anyway, since write and read actions have different performance characteristics, you’ll get the most out of your beefy machines if you’re into this role-play. By the way, the best Rails plugin for this is masochism.

Replication Lag

Still, Replication has its problems. The latency of information propagation from the master to the slave (and from slave to slave if you have a tree topology) can lead to data inconsistency.

Consider a user who creates a new account on your web site. A new record is INSERTed into the `users` table on the master database. On the next page load their record SELECTed from a random slave. But the information might not yet be propagated to the database slave! So the user gets an HTTP 500 error.

Typically, it takes only a fraction of a second for information to propagate. But it isn’t unusual to measure replication lag in seconds, minutes, or hours because of bugs in mysql, partial outages, and expensive queries. This is an every-day occurrence and you must plan accordingly.

Fortunately there is a solution, and it is write-through caching. At the same time as writing to the master database, write “through” a cache layer such as memcached. If for all reads you first try to read from cache, you will always find data: data that hasn’t yet propagated from database to database is newest and therefore most likely to be in the cache.

Easier said, harder done. You’ll have smear your code with cream-cheese to get this to work. Unless there were some magical plugin that solved this problem for you…

Testing for Replication Lag-induced defects

But you can only obtain code cream cheese if you can even figure out what part of the site needs cacheing. You’d be crazy to test, live in production, bugs caused by non-deterministic replication lag. So you need to simulate replication lag somehow.

One technique that is almost foolproof is to use a comprehensive Selenium test-suite. Create a new Rails environment with your master and slave as two local databases. Load your fixtures into both databases. Don’t establish any replication relationship between the two databases — voila! you’ve just simulated infinite replication lag. Now run your Selenium suite. I’m sure it won’t be pretty, 500s upon 500s, but that’s what will happen to your users unless you implement write-through cacheing.

Other Scaling Strategies

Now, there are other database scaling strategies than master-slave replication. Master-Master replication can skirt replication lag if you use sticky sessions. But MySQL support for master-master replication is arguably not production-ready, and in any case such replication, especially combined with sticky sessions, has worse failure characteristics than master-slave.

The unicorn strategy of “sharding”/Database partitioning does not obviate replication strategies; usually each partition has replication strategies if only for fault tolerance.

My fantasy scaling strategy involves skipping the database altogether. Rather than synchronously write to a database, write to a message queue. Have a daemon read from the queue and write to your (partitioned) (replicated) database at its leisure. In the event of a write-database failure, the daemon gets backed up but there is no downtime since your site is completely decoupled from the write database. If you’re not concerned with in-order delivery, message queues scale effortlessly. (Please in no way consider this an endorsement of either ActiveMQ or RabbitMQ). In any case, writing to the database asynchronously from a daemon is only possible with sufficient write-through cacheing.

This is why I say write-through cacheing is critical to scaling. And in the next few weeks I’ll bring forth some code that shows how to do it right.