Tuesday, April 24, 2007

The secret to making things easy: avoid hard problems

That may seem obvious, but in my experience most engineers prefer to focus on the hard problems. Working on hard problems is impressive to other engineers, but it's not a great way to build successful products. In fact, this is one of several reasons why YouTube beat Google Video: Google spent a lot of time solving technically challenging problems, while YouTube built a product that people actually used (using PHP and MySQL, I think, which is not at all technically impressive).

For me, the most effective method of getting things done quickly is to cheat (technically), take a lot of shortcuts, and find an easier way around the problem (and before anyone jumps in with some comment about security or bank transactions, there are obviously a few exceptions). You only need to think ahead enough to avoid painting yourself into a corner, or have a plausible plan for escaping the corner. There's always an easier way -- work lazier, not harder. Note that this doesn't preclude doing things that SEEM difficult -- easy solutions to important problems that LOOK really hard are the best.

I was reminded of this while replying to the comments on news.yc in response to my post on disks and databases. Whenever anyone mentions the possibility of not using a conventional database, a lot of people will immediately reply that databases solve a lot of very difficult problems, and that you shouldn't put a lot of work into reinventing the wheel. These people are, of course, correct.

The thing is, a lot of those difficult problems are irrelevant for 99% of products. For example, "real" databases can handle transactions that are too large to fit in memory. That was probably a really important feature in 1980. Today, you can buy a computer with 32GB of memory for around $5000. How many GB transactions do you suppose Twitter performs? My guess is zero -- I suspect that their average transaction size is closer to 0.0000002 GB (messages are limited to 140 characters).

I want to be perfectly clear about one thing though: I'm not advising you to ditch your database! If your database is plenty fast, then the easiest thing to do is probably "nothing", and that's what I advise you do. If, however, your db is getting slow or overloaded, then you need to do two things:
  1. Understand the problem
  2. Fix the problem
The correct solution to your problem will depend on your situation. For example, if you have some data that's very important but doesn't change very often (username and password), and some data that gets updated continually but doesn't have to be correct (last active time or hit counters), then a simple solution would be to leave the important data in your database and move the less important data into something really simple but less reliable.

Want an example of "simple but less reliable"? Here's one (in one or two easy steps):
  1. All updates go in to memcached, but not the database
  2. (optional) A background process occasionally copies entries from memcached to the db. Without this, the values will be completely lost when memcached restarts.

0 comments:

Post a Comment