Thursday 17 July 2008

Tech Tip du Jour: Variable Length Records

Nothing pisses me off more than people who use variable length records.*

People coming from an Oracle background seem to be very keen on the idea, I suppose when your database is so shit that you have half the world's hardware to keep your CRM system up, it doesn't really matter. But actually, unless you really do have very variable data, variable length fields are a bad idea:

  1. Variable length fields all have a storage size overhead. I've seen far too many fucking idiots with VARCHAR(1) and VARCHAR(2) in their tables.
  2. People ALWAYS say: "well, since it's a VARCHAR, we'll make it the maximum possible size, because there's no downside to that." Well, guess what? There fucking is: when the engine inserts a variable length row, it checks to see if the maximum possible length of the row would fit in the remainder of the page.
    This means that if your maximum row length is more than 50% of your page size, each row will go on its own page, even if the actual row is only 5 bytes long. This is fixed in 11.1, but only if you a) configure MAX_FILL_DATA_PAGES and b) unload and reload all your data. Plus, the engine has extra work to do: without MAX_FILL_DATA_PAGES, the engine has to calculate the maximum row size; with it, the engine may have to move the row to another page if an UPDATE increases the row length sufficiently.
  3. I can count on the fingers of one hand the number of times I've seen VARCHARS and LVARCHARS used for a justifiable reason. Inevitably, people use VARCHARS because they're too fucking lazy to do any kind of analysis on the contents of their data. This is the bit that really pisses me off, because it's an indication that there's going to be lots of shit you need to pick up in years to come. Don't be a cunt, do the spade work -- it always pays off in the end.

*This is a lie. But not much.

No comments: