scale-out: notes on sharding, unique keys, foreign keys…

adapt.com uses sharding to scale out. In my talk on moving to MySQL, I mentioned a few design issues we had to become aware of before we rolled out a sharded database environment. I promised to give more details, so here they are…

We wanted some keys to be globally unique (eg, userid and accountid). How would we do this?

auto_increment_offset and auto_increment_increment

We had been using autoincremented values. So we considered using MySQL’s auto_increment_offset and auto_increment_increment to keep autoincrement values unique across shards. (This basically tells each shard to start at a certain number (auto_increment_offset), eg 1 for one shard, 2 for the next, etc, then to skip auto_increment_increment between values. So, eg, if we set that to 1000, our next keys would be 1001 for shard one, 1002 for shard 2, etc. Then 2001 for shard one, 2002 for shard 2, and so on. This would allow up to 1000 shards.) But we were worried that we would have to be careful to roll out shards with correct, unique values for auto_increment_offset, and thought there might be a better way.

UUIDs

We considered using a UUID. MySQL has a UUID() function. However, we also wanted to be able to use replication. With MySQL 5.0, apparently UUID’s can cause a problem for replication. If you use MySQL’s UUID() function, when you replicate, the statement you used is sent to the slave and executed there. But that means the slave will also call UUID(), creating a different value on the slave. (MySQL 5.1 promises row-based replication – ie, the generated values themselves are sent to the slave, not the SQL statement.)

So we considered using a UUID generated in the application code, and executing a statement with the UUID value in it. But then we wondered about the efficiency of using a UUID for an InnoDB primary key. (How would we store it? As a char(36)? Big for an InnoDB primary key. Also, UUIDs are a bit ugly for reading and working with. (See the MySQL Performance Blog entry “To UUID or not to UUID” for performance considerations.)

global view

Then we thought of using a global view, using the federated storage engine (think database links in Oracle), that would select from each shard. We thought our code could check this view and generate a next value and use it. But then, if any shard fails, the view will bomb out, and you’re out of luck. So we dropped that idea.

global tables

What we ended up doing is have a small, “global” database that has tables for some of our keys that need to be unique. The code inserts into them, gets the values, and uses them to populate the appropriate shard.

Next issue – what about foreign keys?

Currently – as far as I know – you can’t enforce foreign keys across servers. So we set up code to check and monitor some of our foreign key relationships and alert us if it finds any anomolies.

And that is how we arrived at sharding. So far so good…

Advertisements

10 Responses to “scale-out: notes on sharding, unique keys, foreign keys…”

  1. joel Says:

    How are you partitioning your data? Just using a regular scheme to uniformly distribute the data? You might want to consider deciding how to partition stuff yourself. What happens if your ten busiest users are all on the same partition? You want a way to move users around from partition to partition to distribute the load. For that you’d want to keep a lightweight database which associates users to shards and just do a quick look up each time so that you know which shard to go to or use mysql proxy to do that automatically for you.

  2. ben Says:

    Joel – We’re doing what you described. We have our lightweight database (we call it the global database) that associates users to shards. We’re working on scripts to move users from one shard to another if it becomes desirable.

  3. Scalable web architectures » Blog Archive » Sharding: Different from Partitioning and Federation ? Says:

    […] Notes on sharding, unique keys, foreign keys […]

  4. dsheardown Says:

    Hey, I am not really a DBA by any standard, although going through some design decisions about sharding with a new project. I have read countless discussions on how to effectively “scale out” and not “scale up”.

    If I may ask a rather junior question:

    In the global database, do you simply have say a table with the userid/accountid and then some identify to say which shard DB/server to look at?

    Might sound rather rudimentary, but I guess you don’t know if you don’t ask!

  5. ben Says:

    dsheardown –

    In our case, yes, we had a table in a small “global” database that contained accountid and shardids. So you’d look up an accountid and find out which shard that account was on. Of course, it’s important to have this table’s database backed up and probably replicated. If it is unavailable, that would make things difficult!

    We were able to move accounts to new shards by adding their data to a new shard then simply changing that one entry in the accountid/shardid table to point to the new shard.

    Thanks for reading!

    Ben

  6. Andrew Mangold Says:

    Ben,
    I have the following configuration: Two co-master InnoDB databases (Local number portability LNP) 175M rows, 50GB data 300-400K updates/day V5.1.22 using statement level replication and slaved to each other, only one database taking updates at any given time. I have a MySQL Cluster 5.1.23 housing data for an SCP. The SCP needs a subset of the LNP data (~10GB). I had planned on slaving the MySQL Cluster to the co-master InnoDB databases and using a Stored Trigger to drop unneeded columns as LNP replication transactions are applied to the MySQL Cluster. But…the carrier customer has decided to use MS SQL Server 2005 to house the database of record and requires that LNP transactions be feed from the InnoDB database to the MS SQL Server 2005 database and for the MS SQL Server to send update transactions to the MySQL Cluster. All of which needs to happen in real time. I was thinking about combining FreeTDS and the source for mysqlbinlog to build a utility to read the replication logs from InnoDB and apply them to MS SQL Server 2005. Not sure how to get replication transactions from Ms SQL 2005 server and apply them to MySQL Cluster. Thoughts???

  7. Sami Says:

    Ben,

    Could you please elaborate on how to make auto-incremented field globally unique? We are planning to implement sharding as a scaling solution for our database and having auto-incremented field globally unique is one of the requirements.

    Based on my understanding of your approach, It seems like I need to know ahead of time how many shards I want to add and then skip the count by that number. Is this correct? Or I missed something?

    Thanks,
    Sami

  8. Karuna Says:

    Please suggest on any specific sharding algorithm to implement the database sharding in mysql.

  9. jurriaanpersyn.com — Database Sharding at Netlog, with MySQL and PHP Says:

    […] oracle2mysql.wordpress.com/2007/08/23/scale-out-notes-on-sharding-unique-keys-foreign-keys […]

  10. Database Sharding at Netlog, with MySQL and PHP Says:

    […] oracle2mysql.wordpress.com/2007/08/23/scale-out-notes-on-sharding-unique-keys-foreign-keys […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: