scale-out vs scale-up

Scale-up vs scale-out – what are they?

Basically, “scaling up” means to add power to your server, or get a bigger server -build it up. “Scaling out” means to add new servers to your configuration – build it out.

Oracle touts both approaches, using RAC on big servers or on “blades”, or a “grid”.

MySQL generally touts scaling out, and most of the big sites using MySQL use “scale-out”. In my experience, and according to Jay Pipe’s excellent blog entry on the subject, that’s partially because MySQL doesn’t scale up as well as it scales out. (Another reason he mentions – that also fits my experience – is that people who pay for Oracle licensing match that by also buying expensive machines to put it on.)

In Oracle, whether you scale up or out, you’ll probably use RAC, which means setting up private interconnects and basically using “shared everything” with your servers. You’re still sort of “scaling up” the database itself, growing it on one server or many. (Correct me if I’m wrong..)

In MySQL, people tend to split up, or “shard”, their databases into pieces that are put into different databases on different servers. Eg, you might have one small, global database that contains information about which server a user’s data is stored on, and then multiple “user” databases on different servers. Your database would essentially be partitioned by user into smaller databases. (That’s what we do.) That’s scaling out. Then, if you get more users, and need to grow, just add more servers.

(On the other hand, if you have a small database – that can fit entirely into memory – you can use MySQL’s clustering technology and the NDB storage engine to create a cluster that’s more like RAC. In MySQL ~~6 you’ll be able to~~ 5.1 you can do this with disk-based databases. – But why?) (see comments below for why…)

Another aspect of scale-out can be to use MySQL’s replication technology in one way or another. You can use master-master replication to set up two (or a few) databases that are essentially copies of each other and keep in synch with each other. Or, more common (and what we do) is to use master-slave replication to make copies of your databases (shards) that you can use to spread read loads on (and for disaster recovery: if your master fails – promote a slave to be the new master). Most big LAMP sites seem to do this.

In summary, the common approach is to design your database to be split up into multiple shards (servers/databases), then use replication to make copies of the shards for further load-balancing and availability.

Here’s what some other sources have to say:

Definitions according to wikipedia’s entry on scalability:

Scale vertically

To scale vertically (or scale up) means to add resources to a single node in a system, typically involving the addition of CPUs or memory to a single computer. This could also mean expanding the number of running processes, such as increasing the number of Apache daemons.

Scale horizontally

To scale horizontally (or scale out) means to add more nodes to a system, such as adding a new computer to a distributed software application. An example might be scaling out from 1 web server to 3.

According to MySQL: (From the first page of “12 days of scale-out”)

What is Database Scale-Out?

Scale-Out is a modern computing architecture that enables organizations to improve application performance and scalability on an incremental, as-needed basis by adding multiple replicated database servers on low-cost commodity hardware. This is in contrast to a Scale-Up approach, which requires organizations to make a large up-front investment in more expensive and complex server hardware and database licenses in order to add capacity.

Jay Pipe’s “excellent blog entry”: The Ambiguously Vague Duo: Scale-Out and Scale-Up (SNL anyone?)

A white paper from MySQL: Guide to Cost-effective Database Scale-Out using MySQL

This entry was posted on August 22, 2007 at 7:26 pm and is filed under MySQL. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

9 Responses to “scale-out vs scale-up”

Keith Murphy Says:
August 23, 2007 at 1:26 am | Reply
So, a couple of points. One..MySQL 5.0 has totally in-memory clustering using the NDB engine. With MySQL 5.1, on the other hand, you only have to keep indexes in memory. Actually data can be kept on disk.

Why would you do this? Well, primarily reliability. If you have multiple SQL and data nodes and it is properly configured you can have any server go down and the cluster keeps running.

In theory it will allow you to scale your app as needed. That is more theoretical at this point from what I have seen, but I am sure this will continue to improve.

I am having this discussion right now about scale-out vs. scale-up. Yes, scale out is cheap(er) because higher-end hardware is more expensive. But, more servers require more people to maintain them (sys admins) and more dba’s to work on them. So, it gets murky as to which is better.

It has been proven over and over that MySQL can scale-out. Not as much info is available about scale-up. I haven’t read the postings you reference, but I will. The info I have heard is that MySQL under Linux will handle around 64 gigs of RAM and around 8 cores without to much loss due to inefficiency. That is actually a pretty hefty machine….one that could replace quite a few smaller servers.

Another argument aside from the reduced admin cost is that some applications simply don’t shard so well. There is a great deal of complexity in some applications that make this prohibitive.

I hope to benchmark some of this down the road. I will certainly blog about it if I get the chance.
Matthew Montgomery Says:
August 23, 2007 at 4:24 am | Reply
I must correct you on one point… Disk based columns for NDB (MySQL Cluster) tables is available in 5.1.

See: http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-disk-data.html

NDB Engine gives you the ability to scale-out while preserving a single logical image of your data without having to code sharding into your application. It also provides high availability with synchronous replication, sub-second failover and automatic recovery.
ben Says:
August 23, 2007 at 5:14 pm | Reply
Thanks for the corrections and clarifications! I was writing from memory and not checking my facts – I knew disk-based NDB was coming soon, but didn’t realize how soon (to “production”, that is – I guess it’s already here)!
Sean Bannister Says:
September 28, 2007 at 7:23 am | Reply
Thanks for the great article that was really useful, im currently researching different ways to shard.
Sebastian Wallroth Says:
October 28, 2007 at 11:39 pm | Reply
Thank you for your article. I’m in research for scaling-up or -out. As a side effect I translated yoúr article into German: http://habacht.blogspot.com/2007/10/scale-out-vs-scale-up-skalierung-durch.html
Архитектура Wikimedia | Insight IT Says:
March 28, 2008 at 12:33 pm | Reply
[…] scale-out vs scale-up из блога “Oracle to MySQL” […]
Архитектура Wikimedia | Всё интересное Says:
August 12, 2008 at 11:57 am | Reply
[…] scale-out vs scale-up из блога “Oracle to MySQL” […]
Wikimedia Architecture | DBASoul.com Says:
June 28, 2011 at 5:42 am | Reply
[…] scale-out vs scale-up in the from Oracle to MySQL blog. […]
royolsen Says:
June 16, 2014 at 12:25 pm | Reply
As you point out, Oracle RAC does not offer scale-out in every sense of the word as it depends, among other things, on shared storage. You can however scale out Oracle Database using other Oracle technologies, such as DataGuard, Active DataGuard or GoldenGate. The latter is very flexible and allows for advanced replication and partitioning schemes – even across database platforms. You could, say, offload some reporting to MySQL, leaving your RAC free to handle transactions.

GoldenGate even makes multi-master replication look manageable, almost simple, perhaps a little tempting? Don’t fall for it, it’s still a horrible idea.

Naturally, the option of partitioning/distributing data in the application layer applies to Oracle databases just as any other DBMS.

There are also third party options available (GoldenGate used to be one) such as Continuent Tungsten, and it is possible to do basic log shipping and application with a few simple scripts.