Ayende @ Rahien

Hi!
My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

@

Posts: 5,949 | Comments: 44,545

filter by tags archive

The best argument for scale out


I am writing a presentation, and I thought it would be interesting to get some numbers:

Server

Cost

PowerEdge T110 II (Basic) – 8 GB, 3.1 Ghz Quad 4T

$1,350.00

PowerEdge T110 II (Basic) – 32 GB, 3.4 Ghz Quad 8T

$12,103.00

PowerEdge C2100 - 192 GB, 2 x 3 Ghz

$19,960.00

IBM System x3850 X5 – 8 x 2.4 Ghz, 2048 GB

$645,605.00 

Blue Gene/P – 14 teraflops, 4,096 cpus

$1,300,000

K Computer (fastest super computer) - 10 petaflops, 705,024 cores, 1,377 TB

$10,000,000 annual operating cost

No data on actual cost to build

And then what?


Comments

Scooletz

Then you start reading Jacek Dukaj with his idea of Ultimative Inclusions:) http://dukaj.pl/English

Addys

But hardware is only a minor component of overall service cost; power & cooling, datacenter rack space, networking equipment, software licensing and management costs. All these rise linearly with the # of machines and (almost) orthogonally to the processing power of each machine. And that doesn't even consider the huge added complexity (=cost!) of developing and running distributed software.

The bottom line is that there are a few textbook scenarios where scale-out is clearly superior, and many others where scale up or a mixed approach is more effective. And even then the cost-effectiveness usually isn't the deciding factor.

petar repac

So best argument for scale-out is hw cost ? Maybe. But then it is also the best argument for scale-up.

We are missing context here. For what kind of apps ? Business apps, web sites ? Yes, I 'd agree.

But then, I doubt that Blue Gene runs web sites :))

Frans Bouma

Most people never need more power than a single poweredge pizza box server. Mind you, stackoverflow ran for quite some time on 3 of those ( think they still do, not sure)

petar repac

10 Dell R610 IIS web servers (3 dedicated to Stack Overflow): 1x Intel Xeon Processor E5640 @ 2.66 GHz Quad Core with 8 threads 16 GB RAM Windows Server 2008 R2 2 Dell R710 database servers: 2x Intel Xeon Processor X5680 @ 3.33 GHz 64 GB RAM 8 spindles SQL Server 2008 R2 + HAProxy servers + Redis servers + .....

http://highscalability.com/blog/2011/3/3/stack-overflow-architecture-update-now-at-95-million-page-vi.html

Josh Reuben

IBM Sequoia in 2012 - 20 Petaflops - which is equivalent to the processing power of the human brain

Peter Ritchie

Yeah, kinda looks more like an argument for scale-up. to get 705,034 cores with the poweredge C2100 you're looking at $1,172,706,553.00 or $40,878,080.00 to get 4096 CPUs with C2100...

Of course, assumes you need to scale to that amount of power at least sometime...

petar repac

@Josh Reuben: whose brain :)) You think that if I get a job in IBM they would pay me accordingly :D

Rafal

@petar I'd be happy to sell 50% of my brain's processing power to some data center. It's an used brain with far from perfect condition so maybe it has only 2 or 3 petaflops left, but still this should be worth few millions a year.

Ayende Rahien

Schooletz, I am not familiar with him, what is his Ultimate Inclusion?

Ayende Rahien

Addys, 1 EC2 machine (large) for 1 year - 1,756.8 $ 100 EC2 machines (large) for 1 year - 175,680 $

Seems to be a pretty linear scale to me. Actually, the more you use, the better deal you can get.

No one said it is going to be cheap, but it is usually more cost effective.

Yes, distributed programming is more complex, but even if you are on a single machine, it isn't like you can assume only a single thread is running, and a lot of the same issues you have to deal with are there anyway.

Jeff Tucker

Strongly agree, I use a chart similar to this in my advanced networking class to show precisely why scaling out is often far better than scaling up after a certain point. The only problem is that a LOT of devs just have no idea precisely how to scale out and how to write that type of software, and no clue that mathematical landmines are waiting for them that would actually render their entire architecture unusable (not an opinion or observation, there's a formal mathematical proof for these). Really need to take a look at optimistic concurrency models and serial equivalence of transactions, CAP, the Fischer Consensus problem, and queuing theory and truly understand that algorithms that suffer from and address these problems. If not, you're opening yourself up to synchronization problems, deadlocks, consistency issues, and weird one-off Heisenbugs that aren't actually just edge cases and will utterly destroy you outside of a test environment because they don't repro regularly in test but the rate at which those bugs scale is exponential so you see a ridiculous number of them start occurring in a big hurry. In conclusion, it's call computer science for a reason, so start sciencing! (and if sargable is a word, the sciencing is a word too)

Hendry

re @Addys "All these rise linearly with the # of machines...", and isn't it a point that precisely supports the original argument? The figures on this post show that scaling-up rises costs exponentially. Linear costs seem quite a jolly good bargain when it comes to scalability.

Craig D

I'm all for scale out, even more so for cloud.

Especially in a web scenario there are diminishing returns for scale up performance, Scale out performance, is nearly linear, that's in addition to added redundancy etc.

The additional benefit cloud presents is "Scale fast, fail fast". It's not the ability to turn on servers quickly, it's the ability to discard them when they're no longer needed.

Scooletz

He's a Polish hard SF writer:) The Ultimative Inclusion is the optimal computer you can get in a universe based on given physics. To create a better one, you need to create a new universe with 'better' constants. Each of his novels is a masterpiece. If you like SF (and I think you do), it's a very good position for your to-read list :)

Sean Kearon

@Scooletz - Dujak looks very interesting. Are there any english translations available?

silk

@Sean, unfortunately not. Dukaj said on convention that for now every negotations with foreign publishers did not finished in contracts and for now, and he does not predict any to be successfull in near future :(

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. The RavenDB Comic Strip: Part III – High availability & sleeping soundly - 17 hours from now

There are posts all the way to May 28, 2015

RECENT SERIES

  1. The RavenDB Comic Strip (3):
    20 May 2015 - Part II – a team in trouble!
  2. Special Offer (2):
    27 May 2015 - 29% discount for all our products
  3. RavenDB Sharding (3):
    22 May 2015 - Adding a new shard to an existing cluster, splitting the shard
  4. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  5. Interview question (2):
    30 Mar 2015 - fix the index
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats