Ayende @ Rahien

Ayende @ Rahienhttp://ayende.comAyende @ RahienCopyright (C) Ayende Rahien 2004 - 2021 (c) 202660Colin Jack commented on Architecting TwitterReally good stuff, nice to see a discussion of a real world example. On the negligence/incompetence angle, if they truly had only 3 engineers and one operations staff in late 2007 (as reported) then I don't find it that surprising that they weren't able to get the time to rethink their architecture. Dare Obasanjo also had an interesting post on this topic: http://www.google.com/reader/view/#search/twitter/2/feed%2Fhttp%3A%2F%2Ffeeds.feedburner.com%2FCarnage4life http://ayende.com/3346/architecting-twitter#comment21http://ayende.com/3346/architecting-twitter#comment21Sun, 15 Jun 2008 10:16:25 GMTAyende Rahien commented on Architecting TwitterJames, You do that as part of the processing of an incoming message. http://ayende.com/3346/architecting-twitter#comment20http://ayende.com/3346/architecting-twitter#comment20Fri, 13 Jun 2008 10:12:42 GMTJames commented on Architecting TwitterRegarding tagging a message, how would you implement that in such a simple setup? One point you made above was that no joins should happen when reading the timeline. Would you put the tags for a message in a varchar field and perform a text search on that field or use a relationship between message and tag? Surely a properly indexed join would be faster than the comparison on a text field. http://ayende.com/3346/architecting-twitter#comment19http://ayende.com/3346/architecting-twitter#comment19Fri, 13 Jun 2008 06:02:35 GMTMartin commented on Architecting TwitterWhat do you guys think of using amazon simpleDB and amazons queueing service (for building something like twitter)? http://ayende.com/3346/architecting-twitter#comment18http://ayende.com/3346/architecting-twitter#comment18Sun, 08 Jun 2008 21:13:01 GMTAyende Rahien commented on Architecting TwitterMihailo, Yes, there are constraints to that. But when you are faced with routine downtimes, this is bad. This means that your priority should be to fix this, now. All of that said, the outlined architecture isn't really hard to build. About architecture, I am writing on that when I am finding interesting things to write about. Feel free to send me topics. http://ayende.com/3346/architecting-twitter#comment17http://ayende.com/3346/architecting-twitter#comment17Wed, 04 Jun 2008 16:13:59 GMTMihailo commented on Architecting TwitterHi All, Interestingly no-one is considering aspects other then technology and architecture. Surely, architecture and technology are very important but real world projects are funded by real world money (consider money somewhat loosely - personal time, fame, personal satisfaction) and developed by developers who have different skills and abilities - and deadlines. All projects I worked on had serious constraints in mentioned "resources". You are constrained by technologies known by your developers, which again affects architecture since if you want to use something that is not "common" knowledge you need training etc. etc. The once you start you develop prototype, managers say: oh you already have almost ready product. We always agree that prototype is prototype and should be discarded, but at the end of the day it is not you who is going to decide on that but someone who is not really in technology. Then you enter production and half of the team is heavy on support, and the other half is reassigned to other projects and we start all over again. Sorry if this wasn't technical enough, just wanted to point out that there are other things that are at least as important as architecture whether you like it or not. Btw, the article is cool, and as people asked already, why not writing more on architecture? Cheers, Mihailo http://ayende.com/3346/architecting-twitter#comment16http://ayende.com/3346/architecting-twitter#comment16Wed, 04 Jun 2008 16:02:21 GMTAyende Rahien commented on Architecting TwitterMike, I don't think that there should be usage of either. There is a sharding table, which map a user to a particular DB, and that is all. You allocate users to databases in round robin fashion, with a cap on how much you allow per DB, perhaps, but that is all. http://ayende.com/3346/architecting-twitter#comment15http://ayende.com/3346/architecting-twitter#comment15Tue, 03 Jun 2008 15:38:16 GMTAyende Rahien commented on Architecting Twitter1/ This architecture should be usable for just about any number of users. The only scaling out approach would be to introduce additional DB servers and additional writers. There isn't any single place with load. 2/ Powerpoint & Paint http://ayende.com/3346/architecting-twitter#comment14http://ayende.com/3346/architecting-twitter#comment14Tue, 03 Jun 2008 15:32:15 GMTMike D commented on Architecting TwitterAyende, DNS - username.twitter.com = server1.twitter.com Algorithm - A routing table based on characters in the name. http://ayende.com/3346/architecting-twitter#comment13http://ayende.com/3346/architecting-twitter#comment13Tue, 03 Jun 2008 15:04:50 GMTM commented on Architecting Twitter1) Have you tried some simple benchmarks to measure the performance of the architecture? I think the pinch would be felt with real numbers. What would the performance be if you double the users and then doubled them again 2) Slightly off topic -- what do you use to create your diagrams? http://ayende.com/3346/architecting-twitter#comment12http://ayende.com/3346/architecting-twitter#comment12Tue, 03 Jun 2008 13:57:13 GMTAyende Rahien commented on Architecting TwitterMike, I don't think that I follow. What routing options are you talking about? http://ayende.com/3346/architecting-twitter#comment11http://ayende.com/3346/architecting-twitter#comment11Tue, 03 Jun 2008 13:09:09 GMTMike D commented on Architecting TwitterGreat post... What are the other routing options you considered, DNS, algorithm? Would be great if Udi would chime in on the messaging bits. Please do more on Architecting. http://ayende.com/3346/architecting-twitter#comment10http://ayende.com/3346/architecting-twitter#comment10Tue, 03 Jun 2008 13:03:19 GMTTobin Harris commented on Architecting Twitter@Ayende Cheers for the answers! @Casey Yeah, looks like they put in lots of instrumentation, then added memcaching and replication. From what I've read, this is often a first port-of call for Rails scalability. However, I think you're right, they focused on scaling the wrong architecture rather than finding a better one. http://ayende.com/3346/architecting-twitter#comment9http://ayende.com/3346/architecting-twitter#comment9Tue, 03 Jun 2008 11:11:43 GMTAyende Rahien commented on Architecting TwitterTobin, For myself, I would want to use the lowest tech solution possible. In this case, the data is naturally sharded, and there is no need for cross DB operations. This means that I would simply create a new connection and work from there. The least reliance I can make on the infrastructure, the better. That reduce the amount of unusual stuff going on, reduce the amount of configuration needed, and ensure that I don't have to have a homogenous mix of servers. http://ayende.com/3346/architecting-twitter#comment8http://ayende.com/3346/architecting-twitter#comment8Tue, 03 Jun 2008 10:44:56 GMTCasey commented on Architecting Twitter@Tobin I have no idea what they did use, but I suspect that they don't have a well architected solution, but a big ball of mud, otherwise they would have solved their problems long ago - whether by using queues+shards under rails, or by using some other technological solution ... technology is not important, only the architecture is. A good architecture would have allowed them to quickly adapt to their new found success, a bad architecture will submerge them deeper and deeper in the mud as they grow. As Oren points out, even with some fairly simple thought applied, a large amount of the problems Twitter have would go away. http://ayende.com/3346/architecting-twitter#comment7http://ayende.com/3346/architecting-twitter#comment7Tue, 03 Jun 2008 10:27:07 GMTTobin Harris commented on Architecting Twitter@Casey I wonder if they just mis-judged their scaling efforts and used memcached + replication instead of message queues + sharding? Rails would work well with either AFAIK. http://ayende.com/3346/architecting-twitter#comment6http://ayende.com/3346/architecting-twitter#comment6Tue, 03 Jun 2008 09:47:13 GMTCasey commented on Architecting TwitterI suspect Twitter is a classic case of letting the technology drive the business ... someone with Rails experience had a good idea, they implemented it in the technology they knew, they got funding, they expanded it they way they knew how to ... and then they get where they are ... Twitter is a pretty simple system, it isn't complicated, it doesn't do very much, and as your post clearly shows, a few hours of thought could probably re-architect most of the problems out ... but then it seems the guys at Twitter are tied to a technology they picked, and the approach that technology gives them, or surely they would have rewritten most of it by now ... http://ayende.com/3346/architecting-twitter#comment5http://ayende.com/3346/architecting-twitter#comment5Tue, 03 Jun 2008 09:06:57 GMTKen Egozi commented on Architecting TwitterTo the point. de-normalising the DB for the actual use (reads vs. writes) is 100% correct. Doing async computations while keeping http-response quick is 100% correct (when a "processing" response is expected) http://ayende.com/3346/architecting-twitter#comment4http://ayende.com/3346/architecting-twitter#comment4Tue, 03 Jun 2008 08:56:14 GMTPaul Stovell commented on Architecting TwitterI like your architecture. It's hard to make too many assumptions about twitter itself, but this is a nice design for a service like twitter. Given the large number of users, with each user having essentially a different timeline to see, instead of caching (which would be very fragemented - there's probably not a lot of "reusable" stuff to cache is there?) I'd be experimenting with pre-computing the content. Someone tweets? Don't write it to a database, just append the HTML there and then to the user's HTML file. Got web services? Append it to the XML file your web service is returning. Then you don't wear the overhead of database inserts at all, and you don't need indexes. http://ayende.com/3346/architecting-twitter#comment3http://ayende.com/3346/architecting-twitter#comment3Tue, 03 Jun 2008 08:30:44 GMTTobin Harris commented on Architecting TwitterThis is a really informative post, it's great to see your perspective on it. I'm looking forward to seeing reponses to this. 2 implementation questions if I may, as I've not done much distributed stuff... - Do you have a favourite way of implementing sharding? - Would you use linked databases and actually use the server name in the sql, or would clients get the appropriate server name from the routing server and make a separate connection? Cheers http://ayende.com/3346/architecting-twitter#comment2http://ayende.com/3346/architecting-twitter#comment2Tue, 03 Jun 2008 08:23:20 GMTJim Burger commented on Architecting TwitterPersonally, I'm just happy its a free service, with no ads. The fact that it goes down all the time is neither here nor there to me. Like blogs, the information is asynchronous and disparate. I can see why others are frustrated though. So, given a service that doesnt crash would be an order of magnitude nicer, and its not everyday free advice gets given out, so hopefully the guys at Twitter are able to take advantage of your point of view. http://ayende.com/3346/architecting-twitter#comment1http://ayende.com/3346/architecting-twitter#comment1Tue, 03 Jun 2008 05:40:56 GMT