Building a social media platform without going bankruptPart I–Laying the numbers

time to read 5 min | 832 words

Following the discussion a few days ago, I thought that I would share my high level architecture for building a social media platform in a way that would make sense. In other words, building software that is performant, efficient and not waste multiples of your yearly budget on unnecessary hardware.

It turns out that 12 years ago, I wrote a post that discusses how I would re-architect twitter. At the time, the Fail Whale would make repeated appearances several times a week and Twitter couldn’t really handle its load. I think that a lot of the things that I wrote then are still applicable and would probably allow you to scale your system without breaking the bank. That said, I would like to think that I learned a lot since that time, so it is worth re-visiting the topic.

Let’s outline the scenario, in terms of features, we are talking about basically cloning the key parts of Twitter. Core features include:

  • Tweets
  • Replies
  • Mentions
  • Tags

Such an application does quite a lot a the frontend, which I’m not going to touch. I’m focusing solely on the backend processing here. There are also a lot of other things that we’ll likely need to deal with (metrics, analytics, etc), which are separate and not that interesting. They can be handled via existing analytics platforms and don’t require specialized behavior.

One of the best parts of a social media platform is that by its very nature, it is eventually consistent. It doesn’t matter if I post a tweet and you see it now or in 5 seconds. That gives us a huge amount of flexibility in how we can implement this system efficiently.

Let’s talk about numbers I can easily find:

There are problem with those stats, however. A lot of them are old, some of them are very old, nearly a decade!

Given that I’m writing this blog to myself (and you, my dear reader), I’m going to make some assumptions so we can move forward:

  • 50 million users, but we’ll assume that they are more engaged than the usual group.
  • Out of which 50% (25,000,000) would post on a given month.
  • 80% of the users post < 5 posts a month. That means 20 million users that post very rarely.
  • 20% of the users 5 million or so, post more frequently, with a maximum of around 300 posts a month.
  • 1% of the active users 50,000) posts even more frequently, to the tune of a couple of hundred posts a day.

Checking my math, that means that:

  • 50,000 high active users with 150 posts a day for a total of 225 million posts.
  • 5 million active users with 300 posts a month for another 1.5 billion posts.
  • 20 million other users with 5 posts a month, given us another 100 million posts.

Total month posts in this case, would be:

  • 1.745 billion posts a month.
  • 2.4 million posts an hour.
  • 670 posts a second.

That assume that there is  a constant load on the system, which is probably not correct. For example, the 2016 Super Bowl saw a record of 152,000 tweets per minute with close to 17 million tweets posted during the duration of the game.

What this means is that the load is highly variable.  We may have low hundreds of posts per second to thousands. Note that 152,000 posts per minute are “just” 2,533 posts per second, which is a lot less scary, even if it means the same.

We’ll start by stating that we want to process 2,500 posts per second as the current maximum acceptable target.

One very important factor that we have to understand is what exactly do we mean by “processing” a post. That means recording the act of the post and doing that within an acceptable time frame, we’ll call that 200 ms latency for the 99.99%.

I’m going to focus on text only mode, because for binaries (pictures and movies) the solution is to throw the data on a CDN and link to it, nothing more really needs to be done. Most CDNs will already handle things like re-encoding, formatting, etc, so that isn’t something that you need to worry about to start with.

Now that I have laid the ground works, we can start thinking about how we can actually process this. That is going to be handled in a few separate pieces. First, how do we accept the post and process it and then how do we distribute it to all the followers. I’ll start dissecting those issues in my next post.

More posts in "Building a social media platform without going bankrupt" series:

  1. (05 Feb 2021) Part X–Optimizing for whales
  2. (04 Feb 2021) Part IX–Dealing with the past
  3. (03 Feb 2021) Part VIII–Tagging and searching
  4. (02 Feb 2021) Part VII–Counting views, replies and likes
  5. (01 Feb 2021) Part VI–Dealing with edits and deletions
  6. (29 Jan 2021) Part V–Handling the timeline
  7. (28 Jan 2021) Part IV–Caching and distribution
  8. (27 Jan 2021) Part III–Reading posts
  9. (26 Jan 2021) Part II–Accepting posts
  10. (25 Jan 2021) Part I–Laying the numbers