close

DEV Community

Jaspreet singh
Jaspreet singh

Posted on

HLD Fundamentas #7: Back-of-the-Envelope Calculations

When designing systems like Facebook, WhatsApp, Netflix, Amazon, or Instagram, one of the first questions a system designer asks is:

  • Can a single server handle the traffic?
  • How much storage will be needed?
  • Do we need caching?
  • How much RAM should our cache have?
  • How many servers should we deploy?

Before discussing databases, load balancers, microservices, or caching layers, we need a rough understanding of the scale.

This is where Back-of-the-Envelope Calculations come into the picture.


Why Do We Need Back-of-the-Envelope Calculations?

Imagine you're asked to design Facebook.

If you immediately start drawing:

Load Balancer
     ↓
Application Servers
     ↓
Redis Cache
     ↓
Database
Enter fullscreen mode Exit fullscreen mode

without knowing the expected traffic, you're designing blindly.

System design is fundamentally about making trade-offs.

To make those trade-offs, we first need estimates.

Back-of-the-envelope calculations help us answer:

  • How much traffic will the system receive?
  • How much data will be generated?
  • How much cache memory is required?
  • How many servers are needed?

The numbers don't need to be perfect.

They only need to be close enough to make architectural decisions.


What Exactly Is a Back-of-the-Envelope Calculation?

A quick estimation technique used to approximate:

  • Traffic
  • Storage
  • Memory
  • Server Capacity

using rough assumptions.

Think of it as:

"Getting the order of magnitude correct rather than getting the exact number correct."

A system designer rarely needs perfect accuracy during interviews.

They need reasonable estimates.


The Standard Estimation Flow

Whenever you get a System Design question:

Users
  ↓
Traffic
  ↓
Storage
  ↓
RAM / Cache
  ↓
Number of Servers
  ↓
Architecture Design
Enter fullscreen mode Exit fullscreen mode

Always estimate first.

Design later.


The Ultimate Estimation Cheat Sheet

Storage Units

Unit Value
1 KB 10³ Bytes
1 MB 10⁶ Bytes
1 GB 10⁹ Bytes
1 TB 10¹² Bytes
1 PB 10¹⁵ Bytes

Time Units

Unit Value
1 Minute 60 Seconds
1 Hour 3600 Seconds
1 Day 86,400 Seconds

Common Assumptions

Metric Approximation
Peak Traffic 3× Average Traffic
Active Users 10-20% of Total Users
Hot Data 20% of Total Data
Cache Hit Data 80% of Requests

These assumptions are commonly used in interviews.


Example: Facebook Capacity Planning

Let's walk through a complete example.


Step 1: User Estimation

Assume Facebook has:

1 Billion Registered Users
Enter fullscreen mode Exit fullscreen mode

Not every user is active daily.

Let's assume:

20% Daily Active Users
Enter fullscreen mode Exit fullscreen mode

Therefore:

DAU = 1 Billion × 20%

DAU = 200 Million Users
Enter fullscreen mode Exit fullscreen mode

Step 2: Traffic Estimation

Assume each active user opens Facebook:

10 times per day
Enter fullscreen mode Exit fullscreen mode

Total daily requests:

200 Million × 10

= 2 Billion Requests / Day
Enter fullscreen mode Exit fullscreen mode

Now convert this into Requests Per Second (RPS).

2,000,000,000 / 86,400

≈ 23,000 RPS
Enter fullscreen mode Exit fullscreen mode

Peak Traffic Estimation

Traffic is never evenly distributed.

Peak traffic is usually around:

3 × Average Traffic
Enter fullscreen mode Exit fullscreen mode

Therefore:

Peak RPS

= 23,000 × 3

≈ 70,000 RPS
Enter fullscreen mode Exit fullscreen mode

[Insert architecture diagram here showing User → Load Balancer → Application Servers]


Step 3: Number of Servers Required

Assume:

1 Application Server

Handles 10,000 RPS
Enter fullscreen mode Exit fullscreen mode

Required servers:

70,000 / 10,000

= 7 Servers
Enter fullscreen mode Exit fullscreen mode

Always keep extra capacity for failures and traffic spikes.

So we provision:

10 Servers
Enter fullscreen mode Exit fullscreen mode

Final Estimate

Application Servers Needed ≈ 10
Enter fullscreen mode Exit fullscreen mode

Storage Estimation

Now let's estimate how much storage Facebook needs.


User Profile Storage

Assume each user profile stores:

  • Name
  • Bio
  • Settings
  • Metadata

Approximate size:

1 KB per User
Enter fullscreen mode Exit fullscreen mode

Storage required:

1 Billion × 1 KB

= 1 TB
Enter fullscreen mode Exit fullscreen mode

User profiles consume surprisingly little storage.


Post Storage

Assume every active user creates:

1 Post / Day
Enter fullscreen mode Exit fullscreen mode

Daily posts:

200 Million Posts
Enter fullscreen mode Exit fullscreen mode

Average post size:

5 KB
Enter fullscreen mode Exit fullscreen mode

Storage:

200 Million × 5 KB

≈ 1 TB / Day
Enter fullscreen mode Exit fullscreen mode

Image Storage

This is where the real storage explosion happens.

Assume:

50 Million Images Uploaded Daily
Enter fullscreen mode Exit fullscreen mode

Average image size:

500 KB
Enter fullscreen mode Exit fullscreen mode

Storage:

50 Million × 500 KB

≈ 25 TB / Day
Enter fullscreen mode Exit fullscreen mode

Total Daily Storage

Posts  = 1 TB
Images = 25 TB

Total = 26 TB / Day
Enter fullscreen mode Exit fullscreen mode

Annual Storage Growth

26 × 365

≈ 9,500 TB

≈ 9.5 PB
Enter fullscreen mode Exit fullscreen mode

This explains why large social media platforms rely heavily on distributed storage systems.


[Insert diagram here showing Posts + Images → Distributed Storage Cluster]


RAM / Cache Estimation

Storage tells us how much data exists.

Cache tells us how much data needs to stay in memory.


Why Cache?

Fetching everything from a database is expensive.

Frequently accessed content should be stored in memory.

Examples:

  • News Feed
  • User Profiles
  • Trending Posts
  • Friend Lists

Hot Data Estimation

A common assumption is:

20% of data generates 80% of traffic
Enter fullscreen mode Exit fullscreen mode

(Pareto Principle)

Assume total active data:

1 TB
Enter fullscreen mode Exit fullscreen mode

Frequently accessed data:

20%

= 200 GB
Enter fullscreen mode Exit fullscreen mode

Required Cache Size

Adding some safety buffer:

200 GB + Buffer

≈ 300 GB Cache
Enter fullscreen mode Exit fullscreen mode

Cache Servers Required

Assume:

1 Redis Server

100 GB RAM
Enter fullscreen mode Exit fullscreen mode

Then:

300 / 100

= 3 Cache Servers
Enter fullscreen mode Exit fullscreen mode

[Insert architecture diagram here showing Application Layer → Redis Cache → Database]


Final Facebook Estimation

Metric Estimate
Total Users 1 Billion
Daily Active Users 200 Million
Daily Requests 2 Billion
Average RPS 23,000
Peak RPS 70,000
Application Servers 10
Daily Storage 26 TB
Annual Storage 9.5 PB
Cache Requirement 300 GB
Redis Servers 3

Interview Tips

Whenever you're asked to design:

  • Facebook
  • WhatsApp
  • Instagram
  • Netflix
  • Uber
  • YouTube

Don't start with databases.

Start with:

Let's estimate:

1. Users
2. Traffic
3. Storage
4. Cache
5. Servers
Enter fullscreen mode Exit fullscreen mode

This immediately demonstrates senior-level thinking.

Interviewers love candidates who quantify before they architect.


Common Mistakes

Using Exact Numbers

Don't say:

Facebook has exactly 3.07 billion users.
Enter fullscreen mode Exit fullscreen mode

Say:

Let's assume 1 billion users.
Enter fullscreen mode Exit fullscreen mode

Forgetting Peak Traffic

Always calculate:

Peak RPS
Enter fullscreen mode Exit fullscreen mode

because systems fail during peaks, not averages.


Ignoring Cache

Many candidates estimate storage but completely forget memory requirements.

In large-scale systems:

RAM is often more valuable than storage.


Trade-Offs

Approach Pros Cons
No Estimation Faster Poor Design Decisions
Back-of-the-Envelope Quick & Practical Approximate
Detailed Capacity Planning Accurate Time Consuming

  • Back-of-the-envelope calculations are rough estimates used before designing a system.
  • Estimate Users → Traffic → Storage → Cache → Servers.
  • Use assumptions instead of exact values.
  • Always calculate both Average RPS and Peak RPS.
  • Estimate storage growth annually.
  • Estimate hot data for cache sizing.
  • This skill is heavily tested in System Design Interviews.

System design is not about drawing architecture diagrams first.

It's about understanding scale first and then choosing the right architecture.

How do you approach capacity planning during system design interviews? Do you estimate traffic first, storage first, or both together?

Top comments (0)