Skip to main content

Designing Comment Threads for TikTok-Style Applications

In modern content platforms like TikTok, the comment thread is an essential interaction space. Users can post top-level comments and reply to others, forming shallow trees. While this might seem simple, designing this system to be efficient, scalable, and user-friendly takes some thoughtful architecture.

This post explores how comment threads are modeled, stored, and reconstructed — and then closes with a bonus section on using Redis for performance.

Comment Thread Structure

Each video can have:

  • Many top-level comments
  • Optional replies, each pointing to a parent_comment_id

Data Model (Relational Database)

CREATE TABLE comments (
id BIGINT PRIMARY KEY,
video_id BIGINT NOT NULL,
user_id BIGINT NOT NULL,
parent_comment_id BIGINT NULL,
content TEXT NOT NULL,
replies_n BIGINT NOT NULL DEFAULT 0,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
);

Reconstructing Comment Threads

To render a full comment thread:

1. Fetch Top-Level Comments

SELECT * FROM comments
WHERE video_id = :video_id AND parent_comment_id IS NULL
ORDER BY replies_n DESC
LIMIT 50;

2. Fetch Replies for Those Comments

SELECT * FROM comments
WHERE parent_comment_id IN (:top_level_comment_ids);

3. Group Replies by Parent (In Memory)

from collections import defaultdict

replies_by_parent = defaultdict(list)
for reply in replies:
replies_by_parent[reply["parent_comment_id"]].append(reply)

thread = []
for comment in top_comments:
comment["replies"] = replies_by_parent.get(comment["id"], [])
thread.append(comment)

This reconstruction approach is efficient: O(N + M) time, where N = number of top-level comments, M = total replies.

Scaling Considerations

To handle millions of comments:

  • Paginate top-level comments (and load replies lazily)
  • Use indexes on video_id, parent_comment_id
  • Consider denormalizing data for hot content

🔥 Bonus: Redis Caching for Hot Threads

To offload frequent reads and keep response times sub-ms, cache the full reconstructed thread in Redis.

Example Structure

{
"video_id": 123,
"comments": [
{
"id": 1,
"content": "Nice vid!",
"replies": [
{ "id": 4, "content": "Agree!" },
{ "id": 5, "content": "Same!" }
]
}
]
}

Redis Key

comment_thread:video:123

Recompute and Cache

def recompute_comment_thread(video_id):
top_comments = db.query(...)
replies = db.query(...)
...
redis.set(f"comment_thread:video:{video_id}", json.dumps(thread), ex=600)

You can trigger recompute on new comment events or via background jobs.

Conclusion

Designing comment threads involves balancing simplicity, flexibility, and performance. Whether you're serving 100 users or 100 million, the pattern of parent-child comment structuring and smart reconstruction gives you a strong foundation.

And when performance matters most? Add a Redis layer to make it fly.