The Messenger
Build the foundation of distributed communication. You will implement a Maelstrom node that handles JSON messages, processes initialization, and responds to echo requests. This track teaches the fundamental protocol that underlies all subsequent challenges.
Subtracks & Tasks
Hello, Distributed World
Implement Basic JSON Message Parser
In distributed systems, nodes communicate by exchanging messages. The Maelstrom framework uses JSON messages over stdin/stdout for simplicity and lang...
Handle Init Message and Store Cluster Metadata
Before processing any workload, Maelstrom sends an init message to each node. This message tells your node its identity and the full cluster membershi...
Implement Echo Service with Proper Acknowledgment
The echo workload is the simplest Maelstrom workload. Clients send echo messages containing a value, and your node must echo that value back. Request...
Add Message Envelope Validation
Production systems must handle malformed input gracefully. Your node should validate that incoming messages have the required structure before process...
Create Async Event Loop for Concurrent Message Handling
Real distributed systems handle many messages concurrently. Your current synchronous implementation processes one message at a time, which limits thro...
RPC and the Request-Response Model
Implement Synchronous RPC with Timeout
In a distributed system, nodes often need to call remote procedures on other nodes and wait for the result. This is called **synchronous RPC** (Remote...
Implement Timeout and Retry Loop for RPC
In distributed systems, messages can be lost or delayed indefinitely. A single RPC call with a timeout is not enough — you need a **retry loop** to ha...
Implement Async RPC Using Callbacks
Synchronous RPC blocks the caller until a reply arrives, which prevents the node from handling other messages during that time. In high-throughput dis...
Implement Callback Reaper for Leaked RPCs
When a node sends an async RPC but the recipient crashes or the network drops the message, the callback stays in memory forever. This is a **resource ...
Implement Exponential Backoff for Retries
Fixed-interval retries can overwhelm a recovering system. When many nodes retry at the same interval, they create a **thundering herd** that prevents ...
The Protocol Beneath
Model Message Format with Typed Schema
Raw JSON is just strings. A typed schema wraps the raw message in classes with explicit fields, validation, and serialization methods — making it impo...
Add Message Envelope Logger with Timestamps
In production distributed systems, **message tracing** is critical for debugging. When something goes wrong, you need to answer: "What messages did th...
Implement Message Deduplication with LRU Cache
Networks can duplicate messages. If a sender retries because it did not receive an acknowledgment (but the original was actually delivered), the recei...
Benchmark Node Throughput and Latency
How fast is your node? In production systems, you need to measure **throughput** (messages per second) and **latency** (time to process each message)....
Add Chaos Mode with Random Message Dropping
Real networks drop messages. Netflix pioneered **chaos engineering** — deliberately injecting failures to test resilience. Your task is to add a "chao...
Interview Prep
Common interview questions for Distributed Systems / Backend Engineer roles that map directly to what you build in this track. Click any question to reveal the model answer.
Questions are representative of real interview patterns. Model answers are starting points — adapt them with your own experience and the specific context of the interview.
Common Mistakes
The top 5 mistakes builders make in this track — and exactly how to fix them. Click any mistake to see the root cause and the correct approach.
Comparison Mode
Side-by-side comparisons of the approaches, algorithms, and trade-offs you encounter in this track. Expand any comparison to see a detailed breakdown.
Concepts Covered
Rabbit Holes
For when you want to go deeper. Curated papers, posts, and talks beyond what this track covers.
Designing Data-Intensive Applications — Chapter 4: Encoding and Evolution
Kleppmann's chapter on serialization formats covers exactly why JSON, Protobuf, and Avro make the tradeoffs they do. Required reading before you pick a wire format for anything serious.
The Network is Reliable
Kyle Kingsbury's survey of real-world network failures that engineers assumed couldn't happen. Read this before you trust any "reliable" abstraction.
gRPC Design Principles
How the gRPC team decided what to build and what to leave out. The constraints they chose — HTTP/2, Protocol Buffers, streaming — are a masterclass in API design at scale.
Maelstrom Protocol Documentation
The full specification for the message protocol you just implemented. Understanding this deeply unlocks every subsequent track.