Subtracks & Tasks
Scheduling
Implement Job Scheduling System
A job scheduler runs tasks at a specific time or on a recurring schedule without the caller waiting. It is the backbone of background work: sending em...
Implement DAG-Based Task Scheduling
A DAG (Directed Acyclic Graph) scheduler orchestrates workflows where tasks have dependencies. Task B cannot start until Task A finishes, but independ...
Implement Resource Management for Jobs
Jobs compete for finite resources like CPU, memory, and GPU. Resource management tracks what is available, grants allocations when possible, and queue...
Implement Job Monitoring and Observability
Job monitoring gives operators visibility into what is running, how long it takes, and when things go wrong. Without it, a failed job can go undetecte...
Implement Job Deadlines and Timeouts
Without deadlines, a hung job can occupy resources indefinitely and starve everything else. Timeout enforcement cancels overdue jobs and reclaims thei...
Service Mesh
Implement Service Mesh Architecture
A service mesh adds a sidecar proxy to every service. All traffic flows through these proxies, which transparently handle service discovery, retries, ...
Implement mTLS Authentication in Service Mesh
In a service mesh, every service-to-service call must be authenticated. mTLS (mutual TLS) achieves this by requiring **both** the client and the serve...
Implement Traffic Splitting in Service Mesh
Traffic splitting lets you roll out a new version gradually instead of switching all users at once. You can send a small percentage to the new version...
Implement Circuit Breaking in Service Mesh
When a downstream service is failing, continuing to call it wastes resources and slows down your service. A circuit breaker detects this and **fails f...
Implement Service Mesh Observability
Observability in a service mesh means collecting metrics, traces, and logs from every sidecar proxy automatically, without changing application code. ...
Interview Prep
Common interview questions for Platform / Infrastructure Engineer roles that map directly to what you build in this track. Click any question to reveal the model answer.
Questions are representative of real interview patterns. Model answers are starting points — adapt them with your own experience and the specific context of the interview.
Common Mistakes
The top 5 mistakes builders make in this track — and exactly how to fix them. Click any mistake to see the root cause and the correct approach.
Comparison Mode
Side-by-side comparisons of the approaches, algorithms, and trade-offs you encounter in this track. Expand any comparison to see a detailed breakdown.
Concepts Covered
Prerequisites
It is recommended to complete the previous tracks before starting this one. Concepts build progressively throughout the curriculum.
Rabbit Holes
For when you want to go deeper. Curated papers, posts, and talks beyond what this track covers.
Borg, Omega, and Kubernetes
Google's retrospective on 15 years of container management, tracing the lineage from Borg to Omega to Kubernetes. Explains the architectural lessons that each system learned from its predecessor.
Istio Service Mesh Architecture
Istio's architecture documentation explains how the control plane (istiod) pushes configuration to Envoy sidecars using the xDS protocol, and how traffic management, mTLS, and observability are implemented without application code changes.
The Service Mesh Manifesto
Buoyant's essay on why the sidecar proxy pattern is the right architectural choice for service-to-service reliability concerns, written by Linkerd's creator William Morgan.