Tracks/The Filesystem
20

The Filesystem

Advanced
Storage|10 tasks

GFS and HDFS showed the world how to store petabytes across thousands of cheap machines. Build a tiny distributed filesystem with chunk servers, replication, and master failover.

Subtracks & Tasks

Interview Prep

Common interview questions for Infrastructure / Storage Engineer roles that map directly to what you build in this track. Click any question to reveal the model answer.

Questions are representative of real interview patterns. Model answers are starting points — adapt them with your own experience and the specific context of the interview.

Common Mistakes

The top 5 mistakes builders make in this track — and exactly how to fix them. Click any mistake to see the root cause and the correct approach.

Comparison Mode

Side-by-side comparisons of the approaches, algorithms, and trade-offs you encounter in this track. Expand any comparison to see a detailed breakdown.

Concepts Covered

GFS architecturemaster nodechunk server64MB chunksreplication factornamespace treedirectory hierarchychunk mappingmetadataWAL-backedchunk allocationplacement policyrack awarenessprimary assignmentchunk replicationpipeline writesprimary-secondarywrite acknowledgementdata flowleaseprimary electionlease renewallease expiryconsistency windowheartbeatchunk server monitoringliveness detectionchunk inventoryre-replicationunder-replicated chunksfailure recoveryload balancingchunk migrationdisk utilizationrebalancing thresholdmaster failovershadow masterWAL replayhot standbyfailover timechecksumdata integritycorruption detectionper-block checksumsilent corruption

Prerequisites

It is recommended to complete the previous tracks before starting this one. Concepts build progressively throughout the curriculum.