Question 18: Scalable file creation

2024-07-19

Anant and Radhika are arguing whether file creations within a directory can scale with the number of thread. That is, if n threads create a file each (all in the same directory but with different names) in the same time that it takes one thread to create a file. Anant argues that since all the files are being created in the same directory, the threads need to serialize access to the directory and hence file creation cannot scale. Radhika thinks that file creation can scale if only they can come up with a data structure to represent the directory that allows for concurrent file creations.

Who do you think is correct?

Solution coming up in the next post!

Solution for challenge snapshot:

Ram can implement the consistent snapshot algorithm described in this seminal paper.

The high-level idea is that each node takes a snapshot of its own state and sends a special marker message to all other nodes to notify them. When a node receives a marker message, it takes a snapshot if it hasn't already. This ensure that there are no in-flight messages between the two nodes. (This requires ordered messages, which can be implemented independently.) If a node that receives a marker had already taken its snapshot, it considers all the messages received between its snapshot and receiving the marker message as in-flight messages in the snapshot.

Highly recommend reading the paper to get a much better description and explanation of the algorithm!

#file-systems #qna #storage-systems