After Hours Academic

Question 4: Fault tolerant writes

Andrew is building a video storage app. Users can upload their videos to his app and then browse them later. During the upload process, Andrew first stores the video to S3 and then updates a database containing metadata about the video (e.g., title, upload date). This database is used show all the uploaded videos of a user.

The upload process take a while to complete because storing the video on S3 takes a while. Andrew's boss, Ram, suggests that they update the database first and upload the video to S3 in a background thread. This can reduce the perceived duration of the upload process because the user would see the video's information (e.g., title) in their app. Andrew disagrees with Ram citing fault tolerance reasons.

Do you agree with Andrew or Ram?

Solution

Andrew is correct in reasoning that writing the metadata before the upload completes is not fault tolerant. Consider a scenario in which the video upload to S3 fails (for whatever reason, e.g., network connection error). Their app would have an inconsistent state wherein it will incorrectly show the video as uploaded.

This is as example of a more generalization rule for fault-tolerant or crash-consistent updates: write data before the metadata. Soft updates describes the rules to follow for crash-consistent writes in a file system. This particular rule (write data before the metadata) is relevant more generally as well.

#computer-science #concurrency #distributed-systems #fault-tolerance #qna #storage-systems