Question 3: High tail latency for reads from an SSD

2024-06-26

Jenny buys a brand new state-of-the-art SSD for his gaming desktop. Before using it for gaming though, she decides to run benchmark the performance of the SSD to make sure that it is up to the mark. She chooses a workload with a 50/50 mix of random reads and random writes. She let's the workload run overnight and analyzes the data in the morning. One of the things that stands out is that the tail latency of the reads is very high. While the median latency is in 10s of microseconds, the tail latency is close to 100s of milliseconds.

What might have caused such high tail latency?

Solution coming up in the next post!

Solution to log-structured vs copy-on-write:

Bill's argument is indeed correct. Both copy-on-write and log-structured storage systems eschew in-place updates/writes. Both instead choose to write the new data in a new location and update the pointer to the data to point to the newly written location. Updating the pointer required another write, which is again done out-of-place. This leads to a trickle of out-of-place writes leading to the root of the metadata. Both log-structured and copy-on-write storage organizations use this style of updates. The only difference between them is that log-structured storage systems enforce the location of the new writes to be contiguous (to benefit from superior sequential write performance of storage devices) whereas copy-on-write.

The LFS and WAFL file system papers are good references to learn more about log-structured and copy-on-write file systems. And if you want a lighter read than the papers, my summaries of LFS and WAFL are not too bad either ;).

#devices #qna #ssd #storage-systems