Snapshotting Mechanisms for Persistent Memory-Mapped Files
Montage offers impressive resilience against system-wide crash failures. However, we perceive opportunities for further fortification, specifically when persistent memory failures occur, which can lead to substantial data loss. Addressing this challenge, we propose two new snapshotting mechanisms - stop-the-world and online snapshotting - for memory-mapped files.
These mechanisms selectively replicate only those data portions that have been modified since the last snapshot, thus significantly reducing the volume of data copying during snapshot operations. In order to ensure data consistency and optimize snapshotting, we introduce modifications to both Montage and its allocator. This includes a parallel chunk-copying strategy for the stop-the-world snapshotting implementation. Additionally, our online snapshotting mechanism allows updates to chunks not being replicated by the snapshotter, which increases system responsiveness.
We have also developed an algorithm that, when snapshotting is not in progress, disables the reader locks, reducing the overhead associated with such locks and further enhancing performance.