Storage Systems (XM_0092) Homepage
Storage System is a unique course due to its sole focus on NVM storage and its impact on research and education. We take inspiration from the 2018 Data Storage Research Vision 2025 report (https://dl.acm.org/doi/book/10.5555/3316807) which identifies (see section 6.1):
“Many students may only associate storage systems with hard disk drives or a specific file system, which is obviously less attractive compared to, say, self-driving cars. This situation is partly due to the fact that there is no clearly defined course on storage systems in the majority of universities.”
Lecture slides
Storage Systems (VU catalogue number XM_0092) is a MSc-level course that is first established and offered in 2020. The course covers the rise of Non-Volatile Memory (NVM) storage technologies in commodity computing, their impact on system design (architecture, operating system), distributed systems, storage services, application designs, and emerging trends (CXL and io_uring). We cover the following topics in 2023:
1. Introduction: History, HDDs, NAND flash, internal organization, the new triangle of Storage Hierarchy.
Zoom recording (Passcode: S6qD8tE^)
2. Host Interfacing and Software Implications: NVMe, storage and block-layer optimizations (multi queue design).
Zoom recording (Passcode: S6qD8tE^)
3. Flash FTL and Garbage Collection: FTL and GC designs, concerns, and host-managed FTLs.
Zoom recording (Passcode: S6qD8tE^)
4. Flash Filesystems: Log-structured file systems, F2FS, DFS, and Nameless writes.
Zoom recording (Passcode: S6qD8tE^)
5. Flash KV Stores: B+ Trees, Hash Tables, and LSM trees on flash (LOCS, WiscKey, uTree, SILK).
Zoom recording (Passcode: S6qD8tE^)
6. Byte-addressable Persistent Memories: Optane, NVHeap, and Pmem/PMDK project.
Zoom recording (Passcode: S6qD8tE^)
7. Networked Flash: Disaggregated storage, NVMoF, Disaggregated Flash, and FlashNet.
Zoom recording (Passcode: S6qD8tE^)
8. Programmable Storage: What is CSD, Willow, Biscuit, INSIDER.
Zoom recording (Passcode: S6qD8tE^)
9. Distributed Storage - I: Distributed temporary data storage and formats (Crail and Albis).
Zoom recording (Passcode: S6qD8tE^)
10. Distributed Storage - II: Talks about Corfu and Tango distributed transaction systems.
Zoom recording (Passcode: S6qD8tE^)
11. Emerging Topics: CXL and io_uring
Zoom recording (Passcode: S6qD8tE^)
Practical Work
For the practical work, students develop an NVM flash translation layer FTL (the essential part of any modern NVM storage device) for NVMe ZNS devices, and integrate a file system in RocksDB. There are five milestone in the practical work:
A new device is in town - setup the development environment with ZNS devices in QEMU and read the NVMe 1.4 and ZNS specifications, and test the nvme command to interact with nvme devices.
I can’t read, is there a translator here? - implement a host-side hybrid log-data FTL. The log segment is page-mapped, while the data-segment is zone-mapped. No GC at this stage.
It’s 2023, we recycle - implement a choice of garbage collection algorithm for your FTL.
We love Rock(sDB) ‘n’ Roll! - design and implement a file system on top of your FTL and integrate it with the RocksDB FileSystem API.
Wake up, Neo - the last milestone requires you to persist and restart your FTL and filesystem states and pass the RocksDB persistency tests.
The project handbook is publicly available here 2023-2024-stosys-handbook-4.0.pdf
. Drop me an email if you want access to more project related resources.
License
This course content are distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0): https://creativecommons.org/licenses/by/4.0/.
Feel free to modify and use the slides in your course as you see fit with attribution.
Acknowledgement
The project work is generously supported by Western Digital with their donation of ZNS devices and software support.