One Year Later… Dropbox Continues to Innovate with Western Digital
It’s truly an exciting time as software developers continue to create innovative cloud applications and, in the process, more and more data must be stored in the cloud.
As new large-scale applications are developed, Western Digital has been working closely with our customers to optimize our storage device behavior to align more closely to application requirements. Likewise, for an application developer it is critical to make architecture improvements in the software storage stack that can leverage the increased efficiencies provided by the underlying purpose-built storage device.
The notion of purpose-built storage for purpose-built applications reflects the need for optimized solutions at scale. Large-scale solutions benefit from HDD technologies like Shingled Magnetic Recording (SMR), which increases capacity density by storing data in sequential zones, and SSD technologies like Quad Level Cells (QLC) and Zoned Namespaces (ZNS) that are also optimized for sequential workloads and higher capacity density.
Dropbox has developed one of the most innovative cloud storage solutions. A pioneer in cloud storage, Dropbox developed their storage solution internally and from the ground up. Dubbed “Magic Pocket,” the in-house platform is designed for scaling to exabytes and beyond. Dropbox’s approach ensures complete control over their storage software stack and enables Dropbox to deploy innovative storage technologies like SMR that take advantage of higher storage densities.
Dropbox takes a look back… one year later. Check out their journey here.
Already a leader in developing high capacity helium-sealed hard drives, Western Digital introduced the first its host-managed enterprise SMR HDD, with Dropbox being among the first to deploy 14TB SMR HDDs at exabyte scale (with our 14TB Ultrastar® DC HC620).
A year ago we were able to squeeze an additional 2TB from a similar 12TB platform. Today, customers add 25% more capacity with our current 15TB SMR drive, and we expect a 20TB SMR drive in 2020.
When we first introduced SMR drives to the market, few tools natively supported SMR. Kickstarting the ecosystem, Western Digital released an open source library allowing Dropbox to develop their software stack efficiently. Early versions of the Linux® kernel and tools did not even recognize SMR zoned devices. Fast-forward to today, and we have developed a broad set of open source tools and building blocks, and we continue to contribute to many open source Linux kernel and Linux Tool projects.
To get started on this journey and obtain the latest information on the ecosystem readiness for both SMR HDDs and the upcoming ZNS SSDs, check out the Zoned Storage link toward the end of this blog.
What is an SMR zone?
SMR is a method for storing more data onto each platter inside the hard drive. In Conventional Magnetic Recording (CMR), data is stored in concentric circles, or tracks, with a gap between each track. This method allows each track to be read or written individually.
SMR removes that gap to enable higher capacity. The hard drive overlaps the tracks as it writes, leaving only one edge of the track exposed, just like a shingled roof. Each shingle is overlapped by the next one, with just a tiny bit of each shingle visibly exposed. On the hard drive, this translates to having a wide recording head lay down each track, leaving a narrow band to be read back by the read head.
The advantage of SMR is that it not only eliminates the wasted gap between tracks, but it also opens additional space because of the overlap. Since the read head is quite capable of reading a thin and narrow track, the drive can store up to 25% more data than using the conventional approach with non-overlapping tracks.
While the extra space essentially comes for free, it also means that random data can no longer be written into the middle of your shingled data. While reading random data is fully supported, the writes must be done sequentially. Random writes are not supported.
To maintain support for random write, we utilize the notion of “zones.” Rather than creating a single zone of overlapping tracks, most SMR drives use many zones. Each zone is then managed independently. This method enables random writes to each zone, but you can only write sequentially within each zone.
Host Managed SMR implies that that host server must manage the data. When any given zone is full, the host must keep track of which data is valid and which data is “garbage,” then make the optimal decision for when the garbage collection should occur. After enough “garbage” is collected, the host can read the valid data off of zone A, then rewrite the data back to zone A or to a new Zone B. Unlike a client hard drive, which often utilizes Drive Managed SMR, enterprise applications benefit from using Host Managed SMR, which leaves the data management decision to the application software, and not the drive.
Get in the ZONE
In a world with ever-increasing storage requirements, data architects need to store data in the most cost-efficient way. Thinking in terms of zones is a mindset change from the traditional way of thinking about storage. Rather than managing a collection of random blocks, with each block containing sequential bytes, it is beneficial for the application to group blocks into zones and manage the data at a zone level. Each zone becomes a collection of sequential blocks, and by eliminating the random writes we can eliminate cost within the device.
In the case of SMR, that means storing sequential blocks of data in zones. While you can randomly read any data within a zone, you are restricted to writing sequentially only (i.e., no random writes).
Hard drives are not the only devices that can take advantage of Zoned Storage. Zoned Namespaces, a feature in development within the NVMe Consortium, is not far behind for SSDs. In fact, during this year’s 2019 Open Compute Project (OCP) Global Summit, Microsoft presented a session that highlighted SMR and ZNS as two sides of the same coin.
ZNS is a new NVMe standard that we expect will be ratified soon, and Western Digital will be providing libraries and “how-to” information to complement our existing Zoned Storage efforts.
So as you can see, cloud applications are being designed for zones. Whether these zones are SMR Zones, in the case of Dropbox, or a future ZNS application, cloud architects are looking for cost-effective storage by architecting solutions around purpose-built zoned storage.
“We’re always looking for innovative solutions that help us scale with greater efficiency, without compromising the performance of our products,” said Akhil Gupta, vice president of engineering at Dropbox. “Deploying SMR as a cost-effective storage solution continues to result in long-term payoff for Dropbox customers and our business.”
Getting Started with Zones
Deploying SMR is getting easier as the ecosystem develops, but it still requires an architected software solution. Dropbox has demonstrated the benefit of re-architecting by thinking in terms of zones, rather than blocks. By optimizing and managing the data placement, Dropbox can then optimize their storage deployment.
Western Digital has ongoing contributions to various open source projects and standards organizations. For HDDs, SMR is supported by the industry standard Zone Block Commands (ZBC) and Zone ATA Command set (ZAC) for SAS and SATA HDDs respectively. To find out what tools are available today, head over to https://ZonedStorage.io.
Looking into the future, we are working with the industry to bring zones to SSDs in the form of Zone Namespaces (ZNS) for NVMe SSDs. You can find a ZNS introduction hereand as the standards are ratified we will be publishing additional tools and documentation on the ZonedStorage.io website.
We invite you to join our journey into the zone as we enable more efficient, purpose-built storage devices that, in turn, form the storage foundation for achieving the vision of your next scale-out application.
Certain blog and other posts on this website may contain forward-looking statements, including statements relating to expectations for our product portfolio, the market for our products, product development efforts, and the capacities, capabilities and applications of our products. These forward-looking statements are subject to risks and uncertainties that could cause actual results to differ materially from those expressed in the forward-looking statements, including development challenges or delays, supply chain and logistics issues, changes in markets, demand, global economic conditions and other risks and uncertainties listed in Western Digital Corporation’s most recent quarterly and annual reports filed with the Securities and Exchange Commission, to which your attention is directed. Readers are cautioned not to place undue reliance on these forward-looking statements and we undertake no obligation to update these forward-looking statements to reflect subsequent events or circumstances.
The post One Year Later… Dropbox Continues to Innovate with Western Digital appeared first on Western Digital Corporate Blog.
The Human Side of the Fourth Industrial Revolution
A few small sensors and a Raspberry Pi control the watering system in Taweesak Phuengprasit’s family’s farm in the Ang…
Innovation for Independence
Smart home data and automation assist those with special needs Home automation technology is making life easier and more convenient….
The Disks on the Bus Go Round and Round: How AngelTrax Is Re-envisioning Safety
Anna England and her two daughters each leave the house at different hours of the morning, but they are all…
Drive the Future
Data storage helps the automotive industry take the next step forward Car manufactures are building smarter technologies into autonomous vehicles…
The Race to Seal Helium HDDs
Innovation is rarely quick and often only obvious in hindsight. Helium has been one of the greatest breakthroughs for high-capacity…
A Taste of Data: How Data Is Revolutionizing the Alcohol Industry
Since antiquity, humans have made alcohol. It’s part of our ancestry and our every day, and there’s always been a…
Five Reasons the 2019 RISC-V Summit is a Can’t-Miss Event
Whether you’re a veteran RISC-V enthusiast or brand new to the world of open-source instruction set architecture (ISA), the RISC-V…
The IoT Evolution – Top 9 IoT Use Cases of 2019
As smart sensors are placed in billions of connected devices around the world, new IoT applications are generating massive streams…
Driving to Data-Centric Architectures and 1B RISC-V Cores
Two years ago we kicked off our commitment to open-source innovation by announcing our goal to transition over one billion…
In 2039, Could Fully Autonomous and Connected Cars Exist?
Recently, I gave the opening remarks and sat in on a panel at A Data-Driven Futurean automotive industry event hosted…
2019 Data Center Year in Review
The era of digital business transformation has been in full thrust in 2019, as emerging applications and data center infrastructure…
Industrial-Grade Storage Enables Drones for Search and Rescue Teams
First things, first: I’m a proud member of the El Dorado County Search and Rescue (EDSAR) organization and have been…