What the Industry is Doing About its Zetta-Scale Problem
By Ted Marena
It’s no headline news that the amount of data generated is accelerating at a breakneck pace. But for those of us on the data infrastructure end of things, it’s what keeps us up at night. Any way you look at it – storage, memory, processing, or networking – we as an industry have to push the boundaries of existing technologies, or the world won’t be able to keep up with data.
So, what are we doing about it? A lot. And, our new series of meetups is where we bring various experts from across the industry to talk about what’s at the cusp of next-gen data architectures.
Get Used to Talking in Zettabytes
Some stats to feed your data hunger: in 2025, the US is expected to generate 30.6ZB of data while China will generate 48.6ZB of data.1 If that’s a little hard to grasp, let me help you.
Numerically speaking, a Zettabyte is 1,000 Exabytes or 1,000,000 Petabytes or 1,000,000,000 (a billion) Terabytes or 1,000,000,000,000 (one trillion) Gigabytes. If you think that’s a far-off future, you’re wrong. Already two years ago the industry shipped almost a zettabyte of new storage devices. We’ve crossed that chasm before most even took noticed.2 And, we’ve been busy building the foundation to manage where our Herculean sea of data is headed.
What Did You Miss?
Back in July, the Bay Area Storage Solutions Meetup group held an inaugural event. Our first topics discussed were how a graphics processing company views fabrics and NVMe-oF™, how Western Digital is rethinking storage efficiencies through Zoned Storage and how one startup is challenging the control of main memory with OmniXtend™, a cache coherent memory fabric.
Ahead of the next virtual Meetup on Dec. 2, 2020let me get you up to speed on the discussions and the technologies presented in July.
The first presentation explained why NVMe-over-Fabrics (NVMe-oF) is needed given the faster speeds of storage that exist today. In some architectures, the network has now become the bottleneck for performance. NVMe-oF is an open standard that defines how to share storage across multiple servers/CPUs. Improving the storage throughput enables increased performance for applications such as machine learning and AI.
This presentation walked through the implementations of their RDMA (Remote Direct Memory Access) and specifically the support over Ethernet, RoCE (RDMA over Converged Ethernet). Application examples and network performance benchmarks were showed for various implementations. To support RDMA requires both hardware at the network interface point and system software on the host to be aware of RDMA.
The next presentation was on Zoned Storage by Dave Landsman from Western Digital. Dave is on the board of the NVM Express group which sets the open standards for storage and data architectures. He explained how both HDDs and SSDs consist of numerous regions/blocks or zones and that each device physically can only be sequentially written. For most systems, this restriction was not apparent because the drive controller was doing the data management. This conventional implementation works but is not scalable for higher densities.
The zoned storage standard for HDDs is SMR (Shingled Magnetic Recording). The zoned storage implementation for SSDs is called ZNS (Zoned Namedspaces). This standard requires the host device software to cooperate in organizing the data to be stored on a ZNS SSD. There are numerous zoned block software options which can be implemented. These software details can be found at www.zonedstorage.io.
The advantages of implementing zoned storage is that data is intelligently placed on the drives. By doing this, the drive controller has minimal data management tasks to perform. The result is that zoned storage enables higher densities, better QoS and lower TCO. Zoned storage SMR HDDs and ZNS SSDs can address the explosive data growth and support zettabyte scale for data centers and cloud providers. Recently Western Digital announced its first ZNS SSD, the Ultrastar® DC ZN540 ZNS NVMe SSD. Learn more at www.westerndigital.com/zoned-storage
The last presentation was about OmniXtend. This architecture breaks the strangle hold of main memory from the CPU. OmniXtend is an open, cache coherent memory fabric based on low cost Ethernet. Although there are many memory interface architectures, none are all open, based on Ethernet and preserve coherency.
OmniXtend allows all nodes on a network to share main memory equally. No longer does a CPU own main memory. OmniXtend serializes the cache coherency bus, TileLink and sends that in layer 2 over Ethernet frames. This enables not just CPUs to access main memory, but also GPUs, FPGAs, ML accelerators, etc. to equally share memory coherently.
The open source hardware group, CHIPS Alliance, is developing OmniXtend further. Currently there is a FPGA implementation of multiple quad RISC-V cores which share L2 cache and can also access the cache on the other boards via Ethernet. Although it is early in the development of OmniXtend as a standard, if this becomes adopted it would enable new architectures in data centers and better solve memory intensive workload applications.
Join Us on Dec. 2
ZNS SSDs are picking up. Get to know the ecosystem surrounding zoned storage and hear from some of the industry’s biggest movers and shakers. Save the date here.
- Data Age 2025, IDC May 2020
Certain blog and other posts on this website may contain forward-looking statements, including statements relating to expectations for our product portfolio, the market for our products, product development efforts, and the capacities, capabilities and applications of our products. These forward-looking statements are subject to risks and uncertainties that could cause actual results to differ materially from those expressed in the forward-looking statements, including development challenges or delays, supply chain and logistics issues, changes in markets, demand, global economic conditions and other risks and uncertainties listed in Western Digital Corporation’s most recent quarterly and annual reports filed with the Securities and Exchange Commission, to which your attention is directed. Readers are cautioned not to place undue reliance on these forward-looking statements and we undertake no obligation to update these forward-looking statements to reflect subsequent events or circumstances.
Not Just Talk – How Mobile Storage Keeps Pace with Smart Phones
The first mobile phones weighed a few pounds, sported an awkward antenna, and were the size of a small briefcase….
It Starts at the Cell
We owe our connected present in large part to a single device smaller than a grain of sand: the charge…
Like Clay: The Evolution of Form Factor in the Data Center
Despite their modern implications and future forward technologies, the data center is much older than it lets on. The first…
Firmware Runs the World, but Who Will Write It? [pt. 2]
This is a two-part series on the industry gap of embedded firmware engineers. Part one of the series explores the…
Firmware Runs the World, but Who Will Write It? [pt. 1]
Software may have eaten the world, but hardware still runs it. Yet finding those who can build its firmware is…
Five Reasons the 2019 RISC-V Summit is a Can’t-Miss Event
Whether you’re a veteran RISC-V enthusiast or brand new to the world of open-source instruction set architecture (ISA), the RISC-V…
The IoT Evolution – Top 9 IoT Use Cases of 2019
As smart sensors are placed in billions of connected devices around the world, new IoT applications are generating massive streams…
Driving to Data-Centric Architectures and 1B RISC-V Cores
Two years ago we kicked off our commitment to open-source innovation by announcing our goal to transition over one billion…
In 2039, Could Fully Autonomous and Connected Cars Exist?
Recently, I gave the opening remarks and sat in on a panel at A Data-Driven Futurean automotive industry event hosted…
2019 Data Center Year in Review
The era of digital business transformation has been in full thrust in 2019, as emerging applications and data center infrastructure…
Industrial-Grade Storage Enables Drones for Search and Rescue Teams
First things, first: I’m a proud member of the El Dorado County Search and Rescue (EDSAR) organization and have been…