Baseball is back. And with the return of a new season comes hours of content: player statistics, team records, video footage of every play in every game from multiple camera angles in stadiums across the globe.
In fact, MLB Network uploads up to 50 TB of new content per dayand even more in its archives. In order to deliver timely data during games, in daily highlights, and in future coverage, data management teams must decide where to store hot, warm, or cold data depending on how quickly and how often they need to access it.
Whether it’s sports statistics, healthcare, compliance, or historical data, more data is generated daily than can be analyzed, and the rate continues to grow.
Industry experts estimate that data is growing at approximately 30% annually and could generate as much as 175 ZB by 2025. Though not all data sets needs to be analyzed right away, they still be important to keep. That’s where cold storage comes in.
Cold storage retains any data that is not actively in use. Data can be stored in archives — or “cold” — lower cost, infrequently accessed storage tiers as opposed to live, “hot” production data, such as financial transactions that need to be accessed immediately.
According to industry analysts, 60% or more of stored data can be archived or stored in cooler storage tiers until it’s needed.
“The world is generating and storing more archival data than ever before. That’s why cold storage is the fastest growing segment in storage,” said Steffen Hellmold, vice president of Corporate Strategic Initiatives at Western Digital. “There’s a major disruption underway. As more and more bits are stored, cloud providers are reinventing their architectures with accessible archives to manage all that data.”
Why go cold?
As we enter the Zettabyte Age, the more data is stored, the more it costs. The largest pools of data are typically unstructured or semi-structured data, such as video footage, genomics dataor data used to train machine learning and AI use cases. Cold, or secondary, storage is less expensive than hot, or primary, storage. It makes sense to store data that is not actively needed in pools of cooler storage at a lower cost.
“The biggest consideration is how frequently you need to access the data, or how readily available you want it to be when do need it,” said Mark Pastor, director of Platform Product Management at Western Digital. Today’s cloud storage service SLAs are based on how often data needs to be accessed and how long a customer is willing to wait. For some cloud providers, data stored in a cooler tier might take five to 12 hours to access, whereas nearline data is stored in a warmer tier and available immediately but at a price.
“Aside from cost and accessibility, the third factor is a psychological one. It almost goes against human nature to delete anything in case you might need it sometime down the line,” said Kurt Chan, vice president of Platforms at Western Digital. “You never know what data is going to be valuable later on.”
Cold storage options to date
Until now, most secondary (cold) storage has been contained on either tape or hard disk drives (HDDs), with hot data moving to solid-state drives (SSDs). Western Digital supplies SSDs, HDDs, and tape heads, but sees secondary storage growing even faster than primary storage. According to Horison Information Strategiestoday at least 60% of all digital data can be classified as archival, and it could reach 80% or more by 2025, making it by far the largest and fastest growing storage class while presenting the next great storage challenge.
Tape is less expensive than HDDs, but has much higher data access latency so it is a good option for cold data storage. If the value of data is related to the ability to access and mine it, there’s an order of magnitude difference between storing it on disk versus tape.
In other words, data accessibility increases data value.
HDDs are evolving to next-generation disk technologies and platforms that enable both better TCO and accessibility for active archive solutions. Advancements in HDD technology include new data placement technologies (i.e. zoning), higher areal densities, mechanical innovations, intelligent data storage, and new materials innovations.
Future cold storage technologies
Hyperscalers, who house the largest pools of data, are looking for the most cost-effective ways to store the ever-increasing amount of data. Thus, new tiers are emerging for cold storage with IT organizations reinventing their archival storage architectures.
With long-term data storage moving to the century-scale mark – data that needs to be stored for 100 years or more – new cold storage solutions are in development, including DNA, opticaland even undersea deep-freeze storage.
In November last year, Western Digital — in partnership with Twist Bioscience — Illumina and Microsoft as founding members, announced the DNA Data Storage Alliance to advance the field of DNA data storage. Due to its high density, DNA has the capacity to pack large amounts of information into a small space. DNA can also last for thousands of years, making it an attractive medium for archival storage.
As the data generation continues growing at incredible volumes, cold storage will prove integral to preserving tht data at an affordable price and with longevity. Storage innovators are creating long-term data storage solutions that make valuable data accessible both in the near-term and for generations to come.
Not Just Talk – How Mobile Storage Keeps Pace with Smart Phones
The first mobile phones weighed a few pounds, sported an awkward antenna, and were the size of a small briefcase….
It Starts at the Cell
We owe our connected present in large part to a single device smaller than a grain of sand: the charge…
Like Clay: The Evolution of Form Factor in the Data Center
Despite their modern implications and future forward technologies, the data center is much older than it lets on. The first…
Firmware Runs the World, but Who Will Write It? [pt. 2]
This is a two-part series on the industry gap of embedded firmware engineers. Part one of the series explores the…
Firmware Runs the World, but Who Will Write It? [pt. 1]
Software may have eaten the world, but hardware still runs it. Yet finding those who can build its firmware is…
Five Reasons the 2019 RISC-V Summit is a Can’t-Miss Event
Whether you’re a veteran RISC-V enthusiast or brand new to the world of open-source instruction set architecture (ISA), the RISC-V…
The IoT Evolution – Top 9 IoT Use Cases of 2019
As smart sensors are placed in billions of connected devices around the world, new IoT applications are generating massive streams…
Driving to Data-Centric Architectures and 1B RISC-V Cores
Two years ago we kicked off our commitment to open-source innovation by announcing our goal to transition over one billion…
In 2039, Could Fully Autonomous and Connected Cars Exist?
Recently, I gave the opening remarks and sat in on a panel at A Data-Driven Futurean automotive industry event hosted…
2019 Data Center Year in Review
The era of digital business transformation has been in full thrust in 2019, as emerging applications and data center infrastructure…
Industrial-Grade Storage Enables Drones for Search and Rescue Teams
First things, first: I’m a proud member of the El Dorado County Search and Rescue (EDSAR) organization and have been…