DCD>Survey:
Data storage 2024
Exploring the latest attitudes
and trends in data center storage
Survey supported by:
Introduction
Contents
3 What kind of storage does your data center use?
As the cornerstone of our increasingly digitized lives, data centers house data
owned by individuals, enterprises, or government institutions. Due to the varied 4 What types of storage services do you typically provide for your customers?
nature of the applications they serve, not all data centers are created equal,
with different facilities designed to cater to different requirements. For example, 5 What are your primary drivers when choosing storage for your facilities, or what drivers
cloud storage, the (industrial) Internet of Things or business models, including are you seeing from your clients/customers?
software as a service or streaming portals to name but a few.
6 Has the advent of AI made you consider reevaluating your software stack, such as
Storage infrastructures are the foundation of these facilities, but those aren’t employing a DCIM system or employing AI to manage data center operations themselves?
created equal either, which is why it’s so important to get it right. These are the
7 When considering data storage, do you tailor your requirements to different loads and
fundamental mechanisms that, among other things, are responsible for the use cases?
capacity, security, scalability, and accessibility of this often mission-critical data.
8 When looking at data storage, do you consider the file system alongside the hardware
For some businesses, storing data in a secure, structured manner will take itself?
precedence, for others, how quickly the data can be accessed might be the
deciding factor that dictates which storage solution is best suited to the needs 9 Do you consider cybersecurity/ransomware/bad actors when making decisions about
of the organization. your storage needs?
And with the proliferation of AI and other high-density workloads now taking
10 Deduplication can reduce the overall latency and TCO of your data center. Is your facility
capable of managing deduplication for your customers?
center stage and producing vast new, increasingly complex data sets, is the way
we store this information changing? To find out, in this survey, we explore the 11 Do you think that your facility will see capacity issues over the coming decade as a
attitudes and trends we’re seeing across the industry to ascertain where we are result of the sheer quantities of data being stored?
now, and what we can expect next.
12 Analysis
2
What kind of storage does your data center use?
Analysis
As an individual entity, all-flash storage came out on top among our respondents, with tape
somewhat predictably bringing up the rear. This could be due to the fact that reading and
writing data to tape is more time-consuming and labor-intensive than other methods of
storage, as well as being slower to access, limiting which workloads it makes sense for.
Suggested actions
Know your workloads
Not every data center will be running exclusively high-density applications at all times.
Visibility into the requirements of each workload in terms of factors such as: speed,
scalability, cost, capacity, and accessibility will be key in tailoring the right solutions for
your facility and help ensure you’re not paying a premium for storage you don’t need.
Many companies don’t actually know what data they’re storing or how it’s used, so this is
a crucial first step.
Mix and match
As our respondents have indicated, considering a hybrid solution allows you to use
the best parts of different storage types for different workloads and use cases. The
fastest and most expensive option does not make sense for cold storage, while slower
approaches are outdated for hot workloads. Consider a tiered approach designed for
tiers of workload, focused on price per required performance. Other Combination Mechanical
All flash Tape
3
What types of storage services do you typically provide for your customers?
Analysis
With the majority of our respondents providing hot storage services to their customers, this
would suggest an increased appetite for speedy data access. This kind of service would be most
applicable to real-time applications or transactional systems where mission-critical data must
be immediately retrievable. In contrast, cold storage, i.e., data that is rarely accessed, such as
compliance and regulatory data, was less prevalent, although not entirely absent, reinforcing
the continued place of tape storage or HDDs. We would expect services pertaining to east-west
data to accommodate a significant chunk of the responses, due to the increase in data moving
among servers within a specific data center (as opposed to outside of it, as with north-south
traffic) this is likely due to advances in virtualization and private cloud, as well as the increased
adoption of both converged and hyper-converged infrastructure.
Suggested actions
Define and design
Getting a little more granular, when it comes to hot and cold data in particular, outlining
criteria for categorizing which is which, based on factors such as access patterns and
performance needs can be really helpful. Once you’ve ‘defined your data,’ consider
designing a tiered storage architecture; this way you can more easily allocate higher
performance storage systems for hot data, while reserving more cost-effective choices
like hard drives or tape libraries for cold data.
Future forecasting
Cold storage, mostly going East-west data,
It’s clear to see that the scale and types of data being created are evolving, with IDC’s Other
in rather than out constantly on the move
Global Datasphere Forecast predicting the creation of over 220ZB by 2026. Traditional
workloads required traditional storage; however with east-west traffic and an appetite
Hot storage, constantly
for hot storage on the rise, do your own forecasting to future-proof your storage being accessed
requirements without needless and costly overprovisioning.
4
What are your primary drivers when choosing storage for your facilities,
or what drivers are you seeing from your clients/customers?
Analysis
It may be slightly surprising to see ‘security’ and ‘capacity’ rank above ‘cost’ for this question.
However, with respondents from the data center industry, it’s understandable, given that the
safety of mission-critical data as well as the adequate space to store it – and scale up as and
when needed – is paramount. That said, looking after the bottom line is always a priority in
any business, as is indicated by ‘cost’ coming in as the third most popular response, but with
factors such as ‘energy efficiency’ and ‘compatibility with existing infrastructure’ also ranking
strongly, these are intrinsically linked with keeping expenditure to a minimum, showing us both
capex and opex are still (unsurprisingly) key considerations when choosing a storage solution.
Suggested actions
Mind the gap
As we’ve established, cold storage still has its place in the modern data center. Not
only in terms of cost-efficiency when dealing with large magnitudes of data, but the
cybersecurity benefit of being able to air-gap a backup copy. Working together with
cybersecurity software, flash, and replication technology as data changes status from
hot to cold, air-gapping becomes a cost-effective way to store long-term data. Flash is
similarly key to cyber security, due to the speed to recovery it can offer.
Have a backup plan
To help ensure the availability of your data, consider backup deployments. Data stored Security Read/write speeds Other
in a backup environment can be restored from an isolated and immutable location,
allowing you to instantly review snapshots of your data and recover from any point in Futureproofing Footprint Energy efficiency
time. Alternatively, or in combination, you could opt for object storage, which protects
data by creating immutable backups, preserving the longevity and integrity of your
Compatibility with existing
data. If all that seems a bit too complex, consider DRaaS (disaster recovery as a Capacity infrastructure Cost
service) to bolster business continuity.
Easy install/maintenance
5
Has the advent of AI made you consider reevaluating your software stack,
such as employing a DCIM system or employing AI to manage data center operations themselves?
Analysis
As the old adage goes, you can’t manage – or improve upon – what you don’t monitor, so
although 21.4 percent of our respondents claim to already be using an optimal software stack
now, how optimal will it be in two years’ time? There is little that can’t be improved with some
holistic visibility, so the fact the majority of our respondents seem to be considering their
options in terms of AI or DCIM deployments for data storage tells us this is becoming more
commonplace across the industry.
Suggested actions
Ask the right questions
For those respondents that weren’t sure where to begin, when considering a potential
DCIM solution, ask the following questions: Is the solution used by industry leaders?
Does it automatically track assets? Can it grow with my needs? Does it support other
systems (i.e., reporting integration, etc)? Is it secure? Is the solution non-proprietary
and open? And finally, is the solution going to be robust enough to meet and exceed my
needs in the long term?
AI optimization
With maintenance and cooling flagged as two of the primary use cases for AI in
terms of data center operations, it’s worth mentioning AI can deliver so much more,
particularly in terms of data storage. To name but a few benefits, AI can help optimize
We already use an optimal We’re looking at our We’d consider AI
your storage solutions based on usage patterns, enhance security through real-time software stack options optimization of functions like
anomaly detection, and improve energy management by adjusting workloads and maintenance and optimized
predicting energy trends; it can even help with staffing support. cooling
We don’t know where
to start
6
When considering data storage, do you tailor your requirements
to different loads and use cases?
Analysis
Most of our respondents do try to tailor their storage solutions where they can, or at least
to a degree, with some ensuring every appliance is assigned to its intended use case. With
technologies such as AI and ML proliferating at such breakneck speed, this is the way forward.
After all, AI is not only changing the way we do business, it’s changing our infrastructure needs.
Those respondents using the same storage type throughout may have similar workloads
across their data centers and don’t necessarily need a multi-faceted tailored solution, but there
is still optimization to be found at a more granular level.
Suggested actions
What do you have in the pipeline?
In an AI data pipeline, various stages align with specific storage needs to ensure efficient
data processing and utilization. By viewing AI processing as part of a project data
pipeline, enterprises can ensure their generative AI models are trained effectively and the
storage selection is fit for purpose, this way you can ensure that your AI models are both
effective and scalable.
Decisions decisions
Before starting an AI project, a major decision you need to make is whether to use
cloud resources, on-premise data center resources, or both, in a hybrid cloud setup.
When it comes to tailoring your storage solution for AI, the cloud offers various types
Yes, every appiance is No, we use the same Somewhat, we have a
and classes to match different pipeline stages, while on-premise storage can be
tailored for its expected use storage type throughout range of appliances
limited, resulting in a universal solution for various workloads. Depending on your on-
prem set up, training can be faster on the cloud - but if you have multiple tiers on-prem,
it can end up being cheaper long term.
7
When looking at data storage, do you consider the file
system alongside the hardware itself?
Analysis
For this question, the responses were fairly evenly split, with the majority of respondents saying
they leave such matters to their storage providers. This, combined with 31.2 percent saying this
‘wasn’t their area at all,’ is a testament to how complex data storage can be, particularly now
operators have AI and other high-traffic, high-density workloads to store and manage.
Suggested actions
Go global
As IT teams and data center operators grapple with what is likely several incompatible
storage protocols, particularly when it comes to unstructured data, global file systems,
also known as distributed file systems, are gaining traction. Although not new, these
systems put enterprise data under a single file access namespace, so that businesses
can access data from anywhere, offering the flexibility, resilience, and capacity of the
cloud, while retaining the simplicity of NAS storage.
Ask an expert
Data storage is no longer just about storing data; it’s about extracting its value to gain
a competitive edge. If you feel out of your depth, customizable support is available that
can be tailored to meet the needs of bandwidth-hungry IT environments. Between 80-90
percent of data collected today is unstructured, and just waiting to be extrapolated is a
wealth of information crucial to informed decision-making. Seek a third-party who can Yes, but I leave this to my No, this isn’t my area at all Yes, I am fully engaged in
help you tap into it. storage provider/partner the software stack as well
as the hardware
8
Do you consider cybersecurity/ransomware/bad actors
when making decisions about your storage needs?
Analysis
The responses to this question echo the sentiments of question three, which cited security
as the primary driver when choosing a storage solution. It is clear then that when it comes to
storage, security is at the forefront of the mind.
Suggested actions
Back(up) to basics
Having a backup plan is pretty integral when dealing with mission-critical client data.
Storing all your data in one place not only isn’t smart, it’s uneconomical. Not only is it
advisable to have multiple copies of your data, a typical backup strategy might include
snapshots running on your storage, a local backup of both files and images on a separate
storage device as well as offsite backup. It can also be helpful to name ideal recovery
points and times for your business. At the same time, however, data deduplication is a
critical part of improving storage utilization and reducing costs - and in some sectors is
required. Make sure that the only duplication you have is intentional.
People and procedures
The storage solution you choose could have the highest security credentials on the
market, but we’re only human, and people make mistakes. Ensure human error is kept
to a minimum by training employees on security best practices. Performing security
No, we do it all with our I didn’t know it was a It’s a major
audits and assessments regularly, as well as developing policies for not only data DCM/security stack consideration consideration
storage, but transmission and disposal too, will ensure security remains a priority
throughout the data’s entire lifecycle.
9
Deduplication can reduce the overall latency and TCO of your data center.
Is your facility capable of managing deduplication for your customers?
Analysis
As the question suggests, data deduplication is of value as it can significantly reduce storage
space, while lessening the amount of bandwidth that can be wasted moving data to and from
remote storage locations. It can also help slash backup and recovery times, improving data
center efficiency by using less onsite power, all of this ultimately contributes toward a reduced
TCO for your data center, and from a customer standpoint, fewer redundant copies of data
means a reduced risk of data loss or corruption.
Suggested actions
There are no guarantees
Remember that deduplication ratios – the metric used to measure the success of the
deduplication process – provided by vendors tend to be best-case estimates. Not all data
is created equal, and the nature of your data is a vital component in determining how
effective the process will be. Take them with a pinch of salt.
Is deduplication right for you?
From the outset, data deduplication is a good thing, however, 38.1 percent of our
respondents seem to disagree. There are several reasons you might not offer
deduplication as a service, for example: performance issues; data loss if data is
incorrectly matched; difficulties with implementation and maintenance, as well as the
fact deduplication creates new metadata, which can require storage space and create Yes, and our customers see Yes, but our customers It’s not something we
the benefit don’t engage with it offer
data integrity issues. But remember, there are times when deduplication is required.
The bottom line? Know what type of data you’re dealing with.
10
Do you think that your facility will see capacity issues over the coming
decade as a result of the sheer quantities of data being stored?
Analysis
It’s safe to say it’s no longer a case of ‘if’ we will face capacity issues but when, as confirmed
by our respondents, with some looking to build new facilities, or consolidate existing ones to
create more space and the scope to scale.
Suggested actions
Get cold
To deal with ever-increasing data quantities, consider moving them to cold storage,
including object storage with a tape tier. Thanks to massive improvements in density with
cold storage tiers like HDDs and tape, it is possible to keep up with the data surge without
massively increasing one’s footprint.
Get your head in the (hybrid) cloud
In this case, the advantages of cloud storage are two-pronged, not only will it
help maximize on-site storage capacity, but it will provide some respite from data
management. Moving infrequently accessed data to hybrid cloud will mean little to no
oversight is required, allowing storage admins the time to focus on higher-priority data
requiring high-performance storage solutions.
No, our retention policies Yes, and we will be
No, it’ll work itself out
mean we’ll remain in scope looking to consolidation/
densification
Yes, and will look to build
Yes, but we’ve planned for it
out more facilities
11
Analysis
In terms of the type of storage being implemented by our respondents in their data centers, the results That said, it’s not just about optimizing your tech, it’s important to optimize your workforce, too. Why
showed an increasing move to ever faster data types, but a similar growing fear of a data deluge. invest in your storage infrastructure just to leave it open to human error? Ensuring personnel are
properly trained in process and procedure is crucial, not only to prevent mistakes in the first place, but
There is no single answer to this dilemma - each storage technology has its own merits and pitfalls - so they know what to do should an issue arise, without it resulting in disaster. You may have a disaster
and no data center or customer would rely on one approach. Which combination you choose is entirely recovery plan in place for your data, but what about your staff?
dependent on the workloads your facility is running, with a hybrid solution tailored to the individual needs
of the data center an advisable approach to avoid costly overprovisioning (which won’t do anyone’s ESG
targets any favors either.)
This is where the importance of knowing your workloads and defining your data cannot be understated.
With many of our respondents housing both hot and cold data, alongside an increase in east-west Final thoughts
data, ensuring you have criteria for categorization will take the headache out of designing an effective
tiered storage architecture. Forecasting is also paramount, to ensure your solution can scale with future Once you have your storage solution, capacity, and placement sorted, you need to manage
workloads, this is particularly prevalent when ‘capacity’ issues are being cited across the industry as one that data, particularly if it’s unstructured. Consider an end-to-end data management
of the biggest concerns when it comes to data storage in the AI era. platform that supports the entire AI pipeline and unstructured data lifecycle – from the
all-flash performance required to power AI, to low-cost archiving to train AI models using
Regardless of which solution is best for you, putting all your eggs in one basket is never a good idea, not your unique data, today, extracting the value from what you have is the competitive
only from a TCO perspective, but in terms of keeping critical data safe. The fact security came out on top differentiator.
as the primary driver when choosing data center storage is a testament to this, and a reminder that it’s
not just about what you store data on, but how you secure it, and how many copies you keep. Ultimately, storing new types (and volumes) of data requires a new way of thinking,
alongside a toolbox approach that tailors your storage solution to meet your needs. The
Consider backup appliances alone or in combination with object storage to protect data via the means way we move forward will involve asking the right questions, and collaborating with trusted
of isolated and immutable backups, whilst ensuring business continuity through the utilization of DRaaS. partners to fill any gaps in your knowledge. After all, the IT landscape is experiencing a
seismic shift and you don’t need to go it alone. Put your faith in the people in the know to
And when it comes to managing data center operations, many of our respondents said they were either help you navigate your way through, realizing the potential AI has to offer, both inside and
already implementing, or had considered, AI or DCIM. These tools can help optimize storage based on outside the data center.
usage patterns, with real-time anomaly detection to enhance security, as well as the ability to improve
energy management by predicting energy trends and adjusting workloads accordingly. So, despite initial
investment, the right management software will pay dividends when storing data long-term, by providing
the holistic visibility required to make informed decisions, innovate, and improve.
12