NetApp
CTO Predictions 2018
Evolving from “Big Data” to “Huge Data” will
demand new solid-state driven architectures
As the demand to analyze
enormous sets of data
ever more rapidly
increases, we need to
move the data closer to
the compute resource.
Persistent memory is what will allow ultra-
low latency computing without data loss;
and these latency demands will finally
force software architectures to change
and create new data driven opportunities
for businesses. Flash technology has
been a hot topic in the industry, however,
the software being run on it didn’t really
change, it just got faster.
This is being driven by the evolution of IT’s
role in an organization. In the past, IT’s
primary function would have been to
automate and optimize processes like
ordering, billing, accounts receivable and
others. Today, IT is integral to enriching
customer relationships by offering always-
on services, mobile apps and rich web
experiences. The next step will be to monetize
the data being collected through various
sensors and devices to create new business
opportunities and it’s this step that will require
new application architectures supported
by technology like persistent memory.
Today, we have processes
that act on data and
determine how it’s moved,
managed and protected.
But what if the data
defined itself instead?
As data becomes self-aware and even more
diverse than it is today, the metadata will
make it possible for the data to proactively
transport, categorize, analyze and protect
itself. The flow between data, applications
and storage elements will be mapped in
real time as the data delivers the exact
information a user needs at the exact time
they need it. This also introduces the ability
Data becomes self-aware
for data to self-govern. The data itself will
determine who has the right to access, share
and use it, which could have wider implications
for external data protection, privacy,
governance and sovereignty.
For example, if you are in a car accident
there may be a number of different groups
that want or demand access to the data
from your car. A judge or insurance company
may need it to determine liability, while an
auto manufacturer may want it to optimize
the performance of the brakes or other
mechanical systems. When data is self-aware,
it can be tagged so it controls who sees
what parts of it and when, without additional
time consuming and potentially error prone
human intervention to subdivide, approve
and disseminate the valuable data.
by accident. It must be enabled very
deliberately to ensure that the right data is
being retained for later decision making.
For example, autonomous car
manufacturers are adding sensors that
will generate so much data that there’s
no network fast enough between the car
and data centers to move it. Historically,
devices at the edge haven’t created a lot
of data, but now with sensors in everything
from cars to thermostats to wearables,
edge data is growing so fast it will exceed
the capacity of the network connections
to the core. Autonomous cars and other
edge devices require real-time analysis at
the edge in order to make critical
in-the-moment decisions. As a result, we
will move the applications to the data.
Data will grow faster than the ability to
transport it... and that’s OK!
It’s no secret that data
has become incredibly
dynamic and is being
generated at an
unprecedented rate that
will greatly exceed the
ability to transport it.
However, instead of moving the data, the
applications and resources needed to
process it will be moved to the data and
that has implications for new architectures
like edge, core, and cloud. In the future, the
amount of data ingested in the core will
always be less than the amount generated
at the edge, but this won’t happen
This can be thought of in terms of buying a
car versus leasing one or using a rideshare
service like Uber or Lyft. If you are someone
that hauls heavy loads every day, it
would make sense for you to buy a truck.
However, someone else may only need a
certain kind of vehicle for a set period of
time, making it more practical to lease.
And then, there are those who only need a
vehicle to get them from point A to point B,
one time only: the type of vehicle doesn’t
matter, just speed and convenience, so a
rideshare service the best option.
Virtual machines become “rideshare” machines
This same thinking applies in the context
of virtual versus physical machine
instances. Custom hardware can be
expensive, but for consistent, intensive
workloads, it might make more sense
to invest in the physical infrastructure.
A virtual machine instance in the cloud
supporting variable workloads would be
like leasing: users can access the virtual
machine without owning it or needing to
know any details about it. And, at the end
of the “lease,” it’s gone. Virtual machines
provisioned on webscale infrastructure
(that is, serverless computing) are like
the rideshare service of computing where
the user simply specifies the task that
needs to be done. They leave the rest of
the details for the cloud provider to sort
out, making it more convenient and easier
to use than traditional models for certain
types of workloads.
It will be faster, cheaper
and more convenient to
manage increasingly
distributed data using
virtual machines,
provisioned on webscale
infrastructure, than it will
be on real machines.
Mechanisms to manage
data in a trustworthy,
immutable and truly
distributed way (meaning
no central authority)
will emerge and have a
profound impact on the
data center. Blockchain is
a prime example of this.
Decentralized mechanisms like
blockchain challenge the traditional sense
of data protection and management.
Because there is no central point of
control, such as a centralized server,
it is impossible to change or delete
information contained on a blockchain
and all transactions are irreversible.
Emergence of decentralized, immutable
mechanisms for managing data
Think of it as a biological system. You have
a host of small organisms and they each
know what they’re supposed to do without
having to communicate with anything else
or be told what to do. Then you throw in a
bunch of nutrients: in this case, data. The
nutrients know what to do and it all starts
operating in a cooperative manner, without
any central control. Like a coral reef.
Current data centers and applications
operate like commercially managed farms,
with a central point of control (the farmer)
managing the surrounding environment.
The decentralized immutable mechanisms
for managing data will offer microservices
that the data can use to perform necessary
functions. The microservices and data
will work cooperatively, without overall
centrally managed control.
© 2017 NetApp, Inc. All Rights Reserved. NETAPP, the NETAPP logo, and the marks listed at http://www.netapp.com/TM are trademarks of NetApp, Inc.
Other company and product names may be trademarks of their respective owners. October 2017
Many have heard us at NetApp talking recently about how the world is changing fundamentally and quickly: digital
transformation is—or should be—the focus of any enterprise’s IT strategy. And squarely at the center of that focus is data.
As data continues to become even more distributed, dynamic and diverse, everything from IT infrastructures to application
architectures to provisioning strategies will have to change in response to new realities in the hybrid cloud world. It’s in that
context that we conceived our top five CTO predictions for 2018. We look forward to watching as the coming year unfolds.

NetApp CTO Predictions 2018

  • 1.
    NetApp CTO Predictions 2018 Evolvingfrom “Big Data” to “Huge Data” will demand new solid-state driven architectures As the demand to analyze enormous sets of data ever more rapidly increases, we need to move the data closer to the compute resource. Persistent memory is what will allow ultra- low latency computing without data loss; and these latency demands will finally force software architectures to change and create new data driven opportunities for businesses. Flash technology has been a hot topic in the industry, however, the software being run on it didn’t really change, it just got faster. This is being driven by the evolution of IT’s role in an organization. In the past, IT’s primary function would have been to automate and optimize processes like ordering, billing, accounts receivable and others. Today, IT is integral to enriching customer relationships by offering always- on services, mobile apps and rich web experiences. The next step will be to monetize the data being collected through various sensors and devices to create new business opportunities and it’s this step that will require new application architectures supported by technology like persistent memory. Today, we have processes that act on data and determine how it’s moved, managed and protected. But what if the data defined itself instead? As data becomes self-aware and even more diverse than it is today, the metadata will make it possible for the data to proactively transport, categorize, analyze and protect itself. The flow between data, applications and storage elements will be mapped in real time as the data delivers the exact information a user needs at the exact time they need it. This also introduces the ability Data becomes self-aware for data to self-govern. The data itself will determine who has the right to access, share and use it, which could have wider implications for external data protection, privacy, governance and sovereignty. For example, if you are in a car accident there may be a number of different groups that want or demand access to the data from your car. A judge or insurance company may need it to determine liability, while an auto manufacturer may want it to optimize the performance of the brakes or other mechanical systems. When data is self-aware, it can be tagged so it controls who sees what parts of it and when, without additional time consuming and potentially error prone human intervention to subdivide, approve and disseminate the valuable data. by accident. It must be enabled very deliberately to ensure that the right data is being retained for later decision making. For example, autonomous car manufacturers are adding sensors that will generate so much data that there’s no network fast enough between the car and data centers to move it. Historically, devices at the edge haven’t created a lot of data, but now with sensors in everything from cars to thermostats to wearables, edge data is growing so fast it will exceed the capacity of the network connections to the core. Autonomous cars and other edge devices require real-time analysis at the edge in order to make critical in-the-moment decisions. As a result, we will move the applications to the data. Data will grow faster than the ability to transport it... and that’s OK! It’s no secret that data has become incredibly dynamic and is being generated at an unprecedented rate that will greatly exceed the ability to transport it. However, instead of moving the data, the applications and resources needed to process it will be moved to the data and that has implications for new architectures like edge, core, and cloud. In the future, the amount of data ingested in the core will always be less than the amount generated at the edge, but this won’t happen This can be thought of in terms of buying a car versus leasing one or using a rideshare service like Uber or Lyft. If you are someone that hauls heavy loads every day, it would make sense for you to buy a truck. However, someone else may only need a certain kind of vehicle for a set period of time, making it more practical to lease. And then, there are those who only need a vehicle to get them from point A to point B, one time only: the type of vehicle doesn’t matter, just speed and convenience, so a rideshare service the best option. Virtual machines become “rideshare” machines This same thinking applies in the context of virtual versus physical machine instances. Custom hardware can be expensive, but for consistent, intensive workloads, it might make more sense to invest in the physical infrastructure. A virtual machine instance in the cloud supporting variable workloads would be like leasing: users can access the virtual machine without owning it or needing to know any details about it. And, at the end of the “lease,” it’s gone. Virtual machines provisioned on webscale infrastructure (that is, serverless computing) are like the rideshare service of computing where the user simply specifies the task that needs to be done. They leave the rest of the details for the cloud provider to sort out, making it more convenient and easier to use than traditional models for certain types of workloads. It will be faster, cheaper and more convenient to manage increasingly distributed data using virtual machines, provisioned on webscale infrastructure, than it will be on real machines. Mechanisms to manage data in a trustworthy, immutable and truly distributed way (meaning no central authority) will emerge and have a profound impact on the data center. Blockchain is a prime example of this. Decentralized mechanisms like blockchain challenge the traditional sense of data protection and management. Because there is no central point of control, such as a centralized server, it is impossible to change or delete information contained on a blockchain and all transactions are irreversible. Emergence of decentralized, immutable mechanisms for managing data Think of it as a biological system. You have a host of small organisms and they each know what they’re supposed to do without having to communicate with anything else or be told what to do. Then you throw in a bunch of nutrients: in this case, data. The nutrients know what to do and it all starts operating in a cooperative manner, without any central control. Like a coral reef. Current data centers and applications operate like commercially managed farms, with a central point of control (the farmer) managing the surrounding environment. The decentralized immutable mechanisms for managing data will offer microservices that the data can use to perform necessary functions. The microservices and data will work cooperatively, without overall centrally managed control. © 2017 NetApp, Inc. All Rights Reserved. NETAPP, the NETAPP logo, and the marks listed at http://www.netapp.com/TM are trademarks of NetApp, Inc. Other company and product names may be trademarks of their respective owners. October 2017 Many have heard us at NetApp talking recently about how the world is changing fundamentally and quickly: digital transformation is—or should be—the focus of any enterprise’s IT strategy. And squarely at the center of that focus is data. As data continues to become even more distributed, dynamic and diverse, everything from IT infrastructures to application architectures to provisioning strategies will have to change in response to new realities in the hybrid cloud world. It’s in that context that we conceived our top five CTO predictions for 2018. We look forward to watching as the coming year unfolds.