This post is part of the blog series related to Storage Field Day 13 and the Pure Storage Accelerate conferences.
I had the incredible chance to attend two events last week in the United States: Storage Field Day Exclusive at Pure Accelerate 2017 (in San Francisco), followed by three full days of Storage Field Day 13 in Colorado, a well-known industry event organised by Gestalt IT.
During those two events, one thing struck me (and probably others): the emergence of real-time storage analytics as a de-facto standard offering in storage arrays. To my knowledge, the first company to offer this feature was Nimble Storage (now part of HPE), and many others have jumped the bandwagon. Having been recently invited to attend Pure Accelerate, I will combine my vision of things with the practical application of those technological advances in Pure Storage’s own implementation of real-time storage analytics, a platform called Pure1, and it’s machine-learning, AI-leaning extension called META.
Real-time Storage Analytics: the journey or the destination?
Are real-time storage analytics the peak feature of storage arrays, or is there more to come? I believe that while this is a much-needed feature, it is only an iteration towards the true and ultimate goal of self-aware, AI-driven storage (hopefully, this kind of storage arrays will be kind enough to accept to host our precious data).
In October 2016, while at VMworld Europe in Barcelona, I attended the Intel Storage Builders conference and took part in a panel where prominent storage experts discussed about the probable characteristics of next generation storage arrays. Later that day I built upon my recollection of arguments raised during the discussion an added my own thoughts, resuming these into this post, which I recommend to read to get some extra context.
The “Analytics / Intelligence” section of that late 2016 post is the most relevant to today’s discussion. We are only at the beginning of fundamental changes in the way storage arrays operate. Considering how Real-time Storage Analytics solutions work, I think it makes little sense for such kind of analytics platforms to run in segregated islands, as the power of those solutions comes from the analysis and aggregation of googolesque amounts of data. Pure Storage, with their Pure1 META platform, claim for example to collect an astonishing amount of trillions of data points per day at this stage – which, while being an enormous amount of data, is necessary to get as accurate as possible recommendations. The quality and accuracy of such recommendations would be hardly achievable if each customer retained their telemetry data locally.
Pure1 META – A current implementation of Machine Learning
We are currently at the inception of this new era. The stage that has been reached now, and that is the basis of any AI system, is the implementation of Machine Learning. Pure Storage have implemented Machine Learning in META. This technology consists in developing a mathematical model based on past data, then feed the machines with large amounts of data and have then learn the possible future outcomes based on existing data.
The model created for META is called “Workload DNA”. So, what’s in a Workload DNA? Per Pure Storage, more than a thousand metrics are used to build the DNA of a given workload. They analysed metrics across 100,000+ workloads in their global network (read the systems that report centrally to Pure1) and they normalised results to compare how certain workloads behave.
Thanks to META and the vast amount of information amassed and used for machine learning, predictive scenarios can be used to determine how machines with different Workload DNA will fit together. Because different workloads exist over time, it is very likely that an application using an Oracle Database Server or an MS SQL Server with a given profile (think standard applications deployed across many industries) already have a Workload DNA on Pure1 META, it is thus also possible to predict how these workloads evolve and grow with time.
What’s next?
The capabilities offered by AI-enabled storage arrays are just starting to blossom; they will continue growing and it’s hard to predict which features and outcomes will become available. What we can expect is to see less and less human interaction with the storage arrays in terms of configuration and fine tuning.
On the sci-fi & cool side of things, we can imagine some vendors implementing a Natural Language Processing interface to their platforms, where users can query the array or the entire fleet of arrays just like they interface with Siri or Alexa: “Hey META, how much free space do I have left?” or “Hey META, what is my current dedupe ratio?” – now that would be cool, but there needs to be business value in this for development efforts to drive towards this, and those will most certainly be spent on further fine tuning the performance and data placement algorithms.
Obviously, those advances require a high-performance infrastructure capable to run those complex algorithms in real time and on a very wide scale, since the premise of Pure1 META is that of a centralised, cloud-like offering for all Pure Storage customers. We can expect other storage vendors to adopt a similar model because massive ingestion and processing of data in real-time is essential for the Machine Learning / AI system to learn and self-improve, there cannot be room for segmentation unless customers are willing to pay the price both from a cost and effectivity perspective.
One point that is unclear to me yet is how this technology can be applied regardless of different hardware platform architectures. For example, how will the system trained to differentiate between imperatives/requirements on a primary storage platform (such as FlashArray //X) vs an object storage platform (such as FlashBlade). I’m still trying to wrap my thoughts around this and the entire argument might be void, as the AI system would in both cases attempt to make the best advised placement recommendations based on the Workload DNA as described earlier.
Self-Driven Storage and Ethics
In that article on next-gen storage arrays I was discussing the topic of AI reasoning mechanisms and their similarity (or not) to human reasoning. An extra dimension that could be added should cover the ethical aspects of AI decision-making. Let’s imagine an environment bound in resources and with no place to move a workload. Several workloads compete, one of those oversees critical life support systems in a hospital.
How do we factor in the life-saving or moral aspects into AI decision-making processes? Should the system just go ahead and throttle the VM performance as with any workload, at the risk of human life loss, or should there be a kind of “Asimov’s laws of robotics” incorporated in AIs? Most humans (except maybe BOFHs) would prioritise life conservation, but they too need to know the system plays a critical role. So how is an AI supposed to know the criticality of a workload, when even humans must not necessarily be aware of it either? It seems to be still upon humans to classify and categorise the purpose and criticality of systems.
Conclusion
Real-time Storage Analytics is the next big thing (as in feature) in modern storage arrays. They are essential not just because they help customers gain better insights in their infrastructure. Off-premises solutions will become prevalent and are not only a necessary step for vendors to “up their game” against the competition: they are in fact a critical capability required by storage vendors to collect enormous amounts of data that will help them build data models which will be in turn used to feed various simulations and AI experiments. Ultimately, they will help vendors build the next generation of storage arrays that should likely emerge within 3 to 5 years (a conservative estimate would be 5 to 7 years), i.e. in the next decade, the 2020’s, which is approaching fast.
This next generation of storage arrays will very likely incorporate a “first generation” specialised AI engine with an initial set of capabilities related to storage management, workload placement and performance optimisation which will further alleviate the burden of day-to-day storage management. Those features and the way those specialised AIs react may potentially diverge from Machine Learning as we know it today, even if ML is at the basis of AI training.
This will have far reaching consequences such as rapidly accelerating the demise of the storage administrator profession as a specialised discipline (as I wrote earlier in 2016). Ultimately, with the advent of specialised AIs in other fields (not only storage, but we could envision workload placement from a compute perspective and many more applications), we are likely to see similar changes take place in other parts of the data center infrastructure stack.
Disclosure
This disclosure is written specifically for the Storage Field Day 13 and Pure Storage Accelerate events. I was invited to the Pure Storage Accelerate and Storage Field Day 13 events by Gestalt IT & Pure Storage. Gestalt IT & Pure Storage covered travel expenses to the event, accommodation and food were also covered for the entire event duration. Transportation from home to PRG airport and back, transportation from SFO airport to the hotel as well as food and accommodation costs (on 10-Jun-17) were covered by me.
I did not receive any compensation for the participation in this event, for which I took unpaid time off to be able to attend (as it is the case with any events I participate to). I am not obliged to blog or produce any kind of content. Any tweets, blog articles or any other form of content I have produced or may produce in the future related to this event is the exclusive product of my interest in technology and my will to share information with my peers.
In line with the concept of freedom of thought/critical thinking I commit to share only my own point of view and analysis about any products, technologies, strategies & concepts I was introduced to.