If Salvador Dali was at our NetApp MAX Data Tech Field Day presentation, I’m sure he’d start painting melting NVDIMM modules or melting Optane modules. Excuse my nerdiness, though: have you ever heard of NetApp MAX Data? And what if I said Plexistor?
Perhaps it will ring a bell to some of you who follow very closely the developments in the persistent memory space. It happens that Plexistor was acquired by NetApp in May 2017, barely a year after presenting their product at Storage Field Day 9. Long things short, NetApp rebranded Plexistor to Memory-Accelerated Data, which shortens very smoothly to MAX Data.
What is MAX Data?
As hinted earlier, MAX Data is a software-defined architecture that leverages persistent memory for data storage. Prior to the NetApp acquisition, Plexistor used to call this SDM, as in Software-Defined Memory. Two key concepts are important here. First, MAX Data is a software solution. Second, it’s an in-server solution using memory slots on the motherboard to either leverage NVDIMM modules or Intel Optane DC Persistent Memory modules.
MAX Data consists of two storage tiers, and as such is a data tiering solution. The primary tier (PMEM) consists of NVDIMM or 3DXPoint (Optane) storage. This is the area where application mount points are configured to read to or write from. The second tier consists of either Local Flash SSD or a LUN served from an all-flash array, and is used to store data that is less frequently accessed. There is a ratio of 1:25 recommended (and enforced by NetApp) between the primary tier capacity and the secondary tier capcity. Each GB of PMEM must be backed with 25 GB of Flash storage. MAX FS is the filesystem that layers on top of both storage tiers and is installed (via a Linux CentOS or RedHat OS) on the PMEM tier. The way the solution is implemented allows for single-digit microsecond latency (<10 μs) which is nothing short of astounding. The diagram below depicts the architecture of the solution.
Providing local persistent memory storage to a server isn’t a challenge per se. Populate the DIMM slots with the relevant PMEM device and you’re OK to go. Of course, if you want to use a tiering technology such as MAX Data, it gets a bit spicier than that. The challenge lies in how to deliver the expected performance with the enterprise-grade data protection levels that end users (and especially paranoid storage admins, note the pleonasm) are accustomed to.
You may have spotted in the diagram above two words that are very common to NetApp users: Snapshot and Mirror. The snapshot feature leverages the usual Ontap snapshot capabilities and consists into taking a snapshot of the entire primary tier onto the secondary tier via Snapshot / SnapMirror.
Mirroring works differently: NetApp is working on a feature called MAX Data Recovery, where it will be possible to use a dual-node (active-passive) setup to replicate data from the Tier 1 of one server to the Tier 1 of another server. This would be done directly via memory mapping and by using an RDMA connection between both servers (NetApp recommends a 100 GbE card in that case). This solution which works at a very low +/- 10 μs latency allows writes to be protected at memory speeds. Some elements to be considered is that although the data will be copied from one server to the other, the data will be active only on the primary server. One of the interesting things with MAX Data Recovery is that the recovery server, when properly sized, can act as the replication target for up to 8 distinct primary servers.
A great thing with MAX Data is how it integrates with the NetApp Data Fabric. It is in fact possible to replicate data from the secondary tier onto NetApp Cloud Volumes Services Ontap.
NVDIMM or Optane?
NetApp says that their solutions supports NVDIMM or Optane. Which one should we choose? The major difference between NVDIMM and Optane lies in price and capacity. NVDIMM modules are commonly available in 8 or 16 GB modules, with Crucial announcing yesterday the availability of their first NVDIMM 32 GB modules. Other memory manufacturers did announce 32 GB modules in the past, but for now wer’re pretty much confined to 16 GB modules. Optane on the other hand is supposedly cheaper than NVDIMM (33% the price of NVDIMM on capacity parity) and offers greater capacities. It’s worth noting that MAX Data also supports DRAM (which is volatile memory) but only for testing purposes.
The Market & Use Cases for MAX Data
Unsurprisingly, persistent memory currently offers the highest performance and lowest latency. It is the most expensive storage tier and as such, will be leveraged first by the most demanding applications available on the market. Those are usually related to real-time analytics, AI, Deep Learning, Financial Markets info and in general whatever needs the lowest μs scale latency.
How profitable is this for NetApp and other contenders? An unquoted source on one of the presentation slides mentioned that by 2020, the In-Memory Technologies market would grow to $ 13B. The adoption and democratization of such technologies beyond 2020 will certainly increase this market share, although it is not clear what exactly is being counted into this market. Is it software used to enable In-Memory Technologies (such as MAX Data)? Is it the hardware (3DXPoint / NVDIMM and their eventual successors)? Is it the applications that require In-Memory technologies? We’re a bit stranded here and more research might be needed.
The recorded Tech Field Day Extra presentation at NetApp Insight 2018 gives a great overview of those use cases with mind-striking examples. Move over to 51:57 minutes to get to the use cases in the video below (and if you have time, watch it fully!)
Max’s Opinion
I had wanted to hear about Plexistor, but then they got gobbled by NetApp. And they got purchased by NetApp in a time where I wasn’t really looking at NetApp anymore (see this post for context). So this startup about which some industry friends were excited just went through a black hole for me.
I’ve been impressed by MAX Data and what NetApp has been developing here. They were able to transform a cutting edge concept into a product that integrates seamlessly (it seems) with their own “traditional” technology stack (Ontap, SnapMirror, etc.). It allows NetApp to talk with their customers in a language they understand about a revolutionary product.
What I also find interesting is that we’re finding one more use case for 3DXPoint memory. Perhaps Optane isn’t as useless as it seems? Perhaps the word isn’t “useless”, but “niche”. After all, it’s still cheaper than NVDIMM and provides greater capacity.
But I disgress. What is interesting here is the possibilities opened by this product. When MAX Data Recovery makes it to the market, it will be one more compelling reason for critical workload customers (those who need not just all of the IOPS, but also the extremely low latencies) to seriously consider MAX Data.
The major differentiator between MAX Data and other Tier-0 solutions is the deep integration with the NetApp ecosystem. NetApp understood what Plexistor would offer to them if they acquired them. They understood that they had the potential to unleash a powerful product with enterprise class integrations and they were able to execute on this idea. Decidedly, NetApp is full of surprises this year, and it confirms the thoughts I had on the Data Fabric. Damn, this is not the boring 2016 NetApp anymore – those guys had been working in stealth mode all along to bring us the cool things of NetApp Insight 2018. I’m glad of it, and happy for them. Keep rocking on, NetApp!
Disclosure
This post is a part of my Tech Field Day related post series. I am invited to the event by Gestalt IT. Gestalt IT will cover expenses related to the events travel, accommodation and food during the event duration. I will not receive any compensation for participation in this event, and I am also not obliged to blog or produce any kind of content. Any tweets, blog articles or any other form of content I may produce are the exclusive product of my interest in technology and my will to share information with my peers. I will commit to share only my own point of view and analysis of the products and technologies I will be seeing/listening about during this event.