Last week I was in the Silicon Valley to attend Storage Field Day 17. If you don’t know what Storage Field Day is about, I encourage you to check out this recent post of mine. Nevertheless the goal of this article is to cover computational storage – but to cover it in a short way. Why? Spoiler at the end of the article!
What is computational storage?
Computational storage is an emerging concept which aims to perform data processing directly at the storage layer. In traditional Von Neumann architectures the data is dormant on the storage layer. It first needs to be read, then processed by the compute layer (CPU / RAM / GPU), and once processed, whatever outcome is then finally written back on the storage layer.
Computational storage proposes to offload data processing directly at the storage layer. For this to happen, the storage layer must be able to process data: it needs a compute engine with one or more processors, RAM, and eventually one or more chips (ASIC or FPGA) with dedicated functions (encryption / encoding for example) that will further increase data processing speeds at the storage layer.
Why reinvent the wheel? Current compute architectures on the data center are overwhelmingly based on the x86 architecture. This architecture isn’t exactly optimal to process huge data sets. Processors have a high power usage, systems which are standardized in rackmount form factor require a lot of space, and all of this plays on power use and cooling. If we take the topic of edge computing (or fog computing as some put it), we are faced with mobility use cases, low power consumption and low bandwidth, especially when massive data amounts are generated. While we can miniaturize components to a certain extent, we’re still stuck with bandwidth bottlenecks for data transfer. Some systems have constraints that force them to process heaps of raw data locally, and that’s where computational storage will help.
Now that we’ve covered computational storage, how is it delivered? It could come up in the way of dedicated hardware, or via a software-defined storage solution. We will cover below the case of NGD Systems, a company who presented at Storage Field Day 17.
NGD Systems – an example of computational storage implementation
NGD Systems was founded in 2013 & is a startup with a capitalization of 25M USD raised in two funding rounds. NGD Systems was founded by flash industry veterans who have been in the space of enterprise flash since approx. 2003.
With their experience in enterprise flash, NGD Systems thought about coming up with a hardware, flash-based computational storage architecture. Their implementation is very similar to standard enterprise flash devices, except it’s everything but a standard enterprise flash device.
When breaking apart an NGD Systems flash device, we discover that besides flash modules, the device is equipped with an ARM processor and with RAM (2 GB RAM per every 4 TB of flash storage). There is also an onboard ASIC which processes specific data sets. The innovation doesn’t stop at the hardware level though. For a computational storage device to work, it needs to function as computer, because it needs to get instructions to do any kind of local computing. And to do so, it needs to have a small boot partition, an operating system and ideally a way to get instructions. In the case of NGD Systems, a Linux distribution is installed, and it comes with a development environment which even allows to run containerized docker apps directly on the storage device. It even supports Resin.io and much more.
There are also a lot of under the hood improvements in NGD Systems devices, for example it uses nothing of what a standard flash device has; it comes with its own Flash Translation Layer and with specific ways to manage the data for the best endurance. Again, what is essential here is not just offloading the data from the main CPU to the computational storage devices, but doing so in a meaningful fashion that delivers real outcomes. In the case of NGD Systems, the use of ARM processors allows for a much lower power consumption and a reduced footprint: for a given workload, it was possible to cut the amount of main x86 compute (servers needed) down to 25% of the original envisioned footprint if computational storage wasn’t used.
In the video below, Scott Shadley (NGD Systems VP of Marketing) provides an overview of NGD Systems Solutions. There are more videos recorded at Storage Field Day 17 available here.
Where next? More resources coming soon!
Did you find this post to be too superficial? Did you miss my epic 1500/2500+ words length posts? Fear not! I usually end my posts with a conclusion called Max’s Opinion. Exceptionally, I will skip this section. Why? Simply because I’ve been working very hard on a research document on computational storage, which will be published very soon (hopefully within next week). This research paper will be the inaugural one of the “TECHunplugged Industry Insights” series.
It will cover computational storage in greater depth starting with the shortcomings & challenges with data processing in traditional architectures, moving to what computational storage is and how is delivered, what are the relevant use cases and what is the current market landscape. Finally, it will also cover where computational storage is headed next, which could be potential newcomers in the market and whether NVMe-oF (NVMe over Fabrics) presents a threat or not to computational storage.
Stay tuned and if you don’t already, make sure to follow me on twitter (@darkkavenger) as well as our TECHunplugged account (@tech_unplugged).
Disclosure
This post is a part of my Storage Field Day 17 post series. I am invited to the event by Gestalt IT. Gestalt IT will cover expenses related to the events travel, accommodation and food during the event duration. I will not receive any compensation for participation in this event, and I am also not obliged to blog or produce any kind of content. Any tweets, blog articles or any other form of content I may produce are the exclusive product of my interest in technology and my will to share information with my peers. I will commit to share only my own point of view and analysis of the products and technologies I will be seeing/listening about during this event.