I thought a few months ago that it would be worth having a closer look at what Hedvig is doing in the storage space. My upcoming Storage Field Day 10 participation (where Hedvig will present their solution) gives me the opportunity to finally dig down further. This post is part of the SFD10 post series.
Who is Hedvig?
When I first heard about Hedvig, I thought it was a charming scandinavian princess from our infancy’s fairytales. Much to my kids chagrin, but to my geek delight Hedvig is a software-defined storage startup headquartered in Santa Clara, CA. Hedvig was founded by Avinash Lakhsman, who is currently CEO. We learn from his corporate bio that he has co-authored Amazon Dynamo as well as Apache Cassandra (if you know Nutanix, you know that their solution is based among other software components on a « heavily modified Cassandra »).
What Hedvig offers?
Hedvig has developed a software-based solution called the « Hedvig Storage Distributed Platform » (SDP). SDP is a software solution that runs on commodity x86 and ARM servers. Most of us are acquainted with x86 virtualization, but what about ARM servers? I’m by no means an expert in these but it seems that the use case for ARM servers is to run micro services and containers platforms such as Docker and ARM servers generally have a smaller footprint from an energy and thermal standpoint which makes them better suited for modern cloud-based workloads than the x86 architecture. It’s a bold and visionary move from Hedvig to embrace upfront what may likely be adopted in a few years by the industry.
SDP is hypervisor-agnostic and supports VMware vSphere, Microsoft Hyper-V, Citrix XenServer and KVM. SDP also allows storage to be provisioned from Docker Datacenter through a plugin. That will be all on Docker, since this is a playground where I don’t play yet.
Hedvig can be deployed either as hyper-scale storage (in a model where the growth between compute and storage is decoupled), as hyper-converged storage (where compute and storage scale linearly), or a combination of both.
Hedvig supports the following protocols:
- Block-based: iSCSI
- File-based: NFS (v2, v3 and v4)
- Object-based: Swift and S3
- We also hear that SMBv3 is on the roadmap for Hyper-V support
Hedvig Architecture
To my understanding, a Hedvig deployment is made of three components:
- Hedvig Storage Service – The Hedvig OS that runs on each storage node
- Hedvig Proxy – A lightweight VM/service that runs on each hypervisor node
- Hedvig Virtual Disk – A volume or mount point (NFS, iSCSI, S3/Swift) that is made available to the hypervisor or to the container/platform/cloud environment
I have derived the assumptions above (and screenshot below) from page 11 of the Hedvig + Cisco Reference Architecture document. I strongly recommend this document because it will give you a real technical view of how Hedvig can be implemented.
Hedvig support synchronous and asynchronous data replication. Data is replicated and automatically balanced across the cluster. The larger the number of nodes, the faster the rebuild in case of node failure. An interesting feature is called Replication Policies where the user can select from either Agnostic (no copies placement preference), Rack Aware (stores copies on different rack within the same data center) or Data Center Aware (stores copies on different data centers). Hedvig supports from 1 to 6 copies. I also understand (but may be wrong) that it’s possible to replicate data to AWS or Azure.
Snapshots and clones are taken instantaneously (metadata-based) without any performance impact and are unlimited per the product spec sheet. A feature I haven’t heard about before is called I/O sequentialization, where random I/O is aggregated to streamline data writes to the cluster. The management interface also allows to pin certain workloads either to HDD or Flash storage.
Hedvig natively supports inline compression as well as inline global deduplication. I suppose that global deduplication means that the deduplication library (index) is global and distributed across all the nodes in the Hedvig cluster. Also, the Hedvig Storage Proxy service provides server-side caching capabilities by leveraging either local SSD or PCIe flash.
Hedvig advertises their solution as being fit for classical data center use cases such as server virtualization, VDI and backup/DR, but also for modern requirements such as production clouds, test/dev clouds and Big Data. Undeniably, the massive scalability of the solution and the support for commodity x86 servers makes it very appealing. It should be worth mentioning that Hedvig is also API driven and provides a full set of RESTful APIs.
Online research indicates that two licensing models exist: a perpetual license (priced per tiered capacity). A duration-based subscription model is available on demand. Without clear figures it’s hard to determine which model may make more sense.
Why you may want to look at Hedvig?
One of the major advantages is that you can rely on commodity, off-the-shelf hardware instead of being dependent on storage arrays or hyper-converged appliances. If you’re a large x86 commodity server consumer who operates at a large scale, the savings can be huge.
Built-in cluster features such as auto-tiering, self-healing, auto-balancing and intelligent data distribution remove the hassle and burden of balancing data, moving workloads from one place to another and allow the customer to exit the gloomy swamps of primitive manual work. Another operational plus is the way snapshot and clones are handled.
Finally, support for all major hypervisors as well as for orchestration tools such as OpenStack/Mesos and container technologies (Docker/CoreOS) provide a guarantee that you can continue leveraging this storage platform even if you decide to reorient your strategy (from one hypervisor to another), or if you progressively decide to shift some of your workloads and invest into containers and micro services.
Final thoughts
Is Hedvig the solution to all our storage problems? I don’t know but I see theoretically vast possibilities. The broad hypervisor support, cloud integration, support for Docker and OpenStack, but also the support of x86 and ARM shows that careful thought has been put into developing this solution. I am interested in Hedvig’s application to large corporate environments with hundreds of nodes distributed across the globe. I want to understand how cross-data center replication happens, what are the technical requirements for this to happen, how such a large environment can be centrally managed. I’d also like to understand how Hedvig can help corporations evolving in highly regulated environments (SOX, GxP). It would be great to see a real world use case. The potential is fantastic, and I look forward to the SFD10 session with Hedvig.