Architecture Deep Dive
Hedvig doesn’t have an Hardware Compatibility List, they have however a set of “recommended configurations” available on demand, plus a reference architecture based on Cisco Hardware. This article is also interesting and provides sizing recommendations / info about building a Hedvig cluster on commodity x86 hardware.
« The quality of servers determine IOPS, the quantity of servers determines availability » Avinash Lakshman, Hedvig CEO – on Hedvig’s availability & performance
- HBLOCK Process (Data) : Takes care of replication and locally storing data
- Pages Process (Metadata): knows which nodes are up or down and what is their state (usage spikes etc..), knows also which blocks were successfully replicated to which nodes.
- Metacache (metadata) – needed at all times.
- Block cache (data) – can be turned off. Either in DRAM or SSD and present at the proxy level. Customers with NVMe at the proxy level have seen 90000 IOPS from cache
- DedupCache: global deduplication cache: caches the fact that a block ID with the same hash is already there and only updates the map page
- Hashing is done at the proxy level, the hash is sent to the Pages Process and if there is a match between the hash sent by the proxy and what is already in the Metadata, only the pointer to the hash is updated
- Hash IDs are unique to avoid potential hash collision/reuse
- Replication Factor (1 to 6, 3=default), Quorum for replication: (RF/2)+1. The writes are performed synchronously to quorum drives then the remaining writes are done asynchronously. RF is fixed and cannot be changed (unless you do a clone or snapshot).
- Replication Policies (Agnostic, Rack Aware, DC Aware)
- Deduplication, Compression, Client caching (caches data blocks at the proxy level – read cache only)
- Block granularity is set at the vDisk level.
- Container 1=node 1,2,3
- Container 2= node 4,5,6
- Container 3=node 7,8,9
Hedvig Architecture Whiteboard with Bharat Naik from Stephen Foskett on Vimeo.
Hedvig stated that currently they have a customer base of 100 users. They seem to already have a large deployment in place and although they can work in hyper-scale or hyper-converged mode, hyper-converged seems to have the preference. Currently, 70% of their installed base runs on top of VMware. While Hedvig is not an officially certified VMware solution, one may ponder the advantages vs inconvenients of this, as they claim they will support any deployment on top of VMware. Certification programs aren’t free and it may be understandable that Hedvig prefers to focus money on product development. Licensing is available on a perpetual basis or on a yearly subscription basis. The licensing unit is per TB (or was that TiB?) of nominal capacity if I remember well.
Positioning and future?
Considering that Hedvig went out of stealth about one year ago it seems that they have a pretty solid SDS offering in place. There are however some additional challenges that will need to be addressed. Currently in version 1.0, the graphical user interface could see some improvement – priority seems to have been given to APIs. To Hedvig excuse one can understand that they heavily focused their efforts on having a robust backend. There were concerns during the session about security around the Proxy component, Hedvig remained, in my opinion, vague about these topics. While it’s true that it’s hard to make sense of the data, the potential of nuisance of an ill-intentioned individual shouldn’t be underestimated. I wonder if Hedvig have mitigations in place such as using separate VLANs for the Proxy-Storage Services communication and restricting traffic through either firewall rules or ACLs.
While Primary Data’s goal is to rein in existing storage technologies and enterprise chaos by applying a conquer & divide approach (through overlay and policy), Hedvig have a totally different approach. By leveraging commodity x86 hardware (and commodity doesn’t necessarily means cheap, but readily available) and providing a fully fledged software approach to the storage problem while supporting any kind of deployments possible (hyper-scale or hyper-converged) on any available platform (Hyper-V, VMware, KVM, containers and even bare metal), Hedvig probably aims to become the de-facto standard in Software-defined Storage. Nothing comes for free (you have to pay a license) but certain customers can expect cost savings by avoiding custom-tailored appliances upon which vendors generally charge a premium.
The Hedvig presentation and deep dive matched my high expectations and it was one of the best sessions IMHO during SFD10. Although not everyone will “give a damn” if I can say so about the innards and architecture of a storage solution, we got “buck for the bang” by having a detailed technical session with a lot of delegate involvement as well.
Several veteran delegates shared that this session reminded them of Nutanix and PernixData sessions in 2012. It was certainly an exciting session and while perfectible on certain aspects, Hedvig has a huge potential to become a successful company with a great product.
SFD10 Disclosure: this post is a part of my SFD10 post series. I am invited to the Storage Field Day 10 event by Gestalt IT. Gestalt IT will cover travel, accommodation and food during the SFD10 event duration. I will not receive any compensation for participation in this event, and I am also not obliged to blog or produce any kind of content. Any tweets, blog articles or any other form of content I may produce are the exclusive product of my interest in technology and my will to share information with my peers. I will commit to share only my own point of view and analysis of the products and technologies I will be seeing/listening about during this event.