This article is the first part of a series of two articles covering Oracle Cloud Strategy. This article will cover the Oracle Ravello Cloud Service (ORCS). The second article will cover the broader Oracle Cloud offering and will draw my analysis and conclusions about the general Oracle Cloud Strategy.
I was invited by TechReckoning (John Mark Troyer and his fantastic team) to participate in the very first Oracle Ravello Blogger Day, a full day event that was held last week on 23-May-17 in Redwood Shores, CA – at pigeon flight’s distance (in full compliance of RFC 1149) from the imposing Oracle corporate headquarters.
I had gladly accepted the invite because I’ve used Ravello in the past (pre-Oracle acquisition) and loved it, although I seldom get to use it nowadays. The other -and major reason- is that I hoped that besides Ravello content we’d be served with some serious content about Oracle Cloud solutions (oh boy that was fulfilled), as I had absolutely no idea what they are doing.
Special thanks go to my friend Andrea Mauro without which this wouldn’t have been possible, as he was not able to attend the original event for organizational reasons and thus put me in touch with John.
Before we delve into the core of this article, a quick sum-up of the event format. We were invited at the Oracle Conference Center in Redwood Shores, with a relatively large audience of well-known influencers (approx. 25), most of them also being VMware vExperts, a traditional audience group for Ravello Systems. It felt, to some extent, as a family & friends reunion, with a good size of the attendance being also Tech Field Day delegates. Props to TechReckoning for doing their best in having an audience that celebrates diversity!
From Ravello Systems to Oracle Ravello Cloud Service
While the event was specifically named Ravello Blogger Day, it should be no longer news to anyone that Ravello Systems was acquired last year by Oracle for the hefty sum of 500 million USD. As this happened, I remember a feeling of consternation and attentism filled the ranks of the VMware community of Ravello Systems users. To many VMware-centric folks, Oracle is a kind of secretive company that they believe to be run by lawyers and auditors, and whose main offerings are demanding database systems overwhelmingly predominant in the enterprise IT world as well as RISC architecture-based SPARC servers. I’ll stop painting a grim picture here, and will say that because of Oracle partial views on licensing of its technologies in non-Oracle virtualized environments, it has suffered a pretty terrible reputation among virtualization practitioners. I came therefore to the event with an overwhelming curiosity but without being fully able to erase a mindset imprinted by a decade of defiance.
Without diverting on Clay Magouyrk’s (VP of Oracle Bare Metal Cloud Services) magistral show straight into the first 5 minutes of presentations (I’ll cover that in my 2nd post), I must say that the defiance faded very quickly as I found myself hearing not from some individuals trying to explain hard partitioning vs soft partitioning and x86 vs SPARC CPU core ratios, but from IT professionals working in the same industry as ours.
To make things right, the official name of Ravello Systems’ offering is the Oracle Ravello Cloud Service. Lord Of The Rings fans shall rejoice as I will now use the ORCS acronym to make things shorter: Three cheers for Sauron! And to the grammar oriented people: I shall use singular, as ORCS refers to the Oracle Ravello Cloud Service (one single service), not to Orcs as a fictional species. Much to the dismay of the readers I’ll attempt to build sentences in a context that makes them hardly trollable.
Historically Ravello Systems have positioned themselves as an abstraction layer/virtualization platform that aims to provide their customers with the ability to deploy non-production environments on top of public cloud providers infrastructure through the use of nested virtualization.
The major challenge faced with running workloads in the cloud is the differences in architectures between an on-premises environment and a cloud environment. From virtualization to storage, and through that crucial piece that is networking, everything is different. Ravello’s goal was to build an abstraction layer (their HVX hypervisor) that would make these differences disappear and would thus allow applications and virtual machines to run on top of cloud services as if they were configured the same way as on-premises.
Before we move along, it’s important for the reader to understand the abstraction layer model of HVX: currently HVX runs on top of cloud VMs and does not supports (yet) bare metal deployments. At the very bottom of any Ravello deployment are one or more cloud VMs (think Amazon EC2 instances for example) on top of which HVX will execute. The VMs running on top of the ORCS platform are executed within the HVX environment that sits on top of one or more cloud VMs.
The HVX Hypervisor
The first element is HVX, which is a binary translation hypervisor running on top of the cloud VMs (and as we’ll see below, HVX is getting some nice improvements very soon). It supersedes the need to use VMware ESXi (unless obviously you’re using Ravello service to explicitly build a nest VMware lab) and provides the same functionalities as one would expect to see in a VMware environment. HVX is built on qemu and is massively customized with support for VMware devices (VMXNET3 / E1000 network adapters, PVSCSI adapter) and a few more components and logic bits that make it behave, from a VM perspective, as if it was running natively on ESXi.
Networking: the SDN Approach
Yet another greater challenge is to be solved: the question of networking. Each cloud VM has its own IP address. The cloud provider also has its own network space to manage. Ravello took a clever approach to this by implementing a Software-Defined Network (SDN) solution.
The Ravello SDN overlay exposes a clean L2 network to VMs with support for broadcast/multicast. Customizations are available: VLANs, subnets, routers can be added as well as DHCP and DNS services. virtual network appliances can also be imported to support L3 communication in the virtualized environment. Needless to say, this was implemented in a very smooth visual way by Ravello, a way that we’re yet to see on major hypervisors, as per Figure 2 below.
Storage, Management and other bits & bytes
From a storage standpoint, block or object storage from the cloud provider environment is presented directly to the guest VM running on top of HVX as if it was native storage per on-premises specifications and running with the same pvscsi adapters, making sure that the guest VM runs unaltered.
Finally, from a management perspective users can manage their VMs directly from the ORCS interface (or use REST APIs if they’re the programming kind of person). They can structure those VMs into applications, arrange start/stop order and do yet another myriad of actions. Applications can be optimized to run based on cost or performance, on a variety of regions (21 currently supported, 9 upcoming). And indeed, customers can create their own libraries of VMs as well as create application templates, which allow them in turn to spin many instances of the same application for whatever need this may fit.
While we are talking about regions, the concept of High-Availability should be also delved into. It’s possible to implement host anti-affinity rules between guest VMs, making sure that they run on separate cloud VMs (cloud hosts). In a similar way, it’s possible to run CPU hungry VMs on separate hosts for better performance. Guest VMs can also be restarted in case VM failure is detected.
A great showcase for non-production workloads
The strength of the ORCS platform which were explained at length above were undeniably validated by amazing the customers / use cases presented to us during the event. We had a clear proof about how Ravello Systems is a powerful enabler for companies who need to set up complex test environments for their customers, spin up virtual labs to train their personnel or even create isolated environments for security/penetration testing or malware/antivirus infection spreading scenarios.
While this is certainly a series of powerful use cases, until now the ORCS platform been mainly confined into the world of non-production workloads as well as limits inherent to software-assisted virtualization and constraints due to running on top of existing underlying cloud virtual machines. We heard that larger VMs are now available, with up to 32 vCPU and 200 GB RAM but still, the question that every single attendee must have had is: “wait now, what about supporting production workloads, if ever? Size is a good thing, what about performance?”
Upcoming Improvements – Production Ready?
So, the support of production workloads on the ORCS platform is heavily dependent on one thing at least: until now, their current HVX hypervisor offered only software based nested virtualization, which is perfectly fine for functional testing scenarios but awfully slow for performance testing cases or even production workloads, because cloud hosts are masking hardware extensions that would make nested virtualization performance more bearable.
Those performance issues are a known problem for Ravello, therefore their engineers have been working very hard to improve performance. They appear to have succeeded and are soon coming with two additional ways of running workloads, offering therefore a total of three performance tiers:
- Software based nested virtualization (runs on any cloud provider/platform)
- Hardware assisted nested virtualization (may be limited to specific platforms/VMs)
- HVX on bare metal
Software based nested virtualization represents how the ORCS offering has always existed. Hardware assisted nested virtualization requires VMs with hardware support extensions (such as Intel VT-x for example) to be presented to the child VMs. Finally, HVX on bare metal is a fully new class of service, which guarantees best performance as an entire layer of abstraction (the cloud provider VMs) is removed, bringing HVX to leverage full hardware performance.
Another feature which I liked very much (although I didn’t get whether it is already implemented vs in the making) is the ability to comply with data sovereignty measures: VMs can be locked in different VM repositories based on the region where they are to be used. Equally, guest VMs and applications can be locked to run only in specific cloud regions.
Business Value vs Technical Aspects
While the Oracle Ravello Cloud Service is indeed an impressive technical solution, not only by the complexity of means in place to run VMware environments “as is” on totally different target platforms, it would be just that – a technical feat only – if there was no business driver or value behind it. So where is the business value of the ORCS platform?
I see several areas where the business value of this platform is currently best put at use, considering the fact that hardware assisted nested virtualization and bare metal virtualization are not yet available (announced as “coming soon” with no committed date):
- Lift and Shift cases (for environments/applications which are not bound by performance requirements)
- Creation of test/dev environments with ability to create many clones of the environments as needed by development and UAT testing requirements
- Spin-up of multiple identical -but isolated- environments for personnel training (labs / exam certifications)
- Replication of production environments in isolation for security testing use cases: penetration testing, propagation of malware etc.
Future updates would eventually offer an even broader set of use cases due to the ability of the HVX hypervisor to run on bare metal cloud hosts and offer significant performance jumps that will catapult HVX and Ravello in the field of production-capable cloud platforms. The ability to Lift and Shift production grade VMs, multi-VM applications, parts or even entire infrastructure will become possible. Reasons to do so could be based on economic decisions to shift workloads either temporarily (running from cloud in case of prolonged maintenance; covering the gap between out-of-support hardware and delivery/implementation of new hardware in some cases) or permanently (decommission of on-premises infrastructure and shift to cloud).
A collateral would be the reduction of VMware vSphere licensing costs as HVX running on bare metal would allow customers to run a fully VMware compatible environment without having to leverage ESXi and vCenter.
Performance testing and to a larger and much impactful extent, the ability to run production workloads seems to be finally at Ravello Systems’ grasp. While this is not a fundamental game changer for larger cloud providers, it presents yet another breach into the world of on-premises IT infrastructures, and a service that AWS and the likes do not offer, as they expect Enterprise IT players to redesign their applications to support cloud computing imperatives in terms of high availability.
This is especially true for Enterprise IT organizations running complex pre-cloud era applications, who are likely to have built their environments on top of VMware vSphere and who would face challenges to leverage traditional cloud offerings despite good willingness and efforts. Oracle & Ravello will be soon able to propose this conservative customer base a path to gradually lift and shift their on-premises infrastructure, whether on a per-application basis or through larger chunks.
The benefits are obviously a shift from a CAPEX-based model of running on-site infrastructure (with the associated refresh investments in new hardware, support costs and ancillary expenses such as data center estate, power, cooling, security, maintenance personnel etc.) to an OPEX-based model which also potentially bears the non-negligible advantage of potentially cutting down VMware licensing costs for vSphere and vCenter, as the VMs would run natively on top of HVX without losing their native VMware environment configuration.
Enterprises willing to adopt this solution (and consumption model) should nevertheless bear in mind and scrupulously assess the costs that come with running infrastructure on top of cloud services, whether these are compute, storage or network traffic metrics, the latter being notoriously undervalued.
I was invited to the Ravello Blogger Day 1 event by TechReckoning. TechReckoning covered travel expenses to the event. Accommodation and food were also covered for the event duration (22 and 23-May-17).
I did not receive any compensation for the participation in this event, for which I took unpaid time off to be able to attend (as it is the case with any events I participate to). I am not obliged to blog or produce any kind of content. Any tweets, blog articles or any other form of content I have produced or may produce in the future related to this event is the exclusive product of my interest in technology and my will to share information with my peers.
In line with the concept of freedom of thought/critical thinking I commit to share only my own point of view and analysis about any products, technologies, strategies & concepts I was introduced to.