Service Overview
Nutanix Private Cloud Disaster Recovery (DRaaS) is a Nutanix-powered platform that allows you to recover Nutanix hosted workloads on dedicated, enterprise-class hardware in the event of a disaster without the burden of maintaining the infrastructure. Expedient maintains the hardware platform, software updates, licensing, and ongoing maintenance as a managed service. Clients have full admin access to their virtual machines, along with the ability to perform many of the same operations as their on-premises Nutanix environment.
Service Features
Dedicated hardware platform per client
Supports replication to and from Nutanix AHV or client-owned ESXi to Expedient AHV
Pay per host for resources + per VM for fully managed disaster recovery
Encryption-at-rest and in-flight
Protection groups for grouping applications
24x7x365 support for declaring disasters and initiating failovers
Prism Central access to manage multiple clusters and locations
REST API and CLI for management access and automation
Native replication and disaster recovery using Nutanix Disaster Recovery
To and from Nutanix clusters in Expedient Nutanix Private Cloud
To and from on-premises Nutanix clusters to Expedient Nutanix Private Cloud
Two-factor authentication
Multiple RPO options available
15-minute RPO (NearSync)
1 Hour RPO (Async)
Licensing
Expedient’s Nutanix Private Cloud comes with NCI Ultimate and NCM Starter.
Disaster recovery features are under the “NCI” or Nutanix Cloud Infrastructure feature set.
See Nutanix Cloud Platform Software Options for more details.
Disaster Recovery Implications
For clients with their own Nutanix cluster replicating to an Expedient Nutanix Private Cloud:
If your cluster has NCI Ultimate, you will have full access to the disaster recovery suite.
If your cluster has NCI Starter, will not have access to the full disaster recovery suite and are limited to Protection Domains on Prism Element.
If your cluster has NCI Pro, you can use asynchronous replication (≥ 1 Hour RPO) on Prism Central or;
You can purchase the Adv Replication add-on license
Matching Features and Licensing
While Expedient clusters have NCI Ultimate, replication must be configured to match the licensing level of your Nutanix cluster.
The Advanced Replication add-on enables Advanced Orchestration, Multiple Boot Stages, Script Execution, Re-IP, Test Failover), NearSync Replication (RPO = 1-15 minute).
Starter and Pro licensing can do Async Replication (1 Hour RPO) but cannot perform test failovers.
Client Experience Expectations
Clients will connect their on-premises clusters to Expedient via VPN, which will provide the path for replication to the Nutanix cluster on the Expedient side. Clients will use login through OneLogin to Prism Central to access the platform, view replication status, and run recovery runbooks to test and execute a workload failover.
Expedient will monitor replication jobs, availability of the platform, hardware status, resource utilization through Prism Central metrics and alert the client via SMC ticket.
Expedient will support failover from on-premises clusters to Expedient clusters and can perform the failover for a client on their behalf. Expedient will troubleshoot failovers and ensure the recovery completes successfully.
Support Summary
Expedient-owned Nutanix Private Cloud as a Source
Expedient will configure discovery on private cloud instances, including:
Site Pairing to and from Expedient-owned private cloud instances
Configuration of Recovery Plans to orchestrate restoring protected VMs at a secondary location
Configuration of Protection Polices to automate the creation and replication of snapshots across all the clusters managed by Prism Central
Default – Async with a 3-day retention with Rollup selected. This will create 24 hourly snapshots, and 3 daily snapshots.
Assign categories to the VMs that should be protected to default Nutanix categories.
Default Deployment Settings
Available in all Expedient locations with two options:
1) Private Cloud with Unmanaged Disaster Recovery
Client is responsible for disaster recovery setup (replication, protection groups, etc)
Expedient will manage and maintain hardware, hypervisor, and Prism Central
2) Fully Managed Disaster Recovery as a Service
In addition to managing and maintaining hardware, hypervisor, and Prism Central, Expedient will assist with disaster recovery setup including configuring replication, setting up protection policies, and recovery plans.
Protection Policies
Multiple policies can be configured. If you have tiered applications, for example, you could create a separate category (along with the appropriate VMs) and add it to a different stage
Expedient will monitor replication jobs and alert clients via SMC ticket on issues
Networking will be configured for failover between Expedient locations
Clients will be given access and authenticate to Prism Central
Clients can also contact the OSC to initiate a test failover or declare an actual disaster
Clients are provided a Disaster Recovery runbook outlining the steps to execute a failover operation to be incorporated with a larger DR or Business Continuity plan
Clients that declare a disaster or commit to a failover operation, will be asked to reside in secondary site for a period of time for reverse replication to complete successfully, typically 24- 48 hours depending on the size and change rate of the environment.
Data protection and monitoring services will be reestablished after a failover AND commit to the secondary site longer than 48-hour timeframe.
Client user directory (i.e. Active Directory) will be configured for authentication
For Client-owned Nutanix Private Cloud as a Source
Ensure availability and health of Expedient-owned target site.
Monitor target-side Availability Zone connectivity
Assist in the DR configuration on client site.
Default guidance and best practices will be provided.
DR Testing
Nutanix DR failover options are "Planned Failover", "Unplanned Failover", and "Test Failover".
Nutanix offers non-disruptive testing referred to as Validating a Recovery Plan, and an Unplanned Failover / Disruptive Test option.
Expedient will conduct DR testing on Expedient-owned Nutanix Private Cloud upon request.
Expedient will assist in DR testing for client-owned Nutanix source cluster upon request.
Clients may contact Expedient to schedule a test and at least one non-disruptive test is recommended near the end of the delivery phase.
Recovery Workflows
There are multiple options clients can use to perform a disaster recovery of your workloads.
Clients can perform their own recoveries by following this procedure:
Clients can contact Expedient's Operations Support Center and Expedient can perform the failover on the client's behalf.
Responsibility and Accountability Matrix
AHV DRaaS Responsibility Matrix | |||||
Task | Expedient Unmanaged DR | Expedient Managed DRaaS | Client | Co-Managed | Co-Managed with Self-Service Option |
Platform Infrastructure and Supporting Hardware Monitoring | X | X | |||
Platform Infrastructure and Supporting Hardware Break/Fix | X | X | |||
Firmware Updates | X | X | |||
Platform Licensing | X | X | |||
Platform Updates/Patches | X | X | |||
Virtual Machine Management | X | Expedient will not have OS access without OS Management service | |||
Operating System Licensing | X | ||||
Operating System Management, Patching, and Virus protection | X | Expedient will not have OS access without OS Management service | |||
Replication Software | X | X | |||
Replication Software Installation / Configuration | N/A | X | Expedient will assist with client site installs. Expedient will fully install on Expedient platforms | ||
Replication Monitoring (Success/Failure) | N/A | X | Expedient will monitor replications and clients will have access to monitoring | ||
Replication Failure Remediation | N/A | X | Expedient will assist with failure remediation in client site deployments. Expedient will remediate on Expedient platforms. | ||
Workload Protection Configuration | N/A | X | Expedient will assist with configuration of workload protection | ||
Failover of Virtual Machines to Secondary Site | N/A | X | Expedient will assist with failovers | ||
Creation of Disaster Recovery Runbook | N/A | X | |||
Declaration of Disaster Recovery Enactment | X | ||||
Failback of Virtual Machines to Primary Site | X | X | Expedient will assist with failback | ||
Creation of Bubble Networks | N/A | X | |||
Hygiene Check of Operating Systems | X | ||||
Validation of Application Functionality | X |