vSphere Replication Architecture

Some deep dive into vSphere Replication Architecture for folks interested in Business Continuity and Disaster Recovery

Below screenshot is from VMware vSphere Replication administration guide

vr-arch

vSphere Replication Appliance (VR Appliance)Appliance with 2 integrated components

  • vSphere Replication Management Server (VRMS)
    • Manages the whole of replication infrastructure
    • Maintains mapping between source and target Virtual Machines
    • Handles authentication
    • Provides management interface to Site Recovery Manager
    • Configured with internal DB (vPostgres) or External DB (MS SQL Server/Oracle). All configuration and monitoring information is stored in the DB
    • Only one VRMS can exist in a vCenter
  • vSphere Replication Server (VRS)
    • Component at receiving end that receives replicated data.(Source->Target or vice-versa)
    • Upto 9 additional vSphere Replication Servers can be deployed in a vCenter for load balancing needs
    • Some of the configuration information from VRMS DB is pushed to VRS embedded MySQL database.
    • vSphere Replication Server is available as an add-on component in the form of an appliance

Note – While replicating Virtual Machines from Source->Target, vSphere Replication Server component at Source site is not playing any role. It is the vSphere Replication Server component at Target site that is actually receiving data. If the replication direction is from Target->Source then the vSphere Replication Server at Source will play a role in receiving the replicated data from Target.

VR Agent (vSphere Replication Agent) It has 2 sub components

  • vSphere Replication Service
    • Integrated with ESXi hostd agent
    • Manages the replication process for virtual machines
    • Takes care of replicating virtual machines meta data and storing virtual machine replication configuration
    • Importantly it coordinates the schedule to achieve configured RPOs
  • vSCSI Filter
    • vSCSI Filter driver is part of VMKernel
    • Tracks the regions of disks that have changed there by recording that information in memory
    • Driver that actually replicates the data from Source site to vSphere Replication Server on Target site

vSphere Replication under the hood

Initial Replication (Initial Full-Sync)

  1. Administrator configures replication on a VM which is provisioned by VRMS that has plug-in integrated into vSphere Web Client
  2. An Initial Full-Sync of VM starts which actually creates an empty disk(vmdk) at Target site
  3. MD5 (Message Digest hash algorithm) compares the Source and Target disk
  4. As the Target disk is empty, the vSCSI Filter will start preparing data for replication to Target
    • If network compression is enabled for VM while configuring replication, FastLZ (an Open Source faster compression algorithm) will take of compressing the data at source before replicating to target (See FastLZ  benchmark details – http://fastlz.org/index.html)
    • The compression significantly reduces network bandwidth utilization though it costs few CPU cycles at Source and Target for compression and de-compression
    • If application quiescing is enabled while configuring replication, VSS Requestor and VSS Provider available as part of VMware tools will work with Microsoft VSS component to take application consistent snapshot for VSS aware applications (Note that VSS application quiescing is only for Windows)
    • For application quiescing in Linux, custom scripts specific to the application have to be used.
  5. Once the replication traffic hits the vSphere Replication Server at receiving end, a CRC (Cyclic Redundancy Check algorithm) is used for error detection.
  6. MD5 & CRC stated above, together helps for data integrity and consistency of replicated data
  7. The replicated data from vSphere Replication Server is copied to the target datastore through NFC(Network File Copy) service available with ESXi
  8. The replicated data is first written to a snapshot file (redo log file) and then commited to base disk once all replicated data is available in redo log file

Subsequent Replication (Delta Sync)

  1. vSCSI Filter tracks all the changed blocks since last replication
  2. vSphere Replication Service gives a green signal to go as per RPO setting and vSCSI Filter sends the data to vSphere Replication Server at target site.
  3. Steps 4a to 8 remains the same

Leave a comment