Back To Schedule
Monday, September 21 • 11:35am - 12:25pm
Integrity of In-memory Data Mirroring in Distributed Systems

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Data in memory could be in a modified state than its on-disk copy. Also, unlike the on-disk copy, the in-memory data might not be checksummed, replicated or backed-up, every time it is modified. So the data must be checksummed before mirroring to avoid network corruptions. But checksumming the data in the application has other overheads: It must handle networking functionalities like retransmission, congestion, etc. Secondly, if it delays the validation of mirrored data, it might be difficult to recover the correct state of the system.

Mirrored-data integrity as transport protocol functionality leads to modular design and better performance. We propose a novel approach that utilizes TCP with MD5 signatures to handle the network integrity overhead. Thus, the application can focus on its primary task. We discuss the evaluation and use-case of this approach (NVM mirroring in Data Domain HA) to prove its advantages over conventional approach of checksumming in the application.

Learning Objectives

Designing efficient data-mirroring in backup and recovery systems, where reliability is prime
Linux kernel TCP know-how for using it with MD5 option
Analysis of conventional approach vs. the TCP MD5
Use-case: TCP MD5 option for NVM mirroring in Data Domain HA

avatar for Tejas Wanjari

Tejas Wanjari

Senior Software Engineer, EMC Data Domain
Tejas Wanjari is Senior Software Engineer at EMC Data Domain where he is involved in the design and architecture of distributed systems infrastructure and Data Domain OS. Previously he was a member of Parallel Data Lab (PDL) at Carnegie Mellon University while pursuing his Master... Read More →

Monday September 21, 2015 11:35am - 12:25pm PDT
Cypress Room

Attendees (0)