Philippe is responsible of defining the technology vision and future directions for Bull’s storage business unit. As evangelist, he promotes emerging storage technologies and is also Bull’s primary representative at SNIA1.
IT managers are facing continuous data growth, increasing number of mission critical applications, fragmentation of information assets, and emerging regulations and standards. Backup to disk, snapshots, data deduplication, continuous data protection, and data replication are recent innovations within the storage industry that have achieved maturity, helping to address these data protection challenges.
Bull StoreWay is constantly evaluating and integrating technology innovation, delivering state-of-art, mature and cost effective infrastructures.
1- Disk backup, for flexibility, performance and simplification
Despite a higher cost per GB, power consumption and thermal dissipation, disk backup – also known as D2D2 or D2D2T3 – is now part of all data protection implementations. The maturity of the SATA HDD technology, available in most enterprise RAID storage system, continuous capacity improvements – 1 TB being the highest capacity currently available – make disk a reliable and affordable backup storage media.
Among other benefits, D2D backup increases the success rate of backup operations, avoiding tape related failures, and better protects the stored data, using RAID capabilities. Disks are flexible and adaptive in terms of performance, even with changing day-to-day operations or evolving information assets to protect. They deliver more deterministic windows. Concurrent backup streams, simultaneous backup and restore operations, simultaneous backup and media duplication, and direct access to any block of data are also significant advantages. Device consolidation is easier, such as sharing a disk device between different backup software, or reusing an existing storage system to store backup data-sets on low cost SATA HDDs. Tape library virtualization opens additional simplification opportunities.
Two main options are available to integrate D2D backup in data protection infrastructures:
- General purpose disk storage systems, through DAS, SAN or NAS attachment. All enterprise-class backup software now supports disk as a backup media, however with major differences related to the ability to seamlessly move data between disks to tapes.
- Virtual Tape Libraries. These disk storage systems emulating tape libraries enable a fully transparent integration through the swap-and-replace of legacy tape libraries. Optimized for sequential data flow, they integrate capacity optimization technologies such as compression and deduplication, and can off-load from servers the task of moving backup sets from disk to tapes, or from a primary site to a recovery site
The advanced features of the VTLs make them increasingly attractive. Alternatively, general purpose disk storage systems preserve investments in term of devices and know-how. But the best practice is now to use disk for daily and weekly backups, relegating tapes to deep backup and off-site vaulting
Bull StoreWay is delivering all components of disk backup solutions. The Optima 1200 – ideally suited for D2D backup – EMC CX, DMX, and NetApp FAS, are delivered with 750 GB or 1 TB SATA HDDs. StoreWay Virtuo is a VTL with optional connection to physical tape libraries, to deliver very high capacity, with low cost per GB and green efficiency. StoreWay Virtuo fully automates tasks such as moving backup data sets from disk to tapes, duplicating media, and copying data sets to disaster recovery sites. The StoreWay Calypso suite offers advanced D2D solutions, for storage systems and VTLs, with a seamless and powerful migration of data sets between disks and tapes, based on data life-cycle management policies
2- Snapshots, the new recovery tier
Snapshots are widely-available technologies for creating a near-instantaneous image of a data set, while not provisioning the full capacity. Snapshots help to solve the backup window issues, allowing parallelization of production and backup operations.
Not so long ago, snapshots were used directly as a data set to restore from. While not a protection against device failure, it is a valuable tool to recover from deletion and corruption issues, the most frequent causes of recovery requests. The frequency of backups can be hourly, significantly improving the RPO. The RTO can also be improved, as recovery data is online.
Recent improvements within leading data protection software, such as StoreWay Calypso provide a seamless integration of snapshot targets in the backup catalogue, fully automating and simplifying the use of snapshots. Such advanced capabilities enable the delivery of data-centre wide, sub-hour RPO with RTO measured in minutes, with virtually no management overhead.
3- Deduplication, the promise to divide the cost of D2D by a factor of 20 to 50
Data deduplication offers an impressive space saving ratio, compared to compression algorithms. Deduplication’s efficiency relies on information redundancy. For example, files sent to several recipients and multiple versions of a document, with few differences. Another typical scenario is data protection. Daily backup sets usually contain identical or very similar files to the previous day’s set. Data deduplication detects identical bit streams, stores them once, and uses pointers for further occurrences. The efficiency is clearly dependent on the stored data set, and from that point of view, backup data sets offer a very high potential for deduplication.
This impressive space saving capability has several positive impacts:
- D2D backup is less expensive, as less capacity is required
- Data replication to remote sites, mainly for disaster recovery purpose, is also more efficient. Duplicated bit sequences are not transmitted on the wire, enabling use of lower speed links, greater distances, and of course - cost savings.
- Data protection processes can be simplified. Instead of trying to optimize using a mix of full, incremental, differential policies, storage administrators can run daily full backups, allowing data deduplication to optimize the stored and copied data sets.
Data deduplication is clearly positioned as a technology enabling the retention of a huge amount of data on disk, at a much lower cost than traditional solutions. From a performance point of view, it’s a compromise between high performance D2D backup based on general purpose storage systems or non-deduplicated VTLs and legacy tape based backups, deduplication being an intensive processing operation.
Data deduplication is available in several StoreWay products, such as the Calypso suite, EMC Centera, and NetApp FAS. As this technology is very promising, complementary solutions offering data deduplication are being investigated.
4- Data Replication, to recover from a disaster in minutes or few hours
For years, disk arrays have enabled the replication of information in real time, to a remote site enabling the resumption of operations in minutes or hours, and minimizing data loss to few transactions. Data replication delivers crash-consistent data sets, enabling the restart of applications on a recovery site as if they were restarted locally, after an unplanned reboot of a server.
What is new and must be taken into account is the universal level attained by this technology:
- Data replication is available from high-end to low-end disk arrays, for SAN, NAS, DAS, and internal HDDs
- Data replication can be delivered out of the storage box, either at the server level or within the storage network
- Synchronous data replication can be implemented out of campus areas, through the development of WDM technologies and optical MANs.
- The convergence of Storage and IP protocols simplify the interoperability, and the increased tolerance to latency and link failures allow replication on corporate WANs, or using IP services from telco operators, as well as replication over longer distances
- Data replication is applied to both production data sets and secondary data sets, such as D2D backup on general purpose storage systems or VTLs.
Unquestionably, data replication must be revisited and evaluated as a replacement of legacy off-site tape vaulting. It’s no longer an option to deliver 24 x 7 operations, and even if service level objectives are less demanding, data replication greatly secures and simplifies recovery procedures. The integration with server based virtualization such as VMware offers unprecedented flexibility.
Bull has already deployed data replication over distances up to several hundreds of kilometers, and offers array based, network based and host based data replication solutions. The StoreWay portfolio is rich in data replication solutions: the Calypso suite, FDA, Optima 5000, EMC CX, DMX, NetApp FAS disk arrays, the IBM SVC virtualization appliance (Vivo), and Virtuo VTL, all provide data replication.
5- Continuous Data Protection, a new paradigm for data protection
CDP captures changes in real-time, without application downtime, and stores them on another storage system. CDP removes the backup window and provides a near-zero RPO, no data loss. Changes are logged, enabling recovery from any point in time.
Application-aware CDP avoids the drawback of restarting from crash-consistent rather than from application-consistent recovery points. Cooperation with the application allows periodic flushing of application-consistent data sets to disk, these points in time being tagged in the replication flow. Recovery applications from these tagged recovery points is a compromise on the RPO, tied to the frequency of the application synchronization, but it greatly improves the RTO.
Using StoreWay Calypso, storage administrators can either freeze recovery points on-demand through snapshoting, or they can select application aware CDP, with automatic application-consistent recovery point creation.
Taking the broader view
StoreWay is promoting a global strategy to meet or exceed SLAs, while minimizing costs. It’s based on data classification and combination of two approaches; Preventive – “how do I avoid my data being lost?” and Remedial – “I have lost my data, how do I get it back?”
On the preventive side, Bull identifies two major domains:
- Protection against isolated hardware failure, such as disks, controllers, network switches, ports, HBA, etc.
- Archiving. Perhaps surprisingly, Bull considers that archiving is part of a preventive data protection strategy. Preserving data for long periods is a challenge; identifying the information that needs to be archived, and deciding for how long it needs to be retained and with what levels of security and authentication, classifying and indexing the information so that it can be searched and retrieved.
On the remedial side are found:
- Data recovery. Responding to data loss, data recovery restores the lost data from a data copy. The most frequent cause of lost data is accidental deletion, but also includes viruses, software failure, and major hardware failure.
- Disaster Recover is the IT implementation of an organization’s Business Continuity plan. For the IT systems, this typically means responding to the loss of an entire data centre through events such as fire, flood, electrical failure, terrorism, and civil unrest.
A major challenge for IT management is to consider these four areas as a part of the same data protection jigsaw, and not to manage them as separate data protection silos. But benefits are measurable. For example, archiving information reduces the amount of daily backups. A robust failure prevention implementation enables a safe deployment of a snapshot based recovery tier. Don’t hesitate to review your strategy with StoreWay consultants.
IT managers should permanently re-evaluate their data protection implementations, and check if they are aligned with the changing risk management requirements from their company.
Bull storage consultants, Bull professional services, and StoreWay portfolio provide all the ingredients for state of the art solutions, perfectly tailored to the business objectives of each project.
1SNIA : Storage Networking Industry Association
2D2D : Disk to Disk
3D2D2T : Disk to Disk to Tape