It seems that the more data we have, the more complicated and voluminous the backups become. I was recently asked to think about backups from a database archiving perspective. Here are a couple of easy pointers that could help simplify your backups:
1) Ensure that production databases only contain data that needs to be backed-up – Statistics show that up to 90% of data in a production database is static (in other words historical) and therefore much less frequently accessed than say the 10% that is constantly being updated. Given this, 90% of the data can be relocated and stored into a more efficient purpose-built data retention repository where de-duplication and compression can significantly reduce the data footprint, while retaining online accessibility for ongoing look-ups and query. In this scenario, the backup of the combined repositories runs much more efficiently and is therefore more cost-effective. In other words, if you can store 10 terabytes of historical data online that is reduced to less than 1 terabyte, overall costs of managing the data is considerably reduced. Furthermore, any test, development and clones of the production databases become more efficient.
2) Do not use backups as an archiving strategy – Backups should be principally focused as a disaster recovery mechanism. However, because of the cost to keep data online and accessible traditionally has been high, backups have been used as an archiving option as well. The cost of retrieval of data from traditional backups is an expensive process, particularly if tape is used due to the overhead of restoring the data in its original form before access can be given to end-users. Today there is an alternative in the form of specialized data repositories that can maintain data online and accessible at a cost comparable to tape while keeping on demand retrieval instantaneous. By having a good online retention strategy for archiving, backups can be simplified and refocused back on disaster recovery needs.