Jeff Tofano - SEPATON
Archives for: March 2012
Warning: Big Data will Break Your Backup
11:10:41 am, Categories: Notes
The volume of data that every enterprise is dealing with continues to grow at an alarming rate. Backup has been on the forefront of this wave. In fact, the amount of data being protected is growing so fast and the solutions are so broken it’s like pouring gasoline on a fire. But this is just the beginning – as technologies like cloud and big data continue to take hold, the problem is getting significantly worse – and most of the traditional storage approaches will fail under the weight of this growth. The “big data” wave will ripple through enterprise environments starting in primary storage and building to a tsunami of epic proportions by the time it reaches the data protection environment. It will create fire-drill after fire-drill.
Backup is the first place customers will feel the pain – any non-scalable data protection technology will leave customers with a hard choice – keep buying difficult to manage data protection silos or switch to truly scalable data protection architectures that have a chance of solving the problem. “Big Data Backup” is about to hit us all and break the infrastructure we’ve been building for the last decade.
But the trend won’t stop there. Current methodologies for archive and staging data for traditional analytics platforms will also fail as data sets grow to sizes that turn simple data transfers into major pain points. The traditional scale out NAS approach just won’t keep up. Newer massively scalable object repositories will become the only viable solutions going forward. Any second tier solutions not built around these contemporary technologies will eventually fall out of competitive positions.
As we already are seeing, the pain around building scalable primary cloud and analytic repositories is already being felt. Most enterprise customers simply can’t approach the problem like Google, Yahoo and other engineering dominant companies. Massively scalable and manageable analytic platforms solutions are needed -- and they must interoperate efficiently with similarly designed scalable data protection platforms going forward.
Interoperability is not just a nice-to-have, it is essential. Really big data needs to move as little as possible, and when it does move, efficiency will become critical. Most of the traditional analytics storage technologies are buckling under the current load. With no slow-down in sight we’re sure to see these traditional technology approaches fail completely in the near future.
So what is a customer to do? First, be aware of the size and scope of the problem. It’s overwhelming and it’s going to get much, much worse. Second, be cautious about committing to tactical fixes and traditional siloed approaches. Even if they let you “get by” for a while, they will eventually fail on all fronts – causing massive added cost, weak manageability, and data movement that will crush you.
Start to evaluate emerging “Big Data Backup”, “Big Archive” and “Big Data” solutions and build plans to integrate them into their storage ecosystems soon. And finally, customers need to remember emerging solutions need to not only scale and perform cost effectively, but they also need to be manageable in all ways. IT staff are being asked to manage more and more data per person. In fact the ratio of IT staff to volume of data managed is getting ridiculous, so issues like ease of deployment, upgrade and normal operation are critical. Remember, these solutions must also automate data management and integrity tasks, periodically scrubbing data. The bigger data stores get (and they will get huge in coming years), the more data integrity on these systems becomes an issue to worry about.
In a coming blog, I will discuss the demands that big data is making on backup environments in more detail.