Enterprising Developments

Where are You Going to Store All That Big Data?

Joe McKendrick
Insurance Experts' Forum, August 2, 2013

There's no end to the chatter out there about the power and awesomeness of big data. (Okay, I'm guilty of that as well.) And we're only getting started with things — Big Data has many applications we haven't even dreamed of yet.

But nobody is really talking about where insurance companies are going to put all that data.

The problem is, once data is converted into something meaningful — customer records, insights, communications, analyses — there's more of an onus to keep it around. In fact, there may even be legal requirements (or threats of legal actions) that necessitate holding data for seven years to life.

So, when an organization is dealing with 500 terabytes of data from various sources and for various purposes, guess what? That's at least 100 terabytes of disk space, when allowing for compression. The data needs to be stored on disks or tapes, and still be accessible. Then there's still metadata, or data about the data, on top of that.

So the numbers for disk storage systems, which could also include off-site or cloud-type storage, begin to add up. What is the smart way to handle all this overhead?

Bill Kleyman, writing in Data Center Knowledge, provides an excellent discussion of what needs to be considered when storing all that big data — especially for a distributed, “cloud-ready” environment:

Consider bandwidth. For efficiency, it’s important to calculate bandwidth, Kleyman says, and this requires understanding a number of factors, such as “distance the data has to travel (number of hops), failover requirements, amount of data being transmitted, and the number of users accessing the data concurrently.”

Develop a replication policy. “In some cases, certain types of databases or applications being replicated between storage systems have their own resource needs. Make sure to identify where the information is going and create a solid replication policy.”

Pick the right storage platform. Factors to consider include whether the system can support planned or future utilization, and how easily data can be migrated, and data control mechanisms.

Control the data flow. “Basically, there needs to be consistent visibility in how storage traffic is flowing and how efficiently it’s reaching the destination.”

Use intelligent storage (thin provisioning/deduplication). An intelligent dedup strategy will free up immense amounts of space on disks, Kleyman advises. In addition, he adds, “look for controllers which are virtualization-ready. This means that environments deploying technologies like VDI, application virtualization or even simple server virtualization should look for systems which intelligently provision space – without creating unnecessary duplicates.”

Joe McKendrick is an author, consultant, blogger and frequent INN contributor specializing in information technology.

Readers are encouraged to respond to Joe using the “Add Your Comments” box below. He can also be reached at maitlto:joe@mckendrickresearch.com.

This blog was exclusively written for Insurance Networking News. It may not be reposted or reused without permission from Insurance Networking News.

The opinions of bloggers on www.insurancenetworking.com do not necessarily reflect those of Insurance Networking News.

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments...

Already Registered?

If you have already registered to Insurance Networking News, please use the form below to login. When completed you will immeditely be directed to post a comment.

Forgot your password?

Not Registered?

You must be registered to post a comment. Click here to register.

Blog Archive

The IT-Savvy 10 Percent

IBM survey reveals best practices of IT leaders.

The Software-Defined Health Insurer: Radical But Realistic?

Can a tech startup digitally assemble the pieces of a comprehensive, employer-provided health plan?

Data Governance in Insurance Carriers

As the insurance industry moves into a more data-centric world, data governance becomes more critical for ensuring the data is consistent, reliable and usable for analysis.

Fear This

Just days before this Issue, which contains our security cover story, went to press, we got some interesting news: 1.2 billion unique usernames and passwords and 542 million email addresses were reportedly stolen from 420,000 websites, according to The New York Times. The websites ranged from Fortune 500 companies down to small online retailers.

Should You Back Up Enterprise Data to the Cloud?

Six questions that need to be asked before signing on with an outside service.

Modernizing Information Management

While better reporting and actuarial analysis help to support financial decisions, improved analytics and decision making greatly assist the rest of the organization.