Do You Need Big Data Governance? Maybe.
Insurance Networking News, January 25, 2013
Your book talks about big data governance as part of a broader data governance plan. You mention issues like politics and stakeholders, but how else is it similar or different?
I think the disciplines of traditional data governance apply to big data. You’ve got to think about data quality, metadata, privacy, managing the information lifecycle and people who are stewards. But I think you differ first in the implementation. If you are thinking about the example I gave you of MDM and social media, you’ve got to ask yourself, “Do I need a customer steward who understands the ins and outs of privacy laws and regulations in social media?” Or, instead, “Do I need a dedicated social media steward who can negotiate with legal and privacy on what we can and cannot do?”
That’s a big leap to make in terms of commitment and maybe funding.
Yes it is. Some clients I talked to said they started out having the customer steward deal with all the social media and they very quickly ran out of steam. They couldn’t do the governance and their day job or even gather all the expertise around regulations. So instead, in that example, they picked the people who were social media stewards.
Wouldn’t an issue like data quality raise a similar conflict?
When you get into data quality now you’re looking at different things like how to deal with streaming data that’s flowing in, and that’s a different kind of data quality than most in that field have dealt with. In some examples you are trying to match multiple feeds from different sensors, maybe a temperature sensor and a motion sensor. You might expect the temperature sensor to respond 10 times a second. For some reason you lose three seconds and that’s potentially a data quality issue. In the book I talk about temporal alignment and the rate of arrival. It’s a different implementation of data quality, though things like metadata still apply. If you’re thinking about clickstream analytics, which is big data, how do you define a unique visitor to a website? How do define a session, one that is closed or one that is returned to while open? I found many governance issues in that vein that may not have been considered.
In your book you seem to use dictionaries and metadata as the connecting point of where these things can be aligned. Is that a kind of overlay or abstraction as opposed to an attempt to conform the data?
Yes, exactly, and if you want to align a customer’s Twitter feeds with their master record, you still have to define what a customer is. You think about whether customers are prospects or active clients just like in any other system.
What are some of the unknowns in big data governance companies need to manage before they take their experiments out of quarantine and into production?
First, you are right, I haven’t seen a lot of companies ready to integrate their big data governance policies with the rest. There are just so many things that need to be understood first, which is why governance is there in the first place. If I work in credit, can I use your Twitter account to make a loan decision? If I am in collections, can I use Facebook info under the Fair Debt Collections Act? You definitely have to start writing policies by jurisdiction. The state of Maryland and others now have policies that don’t allow employers to use social data to pre-screen candidates. There are concerns that a lot of social media contains protected information like age, race, gender or sexual orientation. You cannot consult social media and later claim you didn’t discriminate with that knowledge.
It reminds me of some of the unintended consequences marketers have experienced using analytics against customer records that backfired after they dug too deeply into a person’s history.
I think that’s a similar challenge for big data because so many regulations are evolving. There’s also reason to worry about reputational backlash if you cross a line with social information that is also deemed personal. I advise clients to be conscious in both the regulator and reputational areas but remind them big data has many types to take advantage of. That can also be a problem when you start integrating multiple types of data and focus a lot of analytical power that can push the edges of privacy, but again, that’s where governance comes in. It will be interesting to follow how people in privacy and legal departments will have their own take on data governance and risk.
For more information on related topics, visit the following channels:
Add Your Comments...
If you have already registered to Insurance Networking News, please use the form below to login. When completed you will immeditely be directed to post a comment.
You must be registered to post a comment. Click here to register.