Blog

Cluster Heat Maps Could Help Insurers Make Sense of Big Data

Thiru Sampath
Insurance Experts' Forum, July 20, 2012

With more than 90 percent of today’s data having been produced within the last two years, we are experiencing a data explosion that makes traditional ad-hoc analysis obsolete for gaining business insights. This presents an ever-growing gap between entities that generate the so-called big data (for instance, a sensor that sends braking information from a vehicle) to the systems that successfully analyze the data and yield business insights.

Statisticians often cluster large amounts of data to understand it. Similar data are categorized in the same cluster, while different clusters are kept as far apart from each other as possible.

A heat map is another powerful method employed to visualize big data. It plots data with various colors according to its “intensity.” In the end, it resembles a thermal image.

In a cluster heat map, the rows and columns are hierarchically clustered. Data points are represented as a matrix of rectangular tiling, with each tile having a shade based on a color scale to represent the value of the corresponding element. The axes are ordered such that similar rows and columns are near each other. Cluster heat maps are already used extensively in bioinformatics, where the volume of data is huge. They’re beginning to be used on insurance and healthcare data.

For instance, consider a “pay how you drive” system where a black box installed in your vehicle sends data to a system that then keeps track of violations. Your premium can be determined from the number of violations you have compared with the mean violations in the area where you drive. To analyze the mean violations in an area, the following can be done.

Since the number of violations are the data points, they can be tracked according to parts of the city (on the x-axis) and by the age of the driver (on the y-axis). Both of these axes can be hierarchically clustered. On the x-axis, a major metropolitan area like Detroit can be divided into cities—Detroit, Farmington Hills, Royal Oak, Birmingham, etc.—and each city can be subdivided into zip codes. The y-axis can have age categories such as under-30 years of age and over-30 years of age. The under-30 category can be subdivided into 17 to 21, 22 to 30 and so on.

The result is a cluster heat map. Once you’ve built it from data collected from various subjects, it can be used as a quick guide to estimate the risk involved in insuring an individual from a particular area. The map can also give insights on risk relationships between the various regions in the city and the risk transition between the various age groups. Color-coding can then be used to represent the "cells" (for instance, red, green, blue and black in that order to represent violations).

Cluster heat maps serve as a quick two-dimensional scorecard that can represent billions of data points. As the amount of data grows exponentially, cluster heat maps will become increasingly valuable to insurers that must make sense out of mountains of data.

Thiru Sampath is a business intelligence architect at X by 2, Inc., in Farmington Hills, Mich. X by 2 is a technology company specializing in software and data architecture and transformation projects.

Readers are encouraged to respond to Thiru using the “Add Your Comments” box below. He can also be reached at tsampath@xby2.com.

This blog was exclusively written for Insurance Networking News. It may not be reposted or reused without permission from Insurance Networking News.

The opinions of bloggers on www.insurancenetworking.com do not necessarily reflect those of Insurance Networking News.

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments...

Already Registered?

If you have already registered to Insurance Networking News, please use the form below to login. When completed you will immeditely be directed to post a comment.

Forgot your password?

Not Registered?

You must be registered to post a comment. Click here to register.

Blog Archive

The Software-Defined Health Insurer: Radical But Realistic?

Can a tech startup digitally assemble the pieces of a comprehensive, employer-provided health plan?

Data Governance in Insurance Carriers

As the insurance industry moves into a more data-centric world, data governance becomes more critical for ensuring the data is consistent, reliable and usable for analysis.

Fear This

Just days before this Issue, which contains our security cover story, went to press, we got some interesting news: 1.2 billion unique usernames and passwords and 542 million email addresses were reportedly stolen from 420,000 websites, according to The New York Times. The websites ranged from Fortune 500 companies down to small online retailers.

Should You Back Up Enterprise Data to the Cloud?

Six questions that need to be asked before signing on with an outside service.

Modernizing Information Management

While better reporting and actuarial analysis help to support financial decisions, improved analytics and decision making greatly assist the rest of the organization.

Strategic Planning: Here and Now

Insurersí annual strategic planning efforts can benefit from an infusion of tactical reality.

Advertisement

Advertisement