4 Commonsense Strategies for Data Testing

Syed Haider
Insurance Experts' Forum, September 5, 2012

Without thorough testing, databases often have hidden problems that eventually give insurers big headaches. In a previous blog, I laid out the reasons that data applications don’t get same level of testing as software. Here are some steps IT managers and data architects can take to remedy that situation.


1. Get smart testers

IT leadership should be wary of hiring "black-box" testers for data-projects whose core strength is in comparing the output of a batch process or report against a given spec or legacy report. While such mechanical analysis is inevitably needed, we should still look for testers who have more advanced skills, such as:

• White-box testing. This requires the ability to understand complex SQL code—keep in mind that the SQL and its variants (PL/SQL, T-SQL, etc.) are notoriously hard to document—and to create test data and scenarios that go above and beyond the “happy” path.

• Ability to independently write code against the specification that should match the output from the code being tested. This puts the testers in pretty much the same league as the developers. The importance of this core competence cannot be overstated for IT managers because it makes the task of organizing the testing group all the more complex.

• Data analysis. In BI projects, data analysis will really boost the end-users’ confidence in IT if the testers are able to perform rudimentary data analysis, such as basic trending, before they get to use the data.


2. Invest in configuration management

Since configuration management is still not as cut and dried in data projects as it is in the application development arena, it merits special attention upfront. One shouldn’t be deterred by the general lack of built-in features like revision management in most modern database platforms, as there are some decent third-party tools emerging in this space.


3. Automate routine testing

The idea here is to recycle your test-cases into automated “smoke-tests” that can be incorporated into production jobs. This way, we can eliminate the need for manual testing of routine production processes, such as incoming/outgoing feeds and batch processes.

It’s a win-win situation for all because smart testers wouldn’t typically enjoy the mundanity of such testing. Smoke tests can be written at a purely database level and don’t require any specialized tool to execute. All we really need is some sort of a framework that orchestrates these tests. Consider the following scenarios:

• As an incoming file is loaded into staging, the ETL invokes automated tests to do basic balancing of control totals against the loaded data. If it fails, an email goes out to the IT admins reporting a problem in the feed.

• As a number-crunching process finishes, it kicks off an automated test that runs a number of sniff tests using ‘acceptable’ thresholds (as specified by the business), and reports if it finds any data failing to meet them.

As we can see, it doesn’t matter which ETL or process-automation tool we employ. As long as we have the following broad-based features available in the testing framework, we can automate our testing. This will:

• enable developers/testers to define, activate/deactivate test cases and suites;

• enable a variety of test types: boundary cases, control totals, expected outcome, negative tests, exception scenarios, user-defined, etc.;

• withstand failures caused by incorrectly written or obsolete tests;

• communicate the results using some collaboration mechanism such as e-mail.


4. Focus on regression testing

As new features and improvements get introduced into the database, consider regression testing. The best way to avoid last-minute surprises is to keep your regression-test suite up-to-date as development continues. The main decision we often have to make in this regard is whether to kick off a full regression every time or make it more targeted.

The answer lies in our ability to track dependencies. We cannot fully count on relational database management system platforms’ innate ability to report dependencies. Often, a broken dependency is reported by the end users in the most unexpected of places after we have promoted some new features.

There are some third-party solutions to track dependencies more effectively from outside the database, and if they can be employed, we can run a partial regression test on potentially affected objects. In the absence of such solutions, our best bet is to run a full regression.

The above list of mitigating approaches is by no means exhaustive, but it does try to capture the big-ticket items. Let’s hope that some of these mitigating strategies might become irrelevant in future as new developments make life simpler for database developers and testers.

Syed Haider is an architect with X by 2, a technology company in Farmington Hills, Mich., specializing in software and data architecture and transformation projects for the insurance industry.

Readers are encouraged to respond to Syed using the “Add Your Comments” box below. He can also be reached at

This blog was exclusively written for Insurance Networking News. It may not be reposted or reused without permission from Insurance Networking News.

The opinions of bloggers on do not necessarily reflect those of Insurance Networking News.

Comments (1)

very wisely stated. unfortunately any application implementation ignores Data migration to be a project in its own thereby under-estimating the time, skills and efforts

Posted by: Sridevi R | September 5, 2012 10:05 PM

Report this Comment

Add Your Comments...

Already Registered?

If you have already registered to Insurance Networking News, please use the form below to login. When completed you will immeditely be directed to post a comment.

Forgot your password?

Not Registered?

You must be registered to post a comment. Click here to register.

Blog Archive

The 5 "I"s of Underwriting Innovation

Underwriting has come a long way in a short time thanks to data and analytics.

Insurers are Losing the Customer Satisfaction Battle – Can Social Media Help or Hurt?

GEICO and Progressive suffered the largest individual carrier dips in satisfaction, according to the ACSI report.

Claims Transformation: Modernization Is Just the Beginning

Claims transformation is bigger than modernization, encompassing changes to the entire claims business model and philosophy rather than simply the day-to-day processes of claims operations.

Why Insurers are Leading on Data and Analytics

A State Street survey finds insurance companies are more likely to be further along in becoming “data innovators” than their financial services counterparts.

Driverless Cars: Unintended Consequences for Insurers to Watch

When bad or unexpected or unusual things happen, the computer gives up control and hands it back to the now woefully unprepared occupant.

The Other Auto Insurance Telematics Shoe Drops

Progressive's decision to charge Snapshot drivers more if their driving data indicates higher risk has started the industry down a road of data-driven adverse selection.